Intelligent Movie Recommendation System Using AI and ML
Intelligent Movie Recommendation System Using AI and ML
https://doi.org/10.22214/ijraset.2022.42255
International Journal for Research in Applied Science & Engineering Technology (IJRASET)
ISSN: 2321-9653; IC Value: 45.98; SJ Impact Factor: 7.538
Volume 10 Issue V May 2022- Available at www.ijraset.com
Abstract: Recommender system are systems which provide you with a similar type of products or solutions and results, you are
looking for. For example, if you go to a Clothing shop, you ask for a T-shirt with different designs or different colors, Then the
shopkeeper recommends you with different colors. This recommending task for websites is done by recommending systems. A
recommendation engine uses several algorithms to filter data and then recommends the most relevant items to consumers. A
Movie Recommender system will recommend the most relevant and connected movie for the given category of search, if a user
visits a movie site for the first time, the site will have no previous history of that user. In such cases, the user can search for their
movie recommendations based on genre, year of release, director or actor and their favorite movie itself to get a new movie
recommendation.
Keywords: Movie Recommendation Systems, Content-Based Filtering, Movie recommendation, machine learning project.
I. INTRODUCTION
In the era of 21st Century, and the increasing e-commerce over the internet. Online shopping and entertainment industry are on peak
levels. Online Everything will be a new normal in upcoming years. Imagine you are shopping online on website like amazon.com.
They have over 60 million products for sale and the same goes for flipkart and other ecommerce websites.
Entertainment websites like Netflix and Amazon Prime and Hotstar have over 10 million movies and series to watched.
If you want anything specific from these Websites, you can simply search for it. But, What about the rest of the products? If you
want something similar or better product than you are search results. If you searching overall, it will like searching a golden tree in a
Forest. You will be lost and never find your way out.
That’s where, recommendation systems becomes your ally. Recommendation System plays an important role of being a guide in the
systems of Amazon, Netflix and etc. Without Recommendation Systems, Many E-commerce and Entertainment websites will be
like a database and You will need to be sure of what you are looking for. It will be a great loss for these companies, if people don’t
buy their products or don’t watch any movie. Similarly, I will be great disadvantage for users, if they can’t get the necessary
product.
Therefore, it is an industrial and user necessity to have a Recommendation System embedded into various websites. We have
decided to learn and implement such Recommendation System and take it on next level.
II. METHODOLOGY
©IJRASET: All Rights are Reserved | SJ Impact Factor 7.538 | ISRA Journal Impact Factor 7.894 | 611
International Journal for Research in Applied Science & Engineering Technology (IJRASET)
ISSN: 2321-9653; IC Value: 45.98; SJ Impact Factor: 7.538
Volume 10 Issue V May 2022- Available at www.ijraset.com
A. Requirement/Data Gathering
Data is the most important and foundation for machine learning projects. Gathering data from various datasets is key for a
recommendation system. The more the data available the better the recommending results.
B. Pre-processing
In the pre-processing stage, Filtering and making ready the data for the project, we will make some changes such as, we will build
tags that will describe the data and help us to calculate its similarity with other data.
1) System Designing: In this system design phase, we design the system which is easily understood by the end-user i.e., user
friendly. We design some UML diagrams and data flow diagrams to understand the system flow and system module and
sequence of execution.
D. Website Designing
After we have created a working model, we will create the same into a website. This stage will involve designing an immersive UI.
E. Deployment of System
Once the functional and non-functional testing is done, the product is deployed in the virtual environment or released into the local
hosting like Heroku, over the internet.
©IJRASET: All Rights are Reserved | SJ Impact Factor 7.538 | ISRA Journal Impact Factor 7.894 | 612
International Journal for Research in Applied Science & Engineering Technology (IJRASET)
ISSN: 2321-9653; IC Value: 45.98; SJ Impact Factor: 7.538
Volume 10 Issue V May 2022- Available at www.ijraset.com
2) Collaborative Recommender System: In Collaborative Recommender System, The Data from users is collected to recommend
different products. The Users using the system collaborate with each other to recommend products. The Similarity between
Users is calculated and matched to recommend products to another similar user. For example, User A and User B are similar.
The Products liked by User A are recommended to User B. Similarly, the products used by User B can be recommended to User
A. This type of recommendation system can be seen in Facebook and Social Media platforms, where you are suggested the
friends or videos which are seen by your friends.
3) Hybrid Recommendation System: Hybrid Recommendation System is a combination of both Content based and Collaborative
Recommendation system. In this, The Content or Product is recommended based on the data provided by the User. The System
will provide you with results based on your previous actions and history. This type of recommendation system uses various data
from different platforms and merge it together for better user experience.
Google Search results and suggestions are a great example of Hybrid Recommendation System.
C. Comparison
Each Recommendation System has its own advantage and disadvantage. The deciding factor remains that where the system is going
to used and how effective the selected recommendation system will be in those conditions.
Famous Recommendation Systems:
Personalized recommender system analyzes a huge amount of user behavior data and provides personalized content to different
users. E-commerce, movie, video, music, social network, reading, local based service, email and advertisement are some of the
fields that widely use this system. It improves click rate and conversions of the website.
©IJRASET: All Rights are Reserved | SJ Impact Factor 7.538 | ISRA Journal Impact Factor 7.894 | 613
International Journal for Research in Applied Science & Engineering Technology (IJRASET)
ISSN: 2321-9653; IC Value: 45.98; SJ Impact Factor: 7.538
Volume 10 Issue V May 2022- Available at www.ijraset.com
D. E-Commerce
Websites like Amazon.com and Flipkart use Recommender Systems based on User previous purchase history. Also, these
recommender systems use Filters to select the right product to be suggested. The Recommender also suggests the products which
can be a used to with other products or bought at the same time by other users/customers. Following are some of the recommending
pages over Amazon:
©IJRASET: All Rights are Reserved | SJ Impact Factor 7.538 | ISRA Journal Impact Factor 7.894 | 614
International Journal for Research in Applied Science & Engineering Technology (IJRASET)
ISSN: 2321-9653; IC Value: 45.98; SJ Impact Factor: 7.538
Volume 10 Issue V May 2022- Available at www.ijraset.com
B. Extraction of Data
The extraction of data takes in account the required tables and information regarding a movie. Any unwanted data should be
neglected. For example, the movie length and movie budget doesn’t affect the similarity of movies. The null values are also to be
neglected for avoiding further missing errors and difficulties.
C. Creation of Tags
A data frame is a set of result of only necessary values. In our case, it would be Movie Title, Genre, Release Date, Cast and Crew.
To make an intelligent movie to movie recommendation, we need to create a single table named ‘Tags’ which can be combination
of all the available data and keywords. Each Tag will be representing a single movie from the dataset.
D. Normalization
To reduce the occurrence of similar worlds and stop words (on, the, are, is, that). All the available data should be first normalized
with each single variable. The Count Vectorizer is used to manipulate the data and eliminate these stop words. We can use the
Natural language processing model (not) to perform such operation. The Function called Porter Stemmer would replace similar
words with a single word. For Example, ‘Loving, Loved, Lover’ would be replaced by a single word ‘love’. This will make
calculating similarity between the tags more easy and more accurate.
E. Vectorization
The created Tags would now be recognized as single word and can be converted into vectors.
Vectors are points on a graph which have co-ordinates. For e.g. A (2,3)
The Conversion of Text to vectors can be performed by using text. Vectorization method.
The methods available are TF-IDF, Word to Vic, etc.
‘Bag of Words’ is such a vectorization technique which is easy to understand and easiest to work on.
©IJRASET: All Rights are Reserved | SJ Impact Factor 7.538 | ISRA Journal Impact Factor 7.894 | 615
International Journal for Research in Applied Science & Engineering Technology (IJRASET)
ISSN: 2321-9653; IC Value: 45.98; SJ Impact Factor: 7.538
Volume 10 Issue V May 2022- Available at www.ijraset.com
F. Bag of Words
In Bag of Words, We Combine all the words in Tags into one Single long word.
For ex. Tag1 + Tag2 + Tag3 + …. = Tag
As we have 5000 movies dataset, we get 5000 tags representing 5000 movies.
Now, we have to calculate the 5000 common words which describe each movie.
By calculating the frequency of words in each tag, we can get these 5000 words.
G. Matrix of Vectors
In Vectorization, we got 5000 words representing 5000 movies.
Now, we create a matrix of 5000 movies x 5000 words (5000,5000)
p (Movie 1)
d
q (Movie 2)
θ
u (Movie 3)
r (Movie 5000)
H. Similarity
To Calculate the Similarity between two movies, we need to calculate the distance between their vectors.
There are two methods to calculate this distance:
1) Euclidian Distance
2) Cosine Distance
©IJRASET: All Rights are Reserved | SJ Impact Factor 7.538 | ISRA Journal Impact Factor 7.894 | 616
International Journal for Research in Applied Science & Engineering Technology (IJRASET)
ISSN: 2321-9653; IC Value: 45.98; SJ Impact Factor: 7.538
Volume 10 Issue V May 2022- Available at www.ijraset.com
I. Euclidian Distance
Euclidian Distance is the distance between tip of one segment to the tip of another segment. We can call it Tip-to-tip distance
between two-line segments. It is useful distance calculation technique but at the same time not very effective.
The Euclidian Distance between two movies cannot be accurate measure to calculate their similarity.
Formula for Euclidian distance:
J. Cosine Distance
Cosine Distance is measured by the taking the cosine of the angle between the two vectors. It is very effective measure of distance
calculating as the value lies between 0 to 1.
Formula for Cosine distance:
Cos (θ) = Distance between p (movie 1) and q (movie 2)
©IJRASET: All Rights are Reserved | SJ Impact Factor 7.538 | ISRA Journal Impact Factor 7.894 | 617
International Journal for Research in Applied Science & Engineering Technology (IJRASET)
ISSN: 2321-9653; IC Value: 45.98; SJ Impact Factor: 7.538
Volume 10 Issue V May 2022- Available at www.ijraset.com
V. IMPLEMENTATION
We are using content-based recommendation system as it is easier compared to collaborative and hybrid system.
Many small businesses used content-based filtering in their E-commerce website and online market.
We will be using the cosine similarity to calculate a numeric quantity that denotes the similarity between two movies. We use the
cosine similarity score since it is independent of magnitude and is relatively easy and fast to calculate. Mathematically, it is defined
as follows:
We are now in a good position to define our recommendation function. These are the following steps, we'll follow:
1) Get the index of the movie given its title.
2) Get the list of cosine similarity scores for that particular movie with all movies. Convert it into a list of tuples where the first
element is its position and the second is the similarity score.
3) Sort the aforementioned list of tuples based on the similarity scores; that is, the second element.
4) Get the top 10 elements of this list. Ignore the first element as it refers to self (the movie most similar to a particular movie is
the movie itself).Return the titles corresponding to the indices of the top elements. While our system has done a decent job of
finding movies with similar plot descriptions, the quality of recommendations is not that great. "The Dark Knight Rises" returns
all Batman movies while it is more likely that the people who liked that movie are more inclined to enjoy other Christopher
Nolan movies. This is something that cannot be captured by the present system. We are going to use a method called text
vectorization
The credits genres actors and keywords are all combined and converted into tags.
©IJRASET: All Rights are Reserved | SJ Impact Factor 7.538 | ISRA Journal Impact Factor 7.894 | 618
International Journal for Research in Applied Science & Engineering Technology (IJRASET)
ISSN: 2321-9653; IC Value: 45.98; SJ Impact Factor: 7.538
Volume 10 Issue V May 2022- Available at www.ijraset.com
We will create a function ‘recommend’ which can take ‘Movie name’ as an input and search its index position into the similarity
file. First 5 greatest score will be the most similar movies based on the tags we provided and the score calculated by the similarity
function.
Their names are also retained along with their index positions.
©IJRASET: All Rights are Reserved | SJ Impact Factor 7.538 | ISRA Journal Impact Factor 7.894 | 619
International Journal for Research in Applied Science & Engineering Technology (IJRASET)
ISSN: 2321-9653; IC Value: 45.98; SJ Impact Factor: 7.538
Volume 10 Issue V May 2022- Available at www.ijraset.com
We will pass all the vectors (movies) through the function ‘recommend’. It will calculate the similarity of one vector with another
vector. We will sort this array according to maximum calculated similarity. The first five movies will be our output. As shown
below,
©IJRASET: All Rights are Reserved | SJ Impact Factor 7.538 | ISRA Journal Impact Factor 7.894 | 620
International Journal for Research in Applied Science & Engineering Technology (IJRASET)
ISSN: 2321-9653; IC Value: 45.98; SJ Impact Factor: 7.538
Volume 10 Issue V May 2022- Available at www.ijraset.com
Each Function will have a different result based on the selected names from the respective entity.
VI. RESULTS
©IJRASET: All Rights are Reserved | SJ Impact Factor 7.538 | ISRA Journal Impact Factor 7.894 | 621
International Journal for Research in Applied Science & Engineering Technology (IJRASET)
ISSN: 2321-9653; IC Value: 45.98; SJ Impact Factor: 7.538
Volume 10 Issue V May 2022- Available at www.ijraset.com
VII. CONCLUSION
The recommendation systems can be enhanced for present and future requirements for increasing the quality and for better
recommendation results.
Recommendation system can become your virtual guide on E-Commerce platforms when powered with AI
It will be a great loss for companies like Amazon and Netflix if people don’t buy or don’t watch their product
With the ever-increasing demand for machine automated solutions 'ML' has become one of the rapidly evolving technologies along
with AI and Data Science.
Recommender systems will be used in the future to predict demand for products, connect buyers and sellers and eventually become
the backbone for the supply chain.
Mega companies like Amazon, Netflix and Facebook need recommendation systems now more than anything with respect to the
increasing products and users.
REFERENCES
[1] Mahesh Giyani and Neha Chourasia “A Review of Movie Recommendation System: Limitations, Survey and Challenges”
[2] Nirav Raval, Vijayshri Khedkar Moviellaborative Filtering Based Moive Recommendation System”
[3] Bhusan K. and Sripant “Recommendation System: Literature Survey and Challenges.
[4] R. Sandeep, S. Sood, and V. Verma, “Twitter sentiment analysis of real-time customer experience feedback for predicting growth of Indian telecom
companies,” in Proceedings of the 2018 4th International Conference on Computing Sciences (ICCS), pp. 166–174, IEEE, Phagwara, India, August 2018.
[5] Bilge, A., Kaleli, C., Yakut, I., Gunes, I., Polat, H.: A survey of privacy-preserving collaborative filtering schemes. Int. J. Softw. Eng. Knowl. Eng. 23(08),
1085–1108 (2013).
[6] Calandrino, J.A., Kilzer, A., Narayanan, A., Felten, E.W., Shmatikov, V.: You might also like: privacy risks of collaborative filtering.
[7] Research.ijcaonline.org
[8] Dataset: tmdb-5000-movies dataset.
[9] Documentation: sklearn and streamlit
©IJRASET: All Rights are Reserved | SJ Impact Factor 7.538 | ISRA Journal Impact Factor 7.894 | 622