Project- Chapter Two
Project- Chapter Two
Literature Review
2.1 Introduction
Among the many applications of these technologies, recommender systems have
emerged as powerful tools that help users navigate the overwhelming amount of
information available and discover personalized content tailored to their interests.
This paper delves into the realm of recommender systems, focusing specifically on
movie recommendations, and explores the latest survey and review articles that
highlight the importance of these systems in enhancing user experiences. A key
focus of this review is user satisfaction, which plays a crucial role in determining the
effectiveness of these systems. We also explore how movie recommendation
systems, designed to personalize movie suggestions for users, aim to enhance user
experience. Additionally, we discuss the importance of evaluating movie content, as
factors like genre, ratings, and user preferences significantly influence the quality
and relevance of recommendations. The review further examines various
recommendation system techniques, such as content-based filtering, collaborative
filtering, and hybrid methods, highlighting their advantages and limitations in relation
to user satisfaction. Moreover, the review addresses key challenges in the field,
including the cold-start problem, scalability, and the balance between accuracy and
diversity in recommendations, along with other factors that influence the
performance of these systems. By reviewing these studies, this research aims to
identify gaps in existing research, laying the foundation for this study, which seeks to
explore ways to improve user satisfaction in movie recommendation systems and
assess their effectiveness.
With the rapid growth of modern technology, the amount of data generated daily has
increased significantly, bringing us into the era of big data. While this digital
transformation has improved many aspects of life, it has also created the problem of
information overload. This occurs when people are presented with more data than
they can process, making it difficult to make decisions or find relevant information. To
tackle this challenge, methods like data mining have become essential for filtering
and organizing data effectively. One of the most practical tools in this area is the
recommendation system [ ].
A dataset is a collection of structured data used to train and test models, such as
recommendation systems. In the context of recommendation systems, datasets
typically consist of user interactions with items, such as ratings, clicks, or views,
along with additional information like user demographics or item features. These
datasets provide the foundation for analyzing patterns and making predictions about
what users might like based on past behavior.
● MovieLens Dataset: A dataset of movie ratings that contains user ratings and
metadata about the movies. It is widely used for research in recommendation
systems, containing information such as movie titles, genres, and user
ratings.
● Amazon Product Dataset: Contains data about products purchased on
Amazon, including user reviews, ratings, and product information like
category, brand, and price. This dataset is often used to build product
recommendation systems.
● Netflix Dataset: A collection of user ratings for movies and TV shows on
Netflix, including metadata about the content like genres, directors, and
actors. It’s used to make personalized content recommendations for users.
● Goodreads Dataset: This dataset contains information about books, authors,
and user reviews. It is typically used in book recommendation systems to
suggest books based on user preferences and ratings.
These datasets help recommendation systems identify patterns in user behavior and
content preferences, enabling them to suggest relevant items.
3. Sparsity Problem
A sparsity issue arises when the user-content interaction data contains insufficient
information, often due to users failing to rate or interact with much content. Sparse
data affects the accuracy of recommendations and highlights the need for reliable
algorithms that can work well with limited data [ ].
4. Lack of Data
Recommendation systems rely on rich datasets to perform well. Smaller or newer
platforms without access to extensive user or content data struggle to provide
relevant suggestions, making data collection and enhancement essential for better
performance [ ].
6. Unpredictable contents
When new or unusual content is introduced without enough past data, the
recommendation system has a hard time suggesting it. This happens because the
system relies on patterns from past interactions to make suggestions. For instance, if
a new product or movie has no ratings or user engagement yet, the system won't
know how to connect it to what a user might like. As a result, these new products or
movies might not get recommended as much because there isn’t enough information
to make an accurate prediction about them [ ].
7.Scalability Issues
Handling a growing number of users and items while maintaining system
performance is a critical challenge. Algorithms that work well on smaller datasets
may struggle with larger ones. Optimizing the structure of both hardware and
software is essential to ensure efficiency as the system expands [ ].
2.3 Movie Recommendation Systems
When it comes to movies, recommendation systems work by suggesting films based
on a user’s preferences and past viewing history. User profiles in movie
recommendation systems are created by collecting details like age, gender, location,
and personal preferences. This information helps the system understand what kind
of movies someone might enjoy. Similarly, movie profiles are built using features like
genre, director, cast, release year, and language. For example, a movie might be
tagged as "Action," "Directed by Christopher Nolan," or "Starring Leonardo
DiCaprio." How these profiles are created is crucial because it directly impacts how
accurate and effective the recommendations are. By matching the traits in a user’s
profile with the features of different movies, the system can suggest films that better
fit the user’s tastes [ ]. For instance, a child would likely receive recommendations for
cartoons and animations, which are more popular with younger viewers. The system
can also adjust its suggestions by considering what other children of the same age
are watching, ensuring that the recommendations are relevant.
They also help users discover movies that align with their unique preferences, saving
time and effort spent searching for something to watch. Offering personalized
suggestions that cater to individual tastes ensures users feel more connected to the
platform, creating a smoother and more enjoyable experience.
For streaming platforms or cinemas, these systems can drive key business
outcomes. By increasing movie viewership, subscriptions, or purchases through
targeted recommendations, they directly contribute to revenue growth. Moreover,
they help improve customer retention by ensuring users stay satisfied and engaged
with content that appeals to them.
They also introduce users to new genres, directors, or movies they may not have
discovered on their own. By balancing popular content with niche films, these
systems broaden the range of choices, allowing users to explore different types of
content they might not have considered before.
These systems help streaming platforms maximize the value of their entire movie
library, recommending less popular or older content that users might otherwise
overlook. By surfacing hidden gems, they help reduce the "long tail" problem,
ensuring that all content, both mainstream and obscure, gets the attention it
deserves.
Collaborative Filtering
Collaborative filtering uses historical user activity to predict movies a user might
enjoy. It analyzes user information, such as the movies they’ve watched, searched
for, or rated, and compares it with the activities of other users with similar
preferences. For example, in movie recommendation systems, demographic details
like age, gender, or ethnicity are combined with past viewing and search history to
suggest movies that align with the interests of similar users. By identifying patterns
among users with shared tastes, it predicts recommendations for the target user.
There can be countless users in the system. This technique identifies users with
similar preferences by analyzing the ratings they have given to specific items. By
comparing these ratings, the system finds patterns of similarity between users. The
strategy relies on the ratings provided by users across a broad catalog of items. This
catalog, often referred to as the user-item matrix, serves as the foundation for
generating relevant recommendations. In the context of movie recommendation
systems, this catalog could be referred to as the user-movie matrix.
Collaborative filtering was first introduced in 1991 by Goldberg et al. with the
development of the Tapestry system. Tapestry was an early attempt at creating a
collaborative recommendation system, designed for smaller user groups. However, it
had its limitations, requiring a lot of user input and not being very scalable for larger
groups. While Tapestry helped to demonstrate the potential of recommendation
systems, it also highlighted the need for improvements.
In the context of movies, it has become one of the most popular methods for
recommending films on streaming services, offering personalized suggestions to
enhance the user experience. For example, platforms like Netflix and IMDb rely on
collaborative filtering techniques to suggest movies, helping users find films they
might enjoy based on the preferences of other users with similar tastes.
The power of collaborative filtering lies in its ability to sift through large amounts of
data and provide personalized recommendations, making it easier for users to
discover things they'll enjoy without needing to search through endless options. It's
become a standard technique for recommending everything from movies to products
across online platforms.
Another issue is that the system’s accuracy can be limited, as people with similar
demographic profiles don’t always have similar tastes. Moreover, because
collaborative filtering relies on finding similarities between users or products, it can
sometimes lead to repetitive suggestions, reducing the variety of recommendations
offered.
Miguel G. Silva et al. developed a collaborative filtering method that groups users by
analyzing patterns in how they rate items. For example, if two users consistently rate
similar items in the same way, they are grouped together. This grouping helps predict
preferences for users even when data is sparse or inconsistent, leading to more
accurate and personalized recommendations [ ].
Daniel A. Galron et al. introduced a deep learning method that works by improving
how the system identifies similarities between users or items. Instead of relying on
traditional metrics, their approach uses advanced neural networks to process the
available data more effectively. This helps overcome challenges in datasets where
user activity is limited or spread out, resulting in better recommendations [ ].
Ali Fallahi RahmatAbadi and Javad Mohammadzadeh explored how deep learning
can solve common collaborative filtering problems. For instance, they highlighted
how neural networks can predict preferences for new users or items (cold start) by
analyzing related data, such as item features or user profiles. Additionally, they
showed how these methods can handle and process large amounts of data more
efficiently, improving scalability for bigger systems [ ].
Content-Based Filtering
Unlike collaborative filtering, content based filtering does not face new user
problems. It does not have other user interaction in it. It only deals with a particular
user’s interest. Collaborative filtering, on the other hand, assumes that grouping
users with similar demographic characteristics will result in effective
recommendations. However, it recognizes that grouping in collaborative filtering,
recommendations may not match the preferences of the users [7].For example, The
tastes and preferences of people with similar demographic characteristics can vary
significantly, what person X likes may not align with what person Y enjoys watching [
].Content based filtering first checks the user preference and then suggests him with
the movies or any other product to him. It only focuses on a single user’s ideas,
thoughts and gives suggestions based on his interest. So if we talk about movies,
then the content based filtering technique checks the rating given by the user. The
approach checks which movies are given high ratings by the user by checking the
genre categories in the user profile. After analysing the user profile, the technique
recommends movies to the user according to his taste [ ]. The fundamental
principles of content-based filtering can be broken down into two key steps: (1)
analyzing the movies a specific user enjoys, identifying common attributes such as
genre, director, cast, or themes, and storing these preferences in the user’s profile;
and (2) comparing the attributes of other movies with the user’s profile to
recommend films that closely match their preferences [ ].
Jieun Son and Seoung Bum Kim suggested a way to improve content-based filtering
for movie recommendation systems by using multiattribute networks. These
networks include detailed information about the movies being recommended. By
analyzing all the attributes through network analysis, their method recommends a
wider variety of movies, effectively solving the overspecialization problem. Their
results also showed improvements in dealing with issues like sparsity and scalability
compared to traditional content-based filtering methods. By leveraging movie
attributes like cast, keywords, crew, and genres, they aim to enhance the
movie-watching experience for users, saving them time and effort in searching for
movies that align with their tastes [ ].
Hybrid Filtering
This filtering is an information filtering system that takes ratings of the movies as
input from the users and then applies the collaborative and content based filtering
and generates recommendation lists [49]. It is a combination of the two techniques
i.e. collaborative filtering and content based filtering. It is superior because it
achieves higher performance in making the suggestions and also a faster
computational time [ ]. When only the single method i.e. the collaborative filtering or
content based filtering alone cannot solve the problem then hybrid filtering concept
comes into picture. By using hybrid filtering many problems of collaborative filtering
and content based filtering can be resolved. For example, the cold start problem in
collaborative filtering and the lack of user preference information in content-based
filtering are significant challenges. So if we apply content based filtering and then
use collaborative filtering it can be a solution to it. So making it hybrid can resolve
the problem.
Yang et al. put forth a hybrid approach based on social similarity and item attribute.
The author used collaborative filtering methods combined with social similarities and
genres of the movie. They used a model called BPR-MF (Bayesian Personalized
Ranking - Matrix Factorization) to address the problem of sparse data. BPR-MF
(Bayesian Personalized Ranking Matrix Factorization) is a recommendation model
that predicts user preferences by ranking items rather than predicting ratings. It
breaks down a user-item interaction table into two smaller tables, optimizing for item
rankings based on user interactions. It's particularly effective with sparse data and
implicit feedback. The proposed method works in two stages. First the BPR-MF
model is used to obtain the candidate set which refers to a group of potential items
that could be recommended to a user, based on the ratings from the training dataset.
After finding the candidate set, the unknown ratings are predicted using the existing
ratings. Then the ratings are sorted and the final candidate set for each user is
obtained. Each set has several top items. In the second stage the movies are
recommended to users using the feature selection TF-IDF method. TF-IDF is a
method used to evaluate how important features (like words or attributes) are in a
dataset, based on how often it appears and how rare it is across all datasets. It helps
find similarities between items or users in recommendation systems.. The result
shows that using BPR-MF shows more accurate results rather than collaborating
filtering.
Priscila Valdiviezo and J. Bobadilla proposed a method that combines various user
ratings and demographic information like age, gender and occupation and combined
them into one matrix model. Then collaborative filtering is used to find out the
missing ratings. The main idea used here is to improve the overall rating prediction.
Here MAE (Mean Absolute Error) is used to measure the performance of the
proposed approach. The data sparsity problem is also solved by using demographic
features of the user and the item.
User satisfaction is achieved when the user’s goals are aligned with the system’s
suggestions. Understanding user perception is key to building a quality
recommender system. Providing the right recommendations at the right time can
boost user satisfaction and encourage engagement. From a reliability standpoint,
previous studies have shown that when novice users understand the reasoning
behind the system's suggestions and can anticipate outcomes, along with having
some control over the system, it helps build trust. This, in turn, enables users to
personalize the agent more effectively. This section will explore the factors
influencing user satisfaction in recommendation systems and the methods used to
measure it effectively.