0% found this document useful (0 votes)
178 views

Recommendation System

The document discusses recommendation systems. It defines recommendation systems as information filtering systems that predict a user's preferences for items. It describes two main types of recommendation systems: content-based systems that examine item properties, and collaborative filtering systems that recommend items liked by similar users. The document then provides examples of applications and methods for building recommendation systems, including memory-based and model-based approaches using matrix factorization.

Uploaded by

TARA TARANNUM
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
178 views

Recommendation System

The document discusses recommendation systems. It defines recommendation systems as information filtering systems that predict a user's preferences for items. It describes two main types of recommendation systems: content-based systems that examine item properties, and collaborative filtering systems that recommend items liked by similar users. The document then provides examples of applications and methods for building recommendation systems, including memory-based and model-based approaches using matrix factorization.

Uploaded by

TARA TARANNUM
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 17

RECOMMENDATION

SYSTEM

UNIT -5 Sowmya V
Assistant Professor
BMSCE, Bangalore
Recommender system or Recommendation system

• Recommender system, or a recommendation system (sometimes


replacing 'system' with a synonym such as platform or engine), is a subclass
of information filtering system that seeks to predict the "rating" or
"preference" a user would give to an item.

• Recommender systems are used in a variety of areas, with commonly


recognised examples taking the form of playlist generators for video and
music services, product recommenders for online stores, or content
recommenders for social media platforms and open web content
recommenders.
Recommendation systems use a number of different technologies. We can
classify these systems into two broad groups.

• Content-based systems examine properties of the items recommended. For


instance, if a Netflix user has watched many cowboy movies, then recommend a
movie classified in the database as having the “cowboy” genre.

• Collaborative filtering systems recommend items based on similarity


measures between users and/or items. The items recommended to a user are
those preferred by similar users.
A Model for Recommendation Systems
The Utility Matrix
• Two classes of entities - users and items.
• Users have preferences for certain items, and these preferences must be
teased out of the data.
• The data itself is represented as a utility matrix, giving for each user-item pair
Applications of Recommendation
Systems
1. Product Recommendations:Perhaps the most important use of
recommendation systems is at on-line retailers.
2. Movie Recommendations: Netflix offers its customers recommendations of
movies they might like.
3. News Articles: News services have attempted to identify articles of interest to
readers, based on the articles that they have read in the past.
Methods
There are two basic architectures for a recommendation system:
1.Content-Based systems focus on properties of items. Similarity of items is determined by measuring the
similarity in their properties.
2. Collaborative-Filtering systems focus on the relationship between users and items. Similarity of items is
determined by the similarity of the ratings of those items by the users who have rated both items.
Content-Based Recommendations / Filtering
• Content-based filtering methods are based on a description of the item and a profile of
the user's preferences.
• These methods are best suited to situations where there is known data on an item
(name, location, description, etc.), but not on the user.
• Content-based recommenders treat recommendation as a user-specific classification
problem and learn a classifier for the user's likes and dislikes based on an item's
features.

In this system, keywords are used to describe the items and a user profile is built to indicate
the type of item this user likes.
To create a user profile, the system mostly focuses on two types of information:
1. A model of the user's preference.
2. A history of the user's interaction with the recommender system.
Consider an example of recommending news articles to users. Let’s say we have 100 articles and a
vocabulary of size N. We first compute the tf-idf score for each of the words for every article. Then we
construct 2 vectors:

1. Item vector: This is a vector of length N. It contains 1 for words that have a high tf-idf score in that
article, otherwise 0.

2. User vector: Again a 1xN vector. For every word, we store the probability of the word occurring (i.e.
having a high tf-idf score) in articles that the user has consumed. Note here, that the user vector is
based on the attributes of the item (tf-idf score of words in this case).

Once we have these profiles, we compute similarities between the users and the items. The items that
are recommended are the ones that 1) the user has the highest similarity with or 2) has the highest
similarity with the other items the user has read.
2 common methods:

1.Cosine Similarity:
To compute similarity between the user and item, we simply take the cosine similarity between the user
vector and the item vector. This gives us user-item similarity.

To recommend items that are most similar to the items the user has bought, we compute cosine similarity
between the articles the user has read and other articles. The ones that are most similar are recommended.
Thus this is item-item similarity.

Cosine similarity is best suited when you have high dimensional features, especially in information
retrieval and text mining.
2. Jaccard similarity:
Also known as intersection over union, the formula is as follows:

This is used for item-item similarity. We compare item vectors with each other and return the items
that are most similar.

Jaccard similarity is useful only when the vectors contain binary values. If they have rankings or
ratings that can take on multiple values, Jaccard similarity is not applicable.

In addition to the similarity methods, for content based recommendation, we can treat
recommendation as a simple machine learning problem. Here, regular machine learning algorithms
like random forest, XGBoost, etc., come in handy.
Collaborative Filtering
Collaborative filtering is based on the assumption that people who agreed in the past will agree
in the future, and that they will like similar kinds of items as they liked in the past.

The system generates recommendations using only information about rating profiles for
different users or items. By locating peer users/items with a rating history similar to the current
user or item, they generate recommendations using this neighborhood.
Examples of explicit data collection include the following:
∙ Asking a user to rate an item on a sliding scale.
∙ Asking a user to search.
∙ Asking a user to rank a collection of items from favorite to least favorite.
∙ Presenting two items to a user and asking him/her to choose the better one of them.
∙ Asking a user to create a list of items that he/she likes .
Examples of implicit data collection include the following:
∙ Observing the items that a user views in an online store.
∙ Analyzing item/user viewing times.
∙ Keeping a record of the items that a user purchases online.
∙ Obtaining a list of items that a user has listened to or watched on his/her computer.
∙ Analyzing the user's social network and discovering similar likes and dislikes.
Collaborative filtering approaches often suffer from three problems: cold start, scalability, and
sparsity.
∙ Cold start: For a new user or item, there isn't enough data to make accurate
recommendations.
∙ Scalability: In many of the environments in which these systems make recommendations,
there are millions of users and products. Thus, a large amount of computation power is
often necessary to calculate recommendations.
∙ Sparsity: The number of items sold on major e-commerce sites is extremely large. The most
active users will only have rated a small subset of the overall database. Thus, even the
most popular items have very few ratings.
One of the most famous examples of collaborative filtering is item-to-item collaborative filtering
(people who buy x also buy y), an algorithm popularized by Amazon.com's recommender
system.
Memory based approach
For the memory based approach, the utility matrix is memorized and recommendations are made by querying
the given user with the rest of the utility matrix. Let’s consider an example of the same: If we have m movies
and u users, we want to find out how much user i likes movie k.

This is the mean rating that user i has given all the movies she/he has rated. Using this, we estimate his
rating of movie k as follows:

Similarity between users a and i can be computed using any methods like cosine similarity/Jaccard
similarity/Pearson’s correlation coefficient, etc.
These results are very easy to create and interpret, but once the data becomes too sparse, performance
becomes poor.
Model based approach
One of the more prevalent implementations of model based approach is Matrix Factorization. In this, we
create representations of the users and items from the utility matrix. This is what it looks like:

Thus, our utility matrix decomposes into U and V where U represents the users and V represents the
movies in a low dimensional space. This can be achieved by using matrix decomposition techniques like
SVD or PCA or by learning the 2 embedding matrices using neural networks with the help of some optimizer
like Adam, SGD etc.

For a user i and every movie j we just need to compute rating y to and recommend the movies with the
highest predicted rating. This approach is most useful when we have a ton of data and it has high sparsity.
Matrix factorization helps by reducing the dimensionality, hence making computation faster. One
disadvantage of this method is that we tend to lose interpretability as we do not know what exactly
elements of the user/item vectors mean.

You might also like