Project Report On Recommendation System
Project Report On Recommendation System
(Session: 2016-2020)
Submitted by:
Amal Yadav
UE163007
BE CSE SECTION-I
Table of contents
2. Abstract 2
3. Introduction : 3
i. Need of recommendation system
ii. Types of filtering in recommendation system:
1. Content based filtering
2. Collaborative filtering
3. Hybrid filtering
4. Basic terminologies 6
5. Project dependencies: 9
1. Dataset
2. Libraries used
3. Loss function used
4. UI
6. Methodology 14
7. Result 15
12. References 24
Acknowledgment
I would like to express a deep sense of gratitude and thanks profusely to Mr. Avinash, without
his wise counsel and able guidance, it would have been impossible to complete the project in this
manner.
I express gratitude to other team members of the Information Technology department of Access
Computer Institute. For their intellectual support throughout the course of this work.
I perceive as this opportunity as a big milestone in my career development. I will strive to use
gained skills and knowledge in the best possible way, and I will continue to work on their
improvement, in order to attain desired career objectives. Hope to continue cooperation with all
of you in the future.
1
Abstract
In this Project report, I present a summary of my project that is recommendation system that
recommends movies for a given user based on a hybrid approach which is a combination of
content-based (using user’s past history or choice) and collaborative approach (using other
similar user’s choice).
For this project, I have used Movie lens 100k dataset, to train and test our model so that it can
recommend movies for any given user. The light FM python library is used for implementing the
popular recommendation algorithms, i.e. WARP (Weighted Approximate Rating Pairwise) loss
based algorithm. The given user’s past viewed history and recommended movies are put on the
webpage, which shows the name, poster of the movie, and even a user can watch the trailer of
any movie present there by clicking on its poster.
2
Introduction
A product recommendation is a filtering system that seeks to predict and show the items that a
user would like to purchase. It may not be entirely accurate, but if it shows what a user like then it
is doing its job right.
Recommendation engines basically are data filtering tools that make use of algorithms and data to
recommend the most relevant items to a particular user. In simple terms, they are nothing but an
automated form of a “shop counter guy”.
➢ In the immortal words of Steve Jobs: “A lot of times, people don’t know what they want until
you show it to them.” Customers may love your movie, your product, your job opening- but
they may not know it exists. The job of the recommender system is to open the customer/user
up to completely new products and possibilities, which they would not think to directly
search for themselves.
➢ With the growing amount of information on the internet and with a significant rise in the
number of users, it is becoming important for companies to search, map and provide them
with the relevant chunk of information according to their preferences and tastes.
1. Collaborative filtering
2. Content-Based Filtering
3. Hybrid Recommendation Systems
1. Collaborative filtering:
➢ This filtering method is usually based on collecting and analyzing information on user’s
behaviors, their activities or preferences and predicting what they will like based on the
similarity with other users.
➢ A key advantage of the collaborative filtering approach is that it does not rely on machine
analyzable content and thus it is capable of accurately recommending complex items such
as movies without requiring an “understanding” of the item itself.
➢ Collaborative filtering is based on the assumption that people who agreed in the past will
agree in the future, and that they will like similar kinds of items as they liked in the past.
➢ For example, if a person A likes item 1, 2, 3 and B like 2,3,4 then they have similar
interests and A should like item 4 and B should like item 1.
3
• User-User Collaborative Filtering: Here, the search is done for lookalike customers
and offer products based on what his/her lookalike has chosen. This algorithm is very
effective but takes a lot of time and resources. This type of filtering requires
computing every customer pair information which takes time. So, for big base
platforms, this algorithm is hard to put in place.
• Item-Item Collaborative Filtering: It is very similar to the previous algorithm, but
instead of finding a customer look alike, we try finding item look alike. Once we have
item look alike matrix, we can easily recommend alike items to a customer who has
purchased an item from the store. This algorithm requires far fewer resources than
user-user collaborative filtering. Hence, for a new customer, the algorithm takes far
lesser time than user-user collaborate as we don’t need all similarity scores between
customers. Amazon uses this approach in its recommendation engine to show related
products which boost sales.
• Other simpler algorithms: There are other approaches like market basket analysis,
which generally do not have high predictive power than the algorithms described
above.
2. Content-based filtering:
➢ These filtering methods are based on the description of an item and a profile of the user’s
preferred choices.
➢ In a content-based recommendation system, keywords are used to describe the items;
besides, a user profile is built to state the type of item this user likes. In other words, the
algorithms try to recommend products which are similar to the ones that a user has liked
in the past.
➢ The idea of content-based filtering is that if a user like an item then he/she will also like a
‘similar’ item.
➢ For example, when we are recommending the same kind of item like a movie or song
recommendation. This approach has its roots in information retrieval and information
filtering research.
➢ A major issue with content-based filtering is whether the system is able to learn user
preferences from users actions about one content source and replicate them across other
different content types.
➢ When the system is limited to recommending the content of the same type as the user is
already using, the value from the recommendation system is significantly less when other
content types from other services can be recommended. For example, recommending
news articles based on the browsing of news is useful, but wouldn’t it be much more
useful when music, videos from different services can be recommended based on the
news browsing.
4
Fig 1: Filtering method representation of collaborative and content-based filtering
5
Basic terminologies
1. Labels: A label is a thing we're predicting. For example, the label could be the future price of
wheat, the kind of animal shown in a picture, the meaning of an audio clip, or just about
anything.
2. Feature: A feature is an input variable. For example, in spam detector example, the features
could include the words in the email text, sender’s address etc.
3. Model: It defines the relationship between features and label. For example, a spam detection
model might associate certain features strongly with "spam".
4. Training means creating or learning the model. That is, the model is shown the labeled
examples and it enables the model to gradually learn the relationships between features and
label.
5. Inference means applying the trained model to unlabeled examples. That is, you use the
trained model to make useful predictions (y’).
6. Loss Function: It measures the difference between the model’s predictions and the desired
output. We want to minimize it during training so that our model becomes more accurate
over time.
➢ Loss: Loss is the penalty for a bad prediction. That is, the loss is a number indicating
how bad the model's prediction was on a single example. If the model's prediction is
perfect, the loss is zero; otherwise, the loss is greater. The goal of training a model is to
find a set of weights and biases that have low loss, on average, across all examples. For
example, Figure 1.3 shows a high loss model on the left and a low loss model on the
right. Note the following about the figure:
• The red arrow represents a loss.
• The blue line represents predictions.
Fig 3: High loss in the left model; low loss in the right model.
The red arrows in the left plot are much longer than their counterparts in the right plot.
Clearly, the blue line in the right plot is a much better predictive model than the blue line
in the left plot.
6
➢ Popular Loss Functions
1. Squared loss: The linear regression models we'll examine here use a loss function
called squared loss (also known as L2 loss). The squared loss for a single example is
as follows:
= the square of the difference between the label and the prediction
= (observation - prediction(x))2
= (y - y')2
2. Mean square error (MSE) is the average squared loss per example over the whole
dataset. To calculate MSE, sum up all the squared losses for individual examples and
then divide by the number of examples:
MSE=1N∑(x, y) ∈ D(y−prediction(x))2
where:
• (x,y) is an example in which
• x is the set of features (for example, chirps/minute, age, gender) that the model
uses to make predictions.
• y is the example's label (for example, temperature).
• prediction(x) is a function of the weights and bias in combination with the set of
features x.
• D is a data set containing many labeled examples, which are (x, y) pairs.
• N is the number of examples in D.
7. Reducing Loss: Calculating the loss function for every conceivable value of the weight of a
feature over the entire data set would be an inefficient way of finding the convergence point.
So, we use the following ways to minimize the loss:
The gradient always points in the direction of steepest increase in the loss function. The
gradient descent algorithm takes a step in the direction of the negative gradient in order to
reduce loss as quickly as possible. The gradient descent then repeats this process, edging
ever closer to the minimum.
7
Fig 4: A gradient step moves us to the next point on the loss curve
c) Learning Rate: Gradient descent algorithms multiply the gradient by a scalar
known as the learning rate (also sometimes called step size) to determine the next
point.
d) Batch: It is the total number of examples you use to calculate the gradient in a single
iteration. A very large batch may cause even a single iteration to take a very long
time to compute.
ii. SGD (stochastic gradient descent): It uses only a single example (a batch size of 1) per
iteration. Given enough iterations, SGD works but is very noisy. The term "stochastic"
indicates that the one example comprising each batch is chosen at random.
8. Epochs - One Epoch is when an entire dataset is passed forward and backward through the
neural network only once.
➢ Why we use more than one Epoch?
I know it doesn’t make sense in the starting that the passing the entire dataset through a
neural network is not enough and we need to pass the full dataset multiple times to the
same neural network. But we are using a limited dataset and to optimize the learning and
the graph we are using Gradient Descent which is an iterative process. So, updating the
weights with a single pass or one epoch is not enough.
One epoch leads to underfitting of the curve in the graph (below).
Fig 5: shows that as the number of epochs increases, the number of times the weight is
changed in the neural network and the curve goes from under
fitting to optimal to overfitting curve.
8
Project Dependencies
1. Dataset:
For the project, MovieLens dataset is used. MovieLens is run by GroupLens, a research lab at the
University of Minnesota. The Movielens dataset is a big CSV file that contains data of 943 users
for 1682 items. Each user has given a rating to at least 20 movies.
DETAILED DESCRIPTIONS OF DATA FILES:
S.No File Description
1. u.data The full u data set, 100000 ratings by 943 users on 1682 items. Each user
has rated at least 20 movies. Users and items are numbered consecutively
from 1. The data is randomly ordered. This is a tab separated list of
user id | item id | rating | timestamp.
The time stamps are Unix seconds since 1/1/1970 UTC
2. u.info The number of users, items, and ratings in the u data set.
3. u.item Information about the items (movies); this is a tab separated list of
movie id | movie title | release date | video release date |IMDb URL |
unknown | Action | Adventure | Animation | Children’s | Comedy | Crime |
Documentary | Drama | Fantasy | Film-Noir | Horror | Musical | Mystery |
Romance | Sci-Fi | Thriller | War | Western |
The last 19 fields are the genres, a 1 indicates the movie is of that genre, a
0 indicates it is not; movies can be in several genres at once. The movie
ids are the ones used in the u.data data set.
4. u.genre A list of the genres.
5. u.user Demographic information about the users; this is a tab separated list of
user id | age | gender | occupation | zip code
The user ids are the ones used in the u.data data set.
6. u.occupation A list of the occupations.
7. U1.base The data sets u1.base and u1.test through u5.base and u5.test are
U1.test 80%/20% splits of the u data into training and test data. Each of u1, …, u5
U2.base have disjoint test sets; this if for 5 fold cross validation (where you repeat
U2.test your experiment with each training and test set and average the results).
U3.base These data sets can be generated from u.data by mku.sh.
U3.test
U4.base
U4.test
U5.base
U5.test
Table 1: Brief description of the Movielens dataset
9
2. Libraries Used:
a) LightFM library - LightFM is a Python implementation of a number of popular
recommendation algorithms for both implicit and explicit feedback, including efficient
implementation of BPR and WARP ranking losses. It's easy to use, fast (via multithreaded
model estimation) and produces high-quality results.
In this project, this library is used to fetch Movielens dataset at runtime, for creating a
model and train it using WARP ranking losses and for training our model. This
implementation uses stochastic gradient descent for training.
In this project, this library is used to open HTML page containing watched and
recommended movies for a user and also to play movie’s trailer.
In this project, this library is used to extract the path of the filename which is to be used
to load the HTML page.
When training recommenders, we often don’t care about the absolute score of the items being
recommended as much as their rank relative to one another. However, few loss functions actually
optimize for this.
10
2) BPR: Bayesian Personalised Ranking pairwise loss. Maximizes the prediction difference
between a positive example and a randomly chosen negative example. Useful when only
positive interactions are present and optimizing ROC AUC is desired.
3) WARP: Weighted Approximate-Rank Pairwise loss. Maximizes the rank of positive examples
by repeatedly sampling negative examples until rank violating one is found. Useful when
only positive interactions are present and optimizing the top of the recommendation list
(precision@k) is desired.
4) k-OS WARP: k-th order statistic loss. A modification of WARP that uses the k-th positive
example for any given user as a basis for pairwise updates.
For this project, the WARP loss function is used for training our model. WARP is an implicit
feedback model: all interactions in the training matrix are treated as positive signals and products that
users did not interact with the implicitly do not like. The goal of the model is to score these implicit
positives highly while assigning low scores to implicit negatives.
WARP loss was first introduced in 2011, not for recommender systems but for image annotation.
It was used to assign to an image the correct label from a very large sample of possible labels.
Originally, the motivation for developing this loss — which in particular, has a novel sampling
technique — was one of memory efficiency. However, the sampling technique also has additional
benefits which make it well suited to training a recommender system.
At a high level, WARP loss will randomly sample output labels of a model, until it finds a pair
which it knows are wrongly labeled, and will then only apply an update to these two incorrectly
labeled examples.
Consider the following example: Let's take the example of a recommender system to recommend
one of the following 5 candy bars. Let a customer’s customer journey is inputted through my
recommender as given, and it has generated an output vector, which assigns to each candy bar a
probability that this customer will purchase it. To train the recommender, there is a target vector,
which describes the customer’s actual behavior using 1s if the customer purchased a specific
candy bar, and 0 if they did not:
11
Highlighted in red is the candy bar the customer actually bought (note that for simplicity, we are
only considering a single purchase, but this loss extends to the case where the customer has made
multiple purchases). This is known as the correct label; let’s label it x³+ for clarity (where the +
highlights that this was the purchased item, and the superscript indicates where the element is in
the vector).
Now we going to randomly sample the other labels until we find one for which the model
assigned a higher probability of purchase to the customer (or we run out of labels to sample).
Then it is known that this randomly sampled label is wrongly labeled because we know that the
Milky Way bar should have the highest probability — since this is the one the customer actually
bought!
For instance, if the first random sample we look at is the Mars bar:
Now, we have two variables: my correct label, x³+, and my sampled label, which we take as a
sampled negative label, x⁵-(negative because since the customer didn’t buy it).
In this case, our model was correct; 0.59 > 0.17 (or x³+>x⁵-) so our model correctly ranked the
Milky Way higher than the Mars bar. When this happens, we sample another label — and we will
keep doing this until we find a case where the model was wrong.
Say the second random sample we take is of the Kit Kat (which becomes the sampled negative
label, x²-):
In this case, 0.59 < 0.63 (or x³+ < x²-). Our model was wrong here since it thought the customer
would be more likely to buy the Kit Kat. To tell our model to correct this, x³+ and x²- are the two
examples we will use for the WARP loss, where the loss is the difference between the two values.
12
In addition to this pair, if we want to have an idea of how well my model did in general; was the
Milky Way bar ranked near the top of all the candy bars? Or did the model do poorly, and stick it
near the bottom?
To avoid having to look at all the examples (remember; efficiency!), we can keep track of this
while we do the random sampling. If it takes us lots of random samples to find an example where
our model was wrong, then we can assume it did pretty well. On the other hand, if the first
random sample we looked at had a higher score than my correct label, then we can assume it did
pretty poorly.
where X is the total number of labels (5, in this case) and N is the number of samples needed to
find an example where the model was wrong (2, in this case — the Mars bar, and the Kit Kat).
This makes sense; as we have to take more samples (and N gets larger), it indicates our model is
more correct, so we want our loss to be small. We also take the natural logarithm of this function,
just to prevent the loss from exploding if N gets small (and since X is generally large).
It’s interesting to note that the loss only depends on these two examples which we have sampled
(and so only weights for those two examples will be updated). Nothing is going to be done about
the fact that the Twix bar was also ranked higher than the Milky Way, or the fact that Snickers got
a 0.35 chance of being bought even though the customer didn’t buy it (so in the best model, it
should have a 0). The model will only learn that the Milky Way bar should be ranked above the
Kit Kat.
For a recommender, this is much more desirable than a model which learns that it should output
1s for all positive examples and 0 for all negative examples, because often for recommenders, a 0
does not mean a negative interaction. Just because the customer didn’t buy a Twix, it doesn’t
mean they didn’t want to buy it — many other factors could have contributed to their not
purchasing it, most notably (considering the case where there are not 5 but 500 products to
recommend) that they just didn’t see it.
4. UI:
For better user experience and understanding, the known choices of movies for a particular user
and the recommended movies are put on an HTML page showing the title, poster of the movie
and if the user clicks on the poster of any movie its trailer is played at the center of the screen.
13
Methodology
i. LightFM includes functions for getting and processing the dataset. There is a function
(fetch_movielens) which downloads the dataset and automatically pre-processes it into
sparse matrices suitable for further calculation. In particular, it prepares the sparse user-
item matrices, containing positive entries where a user interacted with a product, and zeros
otherwise.
ii. We have two such matrices, training, and a testing set. Both have around 1000 users and
1700 items. We’ll train the model on the training matrix but test it on the test matrix.
iii. To run this recommendation system, first, a user id is required just like when a particular
user login his/her account then only his/her past history is known to the system and
according to that particular user’s past history and other user’s choices like him/her are
being recommended to it. So, just for now the user id is given to the recommender system
at runtime.
iv. Then the LightFM model is created. It is a hybrid latent representation recommender
model.
The user and item representations are expressed in terms of representations of their
features: an embedding is estimated for every feature, and these features are then summed
together to arrive at representations for users and items. For example, if the movie ‘Wizard
of Oz’ is described by the following features: ‘musical fantasy’, ‘Judy Garland’, and
‘Wizard of Oz’, then its embedding will be given by taking the features’ embeddings and
adding them together. The same applies to user features.
v. Then we use the WARP (Weighted Approximate-Rank Pairwise) loss function to train our
model. WARP is an implicit feedback model: all interactions in the training matrix are
treated as positive signals and products that users did not interact with they implicitly do
not like. The goal of the model is to score these implicit positives highly while assigning
low scores to implicit negatives.
Model training is accomplished via SGD (stochastic gradient descent). This means that for
every pass through the data — an epoch — the model learns to fit the data more and more
closely. We’ll run it for 10 epochs in this example. We can also run it on multiple cores,
so we’ll set that to 2. (The dataset in this example is too small for that to make a
difference, but it will matter on bigger datasets).
14
Result
After training the model, it predicts the recommended movies for the user id given as an input.
The figure below shows the known choices and recommended movies for the user id 5.
Fig 6: Showing known choices and recommended movies for a user with id 5
But, just for better user interface the above information can be put on the HTML page which contains the
title of the movie, poster of the movie and a functionality that if the user clicks on the poster of the movie
then its trailer is going to run at the center of the screen.
Fig 7: Showing HTML page containing watched and recommended movies title and poster.
15
Fig 8: It shows the selection of a movie when the mouse hovers over the poster of a movie ‘Toy Story’.
Fig 9: Showing the trailer of the movie ‘Toy Story’ is playing on the screen when the user clicks on its
poster.
16
Application of Recommendation System
The following are the application of Recommendation System:
➢ Recommender systems have become increasingly popular in recent years, and are utilized in a
variety of areas including movies, music, news, books, research articles, search queries, social
tags, and products in general.
➢ Mostly used in the digital domain, the majority of today’s E-Commerce sites like eBay,
Amazon, Alibaba etc., make use of their proprietary recommendation algorithms in order to
better serve the customers with the products they are bound to like.
Fig 10: Reference: Amazon’s recommendation system providing a recommendation of the products
2. YouTube :
17
3. Netflix:
Fig 12: Netflix recommendation system giving a recommendation for a user for a movies
4. Gaana Music App
Fig 13: Gaana music app’s recommendation system recommends songs (Made for you).
18
Advantages of using a recommendation system
Below are some of the various potential benefits of recommendation systems in business, and the
companies that use them:
1. “Improving with use” (retention): One of the core potential benefits of recommendation
systems is their ability to continuously calibrate to the preferences of the user. This makes
products that become more and more “sticky” in their customer retention as time goes on:
❖ You’re much less likely to switch to a Netflix competitor when Netflix has such a
wonderful sense of which movies and shows you might want to watch next (i.e. they
“know you so well”). Because most of Netflix’s revenues come from a fixed-rate
recurring billing model subscription, the company’s biggest ROI “win” with
recommendation systems is retention.
2. Improving cart value: A company with an inventory of thousands and thousands of items
would be hard pressed to hard-code product suggestions for all of its products, and it’s
obvious that such static suggestions would quickly be out-of-date or irrelevant for many
customers. By using various means of “filtering”, eCommerce giants can find opportune
times to suggest (on their site, via email, or through other means) new products that you’re
likely to buy.
❖ Amazon’s quick delivery and emphasis on customer service have earned them millions of
customers. Recommendation engines play a role not only in helping customers find more
of what they need (and see Amazon as an authority), but these systems also improve cart
value. If Amazon doesn’t have to pay much more for shipping to send you two or three
times as many products, their profit margins improve.
3. Improved engagement and delight: Sometimes seeing an ROI doesn’t involve explicitly
asking for payment. Many companies use these systems to simply encourage engagement
and activity on their product or platform.
❖ YouTube has subscription options, but the majority of the firm’s revenues are driven
through advertisements placed across its wide array of video properties. The company
makes more money when users come back time and time again. YouTube doesn’t
optimize for short-term view length, as this might encourage pushy or flashy tactics that
wouldn’t genuinely delight users. Instead, the service aims to encourage long-term use,
because advertising views are the ROI that these systems serve at YouTube. Facebook is
another obvious example of a similar application of recommendation engines.
1. are likely only to be a fit for companies with enough data and in-house AI talent to use
them well, and
2. many businesses and business models may be better off not using recommendation
systems as they are not guaranteed to be a higher yield approach than the alternatives.
19
That being said, there are some sectors (most notably digital media, eCommerce) where such
systems seem to be borderline inevitable.
2. According to a paper written by Netflix executives Carlos A. Gomez-Uribe and Neil Hunt,
the video streaming service’s AI recommendation system saves the company around $1
billion each year. This allows them to invest more money in new content which viewers
will continue to view, giving them a good ROI. According to McKinsey, 75 percent of
what users watch on Netflix come from product recommendations.
3. According to YouTube after implementation of the RS for more than a year, it has been
successful in terms of their stated goals, with recommendations accounting for around 60
percent of video clicks from the homepage.
Recommendation systems can significantly boost revenues, CTRs, conversions, and other
important metrics.
Moreover, they can have positive effects on the user experience as well, which translates into
metrics that are harder to measure but are nonetheless of much importance to online businesses,
such as customer satisfaction and retention.
20
Conclusion
1. Recommendation engines basically are data filtering tools that make use of algorithms and
data to recommend the most relevant items to a particular user.
3. The recommendation system made in this project is able to recommend movies for a
particular user-provided its user id is given. Our program fetches the Movielens dataset, and
then create and train a model using WARP loss function. It uses a hybrid approach that is the
content-based and collaborative approach in order to recommend movies for a user
appropriately.
For the evaluation of our results, we can use two metrics of accuracy: precision@k and ROC
AUC. Both are ranking metrics: to compute them, we’ll be constructing recommendation
lists for all of our users, and checking the ranking of known positive movies. For precision at
k we’ll be looking at whether they are within the first k results on the list; for AUC, we’ll be
calculating the probability that any known positive is higher on the list than a random
negative example.
For example, for instance for user with id:5 have the following values of the matrices are:
We can compare the performance of WARP model with other models using these matrices
values.
21
4. The need of recommendation system is: With the growing amount of information on the
internet and with a significant rise in the number of users, it is becoming important for
companies to search, map and provide them with the relevant chunk of information according
to their preferences and tastes.
5. Application of recommendation system: Almost nowadays all web service based business uses
recommendation system. Examples of popular recommendation systems are that of Netflix,
Amazon, YouTube, Gaana Music App, Flipkart, eBay etc.
22
Future Scope
The future scope of this project, the Recommendation System is very wide. There are many
additional features, which are planned to be incorporated during the future enhancements of this
project.
Although all the main objectives have been achieved still there is room for enhancement.
• This system can be easily upgraded in the future. And also include many more features for
the existing system.
• Also, the recommendation system can be generalized or changed so that it can give
recommendations for other things also like music, books, video recommendation provided
appropriate dataset is available to create and train our model.
• Django framework can be used for the providing realistic user experience to the user
including login in into the website and then user id based on login id is processed on the
server and provide a recommendation on that simultaneously.
23
References
1. https://developers.google.com/machine-learning/crash-course
2. https://medium.com/@gabrieltseng/intro-to-warp-loss-automatic-differentiation-and-
pytorch-b6aa5083187a
3. https://movielens.org/
4. https://lyst.github.io/lightfm/docs/quickstart.html
5. https://dataconomy.com/2015/03/an-introduction-to-recommendation-engines/
6. https://www.youtube.com/Siraj-raval
7. https://towardsdatascience.com/what-are-product-recommendation-engines-and-the-
various-versions-of-them-9dcab4ee26d5
8. https://www.datasciencecentral.com/profiles/blogs/5-types-of-recommenders
9. https://en.wikipedia.org/wiki/Recommender_system
24