Paper 3 RecommendationSystemsTechniquesChallengesApplicationsandEvaluations
Paper 3 RecommendationSystemsTechniquesChallengesApplicationsandEvaluations
net/publication/328640457
CITATIONS READS
36 8,032
2 authors, including:
Sandeep Raghuwanshi
Samrat Ashok Technological Institute Vidisha M.P.
13 PUBLICATIONS 230 CITATIONS
SEE PROFILE
All content following this page was uploaded by Sandeep Raghuwanshi on 30 April 2019.
Abstract With this tremendous growth of the Internet, mobile devices, and
e-business, information load is increasing day by day. That leads to the develop-
ment of the system, which can filter and prioritize the relevant information for users.
Recommendation system solves this issue by enabling users to get knowledge, prod-
ucts, and services of personalized basis. Since the inception of recommender system,
researcher has paid much attention and developed various filtering techniques to make
these systems effective and efficient in terms of users and system experience. This
paper presents a preliminary survey of different recommendation system based on
filtering techniques, challenges applications, and evaluation metrics. The motive of
work is to introduce researchers and practitioner with the different characteristics
and possible filtering techniques of recommendation systems.
1 Introduction
Multiple choices lead confusion to a human being about what item is right for them
or fulfill their requirements. This causes inception to develop a system which could
help human being for the selection criteria and eliminate the dilemma. In present or
past, human always relies on the suggestions from the outside for one purpose or
other. Based on this, recommender system becomes the tools that shrink our options
and present most suitable suggestions as per the requirements and our taste. The
huge volume of information and user preferences increase the demand for new and
effective recommender system in the current age. Isinkaye et al. [1] with this RS
must act as information filtering system that urges to predict preferences that user
might have for an item over other and predict whether a particular item would prefer
or not by him. A precise definition of a recommender system is given as (Fig. 1): A
recommender system or a recommendation system (sometimes replacing the system
with a synonym such as a platform or an engine) is a subclass of information filtering
system that seeks to predict the rating or preference that a user would give to an item
[2].
The existence of recommender system had been identified in the late 1970s, ever
since many researchers have proposed various approaches to develop efficient rec-
ommender system. The first computer-based recommendation system was developed
in 1992 by Goldberg et al. [3]. It was called Tapestry, a mail recommender system
which was developed at the Xerox Palo Alto Research Centre [3]. Tapestry is an
experimental information filtering system that manages the huge incoming stream
of documents such as e-mail, news stories, and articles. It predicated documents on
the belief that information filtering can be more effective when humans are involved
in the filtering process. The primary motivation behind the development of RS is to
reduce information load and processing cost by working on personalized informa-
tion and data through analyzing the interest and behavior of the user to guess his/her
preferences over the item. It is beneficiary for both users and service providers [4].
Presently, many organizations such as Google, Twitter, LinkedIn, Netflix, Amazon
use recommendation system as a decision maker to either maximize its profits and
minimize the risk possibility [5, 6]. Most popular recommender systems of today
Recommendation Systems: Techniques, Challenges … 153
2 Filtering Techniques
Content-based (CB) techniques use feature list of the item and compare it with items
preferred by a specific user previously. The items which match in similarity are
recommended to the user. The essential function of content-based filtering works in
two steps: It stores a user profile based on item features which are most commonly
preferred by the user. These features are used to map the similarity of one item
with other by similarity equation. After that, it compares each item’s features with
the user profile and recommends those who have a high degree of similarity [7].
For content-based system, one has to construct item profile, which is a record of
essential characteristics of that item. These characteristics are discovered easily like
in a movie the record may contain a list of actors, director, year of release, and genre
(Fig. 3).
CF is the most popular and used recommendation technique. The basis for collabo-
rative filtering is that users with similar interest are inclined to give same preference
for the new and future items. This technique works on two points. First, it serves
as a criterion to select a group of similar people whose opinions will be accumu-
lated as a basis for a recommendation (nearest neighbors). Second, it also uses these
opinions to form a bigger group and have a greater impact on the recommendation
[8]. Collaborative filtering techniques involved very large data sets and applied in
diverse application areas like finance, weather forecasting, environmental sensing,
e-commerce.
Collaborative filtering techniques make use of a data set of preferences/ratings
given by the users for items to predict additional items that an active user might
like. The model can be expressed as a preferences/rating matrix of order m × n,
Recommendation Systems: Techniques, Challenges … 155
n
Pa,j = ra + k S(a,i) × ri,j − ri (1)
i=1
where ra : the mean rating of user a · n : the number of users in the database with
nonzero ri,j · S(a,i) : Similarity between the active user and each user i · k: a normal-
izing factor such that the absolute values of the weights sum to unity.
There are many techniques used to compute the similarity between the users. Each
one has its pros and cons in their areas some of them are:
i. Pearson Correlation Similarity: Pearson correlation defines the linear correlation
between two vectors and has a value between −1 and 1. The similarity between the
two vectors u and v is defined as:
156 S. K. Raghuwanshi and R. K. Pateriya
n
(ru,i − ru ) × (rv,i − rv )
SCosine (u, v) = i=1
n
(2)
i=1 (ru,i − ru ) × (rv,i − rv )
2 2
ii. Cosine Similarity: Cosine is one of the most popular methods of statistics to find
similarity between two nonzero real values’ vectors. It looked for an angle between
two vectors in n-dimensional space and defined as:
n
(ru,i ) · (rv,i )
SPearson (u, v) = ni=1 (3)
i=1 (ru,i ) × (rv,i )
2 2
Model-based techniques make use of data mining and machine learning approaches
to predict the preference of a user to an item. These techniques include association
rule [12], clustering [13], decision tree [14], artificial neural network [15], Bayesian
classifier [16], regression [17], link analysis [18], and latent factor models. Among
these latent factor models are the most studied and used model-based techniques.
These techniques perform dimensionality reduction over user–item preference matrix
and learn latent variables to predict preference of the user to an item in the recom-
mendation process. These methods include matrix factorization [19], singular value
Recommendation Systems: Techniques, Challenges … 157
Hybrid filtering techniques is one which combines the advantages of two or more
filtering techniques and overcomes their limitations. These techniques provide more
effective and enhance results of recommendation [27]. Hybrid techniques can adopt
one of the following strategies to develop a hybrid filtering method:
Burke [33] performed over hybrid recommender systems and grouped them into
seven classes as weighted hybridization, switching hybridization, mixed hybridiza-
tion, feature-combining hybridization, cascade hybridization, feature-augmenting
hybridization, and meta-level hybridization.
158 S. K. Raghuwanshi and R. K. Pateriya
3 Challenges
Many e-commerce and shopping Web sites use recommender system and evalu-
ate a very large item sets. With large item sets, the user–item metric becomes
sparse and results as a limitation to many recommender systems. Few values of
ratings/preferences in user–item metric lead to poor predictions. New items cannot
be recommended until some users rate them, and similarly new users are also not
getting good recommendations due to lack of their preference history. To deal with
data sparsity problem, many techniques have proposed out of them dimensionality
reduction like singular value decomposition [20], probabilistic matrix factorization
[21, 22], and hybrid techniques such as content boosted are popular and mostly used
(Table 1).
3.2 Scalability
Scalability has always been the challenge for recommendation systems. The perfor-
mance of mostly traditional CF algorithms started to suffer from the increase of size
in users and items. The tremendous increase in a database leads to a poor perfor-
mance of algorithm as computational capabilities went beyond the practical limits
[34].
Business adhered with recommendation systems has a cold start. Initially, for a new
user case, do not have sufficient information. A considerable amount of time is
required to lure a user and getting them know. However, many networks promote
users to fill information to provide them more options. Items also have cold start
when they are not rated [35].
User whose opinions do not consistent with any group of people is known as gray
sheep. These users do not support the smooth functioning of collaborative filtering
[28]. On the other, a special class of users known as Black sheep whose idiosyncratic
behavior makes recommendations nearly impossible. With an optimal combination
of content-based and collaborative filtering (hybrid techniques) is helping to solve
gray sheep problem [36].
Table 1 Comparison of different recommendation systems
Filtering technique comparison
Filtering technique Method Advantages Limitations
Content-based filtering Use implicit and • User independence • Hard to learn user preference
explicit feedback of • Transparent • Limited degree of novelty
users • Easy to recommend new items • Static
Collaborative Memory-based Neighbor-based • Easy implementation • User preference is needed
filtering approach • Does not need user profile and item features • Performance decreases with sparsity
• Scalable with co-rated items • New user problem
• New data can be added easily
Model-based Data mining, machine • Work well with sparse data • Loss of information due to dimensionality
learning, • Scalable reduction
dimensionality • Better prediction performance • Trade-off between prediction performance and
Recommendation Systems: Techniques, Challenges …
reduction scalability
Hybrid Combine memory and • Improved prediction performance • Complex
model-based • Overcome problems such as sparsity and gray • Expensive implementation
sheep • Sometimes need explicit information
Knowledge-based filtering Case-based, • Improved personalized prediction • Expensive and complex
constraint-based, • Handle new user and cold start problem well • Need external domain-specific knowledge
ontologies
Hybrid filtering Combine two or more • More accurate and effective recommendation • Expensive and complex
filtering techniques • Suppress the limitation of individual techniques
159
160 S. K. Raghuwanshi and R. K. Pateriya
3.5 Synonymy
Synonymy refers to the tendency of a number of the same or very similar items to
have different names or entries. Most recommender systems are unable to discover
this latent association and thus treat these products differently [37, 38].
Recommendation system anyways leaks the information to users. The best example
of this is the people you may know feature of Facebook. The issue of trust arises
when evaluating a customer [39].
The recommendation is a public activity, so peoples get biased for their feedbacks and
give millions of positive reviews for their products or items and sometimes negative
views for their competitors. So, it becomes necessary for the system to incorporate
a kind of mechanism to discourage this sort of phenomenon [40].
4 Applications
filtering has mostly used the technique in these systems. For videos content such as
TV(Netflix) and YouTube, social and context-aware techniques play an effective role
in traditional content-based and collaborative methods.
Contents: In recent years, recommender system has become the key of the e-content
system to locate information and knowledge in the digital library. It covers person-
alized Web pages, a new article, e-mail filtering, etc.
Service Oriented: The Internet and mobile devices open a great opportunity to
access various types of information. That also gives essence for the development
of many service-based recommendation systems such as tourist recommendation,
travel services, matchmaking services, consultation services.
5 Evaluation
1
m n
MAE = |r(i,j) − r
i,j | (4)
N i=1 j=1
RMSE: Root Mean Square Error is computed by the square root of the average of
the difference between predictions and actual values. Lower the RMSE is better the
recommendation.
1 m n
RMSE = |r(i,j) − r
i,j |
2 (5)
N i=1 j=1
F-measure: Harmonic mean of precision and recall to get a single value for com-
parison purpose.
2(Precision ∗ Recall)
F − measure = (8)
Precision + Recall
6 Conclusion
Recommender systems are the part of everyone’s daily life. With the tremendous
growth of information and knowledge over the Internet, it is become necessary to
have more and more effective and efficient recommendation systems. These sys-
tems enable their users to access services and products of their taste, which are
not readily available. This paper discusses and highlights various recommendation
system with their techniques, challenges, applications, and their evaluation metrics.
Presently, different hybridization techniques are used to develop recommendation
systems required on task and user personalized basis. The paper helps the researcher
to understand and improve the state of current recommendation system.
References
1. Isinkaye, F.O., Folajimi, Y.O., Ojokoh, B.A.: Recommendation systems: principles, methods,
and evaluation. Egypt. Inf. J. 261–273 (2015)
2. Recommender System Definition: available at, https://en.wikipedia.org/wiki/Recommender_
system
Recommendation Systems: Techniques, Challenges … 163
3. Goldberg, D., Nichols, D., Oki, B.M., Terry, D: Using collaborative filtering to weave an
information tapestry. Commu. ACM 35(12), 61–70 (1992)
4. Pu, P., Chen, L., Hu, R.: A user-centric evaluation framework for recommender systems. In:
Proceedings of the fifth ACM conference on Recommender Systems (RecSys’11), pp. 57–164.
ACM, New York, NY, USA (2011)
5. Bouneffouf, D., Bouzeghoub, A., Ganarski, A.L.: Risk-aware recommender systems. In: Neural
Information Processing, pp. 57–65. Springer, Berlin, Heidelberg (2013)
6. Chen, L.S., Hsu, F.H., Chen, M.C., Hsu, Y.C.: Developing recommender systems with the
consideration of product profitability for sellers. Inf. Sci. 178(4), 1032–1048 (2008)
7. Pazzani, M., Billsus, D.: Content-based recommendation systems. In: The Adaptive Web, pp.
325–341. Springer, Berlin, Heidelberg (2007)
8. Guo, G., Zhang, J., Yorke-Smith, N.: A novel evidence based bayesian similarity measure for
recommendation systems. J. ACM Trans Web 10(2), 8.1–8.30 (2016)
9. Su, X., Khoshgoftaar, T.M.: A survey of collaborative filtering techniques. Adv. Artif. Intell.
1–20 (2009)
10. Breese, J.S., Heckerman, D., Kadie C.: Empirical analysis of predictive algorithms for col-
laborative filtering. In: Proceedings of the Fourteenth Annual Conference on Uncertainty in
Artificial Intelligence, pp. 43–52. July 1998
11. Joonseok, L., Sun, M., Lebanon, G.: A Comparative Study of Collaborative Filtering Algo-
rithms (2012)
12. Mobasher, B., Jin, X., Zhou, Y.: Semantically enhanced collaborative filtering on the web. In:
Web Mining: from web to semantic web, pp. 57–76. Berlin, Heidelberg, Springer 2004
13. Ku Zalewski U.: Advantages of information granulation in clustering algorithms. In: Agents
and artificial intelligence, pp. 131–145. NY, Springer (2013)
14. Michael, J.A., Berry, A., Gordon, S., Linoff, L.: Data mining techniques, 2nd ed. Wiley Pub-
lishing Inc., (2004)
15. Larose, T.D.: Discovering knowledge in data. Wiley, Hoboken, (New Jersey) (2005)
16. Friedman, N., Geiger, D., Goldszmidt, M.: Bayesian network classifiers. Mach. Learn. 29(2–3),
131–63 (1997)
17. Vucetic, S., Obradovic, Z.: Collaborative filtering using a regression based approach. Knowl.
Inf. Syst. 1–22 (2005)
18. Berry, M.J.A., Linoff, G.: Data mining techniques: for marketing, sales, and customer support.
Wiley Computer Publishing, New York (1997)
19. Bell, R., Koren, Y., Volinsky, C.: Matrix factorization techniques for recommender systems.
Computer 42(8), 30–37 (2009)
20. Sali, S.: Movie rating prediction using singular value decomposition. In: Machine Learning
Project Report by University of California, Santa Cruz (2008)
21. Hofmann, T.: Probabilistic latent semantic analysis. In: Proceedings of the 15th Conference on
Uncertainty in AI, pp. 289–296. San Fransisco, California (1999)
22. Salakhutdinov, R., Mnih, A.: Probabilistic matrix factorization. In: Proceedings of the 20th
International Conference on Neural Information Processing Systems (NIPS’07) (2007)
23. Lu, yuan, Yang Jie, Notes on “Low-Rank Matrix Factorization”, e-print (2015)
arXiv:1507.00333
24. Patrik Hoyer, O.: Non-negative matrix factorization with sparseness constraints. J. Mach. Learn.
Res. 5, 1457–1469 (2004)
25. David Blei, M., Andrew Ng, Y., Jordan, M.I.: Latent Dirichlet Allocation. J. Mach. Learn. Res.
3, 993–1022 (2003)
26. Bridge, D., Mehmet Gker, H., McGinty, L., Smyth, B.: Case-based recommender systems.
Knowl. Eng. Rev. 20(3), 315–320 (2005)
27. Adomavicius, G., Zhang, J.: Impact of data characteristics on recommender systems perfor-
mance. ACM Trans. Manage Inf. Syst. 3(1), 3.1–3.17 (2012)
28. Claypool, M., Gokhale, A., Miranda, T., Murnikov, P., Netes, D., Sartin, M.: Combining
content-based and collaborative filters in an online newspaper. In: Proceedings of ACM SIGIR
Workshop on Recommender Systems: algorithms and evaluation. Berkeley, California (1999)
164 S. K. Raghuwanshi and R. K. Pateriya
29. Billsus, D., Pazzani, M.J.: A hybrid user model for news story classification. In: Kay, J. (ed.)
Proceedings of the seventh International Conference on user Modelling, pp. 99–108. Banff,
Canada, Springer, Newyork (1999)
30. Soboroff, I., Nicholas, K.C., Pazzani, M.J.: Workshop on recommender systems: algorithms
and evaluation. In: Conference Proceedings SIGIR Forum, vol. 33, no. 1, pp. 36–43 (1999)
31. Shein, I., Popescul, A., Ungar, L.H., Pennock, D.M.: Methods and metrics for cold-start rec-
ommendations. In: Proceedings of the 25th International ACM SIGIR Conference on Research
and Development in Information Retrieval SIGIR’02, pp. 253–260. ACM, New York, NY, USA
(2002)
32. Popescul, A., Ungar, L.H., Pennock, D.M., Lawrence, S.: Probabilistic models for unified
collaborative and content-based recommendation in sparse data environments. In: Proceedings
of the 17th Conference on Uncertainty in Artificial Intelligence, UAI’01, pp. 437–444 (2001)
33. Burke, R.: Hybrid recommender systems: survey and experiments. User Model. User-Adapt.
Interact. 12(4), 331–370 (2002)
34. Linden, G., Smith, B., York, J.: Recommendations: item-to-item collaborative filtering. IEEE
Internet Comput. 7(1), 76–80 (2003). www.Amazon.com
35. Rana, M.C.: Survey paper on recommendation system. Int. J. Comput. Sci. Inf. Technol. 3(2),
3460–3462 (2012)
36. Mahony, M.O., Hurley, N., Kushmerick, N., Silvestre, G.: Collaborative recommendation: a
robustness analysis. ACM Trans. Internet Technol. 4(4), 344–377 (2004)
37. Jones, S.K.: A statistical interpretation of term specificity and its applications in retrieval. J.
Doc. 28(1) 11–21 )(1972)
38. Gong, M., Xu, Z., Xu, L., Li, Y., Chen, L.: Recommending web service based on user relation-
ships and preferences. In: 20th International conference on web services. IEEE (2013)
39. Canny, J.: Collaborative filtering with privacy via factor analysis. In: Proceedings of the 25th
Annual International ACM SIGIR Conference on Research and Development in Information
Retrieval, pp. 238–245 (2002)
40. Resnick, P., Varian, H.R.: Recommender systems. Commun. ACM 40(3), 56–58 (1997)
41. Herlocker, J.L., Konstan, J.A., Terveen, L.G., Reidll, J.T.: Evaluating recommendation systems.
ACM Trans. Inf. Syst. 22(1), 5–53 (2004)