M03 Item-Based CF-V2 (1)

The document discusses item-based collaborative filtering (CF) as a solution for large-scale e-commerce recommendation systems, highlighting its advantages over user-based CF, particularly in terms of scalability. It explains the use of cosine similarity to measure item similarity and provides examples of how to create similarity matrices and predict user ratings. Additionally, it addresses challenges such as data sparsity and cold start problems, along with various model-based approaches to improve recommendation accuracy.

Uploaded by

Fa Putra

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

4 views

M03 Item-Based CF-V2 (1)

Uploaded by

Fa Putra

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 27

Item-Based Collaborative

Filtering
Dr ZK Abdurahman Baizal

Sumber : Dietmar Jannach, et al, 2010, Introduction to Recommender System

Introduction
• Although user-based CF approaches have been applied successfully in
different domains, some serious challenges remain when it comes to
large e-commerce sites
• millions of users and millions of catalog items -> need to scan a vast number
of potential neighbors makes it impossible to compute predictions in real
time.
• Large-scale e-commerce sites, often implement a different technique,
called as item-based recommendation
Introduction
• Item-to-item collaborative filtering is the technique used by
Amazon.com to recommend books or CDs to their customers.
The main problem with traditional user-based CF is that the algorithm
does not scale well for such large numbers of users and catalog items.
• User-Based Nearest Neighbor Collaborative Filtering:
• Recommendations based on the calculating similarities of two users
• Item-Based Nearest Neighbor Collaborative Filtering:
• Recommendation based on calculating similarities of two items based on
peoples rating of two items.
Cosine Similarity
• Cosine Similarity is a metric used to measure how similar the two
items or documents are irrespective of their size.
• It measures the cosine of an angle between two vectors projected in
multi-dimensional space. This allows us to measure the similarity of a
document of any type.
Cosine Similarity
The cosine of 0° is 1, and it
is less than 1 for any other
angle.

two vectors with the same

orientation have a cosine
similarity of 1, two vectors at
90° have a similarity of 0, and
two vectors diametrically
opposed have a similarity of
-1, independent of their
magnitude.
Cosine Similarilty
Cosine Similarity
• Since we are finding the cosine of two vectors the output will always
range from -1 to 1, where -1 shows that two items are dissimilar and
1 shows that two items are completely similar.
• We will now see how we can use the Cosine Similarity measure to
determine how similar the movies are.
Example
• Suppose
we have
movie
ratings
given by
different
users
Example
• Step 1: We create a matrix where we write user-item ratings in a
matrix form

• In this matrix user, Amy has already rated and watched movies Pulp Fiction and The
GodFather but hasn’t watched the movie, Forrest Gump.
• We will be using the above matrix for our example and will try to create an item-item
similarity matrix using Cosine Similarity method to determine how similar the movies are
to each other.
Example
• Step 2: To calculate the similarity between the movie Pulp Fiction (P) and Forrest Gump
(F), we will first find all the users who have rated both the movies. In our case, Calvin (C),
Robert (R) and Bradley (B) have rated the movies. We now create two vectors:

Therefore Cosine Similarity between movies Pulp Fiction and Forrest Gump is:
Example
• Similarly, we can calculate the cosine similarity of all the movies and
our final similarity matrix will be:
Example
• Step 3: Now we can predict and fill the ratings for a user for the items he
hasn’t rated yet. So to calculate the rating of user Amy for the movie
Forrest Gump, we will use the calculated similarity matrix along with the
already rated movie by the Amy.
∑!∈# 𝑟$,! ∗ 𝑠𝑖𝑚(𝑖, 𝑝)
𝑝𝑟𝑒𝑑 𝑢, 𝑝 =
∑!∈# 𝑠𝑖𝑚(𝑖, 𝑝)

𝐼 = himpunan item yang pernah di-rating oleh active user dan yang similar dengan item 𝑝
Example

Hence, our final matrix would be:

Dalam implementasi, prediksi rating dihitung berdasarkan item-item yang mempunyai

Tingkat similarity tinggi terhadap item yang akan diprediksi ratingnya. Dalam Kasus Amy,
Kita dapat menentukan treshold tingkat similarity dari item-item yang akan dilibatkan
dalam prediksi
Example
Using Adjusted Cosine Similarity
The basic cosine measure does not take the differences in the average rating
behavior of the users into account.

This problem is solved by using the adjusted cosine measure, which subtracts the
user average from the ratings. The values for the adjusted cosine measure
correspondingly range from -1 to +1, as in the Pearson measure
Example
Item1 Item2 Item3 Item4 Item5 Mean-adjusted ratings matrix
Alice 5 3 4 4 ?
User1 3 1 2 3 3
User2 4 3 4 3 5
User3 3 3 1 5 4
User4 1 5 5 2 1

Basic Cosine Similarity

Adjusted Cosine Similarity

Implementasi dalam Python
Implementasi dalam Python
Implementasi dalam Python
Preprocessing data for item-based filtering
• For making item-based recommendation algorithms applicable also
for large scale e-commerce sites without sacrificing recommendation
accuracy, an approach based on offline precomputation of the data is
typically chosen.
• The idea is to construct in advance the item similarity matrix that
describes the pairwise similarity of all catalog items.
Model-Based Approach
• Besides different preprocessing techniques used in so-called model-
based approaches, it is an option to exploit only a certain fraction of
the rating matrix to reduce the computational complexity.
• Basic techniques include subsampling, which can be accomplished by
randomly choosing a subset of the data or by ignoring customer
records that have only a very small set of ratings or that only contain
very popular items
Data Sparsity Problem and Cold Start Problem
• Cold start problem
• How to recommend new items? What to recommend to new users?
• Straightforward approaches
• Ask/force users to rate a set of items
• Use another method (e.g., content-based, demographic or simply non-
personalized) in the initial phase
• Alternatives
• Use better algorithms (beyond nearest-neighbor approaches)
• Example:
• In nearest-neighbor approaches, the set of sufficiently similar neighbors might be to
small to make good predictions
• Assume "transitivity" of neighborhoods
Data Sparsity Problem and Cold Start Problem

Ratings database for spreading activation approach.

A 0 in this matrix should not be interpreted as an explicit

Graphical representation of user–item relationships (poor) rating, but rather as a missing rating
Data Sparsity Problem and Cold Start Problem
• In a standard user-based or item-based CF approach, paths of length 3 will
be considered – that is, Item3 is relevant for User1 because there exists a
three-step path (User1–Item2–User2–Item3) between them.
• Using path length 5, for instance, would allow for the recommendation also
of Item1, as two five-step paths exist that connect User1 and Item1.
• Because the computation of these distant relationships is computationally
expensive, Huang et al. (2004) propose transforming the rating matrix into
a bipartite graph of users and items.
Data Sparsity Problem and Cold Start Problem
• the quality of the recommendations can be significantly improved
with the proposed technique based on indirect relationships, in
particular when the ratings matrix is sparse
• for new users, the algorithm leads to measurable performance
increases when compared with standard collaborative filtering
techniques
More model-based approaches
• Plethora of different techniques proposed in the last years, e.g.,
• Matrix factorization techniques, statistics
• singular value decomposition, principal component analysis
• Association rule mining
• compare: shopping basket analysis
• Probabilistic models
• clustering models, Bayesian networks, probabilistic Latent Semantic Analysis
• Various other machine learning approaches
• Costs of pre-processing
• Usually not discussed
• Incremental updates possible?
Latihan
1. Buatlah file excel untuk perhitungan prediksi rating di kasus Alice
pada slide materi sebelumnya (materi User Based Collaborative
Filtering), dengan menggunakan item-based collaborative filtering.
Tetapkan treshold similarity yang akan digunakan (input dilakukan
di file excel tersebut). Gunakan cosine similarity dan adjusted cosine
similarity. Bandingkan hasil prediksi dari kedua rumus similarity
tersebut
2. Buatlah program dalam phyton untuk mengerjakan kasus no 1.
Input dapat berupa matriks rating dengan dimensi bebas
Latihan
5. Jelaskan keunggulan item based collaborative filtering dibanding
user based collaborative filtering
6. Jelaskan keungglan adjusted cosine similarity dibanding basic cosine
similarity
7. Jelaskan perbedaan implicit rating dan explicit rating
8. Jelaskan apa yang dimaksud dengan Sparsity problem, dan apakah
efeknya?
9. Jelaskan apa yang dimaksud dengan cold start problem, dan apakah
efeknya?

Assignment 3 RecSys Solution
No ratings yet
Assignment 3 RecSys Solution
2 pages
QLDA - Chapter 6-Developing A Project Plan
No ratings yet
QLDA - Chapter 6-Developing A Project Plan
6 pages
Exam Ref AZ 305 Designing Microsoft Azure Infrastructure Sol 2023
100% (2)
Exam Ref AZ 305 Designing Microsoft Azure Infrastructure Sol 2023
285 pages
Project Report "E-Commerce Recommendation"
No ratings yet
Project Report "E-Commerce Recommendation"
20 pages
AN OPTIMIZED ITEM-BASED COLLABORATIVE FILTERING RECOMMENDATION ALGORITHM
No ratings yet
AN OPTIMIZED ITEM-BASED COLLABORATIVE FILTERING RECOMMENDATION ALGORITHM
5 pages
A Personalized Recommender Integrating Item-Based and User-Based Collaborative Filtering
No ratings yet
A Personalized Recommender Integrating Item-Based and User-Based Collaborative Filtering
4 pages
Module5 Recommender Systems PartB
No ratings yet
Module5 Recommender Systems PartB
57 pages
mod4
No ratings yet
mod4
6 pages
RecSys Updated
No ratings yet
RecSys Updated
37 pages
M02 User-Based CF V02
No ratings yet
M02 User-Based CF V02
20 pages
Combining Memory-Based and Model-Based Collaborative Filtering in Recommender System
100% (1)
Combining Memory-Based and Model-Based Collaborative Filtering in Recommender System
4 pages
Mindsight Codex
No ratings yet
Mindsight Codex
87 pages
A Collaborative Filtering Recommendation Algorithm Based on Item Genre and Rating Similarity
No ratings yet
A Collaborative Filtering Recommendation Algorithm Based on Item Genre and Rating Similarity
4 pages
Movie Recommendation System: CSN-382 Project
No ratings yet
Movie Recommendation System: CSN-382 Project
25 pages
CAIM: Cerca I Anàlisi D'informació Massiva: FIB, Grau en Enginyeria Informàtica
No ratings yet
CAIM: Cerca I Anàlisi D'informació Massiva: FIB, Grau en Enginyeria Informàtica
36 pages
Recommendation System
No ratings yet
Recommendation System
32 pages
10 Recommender Systems
No ratings yet
10 Recommender Systems
35 pages
Recommender Systems Notes
No ratings yet
Recommender Systems Notes
16 pages
Lecture 1_Collaborative Filtering
No ratings yet
Lecture 1_Collaborative Filtering
27 pages
Advanced Recommender Systems With Python
No ratings yet
Advanced Recommender Systems With Python
13 pages
[2012]_sistemasderecomendacion
No ratings yet
[2012]_sistemasderecomendacion
18 pages
Answer
No ratings yet
Answer
13 pages
L6 Recommendation
No ratings yet
L6 Recommendation
56 pages
Module 5
No ratings yet
Module 5
8 pages
Is593-Lecture04 Recommendation Systems
No ratings yet
Is593-Lecture04 Recommendation Systems
51 pages
Recommender System - New
No ratings yet
Recommender System - New
49 pages
CS345A Data Mining: Recommendation Systems
No ratings yet
CS345A Data Mining: Recommendation Systems
26 pages
Week 13
No ratings yet
Week 13
26 pages
Recommendation System
No ratings yet
Recommendation System
17 pages
Slides Lecture 2 RecSys
No ratings yet
Slides Lecture 2 RecSys
86 pages
Movie Recommendations
No ratings yet
Movie Recommendations
12 pages
Recommender Systems-Chapter 4
No ratings yet
Recommender Systems-Chapter 4
76 pages
Recommender System
No ratings yet
Recommender System
26 pages
Recommender Systems
No ratings yet
Recommender Systems
12 pages
RMBI1020 - Data Analytics For Business - Collaborative Filtering
No ratings yet
RMBI1020 - Data Analytics For Business - Collaborative Filtering
34 pages
RS Part 1
No ratings yet
RS Part 1
40 pages
Recommender System
No ratings yet
Recommender System
20 pages
Recommendation Engines
No ratings yet
Recommendation Engines
17 pages
Title_obvhbResearch_Project
No ratings yet
Title_obvhbResearch_Project
7 pages
2404 16177v1
No ratings yet
2404 16177v1
6 pages
RecSys PyData2016
No ratings yet
RecSys PyData2016
32 pages
Unit-3
No ratings yet
Unit-3
21 pages
E96660695201532
No ratings yet
E96660695201532
5 pages
AStudyof Mathematical Modelfor Collaborative Filtering
No ratings yet
AStudyof Mathematical Modelfor Collaborative Filtering
10 pages
.Trashed-1724941095-Recommender Systems
No ratings yet
.Trashed-1724941095-Recommender Systems
30 pages
T10 Recommender System
No ratings yet
T10 Recommender System
45 pages
15.0 Collaborative Filtering
No ratings yet
15.0 Collaborative Filtering
13 pages
Recommendations Using Collaborative Filtering
No ratings yet
Recommendations Using Collaborative Filtering
37 pages
PCL Group2
No ratings yet
PCL Group2
21 pages
An Item-based Collaborative Filtering Recommendation Algorithm Using Slope
No ratings yet
An Item-based Collaborative Filtering Recommendation Algorithm Using Slope
3 pages
DM Lect 6_Recommender Systems
No ratings yet
DM Lect 6_Recommender Systems
46 pages
Article34
No ratings yet
Article34
8 pages
DM - Lecture 5
No ratings yet
DM - Lecture 5
75 pages
smlPBL
No ratings yet
smlPBL
18 pages
CS583 Recommender Systems
No ratings yet
CS583 Recommender Systems
40 pages
filter2
No ratings yet
filter2
7 pages
Recommender: An Analysis of Collaborative Filtering Techniques
No ratings yet
Recommender: An Analysis of Collaborative Filtering Techniques
5 pages
第十讲-Recommender Systems
No ratings yet
第十讲-Recommender Systems
81 pages
Movie Recommendation Engine Using Artificial Intelligence
No ratings yet
Movie Recommendation Engine Using Artificial Intelligence
30 pages
An Item-based collaborative filtering method using Item-based hybrid similarity
No ratings yet
An Item-based collaborative filtering method using Item-based hybrid similarity
4 pages
Differential Evolution: Fundamentals and Applications
From Everand
Differential Evolution: Fundamentals and Applications
Fouad Sabry
No ratings yet
Machine Learning - Advanced Concepts
From Everand
Machine Learning - Advanced Concepts
Derrick Mwiti
No ratings yet
Adept Software Advanced Tutorial
No ratings yet
Adept Software Advanced Tutorial
8 pages
ICT121 - JULY - 2017 - Exam Paper
No ratings yet
ICT121 - JULY - 2017 - Exam Paper
8 pages
CH 4
No ratings yet
CH 4
16 pages
DBMS Lab
No ratings yet
DBMS Lab
38 pages
Install Win7 To USB3 - 0 Computers PDF
No ratings yet
Install Win7 To USB3 - 0 Computers PDF
8 pages
Foc QP 4
No ratings yet
Foc QP 4
18 pages
Information Security MCQs
No ratings yet
Information Security MCQs
9 pages
Contention Window Optimization in IEEE 802.11ax Networks With Deep Reinforcement Learning
No ratings yet
Contention Window Optimization in IEEE 802.11ax Networks With Deep Reinforcement Learning
4 pages
Designing For Disabled People
No ratings yet
Designing For Disabled People
6 pages
Slides Rethink Machine Learning For Regulated Industries
No ratings yet
Slides Rethink Machine Learning For Regulated Industries
30 pages
50 Years of Data Science - Published PDF
No ratings yet
50 Years of Data Science - Published PDF
23 pages
Big Data Analysis
No ratings yet
Big Data Analysis
8 pages
049BENG-OPTIMIZING+RAG+APPS
No ratings yet
049BENG-OPTIMIZING+RAG+APPS
11 pages
Deployment Models For Cloud Computing
No ratings yet
Deployment Models For Cloud Computing
3 pages
O&M Manual
No ratings yet
O&M Manual
4 pages
Inggris
No ratings yet
Inggris
4 pages
SAP SD Interview Questions & Answers With Explanations
100% (1)
SAP SD Interview Questions & Answers With Explanations
116 pages
Form 1163 Application Form Efb
No ratings yet
Form 1163 Application Form Efb
9 pages
Cambridge O Level: Computer Science 2210/23
No ratings yet
Cambridge O Level: Computer Science 2210/23
4 pages
Windows XP Installation
No ratings yet
Windows XP Installation
52 pages
H.263:Video Compression Standard: Presented By:ekta Tiwari
No ratings yet
H.263:Video Compression Standard: Presented By:ekta Tiwari
23 pages
EXTERNAL PRACTICAL EXAM Appointment Orders 06-03-17
No ratings yet
EXTERNAL PRACTICAL EXAM Appointment Orders 06-03-17
1 page
Course List
No ratings yet
Course List
6 pages
Comptia Security+ Guide To Network Security Fundamentals, Fifth Edition
No ratings yet
Comptia Security+ Guide To Network Security Fundamentals, Fifth Edition
52 pages
Manual Del FaultKin - Geomechanics
100% (1)
Manual Del FaultKin - Geomechanics
31 pages
ICION 2019 - ISC2 Jakarta Chapter Presentation
No ratings yet
ICION 2019 - ISC2 Jakarta Chapter Presentation
20 pages
Index: LTE Signaling, Troubleshooting and Optimization, First Edition. Ralf Kreher and Karsten Gaenger
No ratings yet
Index: LTE Signaling, Troubleshooting and Optimization, First Edition. Ralf Kreher and Karsten Gaenger
6 pages
18ME71 CE Student Notes Updated
No ratings yet
18ME71 CE Student Notes Updated
251 pages

M03 Item-Based CF-V2 (1)

Uploaded by

M03 Item-Based CF-V2 (1)

Uploaded by

Item-Based Collaborative

Sumber : Dietmar Jannach, et al, 2010, Introduction to Recommender System

two vectors with the same

Hence, our final matrix would be:

Dalam implementasi, prediksi rating dihitung berdasarkan item-item yang mempunyai

Basic Cosine Similarity

Adjusted Cosine Similarity

Ratings database for spreading activation approach.

A 0 in this matrix should not be interpreted as an explicit

You might also like