0% found this document useful (0 votes)
32 views

Intro To Data Sci Slide 21 - User Based Collaborative Filtering

This document describes the process of user-based collaborative filtering to create a recommendation system based on users' listening histories. It involves 4 main steps: 1) choosing an item and checking if a user consumed it, 2) getting similarity scores of the item based on its closest 10 neighbors, 3) getting the user's consumption record, and 4) calculating a score using a formula. These steps are performed using a for loop that iterates through each user and artist. The output is a data frame showing recommended artists for each user based on the highest scores. The results are then ranked to display the top 3 artist recommendations for each user.

Uploaded by

Nicole Fu
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
32 views

Intro To Data Sci Slide 21 - User Based Collaborative Filtering

This document describes the process of user-based collaborative filtering to create a recommendation system based on users' listening histories. It involves 4 main steps: 1) choosing an item and checking if a user consumed it, 2) getting similarity scores of the item based on its closest 10 neighbors, 3) getting the user's consumption record, and 4) calculating a score using a formula. These steps are performed using a for loop that iterates through each user and artist. The output is a data frame showing recommended artists for each user based on the highest scores. The results are then ranked to display the top 3 artist recommendations for each user.

Uploaded by

Nicole Fu
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 3

INTRO TO DATA SCI

Slide 21 – User Based Collaborative Filtering


 Now, we will take it one step further by applying user-based collaborative filtering
 We’ll be sifting through a list of users’ consumption history to create a
recommendation system based on what they currently listen to

Slide 22 – Steps 1 & 2


 In order to do this, we will need to create a score matrix to determine our
recommendations
 This process consists of four main steps:
o First, we’ll choose an item and check if a user has consumed that item
 So in this case, we’ll be looking at particular songs and determining
if a user has already listened to it
o Next, we will get the similarities of that particular song based on its closest
10 neighbours
o Then we will get the consumption record of the user
o And then calculate the score with the formula shown
 And we’ll be doing this entire process using a For Loop function

Slide 23 – Steps 3 & 4


 Prior to running the For Loop, we will need to create a helper function to calculate
the score matrix and a holder matrix to hold our original dataset
 Once we have done this, we can move onto the For Loop

Slide 24 – For Loop Step 1


 The loop starts by taking each user, which are the rows in the dataset, and then
jumping into another loop that takes each artist, which are the columns
 Then, we store the user’s name and artist name in variables to use them easily
later.
 Next, we use an if statement to filter out any artists that a user has already
listened to

Slide 25 – For Loop Step 2


 The next step gets the item-based scores for a particular artist, which are sorted
by the similarity of that item’s top 10 neighbours
 After doing this, we want to store the similarities score and song names
 However, since the first column always represents the same song, we must
drop it before continuing on
 Now, we need to get the user’s purchase history for their top 10 songs from the
original dataset
 So step 3 involves getting the consumption record of a user and filtering out
purchases that match the user

Slide 26 – For Loop


 The last step involves calculating the score for the product and the user
 To do this, we will be using the formula shown on the screen, which takes the
sum product of a user’s purchase history and similarity, divided by the sum of
similarities
 Once we are done, we can store the results in a data frame

Slide 27 – For Loop


 So this data frame reads that for user 51, based on the highest value, we would
recommend abba first, then a perfect cycle
 However, we would not recommend ACDC since there is a value of 0

Slide 28 – Ranking
 We took the extra step to rank our values, since they were not organized in a
descending order, as we saw earlier
 So to do this, we created another holder matrix
 For each user score, we sorted out the scores and stored the artist names in a
ranked order

Slide 29 – Results & Recommendation


 The final output of this is displayed here, which gives us our final results and
recommendation
 By sorting, we can see that the top 3 artists for user 51 is actually the subways,
the kooks and the hives

Through item-based and user-based approaches, we were able to determine a user’s


top 10 artists based on their listening history . Thank you for listening, and I hope we’ve
taught you something new about collaborative filtering today!
COMPETING WITH ANALYTICS

Slide 6 – Datasets & Applications

Slide 7 – Time Series Dataset Description

Slide 8 – Language Processing Dataset Description

Slide 9 – Potential Applications and Impact

You might also like