38.10 - Matrix Factorization For Recommender Systems Netflix Prize Solution - mp4

Uploaded by

NAKKA PUNEETH

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

14 views

38.10 - Matrix Factorization For Recommender Systems Netflix Prize Solution - mp4

Uploaded by

NAKKA PUNEETH

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 3

So Netflix price is a very popular, very popular machine learning competition.

I think this
was around 2000. This ended in 2009, that I remember very well. But I think it started
maybe a year or two earlier to this. So there is actually a Wikipedia page on Netflix Price
where you can get all the details. As I told you, I remember that the competition ended in
2009 because I was at graduate jury's school when this ended, and we ended up reading the
research on how this problem was solved. So the problem is very interesting. The problem
is Netflix, a very popular company that streams the today streams video on demand, right?
You can watch movies and things like that. I believe Netflix is available in multiple
geographies, including India, of course, it started in the US. I also have a Netflix account in
India, so I know that it exists in India for sure. So one of the interesting things is Netflix
price. The Netflix company said that we will give you a bunch of user ratings, a bunch of
user ratings for some movies. For a bunch of movies, right? The exact numbers are here. So
if you go down here, it shows that, okay, so they gave them roughly 100 million ratings for
480,000 users and 17,770 movies. This is the data set that they're given. So they're given
user, the movie, the date of rating, and the rate itself, or the grade, right? So this is the fun
part. So, given this data, they said Netflix algorithm, and they gave some loss metric. They
gave a loss metric, or they gave a performance metric, to be precise. And they said Netflix
itself, internally, we have some value. So we have a value of its. So they said the root mean
square error. So they wanted to minimize the root mean square error, right? So let's
understand what is root mean square error. So imagine for user I for movie J. Suppose if
their system predicts, if an algorithm predicts some rating, I, J, right? And if the actual
rating, so this is predicted. Let me put a hat here to represent that it's predicted, right? So
the actual rating is, let's say, Rij. Now, root mean square error works like this. You take Rij
minus, right? You'd square this. You sum up between all ijs, right? And then you take the
square root of it. This is your typical rmse. We have seen this in the past. We have seen this
when we learned regression root mean, of course, you take the number n, where n is the
number of ratings, right? Where n is the number of ratings, where number of ratings that
you're averaging. So root. So you have a square root. This is your root, this is your mean, this
is your squared. And this is the error. That's how I remember root mean square error.
Right? So they said that Netflix itself has some root mean square error. Let's call it root
mean square error, Netflix. And they said, using this data, if any team or individual can build
a better algorithm than what Netflix itself had, and if the algorithm that somebody builds
has a root mean square error, which is 10% lower than what Netflix internally has, Netflix
promised a million dollars. A million dollars. For those of you who prefer the indian system,
it is ten lakh US dollars as a prize money. Very hard competition. People took, I think,
multiple years to crack this. And the winning team, and the winning team, and the winning
team used had consisted of multiple researchers. This became a very, very active research
area. It had multiple researchers and actually lots of teams combined together at the end to
breach the 10% mark, right? So the team, actually, some of the core team members, these
are team members from at and T research, et cetera. So after the end of this whole
competition, the winners wrote a very nice article. So one of the winners is Yahuda Corin,
who happened to work at Yahoo Research while I was working there. And I happened to see
the talk that he gave. This was around 2009, when I just joined Yahoo Research, or Yahoo
Labs, as it's called in India. So when I joined Yahoo Labs in India, Yehuda Korean gave. He's
one of the team members, he's not the only member, he's one of the team members that
won this competition, and he provided a brilliant lecture on how they won. Thankfully, they
have written a very nice research paper called matrix factorization techniques for
recommender systems, explaining how they built these systems. Truth be told, for
recommender systems, matrix factorization became popular, became popular, became very
popular only after the Netflix price people tried to use it, but not very successfully. It's only
after Netflix price that people realize that matrix factorization is a very, very powerful
technique for recommender system. Earlier to that, people were mostly doing item item
similarity, or they were doing user user similarity type of stuff. Matrix factorization, as a
core idea became extremely popular after Netflix price. And actually, frankly speaking, it's
only after Netflix price that I actually learned about matrix factorization and started
applying it to new problems. Personally speaking, we have provided a reference link in this
video for this research paper itself. It's a brilliant research paper. It's not very hard to read,
actually. I strongly recommend everyone to read this whole research paper. We will cover
part of this research paper in this video, but I strongly recommend it's very, very readable.
It's very, very simple to read. It's written in an English or the terminology is very, very
simple. It's not like a dense research paper. It's beautifully written research paper. I
strongly recommend everyone to read it. Okay, so let's go to the problem itself. I will use the
notation that is there in the research paper so that it's easy for you to follow later. Right. So,
Rui, let me introduce you to some terminology. RuI is the rating given by a user u on an item
I, right? So qi is an item vector. Qi is an item vector. Pu is a user vector. Right. Here, I'm just
using the notation that is used in the research paper so that it will be easy for you to follow
later. Right. So, given this data, now, if you think about it logically, your RUI is nothing but aij
for us in our previous discussions. Right, your RUI is nothing but aij. I'm trying to connect
the dots between what we already learned and this notation so that it's easy for you to
follow. Right. So what is the problem that we are solving? So let's take the first optimization
problem that we are solving. We are trying to find q and p user vectors and, sorry, user
vectors and item vectors such that across all items and users. Sorry, let me write it this way.
Instead of writing ij, the actual notation used in the paper is across all users and items.
Right. I want to minimize r u I minus q I transpose pu square. Right. This part here, this part
here is exactly same as AI. J minus bi transpose cj squared in our previous notation. Right.
This is exactly what we did in the previous notation. This is your squared loss. I'm just using
the notation of the paper again. But they say that instead of just solving this problem, it is
better to add regularization to this. So they add lambda times right. Q I square plus pu
square. If you think about this from an optimization standpoint, if you look at this from an
optimization standpoint, this is your loss. This is your squared loss. And this is your l two
regularization. And why do you need an L two regularization? As usual, to avoid overfitting.
This is to avoid overfitting. So actually, even though I said this, even though I said that this is
a problem we solve, we should always add regularization to any optimization problem so
that we avoid overfitting. Right. So the actual problem that we are solving here, let me write
it again. Clearly here is minimization. Find me user vectors and item vectors which
minimize sum over all users and items the rating given by user u on item I minus the
product of user vectors and item vectors squared plus. Okay, let me write it here itself. Plus
lambda times my regularization, which is q I squared plus pu squared. And we know how to
find the lambda using crossfire rotation. Right? Very simple. To put it simply, this is your
loss and this is your l two regularizer. And how do you solve it? How do you solve this
problem? Of course, one solution for this is your SGD, right? Given this problem, you can
compute the derivative, because what are the stuff that you have to find if this is your.

ML Interview Questions and Answers
No ratings yet
ML Interview Questions and Answers
105 pages
NinjaScript Programmer's Launch Pad
From Everand
NinjaScript Programmer's Launch Pad
Scott Daggett
4.5/5 (2)
Writing High-Performance .NET Code, 2nd Edition
From Everand
Writing High-Performance .NET Code, 2nd Edition
Ben Watson
4.5/5 (2)
Catch Me If I Fall by Barry Jonsberg Chapter Sampler
0% (2)
Catch Me If I Fall by Barry Jonsberg Chapter Sampler
21 pages
38.1 - Problem Formulation Movie Reviews - mp4
No ratings yet
38.1 - Problem Formulation Movie Reviews - mp4
5 pages
Kanban Fundamentals How To Become Insanely Productive
From Everand
Kanban Fundamentals How To Become Insanely Productive
SADANAND PUJARI
No ratings yet
MACHINE LEARNING
No ratings yet
MACHINE LEARNING
13 pages
Introduction To Algorithms For Behavior Based Recommendation
No ratings yet
Introduction To Algorithms For Behavior Based Recommendation
36 pages
Linear Algebra For Machine Learning
No ratings yet
Linear Algebra For Machine Learning
65 pages
Yousef ML Washin Regression
No ratings yet
Yousef ML Washin Regression
590 pages
A Recommender System: John Urbanic
No ratings yet
A Recommender System: John Urbanic
36 pages
Hexagonal Architecture Explained
From Everand
Hexagonal Architecture Explained
Alistair Cockburn
No ratings yet
ALS Large-Scale Parallel Collaborative Filtering For The Netflix Prize
No ratings yet
ALS Large-Scale Parallel Collaborative Filtering For The Netflix Prize
12 pages
C# 7 and .NET Core Cookbook
From Everand
C# 7 and .NET Core Cookbook
Dirk Strauss
No ratings yet
Machine Learning Interview Questions
No ratings yet
Machine Learning Interview Questions
8 pages
Drupal Rules How-to
From Everand
Drupal Rules How-to
Robert Varkonyi
No ratings yet
subtitle
No ratings yet
subtitle
2 pages
Appm 3310 Final Project
No ratings yet
Appm 3310 Final Project
13 pages
FL LectureNotes
No ratings yet
FL LectureNotes
92 pages
Machine Learning in Python: Hands on Machine Learning with Python Tools, Concepts and Techniques
From Everand
Machine Learning in Python: Hands on Machine Learning with Python Tools, Concepts and Techniques
Abiprod Pty Ltd
5/5 (10)
Machine Learning in Python: Hands on Machine Learning with Python Tools, Concepts and Techniques
From Everand
Machine Learning in Python: Hands on Machine Learning with Python Tools, Concepts and Techniques
Bob Mather
5/5 (1)
Python: Programming for Intermediates: Learn the Fundamentals of Python in 7 Days
From Everand
Python: Programming for Intermediates: Learn the Fundamentals of Python in 7 Days
Michael Knapp
4/5 (1)
Coding for Beginners and Kids Using Python: Python Basics for Beginners, High School Students and Teens Using Project Based Learning
From Everand
Coding for Beginners and Kids Using Python: Python Basics for Beginners, High School Students and Teens Using Project Based Learning
Bob Mather
3/5 (1)
ML Report
No ratings yet
ML Report
23 pages
TypeScript Interview Playbook
From Everand
TypeScript Interview Playbook
Tech Interviews
No ratings yet
FAI - ch5 - Remaining Topics
No ratings yet
FAI - ch5 - Remaining Topics
3 pages
The basic concepts of OOP in C#: Learn conceptually in simple language
From Everand
The basic concepts of OOP in C#: Learn conceptually in simple language
Hani Marzban
No ratings yet
The Entrepreneur Book
From Everand
The Entrepreneur Book
Gunner Technology
No ratings yet
MIT18 409S15 Bookex
No ratings yet
MIT18 409S15 Bookex
123 pages
TM3 ch05 Link Analysis
No ratings yet
TM3 ch05 Link Analysis
69 pages
What is Software Testing?: ISTQB Foundation Companion and Study Guide
From Everand
What is Software Testing?: ISTQB Foundation Companion and Study Guide
Daniel Chelliah
5/5 (8)
Gradient Ascent
No ratings yet
Gradient Ascent
27 pages
Subtitle (5)
No ratings yet
Subtitle (5)
3 pages
Neural Networks: Neural Networks Tools and Techniques for Beginners
From Everand
Neural Networks: Neural Networks Tools and Techniques for Beginners
John Slavio
5/5 (10)
Entity Framework Core
From Everand
Entity Framework Core
Kenji Elzerman
No ratings yet
AIML105
No ratings yet
AIML105
5 pages
Matrix Factorization
No ratings yet
Matrix Factorization
18 pages
Machine Learning: The Basics
No ratings yet
Machine Learning: The Basics
288 pages
Introduction To Mobile Robotics - Burgard PDF
No ratings yet
Introduction To Mobile Robotics - Burgard PDF
745 pages
Photoshop: Photo Manipulation Techniques to Improve Your Pictures to World Class Quality Using Photoshop
From Everand
Photoshop: Photo Manipulation Techniques to Improve Your Pictures to World Class Quality Using Photoshop
John Slavio
2/5 (2)
MACHINE LEARNING Updated
No ratings yet
MACHINE LEARNING Updated
12 pages
Machine Learning
100% (1)
Machine Learning
12 pages
CODING INTERVIEW: 50+ Tips and Tricks to Better Performance in Your Coding Interview
From Everand
CODING INTERVIEW: 50+ Tips and Tricks to Better Performance in Your Coding Interview
Eric Schmidt
No ratings yet
38.7 - Matrix Factorization For Feature Engineering - mp4
No ratings yet
38.7 - Matrix Factorization For Feature Engineering - mp4
2 pages
FULLTEXT01
No ratings yet
FULLTEXT01
44 pages
Machine Learning
100% (1)
Machine Learning
185 pages
Co-Intelligence: Living and Working with AI
From Everand
Co-Intelligence: Living and Working with AI
Ethan Mollick
3.5/5 (40)
Data Analysis with Python: Introducing NumPy, Pandas, Matplotlib, and Essential Elements of Python Programming (English Edition)
From Everand
Data Analysis with Python: Introducing NumPy, Pandas, Matplotlib, and Essential Elements of Python Programming (English Edition)
Rituraj Dixit
No ratings yet
2_DataPreProcessing_code
No ratings yet
2_DataPreProcessing_code
46 pages
Machine Learning SELF
No ratings yet
Machine Learning SELF
29 pages
Learning Data Mining with Python - Second Edition
From Everand
Learning Data Mining with Python - Second Edition
Robert Layton
No ratings yet
Learning Grunt
From Everand
Learning Grunt
Reynolds Douglas
No ratings yet
Recommender Systems
No ratings yet
Recommender Systems
8 pages
The Hidden Costs of Self-Publishing a Best Seller - Facts You should Know
From Everand
The Hidden Costs of Self-Publishing a Best Seller - Facts You should Know
S.D. Anderson
No ratings yet
Think Fast Act Faster: How to Generate Innovative Ideas and Make Them Happen
From Everand
Think Fast Act Faster: How to Generate Innovative Ideas and Make Them Happen
Gregg Rainer
No ratings yet
Python 3 Programming for Beginners: The Beginner's Guide for Learning How to Code in Python (version 3.X) From Scratch in Under 7 Days: Computer Programming, #1
From Everand
Python 3 Programming for Beginners: The Beginner's Guide for Learning How to Code in Python (version 3.X) From Scratch in Under 7 Days: Computer Programming, #1
Ramon Nastase
5/5 (1)
CSC413 Lecture Note
No ratings yet
CSC413 Lecture Note
32 pages
Recommendation Systems
No ratings yet
Recommendation Systems
62 pages
MLBasicsBook
No ratings yet
MLBasicsBook
287 pages
18.2 - Data Matrix Notation - mp4
No ratings yet
18.2 - Data Matrix Notation - mp4
3 pages
RMT ML Book-1
No ratings yet
RMT ML Book-1
446 pages
18.15 - Visualizing Train, Validation and Test Datasets - mp4
No ratings yet
18.15 - Visualizing Train, Validation and Test Datasets - mp4
3 pages
28.13 - Cases - mp4
No ratings yet
28.13 - Cases - mp4
3 pages
28.7 - Polynomial Kernel - mp4
No ratings yet
28.7 - Polynomial Kernel - mp4
3 pages
56.11 - PageRank - mp4
No ratings yet
56.11 - PageRank - mp4
3 pages
57.10 - ORDER BY - mp4
No ratings yet
57.10 - ORDER BY - mp4
2 pages
57.7 - USE, DESCRIBE, SHOW TABLES - mp4
No ratings yet
57.7 - USE, DESCRIBE, SHOW TABLES - mp4
4 pages
2.4 - Comments, Indentation and Statements - mp4
No ratings yet
2.4 - Comments, Indentation and Statements - mp4
2 pages
2.7 - Operators - mp4
No ratings yet
2.7 - Operators - mp4
3 pages
2.2 - Why Learn Python - mp4
No ratings yet
2.2 - Why Learn Python - mp4
1 page
Dpa M.tech
No ratings yet
Dpa M.tech
3 pages
Syllabus in Basic Microeconomics
No ratings yet
Syllabus in Basic Microeconomics
6 pages
B To B Personal Selling
No ratings yet
B To B Personal Selling
61 pages
How Companies Make Money: Education
No ratings yet
How Companies Make Money: Education
22 pages
Edif - ERA - DataSheets - ASME API PDF
No ratings yet
Edif - ERA - DataSheets - ASME API PDF
2 pages
Lesson Plan in Mathematics kindergarten
No ratings yet
Lesson Plan in Mathematics kindergarten
11 pages
Syllabus Ebd 2033-Industrial Organization
No ratings yet
Syllabus Ebd 2033-Industrial Organization
4 pages
AP30J7692
No ratings yet
AP30J7692
1 page
Functions Review Packet
No ratings yet
Functions Review Packet
7 pages
Star Trek Adventures - Discovery Season 1 - Player Characters
100% (1)
Star Trek Adventures - Discovery Season 1 - Player Characters
22 pages
Love My Kidzee: Kidzeeindia Kidzeeindia
No ratings yet
Love My Kidzee: Kidzeeindia Kidzeeindia
12 pages
Brochure FPT University
No ratings yet
Brochure FPT University
58 pages
1.3 Engineering Education
No ratings yet
1.3 Engineering Education
15 pages
CSC_5SL03_TP-pt3-Tutorial4-RBAC-answers
No ratings yet
CSC_5SL03_TP-pt3-Tutorial4-RBAC-answers
3 pages
Skorogovorki Na Angliiskom Yazke
No ratings yet
Skorogovorki Na Angliiskom Yazke
79 pages
Led TV: Service Manual
No ratings yet
Led TV: Service Manual
77 pages
Lab 4 Strings & Formatted Output: 4.1 Objectives
No ratings yet
Lab 4 Strings & Formatted Output: 4.1 Objectives
4 pages
23 11 13 - Fuel Oil Piping
No ratings yet
23 11 13 - Fuel Oil Piping
2 pages
PH6 (Ondes Progressives)
No ratings yet
PH6 (Ondes Progressives)
8 pages
Maple Syrup Day at Hartwick Pines Maple Syrup Day at Hartwick Pines
No ratings yet
Maple Syrup Day at Hartwick Pines Maple Syrup Day at Hartwick Pines
20 pages
Maths Shine Academy 11+ Mock Exam 2023
No ratings yet
Maths Shine Academy 11+ Mock Exam 2023
24 pages
PrinciplesofPolymerizationOdian (5)
No ratings yet
PrinciplesofPolymerizationOdian (5)
8 pages
Why SeaWorld Should Be Shut Down
No ratings yet
Why SeaWorld Should Be Shut Down
11 pages
IT System en Iso685
No ratings yet
IT System en Iso685
5 pages
Program: 5: Write A Program To Implement and Find Class, Network ID and Host ID From Given IPV4 Address
No ratings yet
Program: 5: Write A Program To Implement and Find Class, Network ID and Host ID From Given IPV4 Address
5 pages
Us History Past Papers Analysis
No ratings yet
Us History Past Papers Analysis
6 pages
Inobasyon
No ratings yet
Inobasyon
9 pages
Userguidervlft
No ratings yet
Userguidervlft
14 pages
Course Title:: Critical Thinking and Business Ethics
No ratings yet
Course Title:: Critical Thinking and Business Ethics
5 pages
Afms Test and Reference Sheet
No ratings yet
Afms Test and Reference Sheet
3 pages

38.10 - Matrix Factorization For Recommender Systems Netflix Prize Solution - mp4

Uploaded by

38.10 - Matrix Factorization For Recommender Systems Netflix Prize Solution - mp4

Uploaded by

So Netflix price is a very popular, very popular machine learning competition.

You might also like