0% found this document useful (0 votes)

24 views

Module-4_Notes_13-12-2024.docx

Module 4 notes

Uploaded by

saivineela0806

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

24 views

Module-4_Notes_13-12-2024.docx

Module 4 notes

Uploaded by

saivineela0806

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 21

Module-4 Notes

Syllabus

Recommender Systems:
Datasets, Association rules, collaborative filtering, user-based similarity, item-
based similarity using surprise library, matrix factorization
Text Analytics:
Overview, sentiment classification, Naïve Bayes, model for sentiment
classification, using TF-IDF vectorizer, challenges of text analytics
Textbook : Machine Learning using Python by Manaranjan Pradhan Consultant
Indian Institute of Management Bangalore U Dinesh Kumar Professor Indian
Institute of Management Bangalore, Wiley 2019
Chapters 9 and 10
First Edition: 2019
ISBN: 978-81-265-7990-7 ISBN: 978-81-265-8855-8 (ebk)
www.wileyindia.com.

ASSOCIATION RULES (ASSOCIATION RULE MINING)

Association rule finds combinations of items that frequently occur together in
orders or baskets (in a retail context). The items that frequently occur together
are called itemsets. Itemsets help to discover relationships between items that
people buy together and use that as a basis for creating strategies like combining
products as combo offer or place products next to each other in retail shelves to
attract customer attention. An application of association rule mining is in
Market Basket Analysis (MBA). MBA is a technique used mostly by retailers to
find associations between items purchased by customers.

1
To illustrate the association rule mining concept, let us consider a set of baskets
and the items in those baskets purchased by customers as depicted in Figure 9.1.
Items purchased in different baskets are:
• Basket 1: egg, beer, sugar, bread, diaper 2.
• Basket 2: egg, beer, cereal, bread, diaper
• Basket 3: milk, beer, bread 4.
• Basket 4: cereal, diaper, bread

apriori algorithm
The Apriori algorithm is a widely used data mining technique for discovering
frequent itemsets and association rules from large datasets. It is particularly
popular in market basket analysis, where it helps identify items frequently
purchased together.
How the Apriori Algorithm Works
The Apriori algorithm relies on the Apriori property, which states that:
• "If an itemset is frequent, then all its subsets must also be frequent."
This property reduces the search space by eliminating candidate itemsets that
include infrequent subsets.

2
Steps of the Apriori Algorithm
1. Generate Candidate Itemsets:
o Start with single-item itemsets (e.g., {A}, {B}, {C}).
o Extend these into larger itemsets by combining frequent itemsets
from the previous iteration.
2. Prune Infrequent Itemsets:
o For a candidate itemset to be frequent, its support (occurrence in
transactions) must meet or exceed a minimum support threshold.
o Remove itemsets that do not meet this threshold.
3. Repeat:
o Increase the size of the itemsets (e.g., 2-itemsets, 3-itemsets, etc.)
until no further frequent itemsets can be generated.
4. Generate Association Rules:
o Once frequent itemsets are identified, association rules are
generated by calculating confidence and lift for potential rules.

Key Terms in Apriori Algorithm

1. Support:
o Measures how often an itemset appears in the dataset.
o Formula:

3
From suggested learning resource Manaranjan Pradhan book
COLLABORATIVE FILTERING
Collaborative filtering comes in two variations:
• User-Based Similarity: Finds K similar users based on common items
they have bought.
• Item-Based Similarity: Finds Ksimilar items based on common users
who have bought those items. Both algorithms are similar to K-Nearest
Neighbors (KNN),

Pros and Cons of Association Rule Mining

The following are advantages of using association rules: 1. Transactions data,
which is used for generating rules, is always available and mostly clean. 2. The
rules generated are simple and can be interpreted. However, association rules do
not take the preference or ratings given by customers into account, which is an
important information pertaining for generating rules. If customers have bought
two items but disliked one of them, then the association should not be
considered. Collaborative filtering takes both, what customers bought and how
they liked (rating) the items, into consideration before recommending.
Association rules mining is used across several use cases including product
recom

4
9.3 | COLLABORATIVE FILTERING
Collaborative filtering is based on the notion of similarity (or distance). For
example, if two users A and B have purchased the same products and have rated
them similarly on a common rating scale, then A and B can be considered
similar in their buying and preference behavior. Hence, if A buys a new product
and rates high, then that product can be recommended to B. Alternatively, the
products that A has already bought and rated high can be recommended to B, if
not already bought by B.

How to Find Similarity between Users?

Similarity or the distance between users can be computed using the rating the
users have given to the common items purchased. If the users are similar, then
the similarity measures such as Jaccard coefficient and cosine similarity will
have a value closer to 1 and distance measures such as Euclidian distance will
have low value. Calculating similarity and distance have already been discussed
in Chapter 7. Most widely used distances or similarities are Euclidean distance,
Jaccard coefficient, cosine similarity, and Pearson correlation. We will be
discussing collaborative filtering technique using the example described below.
The picture in Figure 9.2 depicts three users Rahul, Purvi, and Gaurav and the
books they have bought and rated.

5
The users are represented using their rating on the Euclidean space in Figure
9.3. Here the dimensions are represented by the two books Into Thin Air and
Missoula, which are the two books commonly bought by Rahul, Purvi, and
Gaurav.

What is Euclidean Distance?

Euclidean distance is a measure of the straight-line distance between two
points in Euclidean space. It is the most common and familiar distance
metric, often referred to as the "ordinary" distance.

6
Collaborative filtering comes in two variations:
1. User-Based Similarity: Finds K similar users based on common items
they have bought.
2. Item-Based Similarity: Finds K similar items based on common users
who have bought those items
Both algorithms are similar to K-Nearest Neighbors (KNN)
9.3.2 | User-Based Similarity
We will use MovieLens dataset (see https://grouplens.org/datasets/movielens/)
for finding similar users based on common movies the users have watched and
how they have rated those movies. The file ratings. csv in the dataset contains
ratings given by users. Each line in this file represents a rating given by a user
to a movie. The ratings are on the scale of 1 to 5. The dataset has the following
features: 1. userId 2. movieId 3. rating 4. timestamp

9.3.2.6 Challenges with User-Based Similarity

9.3.3 | Item-Based Similarity

If two movies, movie A and movie B, have been watched by several users and
rated very similarly, then movie A and movie B can be similar in taste. In other
words, if a user watches movie A, then he or she is very likely to watch B and
vice versa.
Jaccard coefficient, cosine similarity, and Pearson correlation are commonly
used similarity/distance metrics, each suited for different types of data and
scenarios. Here's a breakdown of these concepts:

7
1. Jaccard Coefficient
Definition:
The Jaccard coefficient measures the similarity between two sets by comparing
their intersection and union. It is particularly useful for binary data or datasets
with categorical features.
Formula:

Range:
• Values range from 0 to 1:
o 0: No overlap.
o 1: Perfect overlap.

8
Use Cases:
• Text similarity (e.g., comparing documents based on word sets).
• Collaborative filtering with binary data (e.g., user preferences).

2. Cosine Similarity
Definition:
Cosine similarity measures the cosine of the angle between two non-zero
vectors in a multidimensional space. It is useful for comparing high-
dimensional data such as text embeddings or user-item interactions.

Range:
• Values range from −1 to 1:
o 1: Vectors point in the same direction (high similarity).
o 0: Vectors are orthogonal (no similarity).
o −1: Vectors point in opposite directions (high dissimilarity).

9
=0.67

=0.67
Use Cases:
• Document similarity in Natural Language Processing (e.g., TF-IDF
vectors).
• User-item interaction data in recommender systems.

3. Pearson Correlation Coefficient

• Definition:
• Pearson correlation measures the linear relationship between two variables. It
indicates whether an increase in one variable corresponds to an increase or decrease in
the other.

10
11
Use Cases:
• Collaborative filtering in recommender systems (e.g., user-user
similarity).
• Correlation analysis in statistical studies.

Comparison
Metric Use Case Data Type Key Characteristic
Jaccard Binary data (e.g., Sets or Measures overlap
Coefficient sets, categorical binary between sets.
data) vectors
Cosine High-dimensional Numeric Measures the angle
Similarity data (e.g., text, vectors between two vectors.
vectors)
Pearson Ratings or Numeric Measures linear
Correlation continuous data correlation between
numerical data variables.

Key Points
• Jaccard Coefficient is best for sets or binary data.
• Cosine Similarity focuses on the direction of vectors, ignoring
magnitude.
• Pearson Correlation evaluates the strength and direction of a linear
relationship between variables.
These metrics serve different purposes, so the choice depends on the data type
and task at hand. Let me know if you'd like further examples or implementation
details!

import pandas as pd
movies_df = pd.read_csv("/content/movies.csv" )

12
movies_df.shape
print(movies_df.head(20))
print(movies_df.shape)
print(movies_df.tail(10))
print(movies_df.describe)
print(movies_df.columns)

9.3.2.6 Challenges with User-Based Similarity

Finding user similarity does not work for new users. We need to wait until the
new user buys a few items and rates them. Only then users with similar
preferences can be found and recommendations can be made based on that. This
is called cold start problem in recommender systems. This can be overcome by
using item-based similarity. Item-based similarity is based on the notion that if
two items have been bought by many users and rated similarly, then there must
be some inherent relationship between these two items. In other terms, in future,
if a user buys one of those two items, he or she will most likely buy the other
one.
9.3.3 | Item-Based Similarity
If two movies, movie A and movie B, have been watched by several users and
rated very similarly, then movie A and movie B can be similar in taste. In other
words, if a user watches movie A, then he or she is very likely to watch B and
vice versa.
Item-Based Similarity in Recommender Systems
Item-based similarity is a technique used in collaborative filtering for
recommending items to users. Instead of focusing on user-user relationships, it
measures the similarity between items based on user interactions (e.g., ratings,
purchases). The idea is that if a user liked one item, they are likely to like
similar items.
Key Concepts

13
1. Similarity Matrix:
o A matrix where each entry (i,j)(i, j)(i,j) represents the similarity
between item iii and item jjj.
o Common similarity measures:
▪ Cosine Similarity: Measures the cosine of the angle
between item vectors.
▪ Pearson Correlation: Measures the linear relationship
between ratings for two items.
▪ Jaccard Similarity: Measures the overlap in users who
interacted with two items.
2. Recommendation Process:
o Identify items the user has interacted with.
o Compute the similarity of these items to all other items.
o Rank items based on their similarity scores and recommend the
top-ranked items.
3. Advantages of Item-Based Similarity:
o More stable compared to user-based similarity because item
relationships change less frequently.
o Scales well for scenarios with many users and fewer items (e.g.,
product recommendation).
4. Challenges:
o Cold-start problem for new items.
o Requires a sufficient number of user interactions to compute
meaningful similarities.
9.4 | USING SURPRISE LIBRARY
For real-world implementations, we need a more extensive library which hides
all the implementation details and provides abstract Application Programming
Interfaces (APIs) to build recommender systems. Surprise is a Python library for
accomplishing this. It provides the following features: 1. Various ready-to-use
prediction algorithms like neighborhood methods (user similarity and item
similarity), and matrix factorization-based. It also has built-in similarity
measures such as cosine, mean square distance (MSD), Pearson correlation

14
coefficient, etc. 2. Tools to evaluate, analyze, and compare the performance of
the algorithms. It also provides methods to recommend.
We import the required modules or classes from surprise library.

9.3.3.1 Calculating Cosine Similarity between Movies

15
Matrix Factorization
Useful webpages
• Matrix Factorization made easy (Recommender Systems) | by Rohan Naidu |
Analytics Vidhya | Medium
• Recommender Systems: Matrix Factorization from scratch | by Aakanksha NS |
Towards Data Science
We come across recommendations multiple times a day — while deciding what to watch
on Netflix/Youtube, item recommendations on shopping sites, song suggestions on
Spotify, friend recommendations on Instagram, job recommendations on LinkedIn…the
list goes on! Recommender systems aim to predict the “rating” or “preference” a user
would give to an item. These ratings are used to determine what a user might like and
make informed suggestions.
There are two broad types of Recommender systems:
1. Content-Based systems: These systems try to match users with items based on
items’ content (genre, color, etc) and users’ profiles (likes, dislikes, demographic
information, etc). For example, Youtube might suggest me cooking videos based
on the fact that I’m a chef, and/or that I’ve watched a lot of baking videos in the
past, hence utilizing the information it has about a video’s content and my
profile.
2. Collaborative filtering: They rely on the assumption that similar users like
similar items. Similarity measures between users and/or items are used to make
recommendations.
This article talks about a very popular collaborative filtering technique called Matrix
factorization.
Matrix Factorization
A recommender system has two entities — users and items. Let’s say we have m users
and n items. The goal of our recommendation system is to build an mxn matrix (called
the utility matrix) which consists of the rating (or preference) for each user-item pair.
Initially, this matrix is usually very sparse because we only have ratings for a limited number
of user-item pairs.
Here’s an example. Say we have 4 users and 5 superheroes and we’re trying to predict the
rating each user would give to each superhero. This is what our utility matrix initially looks
like:

16
Now, our goal is to populate this matrix by finding similarities between users and items. To
get an intuition, for example, we see that User3 and User4 gave the same rating to Batman, so
we can assume the users are similar and they’d feel the same way about Spiderman and
predict that User3 would give a rating of 4 to Spiderman. In practice, however, this is not as
straightforward because there are multiple users interacting with many different items.
In practice, The matrix is populated by decomposing (or factorizing) the Utility matrix into
two tall and skinny matrices. The decomposition has the equation:

17
Implementation
To implement matrix factorization, we can use embeddings for the user and item embedding
matrices and use Gradient Descent to get the optimal decomposition. If you’re unfamiliar
with embeddings, you can check out this article where I’ve talked about them in detail:
Code
All the code I’ve used in this article can be found here: https://jovian.ml/aakanksha-ns/anime-
ratings-matrix-factorization
Dataset
I’ve used the Anime Recommendations dataset from Kaggle:
Surprise Library
Leveraging Surprise Library for Recommender Systems in Python | by Mario Montalvo
García | Medium

Introduction
Recommender systems play a crucial role in our daily lives, assisting us in discovering new
products, services, and content that align with our preferences. Python provides numerous
libraries for building recommender systems, and one powerful option is the Surprise library.
Surprise is an open-source Python library specifically designed for recommendation tasks,
making it easier to develop and evaluate recommender systems. In this article, we will
explore the uses and applications of the Surprise library and highlight its key features.
Defining Surprise
Surprise is a Python scikit for building and evaluating recommender systems. It provides a
simple and intuitive API, making it accessible even to beginners. Developed on top of SciPy,
Surprise offers a wide range of collaborative filtering algorithms, including matrix
factorization-based methods such as Singular Value Decomposition (SVD) and Non-negative

18
Matrix Factorization (NMF). It also supports neighborhood-based approaches like k-Nearest
Neighbors (k-NN) and provides tools for model selection and evaluation.
Applications of Surprise
• Movie Recommendations: Surprise is commonly used for movie recommendation
systems. By leveraging collaborative filtering algorithms, Surprise can analyze user
preferences and provide personalized movie suggestions based on similar users’
ratings.
• Music Recommendations: With the rise of music streaming platforms, building
accurate music recommendation systems has become crucial. Surprise can help create
personalized playlists and recommend new songs or artists based on users’ listening
habits.
• Book Recommendations: Recommending books based on user preferences is
another popular application of Surprise. By analyzing past ratings or reviews, the
library can suggest books that align with users’ reading preferences and interests.
• E-commerce Recommendations: Surprise can also be employed in e-commerce
platforms to recommend products to users based on their browsing history, purchase
behavior, and similarities with other users.
Key Features of Surprise
• Easy Integration: Surprise seamlessly integrates with other popular Python libraries,
such as NumPy and Pandas, making it convenient to preprocess and manipulate data
for recommendation tasks.
• Variety of Algorithms: Surprise offers a wide range of built-in algorithms, including
collaborative filtering, matrix factorization, and neighborhood-based methods. These
algorithms provide flexibility and allow developers to choose the most suitable
approach for their specific recommendation problem.
• Built-in Datasets: Surprise provides several built-in datasets, including famous
benchmark datasets like MovieLens and Jester. These datasets can be readily used for
experimentation and evaluation of recommendation models.
• Cross-Validation and Evaluation: Surprise simplifies the evaluation process by
providing built-in functions for cross-validation and performance metrics. Developers
can easily assess the accuracy and performance of their recommendation models
using metrics such as RMSE and MAE.
• Hyperparameter Tuning: The library also includes tools for hyperparameter tuning,
allowing developers to optimize the performance of their models. Grid search and
random search functionalities help in finding the best combination of hyperparameters
for improved recommendation accuracy.
Getting Started with Surprise
To begin using Surprise, follow these steps:

19
• Install the library: You can install Surprise using pip by running the command pip
install surprise in your terminal.
• Import Surprise and relevant modules: Start by importing Surprise and other
required modules using import surprise.
• Load or create a dataset: You can either load one of the built-in datasets provided by
Surprise or create your own dataset using Pandas or NumPy.
• Choose an algorithm: Select an algorithm from Surprise’s extensive collection based
on your recommendation task. Each algorithm has its own parameters that can be
tuned to improve performance.
• Instantiate and train the model: Create an instance of the chosen algorithm and fit it
to your dataset using the fit() method. This step trains the model on the provided data.
• Generate recommendations: Once the model is trained, you can generate
recommendations for users by calling the appropriate method, such
as recommend() or predict().
• Evaluate the model: Use Surprise’s built-in evaluation functions to assess the
performance of your model. Compute metrics like RMSE or MAE to measure the
accuracy of your recommendations.
• Fine-tune and iterate: Iterate through the previous steps, experimenting with
different algorithms, parameters, and evaluation metrics to refine your recommender
system.
Building a Book Recommendation System with Surprise
Builds a book recommendation system using Surprise library in Python, utilizing
collaborative filtering algorithms and the Book-Crossing dataset. Splits data, trains the model
with SVD algorithm, evaluates accuracy with RMSE, and generates personalized book
recommendations. Here is the code:
import surprise
from surprise import Dataset
from surprise import SVD
from surprise.model_selection import train_test_split
from surprise import accuracy

# Load the book-crossing dataset

data = Dataset.load_builtin('book-crossing')

# Split the data into training and testing sets

trainset, testset = train_test_split(data, test_size=0.2)

# Define the matrix factorization-based algorithm

algo = SVD()

# Train the algorithm on the training set

20
algo.fit(trainset)

# Predict ratings for the test set

predictions = algo.test(testset)

# Evaluate the accuracy of the model

accuracy.rmse(predictions)

# Get top-N book recommendations for a specific user

user_id = str(276729)
top_n = 5

# Get the inner id of the user

user_inner_id = algo.trainset.to_inner_uid(user_id)

# Get a list of book recommendations for the user

recommendations = algo.get_top_n(user_inner_id, k=top_n)

# Print the top-N book recommendations for the user

for book_id, predicted_rating in recommendations:
book_title = algo.trainset.to_raw_iid(book_id)
print(f"Book: {book_title}, Predicted Rating: {predicted_rating}")
Conclusion
The Surprise library provides a user-friendly and powerful framework for building
recommender systems in Python. Its intuitive API, variety of algorithms, and convenient
evaluation tools make it an excellent choice for both beginners and experienced developers.
With applications ranging from movie and music recommendations to e-commerce and book
suggestions, Surprise enables the creation of personalized recommendation systems tailored to
different domains. By leveraging Surprise’s features and following the steps outlined in this
article, you can start developing your own recommender systems and deliver personalized
recommendations to users based on their preferences and behaviors.

Grade 6 TLE (IA) LAS
100% (2)
Grade 6 TLE (IA) LAS
44 pages
Learning Activity Sheet Grade - 9 Science
100% (7)
Learning Activity Sheet Grade - 9 Science
4 pages
Project Report "E-Commerce Recommendation"
No ratings yet
Project Report "E-Commerce Recommendation"
20 pages
M4
No ratings yet
M4
58 pages
Module 4
No ratings yet
Module 4
20 pages
RecSys Updated
No ratings yet
RecSys Updated
37 pages
Recommendation System
No ratings yet
Recommendation System
17 pages
Module 4-1
No ratings yet
Module 4-1
34 pages
Module 4
No ratings yet
Module 4
11 pages
Book Recommendation System Project
No ratings yet
Book Recommendation System Project
14 pages
mod4
No ratings yet
mod4
6 pages
unit..3 rs
No ratings yet
unit..3 rs
8 pages
MODULE_4 Advance AIML part 1
No ratings yet
MODULE_4 Advance AIML part 1
12 pages
Recommender Systems
No ratings yet
Recommender Systems
12 pages
Recommended
No ratings yet
Recommended
8 pages
Module4-RecommenderSystem
No ratings yet
Module4-RecommenderSystem
11 pages
Recommender Systems Notes
No ratings yet
Recommender Systems Notes
16 pages
Recommendation Engines
No ratings yet
Recommendation Engines
17 pages
Online Book Recommendation System Using Collaborative Filtering (With Jaccard Similarity)
No ratings yet
Online Book Recommendation System Using Collaborative Filtering (With Jaccard Similarity)
9 pages
Recommender Systems-Unit Iii
No ratings yet
Recommender Systems-Unit Iii
9 pages
Item Based Collaborative Filtering Research
No ratings yet
Item Based Collaborative Filtering Research
4 pages
A Study On E-Commerce Recommender System Based On Big Data
No ratings yet
A Study On E-Commerce Recommender System Based On Big Data
5 pages
Unit 1 Recommender Systems
No ratings yet
Unit 1 Recommender Systems
33 pages
AIML_presentation
No ratings yet
AIML_presentation
21 pages
Online Book Recommendation System
100% (1)
Online Book Recommendation System
21 pages
unit..1 rs
No ratings yet
unit..1 rs
16 pages
2404 16177v1
No ratings yet
2404 16177v1
6 pages
Machine Learning Algorithms For Recommender System - A Comparative Analysis
No ratings yet
Machine Learning Algorithms For Recommender System - A Comparative Analysis
4 pages
A Case Study of Exploiting Decision Tree
No ratings yet
A Case Study of Exploiting Decision Tree
8 pages
module 3
No ratings yet
module 3
7 pages
DMBAR Chapter 14 Association Rules and Collaborative Filtering
No ratings yet
DMBAR Chapter 14 Association Rules and Collaborative Filtering
21 pages
Recommendation in Social Media: Recommender System
No ratings yet
Recommendation in Social Media: Recommender System
29 pages
UNIT3
No ratings yet
UNIT3
37 pages
Recommender Systems Asanov
No ratings yet
Recommender Systems Asanov
7 pages
Unit-3 Data Analytics Material
No ratings yet
Unit-3 Data Analytics Material
21 pages
Enhancing Product Recommender Systems On Sparse Binary Data
No ratings yet
Enhancing Product Recommender Systems On Sparse Binary Data
24 pages
Missing Item Prediction and Its Recommendation Based On Users Approach in Ecommerce
No ratings yet
Missing Item Prediction and Its Recommendation Based On Users Approach in Ecommerce
4 pages
Ir Recommendation & KNN
No ratings yet
Ir Recommendation & KNN
9 pages
Predictive Analysis 5
No ratings yet
Predictive Analysis 5
8 pages
Lect 13 DM
No ratings yet
Lect 13 DM
20 pages
Implementation and Comparison of Recommender Systems Using Various Models
100% (1)
Implementation and Comparison of Recommender Systems Using Various Models
13 pages
UNIT III
No ratings yet
UNIT III
13 pages
TECHNICAL+NOTE Recommender+Systems+v.27
No ratings yet
TECHNICAL+NOTE Recommender+Systems+v.27
16 pages
ML Unit 6
No ratings yet
ML Unit 6
83 pages
Readme PDF
No ratings yet
Readme PDF
6 pages
Unit III Data Mining Techniques
No ratings yet
Unit III Data Mining Techniques
17 pages
6CS4 ML Unit-5
No ratings yet
6CS4 ML Unit-5
33 pages
Association Rules,Recommendation Engine n Network Analytics
No ratings yet
Association Rules,Recommendation Engine n Network Analytics
22 pages
Data Mining Final Project Report
No ratings yet
Data Mining Final Project Report
8 pages
Recommendation System in Python
No ratings yet
Recommendation System in Python
13 pages
Unit 3
No ratings yet
Unit 3
36 pages
Survey On Collaborative Filtering Technique in Recommendation System
No ratings yet
Survey On Collaborative Filtering Technique in Recommendation System
7 pages
Recommendation Systems: A Review
No ratings yet
Recommendation Systems: A Review
6 pages
A Case Study of Exploiting Data Mining Techniques
No ratings yet
A Case Study of Exploiting Data Mining Techniques
8 pages
Data Mining U3
No ratings yet
Data Mining U3
19 pages
Recommender: An Analysis of Collaborative Filtering Techniques
No ratings yet
Recommender: An Analysis of Collaborative Filtering Techniques
5 pages
Using Item Descriptors in Recommender Systems: Eliseo Reategui, John A. Campbell, Roberto Torres
No ratings yet
Using Item Descriptors in Recommender Systems: Eliseo Reategui, John A. Campbell, Roberto Torres
7 pages
Chapter 4
No ratings yet
Chapter 4
9 pages
Collaborative Filtering and Content-Based Recommendation of Documents and Products
No ratings yet
Collaborative Filtering and Content-Based Recommendation of Documents and Products
9 pages
Movie Recommendation System: CSN-382 Project
No ratings yet
Movie Recommendation System: CSN-382 Project
25 pages
Pedestrian Detection: Please, suggest a subtitle for a book with title 'Pedestrian Detection' within the realm of 'Computer Vision'. The suggested subtitle should not have ':'.
From Everand
Pedestrian Detection: Please, suggest a subtitle for a book with title 'Pedestrian Detection' within the realm of 'Computer Vision'. The suggested subtitle should not have ':'.
Fouad Sabry
No ratings yet
Differential Evolution: Fundamentals and Applications
From Everand
Differential Evolution: Fundamentals and Applications
Fouad Sabry
No ratings yet
CURRICULUM development
No ratings yet
CURRICULUM development
4 pages
India Agricultural Equipment Industry Trends and Future Prospects
No ratings yet
India Agricultural Equipment Industry Trends and Future Prospects
14 pages
Coding Interview Questions in Java for Freshers _ PrepInsta
No ratings yet
Coding Interview Questions in Java for Freshers _ PrepInsta
20 pages
Session Plan
No ratings yet
Session Plan
6 pages
Longi 645 (1)
No ratings yet
Longi 645 (1)
2 pages
CTOS Consent
No ratings yet
CTOS Consent
1 page
Outline Specifications For General Construction
No ratings yet
Outline Specifications For General Construction
50 pages
Enterprise Unified Process 1233990371415931 3
No ratings yet
Enterprise Unified Process 1233990371415931 3
14 pages
Class Xii (Informatics Practices) Half Yearly QP Chennai Region
No ratings yet
Class Xii (Informatics Practices) Half Yearly QP Chennai Region
4 pages
Tutorial 03 PLC
No ratings yet
Tutorial 03 PLC
12 pages
Nike v. Skechers
No ratings yet
Nike v. Skechers
14 pages
4.6 Creating and Operating POSIX Threads
No ratings yet
4.6 Creating and Operating POSIX Threads
12 pages
Celda10 Graf
No ratings yet
Celda10 Graf
23 pages
Company Analysis - Applied Valuation by Rajat Jhingan
No ratings yet
Company Analysis - Applied Valuation by Rajat Jhingan
13 pages
Honda P6DD1-P402069 Sedan PADEK3520XV102031 Gold-Silver DTR495
No ratings yet
Honda P6DD1-P402069 Sedan PADEK3520XV102031 Gold-Silver DTR495
1 page
Al Alloy & CCA Catalog
No ratings yet
Al Alloy & CCA Catalog
8 pages
Transações SAP
No ratings yet
Transações SAP
91 pages
Przyimki 3 Cwiczenia
No ratings yet
Przyimki 3 Cwiczenia
2 pages
Negotiation Skills Part 1 - Training Agenda
No ratings yet
Negotiation Skills Part 1 - Training Agenda
6 pages
Mukesh Surana 2017-18 Financial Year Balance Sheet
No ratings yet
Mukesh Surana 2017-18 Financial Year Balance Sheet
5 pages
New Sabah Times PDF
No ratings yet
New Sabah Times PDF
3 pages
Book2
No ratings yet
Book2
42 pages
Special Report: The Ten-Year Test Results That Embarrassed Many So-Called Lotto Experts'
67% (6)
Special Report: The Ten-Year Test Results That Embarrassed Many So-Called Lotto Experts'
18 pages
Unit Iv
No ratings yet
Unit Iv
36 pages
Application Migration Plan
100% (1)
Application Migration Plan
46 pages
Minimum Spanning Tree
No ratings yet
Minimum Spanning Tree
41 pages
4 - JPL V CA
100% (1)
4 - JPL V CA
3 pages

Module-4_Notes_13-12-2024.docx

Uploaded by

Module-4_Notes_13-12-2024.docx

Uploaded by

Module-4 Notes

ASSOCIATION RULES (ASSOCIATION RULE MINING)

Key Terms in Apriori Algorithm

Pros and Cons of Association Rule Mining

How to Find Similarity between Users?

What is Euclidean Distance?

9.3.2.6 Challenges with User-Based Similarity

9.3.3 | Item-Based Similarity

3. Pearson Correlation Coefficient

9.3.2.6 Challenges with User-Based Similarity

9.3.3.1 Calculating Cosine Similarity between Movies

# Load the book-crossing dataset

# Split the data into training and testing sets

# Define the matrix factorization-based algorithm

# Train the algorithm on the training set

# Predict ratings for the test set

# Evaluate the accuracy of the model

# Get top-N book recommendations for a specific user

# Get the inner id of the user

# Get a list of book recommendations for the user

# Print the top-N book recommendations for the user

You might also like