0% found this document useful (0 votes)
2 views

Lesson #9

The document presents an overview of Association Rule Mining, a data mining technique used to find relationships between items in transaction data, commonly applied in business for marketing and store design. It discusses key algorithms like Apriori and their application in generating association rules based on support and confidence levels. Additionally, it touches on recommender systems that utilize similar principles to suggest items based on user affinities.

Uploaded by

cruz filip
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views

Lesson #9

The document presents an overview of Association Rule Mining, a data mining technique used to find relationships between items in transaction data, commonly applied in business for marketing and store design. It discusses key algorithms like Apriori and their application in generating association rules based on support and confidence levels. Additionally, it touches on recommender systems that utilize similar principles to suggest items based on user affinities.

Uploaded by

cruz filip
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 18

Presented by:

Dr. Alexis John M. Rubio

Copyright © 2017 McGraw Hill Education, All Rights Reserved.

PROPRIETARY MATERIAL © 2017 The McGraw Hill Education, Inc. All rights reserved. No part of this PowerPoint slide may be displayed,
reproduced or distributed in any form or by any means, without the prior written permission of the publisher, or used beyond the limited distribution
to teachers and educators permitted by McGraw Hill for their individual course preparation. If you are a student using this PowerPoint slide, you are
using it without permission.
Learning Objectives
To understand Association Rule Mining
To understand its key algorithm
Solved example of Association rule mining
Recommender Systems
Association Rule Mining
A very popular DM method in business
Also known as market basket analysis

Finds interesting relationships (affinities) between variables (items or


events)
Assume all data are categorical.

Employs unsupervised learning


There is no output variable

Part of machine learning family

Often used as an example to describe DM to ordinary people, such as


the famous “relationship between diapers and beers!”
Association Rule Mining
Input: the simple point-of-sale transaction data
Output: Most frequent affinities among items
Example: according to the transaction data…
“Customer who bought a laptop computer and a virus protection
software, also bought extended service plan 70 percent of the time."

How do you use such a pattern/knowledge?


Put the items next to each other for ease of finding
Promote the items as a package (do not put one on sale if the other(s) are on sale)
Place items far apart from each other so that the customer has to walk the aisles to
search for it, and by doing so potentially seeing and buying other items
Applications
Applications of association rule mining
In business
○ cross-marketing, cross-selling
○ store design, catalog design, e-commerce site design
○ optimization of online advertising
○ product pricing, and sales/promotion configuration

In medicine
○ relationships between symptoms and illnesses
○ diagnosis and patient characteristics and treatments
○ genes and their functions (genomics projects)…
Algorithms

The algorithms help identify the frequent item sets,


which are, then converted to association rules
A large number of algorithms are available
Apriori (most popular, breadth first strategy)
Eclat (depth-first)
FP-Growth (optimized)

Their resulting sets of rules are all the same.


Given a transaction data set T, and a minimum support and a minimum
confidence, the set of association rules existing in T is uniquely
determined.
Association Rules

Are all association rules interesting and useful


A Generic Rule: X  Y [S%, C%]

X, Y: products and/or services


S: Support: how often X & Y go together
C: Confidence: how often Y go together with the X

Example: {Laptop Computer, Antivirus Software}


 {Extended Service Plan} [30%, 70%]
Apriori Algorithm
Apriori Algorithm
Finds subsets that are common to at least a minimum number of the
itemsets

uses a bottom-up approach


○ frequent subsets are extended one item at a time (the size of
frequent subsets increases from one-item subsets to two-item
subsets, then three-item subsets, and so on), and
○ groups of candidates at each level are tested against the data for
minimum support
Sample example

Apriori Algorithm
Solved Exercise – creation
Association Rules from grocery
transaction data
Transactions List
1 Milk Egg Bread Butter
2 Milk Butter Egg Ketchup
3 Bread Butter Ketchup
4 Milk Bread Butter
5 Bread Butter Cookies
6 Milk Bread Butter Cookies
7 Milk Cookies
8 Milk Bread Butter
9 Bread Butter Egg Cookies
10 Milk Butter Bread
11 Milk Bread Butter
12 Milk Bread Cookies Ketchup
Creating Frequent Itemsets
Begin with 1-item itemsets

Transactions 1-item ItemSets


1 Milk Egg Bread Butter
Milk 9
Bread 10
2 Milk Butter Egg Ketchup
Butter 10
3 Bread Butter Ketchup
Egg 3
4 Milk Bread Butter Ketchup 3
5 Bread Butter Cookies Cookies 5
6 Milk Bread Butter Cookies
7 Milk Cookies
8 Milk Bread Butter
9 Bread Butter Egg Cookies
10 Milk Butter Bread
11 Milk Bread Butter
12 Milk Bread Cookies Ketchup
Frequent Itemsets (higher levels)
1-item Sets 2-item Sets 3-item Sets
Milk 9 Milk, Bread 7 Milk, Bread, Butter 6
Bread 10 Milk, Butter 7 Milk, Bread, Cookies 1
Butter 10 Milk, Egg 2
Bread, Butter,
Egg 3 Milk, Ketchup 2 Cookies 3
Ketchup 3 Milk, Cookies 3
Cookies 5 Butter, Egg, Cookies 1
Bread, Butter 9
Bread, Egg 1
Bread, Ketchup 2 4-item Sets
Bread, Cookies 4 M,B,B,C 1

Butter, Egg 3
Select only those itemsets
Butter, Ketchup 2
that meet the designated
Butter, Cookies 3 support level, say 25%

Egg, Ketchup 1
Egg, Cookies 1

Ketchup, Cookies 1
Association Rule extraction
(for a given Support and Confidence levels,
say 25% and 50%)
Itemset (Milk, Bread, Butter) occurs 6 times.
That is 6 out of a total 12 transactions in the database
For all rules coming out of this, support level will be 6/12 (or 50%)

Consider the rule: (Bread, Butter) -> Milk.


Itemset (Bread, Butter) occurs 9 times;
Support level is 6/12, (or 50%)
confidence level is 6/9 (=67%)
Thus (Bread, Butter) -> Milk {S=50%, C=67%}

Similarly the rule (Milk,Butter)->Bread{S=50%, C=6/7}

And the rule (Milk,Bread)->Butter{S=50%, C=6/7}


Recommender Systems

Works on the same principle of affinity


There are three major approaches
Content based recommender systems
Collaborative filtering systems
Latent factor based systems
Content based Recommender
Systems

Create an affinity systems of Products or items

Create a vector of features of a type of item, such as a movie

Compute jacquard similarity between products of a typle

Recommend items with higher similarity


Collaborative Filtering Systems

Create an affinity network of users (or buyers or


consumers or raters)
Create a vector of consumption by each user
Measure similarity of the things they consume
Recommend ratings within networks of people with similar affinities
Frequent Itemsets
1-item Sets 2-item Sets 3-item Sets
Milk 9 Milk, Bread 7 Milk, Bread, Butter 6
Bread 10 Milk, Butter 7 Milk, Bread, Cookies 1
Butter 10 Milk, Egg 2
Egg 3 Milk, Ketchup 2 Bread, Butter, Cookies 3
Ketchup 3 Milk, Cookies 3
Cookies 5 Butter, Egg, Cookies 1
Transactions Bread, Butter 9
1 Milk Egg Bread Butter Bread, Egg 1
2 Milk Butter Egg Ketchup Bread, Ketchup 2 4-item Sets
3 Bread Butter Ketchup Bread, Cookies 4 M,B,B,C 1
4 Milk Bread Butter
Butter, Egg 3
5 Bread Butter Cookies
Butter, Ketchup 2
6 Milk Bread Butter Cookies
Butter, Cookies 3
7 Milk Cookies
8 Milk Bread Butter Egg, Ketchup 1
9 Bread Butter Egg Cookies Egg, Cookies 1
10 Milk Butter Bread
11 Milk Bread Butter Ketchup, Cookies 1
12 Milk Bread Cookies Ketchup
Thank you!

You might also like