Open navigation menu

Scribd

0% found this document useful (0 votes)

284 views18 pages

26CS157F Stefaan Yetimyan Cs175A

1. Data mining uses techniques like association rules to find patterns in large datasets. Association rules find relationships between variables in the data. 2. Association rules have two measures: support measures how frequently an item appears, and confidence measures the conditional probability of one item being present given another. 3. An example transaction database is presented to demonstrate calculating support and confidence for association rules.

Uploaded by

Matthew Marquez

Copyright

© Attribution Non-Commercial (BY-NC)

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

284 views18 pages

26CS157F Stefaan Yetimyan Cs175A

1. Data mining uses techniques like association rules to find patterns in large datasets. Association rules find relationships between variables in the data. 2. Association rules have two measures: support measures how frequently an item appears, and confidence measures the conditional probability of one item being present given another. 3. An example transaction database is presented to demonstrate calculating support and confidence for association rules.

Uploaded by

Matthew Marquez

Copyright

© Attribution Non-Commercial (BY-NC)

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

You are on page 1/ 18

DATA MINING -ASSOCIATION RULES-

STEFAAN YETIMYAN CS 157A

Outline
1. Data Mining (DM) ~ KDD [Definition] 2. DM Technique -> Association rules [support & confidence] 3. Example (4. Apriori Algorithm)

1. Data Mining ~ KDD [Definition]

- "Data mining (DM), also called KnowledgeDiscovery in Databases (KDD), is the process of automatically searching large volumes of data for patterns using specific DM technique."

- [more formal definition] KDD ~ "the non-trivial extraction of implicit, previously unknown and potentially useful knowledge from data"

1. Data Mining ~ KDD [Definition]

Data Mining techniques Information Visualization k-nearest neighbor decision trees neural networks association rules

2. Association rules
Support
Every association rule has a support and a confidence. The support is the percentage of transactions that demonstrate the rule.

Example: Database with transactions ( customer_# : item_a1, item_a2, )

1: 2: 3: 4:

1, 3, 5. 1, 8, 14, 17, 12. 4, 6, 8, 12, 9, 104. 2, 1, 8.

support {8,12} = 2 (,or 50% ~ 2 of 4 customers) support {1, 5} = 1 (,or 25% ~ 1 of 4 customers ) support {1} = 3 (,or 75% ~ 3 of 4 customers)

2. Association rules
Support

An itemset is called frequent if its support is equal or greater than an agreed upon minimal value the support threshold add to previous example: if threshold 50% then itemsets {8,12} and {1} called frequent

2. Association rules
Confidence
Every association rule has a support and a confidence. An association rule is of the form: X => Y

X => Y: if someone buys X, he also buys Y

The confidence is the conditional probability that, given X present in a transition , Y will also be present.

Confidence measure, by definition: Confidence(X=>Y) equals support(X,Y) / support(X)

2. Association rules
Confidence

We should only consider rules derived from itemsets with high support, and that also have high confidence. A rule with low confidence is not meaningful. Rules dont explain anything, they just point out hard facts in data volumes.

3. Example
Example: Database with transactions ( customer_# : item_a1, item_a2, )

1: 2: 3: 4: 5: 6: 7: 8: 9: 10:

3, 5, 8. 2, 6, 8. 1, 4, 7, 10. 3, 8, 10. 2, 5, 8. 1, 5, 6. 4, 5, 6, 8. 2, 3, 4. 1, 5, 7, 8. 3, 8, 9, 10.

Conf ( {5} => {8} ) ? supp({5}) = 5 , supp({8}) = 7 , supp({5,8}) = 4, then conf( {5} => {8} ) = 4/5 = 0.8 or 80%

3. Example
Example: Database with transactions ( customer_# : item_a1, item_a2, )

1: 2: 3: 4: 5: 6: 7: 8: 9: 10:

3, 5, 8. 2, 6, 8. 1, 4, 7, 10. 3, 8, 10. 2, 5, 8. 1, 5, 6. 4, 5, 6, 8. 2, 3, 4. 1, 5, 7, 8. 3, 8, 9, 10.

Conf ( {5} => {8} ) ? 80% Done. Conf ( {8} => {5} ) ? supp({5}) = 5 , supp({8}) = 7 , supp({5,8}) = 4, then conf( {8} => {5} ) = 4/7 = 0.57 or 57%

3. Example
Example: Database with transactions ( customer_# : item_a1, item_a2, )

Conf ( {5} => {8} ) ? 80% Done. Conf ( {8} => {5} ) ? 57% Done. Rule ( {5} => {8} ) more meaningful then Rule ( {8} => {5} )

3. Example
Example: Database with transactions ( customer_# : item_a1, item_a2, )

1: 2: 3: 4: 5: 6: 7: 8: 9: 10:

3, 5, 8. 2, 6, 8. 1, 4, 7, 10. 3, 8, 10. 2, 5, 8. 1, 5, 6. 4, 5, 6, 8. 2, 3, 4. 1, 5, 7, 8. 3, 8, 9, 10.

Conf ( {9} => {3} ) ? supp({9}) = 1 , supp({3}) = 1 , supp({3,9}) = 1, then conf( {9} => {3} ) = 1/1 = 1.0 or 100%. OK?

3. Example
Example: Database with transactions ( customer_# : item_a1, item_a2, )

Conf( {9} => {3} ) = 100%. Done. Notice: High Confidence, Low Support. -> Rule ( {9} => {3} ) not meaningful

4. APRIORI ALGORTHM

APRIOIRI is an efficient algorithm to find association rules (or, actually, frequent itemsets). The apriori technique is used for generating large itemsets. Out of all candidate (k)itemsets, generate all candidate (k+1)-itemsets.
(Also: Out of one k-itemset, we can produce ((2^k) 2) rules)

4. APRIORI ALGORTHM
Example: with k = 3 (& k-itemsets lexicographically ordered)

{3,4,5}, {3,4,7}, {3,5,6}, {3,5,7}, {3,5,8}, {4,5,6}, {4,5,7}

genereate all possible (k+1)-itemsets, by, for each to sets where we have {a1,a2,..a(k-1),X} and {a1,a2,..a(k-1),Y}, results in candidate {a_1,a_2,...a_(k-1),X,Y}.

{3,4,5,7}, {3,5,6,7}, {3,5,6,8}, {3,5,7,8}, {4,5,6,7}

4. APRIORI ALGORTHM
Example (CONTINUED):

{3,4,5,7}, {3,5,6,7}, {3,5,6,8}, {3,5,7,8}, {4,5,6,7}

Delete (prune) all itemset candidates with non-frequent subsets. Like; {3,5,6,8} self never frequent since subset {5,6,8} is not frequent.
Actually, here, only one remaining candidate {3,4,5,7}

Last; after pruning, determine the support of the remaining itemsets, and check if they make the threshold.

THE END

REFERENCES
Textbook: DATABASE Systems Concepts (Silberschatz et al.) http://www.anderson.ucla.edu/faculty/jason.frand/teacher/technologi es/palace/datamining.htm http://aaaprod.gsfc.nasa.gov/teas/joel.html http://www.liacs.nl/

You might also like

Dm Unit 2
No ratings yet
Dm Unit 2
330 pages
Session 8-Association Rules Mining
No ratings yet
Session 8-Association Rules Mining
75 pages
MODULE 3 - Question &answer-2
No ratings yet
MODULE 3 - Question &answer-2
32 pages
Concepts and Techniques: - Chapter 6
No ratings yet
Concepts and Techniques: - Chapter 6
64 pages
Module 3
No ratings yet
Module 3
136 pages
06 FPBasic
No ratings yet
06 FPBasic
65 pages
Concepts and Techniques: Data Mining
No ratings yet
Concepts and Techniques: Data Mining
93 pages
Chapter 5 - Association Rule Mining
No ratings yet
Chapter 5 - Association Rule Mining
45 pages
Association Rule Mining
No ratings yet
Association Rule Mining
20 pages
4 Association
No ratings yet
4 Association
66 pages
Association Rule Mining 2023 (Compatibility Mode)
No ratings yet
Association Rule Mining 2023 (Compatibility Mode)
44 pages
Chap 6
No ratings yet
Chap 6
77 pages
s13042-013-0172-6
No ratings yet
s13042-013-0172-6
11 pages
Frequent Pattern Based Clustering Methods
No ratings yet
Frequent Pattern Based Clustering Methods
23 pages
Data Mining-Knowledge Presentation 2: Prof. Sin-Min Lee
No ratings yet
Data Mining-Knowledge Presentation 2: Prof. Sin-Min Lee
54 pages
chap 4-Mining Frequent Patterns, Association-Lecture 6-2
No ratings yet
chap 4-Mining Frequent Patterns, Association-Lecture 6-2
66 pages
6asso ST
No ratings yet
6asso ST
77 pages
ITS632 Lecture7 Research Paper Association
No ratings yet
ITS632 Lecture7 Research Paper Association
21 pages
Unit2 Apriori FP Growth
No ratings yet
Unit2 Apriori FP Growth
27 pages
Association
No ratings yet
Association
29 pages
Concepts and Techniques: Data Mining
No ratings yet
Concepts and Techniques: Data Mining
65 pages
Week 3
No ratings yet
Week 3
56 pages
Assoc 1
No ratings yet
Assoc 1
26 pages
CIS664-Knowledge Discovery and Data Mining
No ratings yet
CIS664-Knowledge Discovery and Data Mining
74 pages
06 FPBasic
No ratings yet
06 FPBasic
69 pages
DM Lect7
No ratings yet
DM Lect7
26 pages
UNIT-2 DMA (2)
No ratings yet
UNIT-2 DMA (2)
68 pages
Concepts and Techniques: Data Mining
No ratings yet
Concepts and Techniques: Data Mining
65 pages
Mining: Association Rules
No ratings yet
Mining: Association Rules
54 pages
Association-Rules
No ratings yet
Association-Rules
33 pages
Data Mining Association Rules
No ratings yet
Data Mining Association Rules
54 pages
Mining Frequent Patterns and Associations
No ratings yet
Mining Frequent Patterns and Associations
52 pages
BIS 541 Ch05 20-21 S
No ratings yet
BIS 541 Ch05 20-21 S
91 pages
Data Mining Unit-III
No ratings yet
Data Mining Unit-III
24 pages
Association Rules
No ratings yet
Association Rules
48 pages
Associationrule 1
No ratings yet
Associationrule 1
30 pages
Top 9 Data Science Algorithms
No ratings yet
Top 9 Data Science Algorithms
152 pages
Association Rules
No ratings yet
Association Rules
24 pages
FALLSEM2022-23 SWE2009 ETH VL2022230101117 Reference Material I 25-08-2022 Frequent Pattern Mining
No ratings yet
FALLSEM2022-23 SWE2009 ETH VL2022230101117 Reference Material I 25-08-2022 Frequent Pattern Mining
42 pages
Marketbasket Analysis
No ratings yet
Marketbasket Analysis
28 pages
Unit 3 Data Science
No ratings yet
Unit 3 Data Science
15 pages
P-3 1 5-Association
No ratings yet
P-3 1 5-Association
46 pages
Lecture 2.3.1 2.3.2
No ratings yet
Lecture 2.3.1 2.3.2
23 pages
CIS664-Knowledge Discovery and Data Mining
No ratings yet
CIS664-Knowledge Discovery and Data Mining
74 pages
DWM UNIT-4 SEM ANS
No ratings yet
DWM UNIT-4 SEM ANS
9 pages
Data Mining Session 6 - Main Theme Mining Frequent Patterns, Association, and Correlations Dr. Jean-Claude Franchitti
No ratings yet
Data Mining Session 6 - Main Theme Mining Frequent Patterns, Association, and Correlations Dr. Jean-Claude Franchitti
66 pages
Association Rules PDF
No ratings yet
Association Rules PDF
35 pages
CH 03 Frequent Pattern Mining 2021
No ratings yet
CH 03 Frequent Pattern Mining 2021
62 pages
Study On Application of Apriori Algorithm in Data Mining
No ratings yet
Study On Application of Apriori Algorithm in Data Mining
4 pages
DataMining_Chapter2
No ratings yet
DataMining_Chapter2
8 pages
Tutorial
No ratings yet
Tutorial
52 pages
Mining Association Rules in Large Databases
No ratings yet
Mining Association Rules in Large Databases
77 pages
Module5 DMW
No ratings yet
Module5 DMW
13 pages
Data Analytics Unit 4
No ratings yet
Data Analytics Unit 4
22 pages
Module 5 - Frequent Pattern Mining
No ratings yet
Module 5 - Frequent Pattern Mining
111 pages
Data Mining: Concepts and Techniques: - Slides For Textbook - Chapter 6
No ratings yet
Data Mining: Concepts and Techniques: - Slides For Textbook - Chapter 6
82 pages
Assignment 3 Aim: Association Rule Mining Using Apriori Algorithm. Objectives
No ratings yet
Assignment 3 Aim: Association Rule Mining Using Apriori Algorithm. Objectives
7 pages
Random Sample Consensus: Robust Estimation in Computer Vision
From Everand
Random Sample Consensus: Robust Estimation in Computer Vision
Fouad Sabry
No ratings yet
Amazing Java: Learn Java Quickly
From Everand
Amazing Java: Learn Java Quickly
Andrei Besedin
No ratings yet
Alternating Decision Tree: Fundamentals and Applications
From Everand
Alternating Decision Tree: Fundamentals and Applications
Fouad Sabry
No ratings yet