Chapter - 4 - Association Rule Mining
Chapter - 4 - Association Rule Mining
space?
Which association rules are the most interesting?
associations or correlations?
How can we take advantage of user preferences or constraints to speed
With massive amounts of data continuously being collected and stored, many
industries are becoming interested in mining frequent itemset patterns from
their databases.
Which is read as if body (X) then head (Y) will occur together in the
transaction with the stated support and confidence
Rule support and confidence are two measures of rule interestingness. They
respectively reflect the usefulness and certainty of discovered rules.
Let D, the task-relevant data set, be a set of database transactions where each
transaction T is a set of items such that T I .
Each transaction is associated with an identifier, called TID (Transaction ID).
Let A be a set of items.
transaction T is said to contain A if and only if A T.
An association rule is an implication of the form A B, where A I , B I ,
and AB=.
The rule A B holds in the transaction set D with support s, where s is the
percentage of transactions in D that contain A B (i.e., the union of itemsets A
and B, or say, both A and B).
Support shows the probability that all the predicates in A and B fulfill together.
Count of tuples that has both A and B divided by total number of tuples in
the working data set
confidence(A B) = P(B|A) =
• Rules that satisfy both a minimum support threshold (min sup) and a minimum
confidence threshold (min conf) are called strong.
A
B occur without A (C3)
confidence(AB) = P(B | A)
= support(A B)/ support(A) (relative support)
= support_count(A B)/support_count(A) (absolute support)
The above equation shows that the confidence of rule A B can be easily
derived from the support counts of A and A B.
That is, once the support counts of A, B, and A B are found, it is straightforward
to derive the corresponding association rules AB and B A and check whether
they are strong.
Thus, the problem of mining association rules can be reduced to that of mining
frequent itemsets.
March 10, 2024 Data Mining: Concepts and Techniques 16
Association mining from frequent Pattern:
Support and Confidence example
We will discuss in this chapter only the first approach (Apriori algorithm)
Assume:
Lk be the set of all frequent k-itemsets which are ordered
lexicographically (i.e. the ith itemset in Lk is smaller than the jth
itemset iff i< j)
Ck be the set of k-itemset which is a super set of Lk .
li and lj be the ith and jth k-itemset from a given Lk and each of their
elements are also sorted lexicographically.
Initialization
Generate all the frequent itemset with cardinality of 1
(i.e. L1) in which each elements are sorted
lexicographically.
Let L be {{i }, {i }, {i }, {i }, {i }} (Note the
1 1 4 7 9 11
ordering)
Join Step:
Generate the candidate k-itemsets by joining L with itself
k-1
Join Step:
o Let’s assume L2 = {{i1,i4}, {i1,i9}, {i1,i11} , {i4,i9} , {i4,i11} , {i7,i9} ,
{i7,i11}}
o The candidate 3-itemsets are {{i1,i4,i9}, {i1,i4,i11}, {i1,i9,i11},
{i4,i9,i11}, {i7,i9,i11} } (Note each elements are sorted and the
elements of the elements are also sorted)
o Note {i9,i11} is subset of the generated 3-itemset but not in L2.
o As a result, some of the 2-itemset are not frequent and hence those
3-item set having {i9,i11} as its subset could not fulfill the
requirement to be frequent itemset.
o which leads into immediate removal of the 3 candidate 3-itemsets
in the next step
March 10, 2024 Data Mining: Concepts and Techniques 32
The Apriori Algorithm
Prune Step:
generate Ck from the candidate k-itemset by pruning apriori those
elements which has subsets that are not frequent
This can be best done by checking if an element in the k-itemset has
Any (k-1)-itemset that is not frequent.
If such an element exist, it should be prunned as it is not frequent
Generation:
Generate L from C by eliminating elements which are
k k
not frequent
This can be best done by assigning count to each k-
Input:
D, a database of transactions;
Output:
L, frequent itemsets in D.
Method:
1. L1 = find frequent 1-itemsets(D); //initialize
2. for (k = 2;Lk-1;k++) {
3. Ck = apriori_gen(Lk-1); //join and prune
4. for each transaction t D {// scan D for counts
5. Ct = subset(Ck, t);
// get the subsets of t that are
candidates
6. for each candidate c C t
7. c.count++;
8. }
9. Lk = { c Ck | c:count min_sup}
//generate
10. }
March 10, 2024 11. return L = k LData
k; Mining: Concepts and Techniques 36
The Apriori Algorithm
procedure apriori_gen(Lk-1:frequent (k-1)-itemsets)
1. for each itemset l1 Lk-1{
2. for each itemset l2 Lk-1 {
3. if (l1[1] = l2[1])^(l1[2] = l2[2])^ . . . ^(l1[k-2] = l2[k-2])^
(l1[k-1] < l2[k-1]) then {
4. c = l1 ∞ l2; // join step: generate candidates
5. if (not (has_infrequent_subset(c, Lk-1))) then
6. add c to Ck;
41
42
Generating Association Rules
43
Example of Apriori: Support threshold=50%,
Confidence= 60%
Table 1
Transaction List of items
T1 I1,I2,I3
T2 I2,I3,I4
T3 I4,I5
T4 I1,I2,I4
T5 I1,I2,I3,I5
T6 I1,I2,I3,I4
Solution:
Support threshold=50% => 0.5*6= 3 => min_sup=3
March 10, 2024 Data Mining: Concepts and Techniques 44
1. Count Of Each Item
Item Count
I1 4
I2 5
I3 4
I4 4
I5 2
2. Prune Step: TABLE -2 shows that I5 item does not
meet min_sup=3, thus it is deleted, only I1, I2, I3, I4
meet min_sup count.
Item
No I1,I2,I3
t fre
{
qu ent I1,I2,I4
I1,I3,I4
I2,I3,I4
{I1, I2} => {I3}
Confidence = support {I1, I2, I3} / support {I1, I2} = (3/
4)* 100 = 75%
March 10, 2024 Data Mining: Concepts and Techniques 50
{I1, I3} => {I2}
Confidence = support {I1, I2, I3} / support {I1, I3} =
(3/ 3)* 100 = 100%
{I2, I3} => {I1}
Confidence = support {I1, I2, I3} / support {I2, I3} =
(3/ 4)* 100 = 75%
{I1} => {I2, I3}
Confidence = support {I1, I2, I3} / support {I1} = (3/
4)* 100 = 75%
How?
For each frequent itemset S, generate all its nonempty subsets.
rule “α β ” if
support_count(α U β =S) * 100 ≥ support_count(α) * min_conf
milk bread
Fraser Sunset
Progressive Deepening
A top_down, progressive deepening approach:
First find high-level strong rules:
Food Level 0
milk bread Level 1
skim 2% wheat white Level 2
Fraser Sunset
Level 3
Progressive Deepening
Variations at mining multiple-level association rules.
Level-crossed association rules:
Uniform Support
In this approach, the same (only one) minimum support needs
to be assessed to measure how frequent a pattern is at all
levels
Uniform Support
Uniform Support (limitations)
If support threshold
Uniform Support
Level 1 Milk
min_sup = 5% [support = 10%]
Reduced Support
In this approach, the algorithm will reduce the
required minimum support as we go down to the
lower concept levels (i.e. the minimum support
reduce as the level increases)
There are different search strategies to implement
reduced support multilevel association rule (see the
text book):
Reduced Support
Level 1 Milk
min_sup = 5% [support = 10%]
Redundancy Filtering
Some rules may be redundant due to “ancestor” relationships
between items.
Example
milk wheat bread [support = 8%, confidence = 70%]
2% milk wheat bread [support = 2%, confidence = 72%]
Redundancy Filtering
What is expected?
An expected support is that in the concept hierarchy, each ancestor
Example
milk wheat bread [support = 8%, confidence = 70%]
2% milk wheat bread [support = 2%, confidence = 72%]
Redundancy Filtering
Consider the two rules below:
milk wheat bread [support = 8%, confidence = 70%]
2% milk wheat bread [support = 8%, confidence = 72%]
This shows all milk are 2% milk as both rules have the same
support
Hence the 2nd rule is the best rule to take
Redundancy Filtering
Consider the two rule with expectation of 50% each type of milk:
1. milk wheat bread [support = 8%, confidence = 70%]
2. 2% milk wheat bread [support = 2%, confidence = 72%]
3. Skim milk wheat bread [support = 6%, confidence = 69%]
This shows significant difference between rule 1 and 2 but match
between rule 1 and 3
Hence the 3rd rule may be removed as it doesn’t add extra
information given rule 1
Rule 2 must be retained as it doesn’t have support as expected
Which means among all the individuals at any age range and income
level, 30% of them bought a laptop
(age,income,buys)
Example:
age(X,”30-34”) income(X,”24K - 48K”) buys(X,”high resolution TV”)
confidence