Da Exp 9
Da Exp 9
UID 2021600022
2021600033
Date 10-11-2024
Lab 9
Objective Association rule mining identifies patterns or relationships between items in large datasets.
In market basket analysis, it uncovers frequent item combinations, such as customers
buying bread and butter also purchasing milk. These insights help businesses optimize
product placements, promotions, and inventory management.
Theory
1. Association Rule Mining:
Association rule mining is a data mining technique used to find rules that predict the
occurrence of an item based on the occurrences of other items in a transaction. The rules
are typically represented in the form of "If-Then" statements, where if a particular set of
items (antecedent) is present, then there is a likelihood that another item or set of items
(consequent) will also be present in the same transaction.
For example: In a supermarket dataset, a rule like "If a customer buys bread and butter,
then they are 70% likely to buy milk" could be a common association.
2. Apriori Algorithm:
The Apriori algorithm is a widely used algorithm in association rule mining for generating
frequent itemsets. It works by identifying the frequent individual items and extending them
BHARATIYA VIDYA BHAVAN’S
SARDAR PATEL INSTITUTE OF TECHNOLOGY
(Empowered Autonomous Institute Affiliated to University of Mumbai)
[Knowledge is Nectar]
1. Identify Frequent 1-itemsets: Find all individual items that meet the minimum
support threshold.
2. Generate Candidates for k-itemsets: From the (k-1)-itemsets that meet the
support threshold, create candidate k-itemsets by combining pairs of (k-1)-itemsets
that share a common prefix.
3. Prune Non-frequent Itemsets: Remove any candidate k-itemsets that do not meet
the minimum support.
4. Generate Association Rules: For each frequent itemset, generate association
rules and calculate their confidence. If the confidence meets the threshold, keep the
rule; otherwise, discard it.
Apriori Principle: This principle states that any subset of a frequent itemset must also be
frequent. The algorithm uses this property to prune the search space, reducing the number
of candidate itemsets.
# Step 3: Visualization
# Scatter plot for Support vs. Confidence
plt.figure(figsize=(10, 6))
plt.scatter(rules['support'], rules['confidence'], alpha=0.5,
color='purple')
plt.xlabel('Support')
plt.ylabel('Confidence')
plt.title('Support vs Confidence')
plt.show()
Output
BHARATIYA VIDYA BHAVAN’S
SARDAR PATEL INSTITUTE OF TECHNOLOGY
(Empowered Autonomous Institute Affiliated to University of Mumbai)
[Knowledge is Nectar]
Conclusion In conclusion, association rule mining, particularly through the Apriori algorithm, is a
powerful tool for discovering meaningful relationships and patterns in large datasets. By
identifying frequently occurring itemsets and generating association rules, it provides
valuable insights that can drive strategic business decisions, optimize product offerings,
and enhance customer experiences. This technique is widely applied in areas like retail,
healthcare, and marketing, where understanding item correlations is crucial for success.
References https://www.youtube.com/watch?v=zi_ydmbWfAs