ECLAT stands for Equivalence Class Clustering and bottom-up Lattice Traversal. It is a data mining algorithm used to find frequent itemsets in a dataset. These frequent itemsets are then used to create association rules which helps to identify patterns in data. It is an improved alternative to the Apriori algorithm by providing better scalability and computational efficiency.
What Makes ECLAT Different from Apriori?
The main difference between the two lies in how they store and search through the data:
- Apriori uses a horizontal format where each transaction is a row and it follows a breadth-first search (BFS) strategy. This means it scans the database multiple times to find frequent item combinations.
- ECLAT on the other hand uses a vertical format where each item is linked to a list of transaction IDs (TIDs). It uses a depth-first search (DFS) strategy which requires fewer scans and makes it faster and more memory-efficient.
This vertical approach significantly reduces the number of database scans making ECLAT faster and more memory-efficient especially for large datasets.
How ECLAT Algorithm Works
Let’s walk through an example to better understand how ECLAT algorithm works. Consider the following transaction dataset represented in a Boolean matrix:

The core idea of the ECLAT algorithm is based on the interection of datasets to calculate the support of itemsets, avoiding the generation of subsets that are not likely to exist in the dataset. Here’s a breakdown of the steps:
Step 1: Create the Tidset
The first step is to generate the tidset for each individual item. A tidset is simply a list of transaction IDs where the item appears. For example: k = 1, minimum support = 2
Item | Tidset |
---|
Bread | {T1, T4, T5, T7, T8, T9} |
Butter | {T1, T2, T3, T4, T6, T8, T9} |
Milk | {T3, T5, T6, T7, T8, T9} |
Coke | {T2, T4} |
Jam | {T1, T8} |
Step 2: Calculate the Support of Itemsets by Intersecting Tidsets
ECLAT then proceeds by recursively combining the tidsets. The support of an itemset is determined by the intersection of tidsets. For example: k = 2
Item | Tidset |
---|
{Bread, Butter} | {T1, T4, T8, T9} |
{Bread, Milk} | {T5, T7, T8, T9} |
{Bread, Coke} | {T4} |
{Bread, Jam} | {T1, T8} |
{Butter, Milk} | {T3, T6, T8, T9} |
{Butter, Coke} | {T2, T4} |
{Butter, Jam} | {T1, T8} |
{Milk, Jam} | {T8} |
Step 3: Recursive Call and Generation of Larger Itemsets
The algorithm continues recursively by combining pairs of itemsets (k-itemsets) checking the support by intersecting the tidsets. The recursion continues until no further frequent itemsets can be generated. Now k = 3
Item | Tidset |
---|
{Bread, Butter, Milk} | {T8, T9} |
{Bread, Butter, Jam} | {T1, T8} |
Step 4: Stop When No More Frequent Itemsets Can Be Found
The algorithm stops once no more itemset combinations meet the minimum support threshold. k = 4
Item | Tidset |
---|
{Bread, Butter, Milk, Jam} | {T8} |
We stop at k = 4 because there are no more item-tidset pairs to combine. Since minimum support = 2, we conclude the following rules from the given dataset:-
Items Bought | Recommended Products |
---|
Bread | Butter |
Bread | Milk |
Bread | Jam |
Butter | Milk |
Butter | Coke |
Butter | Jam |
Bread and Butter | Milk |
Bread and Butter | Jam |
Advantages of the ECLAT Algorithm
- Efficient in Dense Datasets: Performs better than Apriori in datasets with frequent co-occurrences.
- Memory Efficient: Uses vertical representation, reducing redundant scans.
- Fast Itemset Intersection: Computing itemset support via TID-set intersections is faster than scanning transactions repeatedly.
- Better Scalability: Can handle larger datasets due to its depth-first search mechanism.
Disadvantages of the ECLAT Algorithm
- High Memory Requirement: Large TID sets can consume significant memory.
- Not Suitable for Sparse Data: Works better in dense datasets, but performance drops for sparse datasets where intersections result in small itemsets.
- Sensitive to Large Transactions: If a transaction has too many items its corresponding TID-set intersections can be expensive.
Applications of ECLAT Algorithm
- Market Basket Analysis: Identifying frequently purchased items together.
- Recommendation Systems: Suggesting products based on past purchase patterns.
- Medical Diagnosis: Finding co-occurring symptoms in medical records.
- Web Usage Mining: Analyzing web logs to understand user behavior.
- Fraud Detection: Discovering frequent patterns in fraudulent activities.