0% found this document useful (0 votes)
90 views

Topic 6d - Hierarchical Algorithm

Hierarchical clustering is an unsupervised machine learning algorithm that builds nested clusters based on the distances between observations. It does not require specifying the number of clusters as input. There are two main approaches: agglomerative, which starts with each observation in its own cluster and merges them sequentially, and divisive, which starts with all observations in one cluster and splits them sequentially. The algorithm uses a distance or similarity matrix to merge or split clusters at each step. The results can be visualized as a dendrogram tree structure.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
90 views

Topic 6d - Hierarchical Algorithm

Hierarchical clustering is an unsupervised machine learning algorithm that builds nested clusters based on the distances between observations. It does not require specifying the number of clusters as input. There are two main approaches: agglomerative, which starts with each observation in its own cluster and merges them sequentially, and divisive, which starts with all observations in one cluster and splits them sequentially. The algorithm uses a distance or similarity matrix to merge or split clusters at each step. The results can be visualized as a dendrogram tree structure.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 38

TOPIC 6 – PART D

CLUSTERING: HIERARCHICAL
APPROACH
HIERARCHICAL CLUSTERING

• Produces a set of nested clusters organized as a hierarchical tree


• Can be visualized as a dendrogram
• A tree like diagram that records the sequences of merges or splits

6 5
0.2
4
3 4
0.15 2
5
0.1 2

0.05 1
3 1
0
1 3 2 5 4 6
HIERARCHICAL CLUSTERING

• Use distance matrix as clustering criteria. This method does not


require the number of clusters k as an input, but needs a termination
condition Step 0 Step 1 Step 2 Step 3 Step 4
agglomerative
(AGNES)
a
ab
b
abcde
c
cde
d
de
e
divisive
Step 4 Step 3 Step 2 Step 1 Step 0 (DIANA)
HIERARCHICAL CLUSTERING

 Two main types of hierarchical clustering


◦ Agglomerative:
 Start with the points as individual clusters
 At each step, merge the closest pair of clusters until only one cluster (or k clusters) left

◦ Divisive:
 Start with one, all-inclusive cluster
 At each step, split a cluster until each cluster contains a point (or there are k clusters)

 Traditional hierarchical algorithms use a similarity or distance matrix


◦ Merge or split one cluster at a time
HIERARCHICAL CLUSTERING
 Introduced in Kaufmann and Rousseeuw (1990)
 Implemented in statistical packages, e.g., Splus
 Use the single-link method and the dissimilarity matrix
 Merge nodes that have the least dissimilarity (min. Euclidean distance)
 Go on in a non-descending fashion
 Eventually all nodes belong to the same cluster
10 10 10

9 9 9

8 8 8

7 7 7

6 6 6

5 5 5

4 4 4

3 3 3

2 2 2

1 1 1

0 0 0
0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10
AGGLOMERATIVE CLUSTERING ALGORITHM

 More popular hierarchical clustering technique

 Basic algorithm :
1. Compute the dissimilarity matrix
2. Let each data point be a cluster
3. Repeat
4. Merge the two closest clusters
5. Update the dissimilarity matrix
6. Until only a single cluster remains

 Key operation is the computation of the dissimilarity of two clusters


◦ Different approaches to defining the distance between clusters
DENDROGRAM
DENDROGRAM: shows how
the clusters are merged
Decompose data objects into a
several levels of nested
partitioning (tree of clusters),
called a dendrogram.

A clustering of the data objects is


obtained by cutting the
dendrogram at the desired level,
then each connected component
forms a cluster.
STARTING SITUATION
• Start with clusters of individual points and a
dissimilarity matrix
p1 p2 p3 p4 p5 ...
p1

p2
p3

p4
p5
.
.
. Dissimilarity Matrix

...
p1 p2 p3 p4 p9 p10 p11 p12
INTERMEDIATE SITUATION
c1 c2 c3 c4 c5 ...

• After some merging steps, c1

c2
we have some clusters
c3

c4

C3 c5
.
C4 .
Dissimilarity Matrix
.

C1

C2 C5

...
p1 p2 p3 p4 p9 p10 p11 p12
INTERMEDIATE SITUATION
c1 c2 c3 c4 c5 ...
• We want to merge the two closest clusters c1

(C2 and C5) and update the dissimilarity c2


matrix. c3

c4

C3 c5
.
C4 . Dissimilarity Matrix
.

C1

C2 C5

...
p1 p2 p3 p4 p9 p10 p11 p12
AFTER MERGING
C2
U
C1 C5 C3 C4
• The question is “How do we
update the dissimilarity C1 ?

matrix?” C2 U C5 ? ? ? ?

C3 ?

C3 C4 ?

C4 Dissimilarity Matrix

C1

C2 U C5

...
p1 p2 p3 p4 p9 p10 p11 p12
HOW TO DEFINE INTER-CLUSTER SIMILARITY

p1 p2 p3 p4 p5 ...

Similarity? p1

p2
p3

p4
 MIN p5
 MAX .
.
 Group Average
 Distance Between Centroids
.
Dissimilarity Matrix
 Other methods driven by an objective
function
HOW TO DEFINE INTER-CLUSTER SIMILARITY

p1 p2 p3 p4 p5 ...

p1

p2
p3

p4
 MIN p5
 MAX .
 Group Average .
. Dissimilarity Matrix
 Distance Between Centroids
 Other methods driven by an objective function
HOW TO DEFINE INTER-CLUSTER SIMILARITY

p1 p2 p3 p4 p5 ...

p1

p2
p3

p4
 MIN p5
 MAX .
 Group Average .
. Dissimilarity Matrix
 Distance Between Centroids
 Other methods driven by an objective function
HOW TO DEFINE INTER-CLUSTER SIMILARITY

p1 p2 p3 p4 p5 ...

p1

p2
p3

p4
 MIN p5
 MAX .
 Group Average .
. Dissimilarity Matrix
 Distance Between Centroids
 Other methods driven by an objective function
HOW TO DEFINE INTER-CLUSTER SIMILARITY

p1 p2 p3 p4 p5 ...

p1

  p2
p3

p4
 MIN p5
 MAX .
 Group Average .
. Dissimilarity Matrix
 Distance Between Centroids
 Other methods driven by an objective function
MIN AND MAX
MIN or Single Link • MAX or Complete Linkage
• Similarity of two clusters is based on • Similarity of two clusters is based on
the two most similar (closest) points the two least similar (most distant)
in the different clusters points in the different clusters
• Determined by one pair of points, i.e., by • Determined by all pairs of points in the
one link in the dissimilarity graph. two clusters
• Using MIN OPERATOR (minimum • Using MAX OPERATOR (maximum
value) value)

Example: Example:
dist({3,6},{2,5}) dist({3,6},{2,5})
= min (dist(3,2), dist(6,2), dist(3,5), dist(6,5)) = max (dist(3,2), dist(6,2), dist(3,5), dist(6,5))
= min (0.15,0.25,0.28,0.39) = max (0.15,0.25,0.28,0.39)
= 0.15 = 0.39
CLUSTER SIMILARITY: MIN OR SINGLE LINK

• Similarity of two clusters is based on


the two most similar (closest) points in
the different clusters
• Determined by one pair of points, i.e., by
one link in the dissimilarity graph.

1 2 3 4 5
CLUSTER SIMILARITY: MIN OR SINGLE LINK
Point x coordinate y coordinate
p1 0.40 0.53
p2 0.22 0.38 Numeric
p3 0.35 0.32 Attributes
p4 0.26 0.19
p5 0.08 0.41
p6 0.45 0.30

p1 p2 p3 p4 p5 p6
p1 0.00
p2 0.24 0.00
p3 0.22 0.15 0.00
p4 0.37 0.20 0.15 0.00
p5 0.34 0.14 0.28 0.29 0.00
√ 2
𝑑 ( 𝑝 1 , 𝑝 2 ) = ( 𝑥1 − 𝑥 2 ) + ( 𝑦 1 − 𝑦 2 )
2

p6 0.23 0.25 0.11 0.22 0.39 0.00


𝑑 ( 𝑝 1 , 𝑝 2 ) =√ ( 0 . 40 − 0 . 22 ) + ( 0 . 53 −0 . 38 )
2 2
CLUSTER SIMILARITY: MIN OR SINGLE LINK
• Step 1 – find 2 shortest distances from the dissimilarity matrix become
the first 2 clusters
cluster (p3, p6) and cluster (p2,p5)

p1 p2 p3 p4 p5 p6
Draw the dendrogram
p1 0.00 0.2
p2 0.24 0.00
p3 0.22 0.15 0.00 0.15

p4 0.37 0.20 0.16 0.00


0.1
p5 0.34 0.14 0.28 0.29 0.00
p6 0.23 0.25 0.11 0.22 0.39 0.00 0.05

0
3 6 2 5 4 1
CLUSTER SIMILARITY: MIN OR SINGLE LINK

1 O Step 2 – form the next cluster by finding the point


with the minimum distance to the other points in
the existing clusters, so update the matrix with two
5
2 1 newly formed clusters
2 3 6 Update dissimilarity matrix
p1 P(2,5) P(3,6) p4
p1 0.00
4
P(2,5) ??? 0.00
P(3,6) ??? ??? 0.00
p4 0.37 ??? ??? 0.00
CLUSTER SIMILARITY: MIN OR SINGLE LINK
Distance Single link
dist({3,6},{2,5}) = min (dist(3,2), dist(6,2), dist(3,5), dist(6,5)) p1 P(2,5) P(3,6) p4
= min (0.15,0.25,0.28,0.39) p1 0.00
= 0.15 P(2,5) 0.24 0.00
dist({3,6},{1}) = min (dist(3,1), dist(6,1)) P(3,6) 0.22 0.15 0.00
= min (0.22,0.23)
= 0.22 p4 0.37 0.20 0.16 0.00
dist({3,6},{4}) = min (dist(3,4), dist(6,4))
= min (0.16,0.22)
= 0.16 The closest clusters are
dist({2,5},{1}) = min (dist(2,1), dist(5,1)) cluster {3,6} and cluster
= min (0.24,0.34) {2,5} with 0.15, so the
= 0.24 clusters are merged.
dist({2,5},{4}) = min (dist(2,4), dist(5,4))
= min (0.20,0.29)
= 0.20
CLUSTER SIMILARITY: MIN OR SINGLE LINK

0.2

Draw the 0.15

dendrogram
0.1

0.05

0
3 6 2 5 4 1
CLUSTER SIMILARITY: MIN OR SINGLE LINK
O Step 3 – form the next cluster by finding the point p1 P(2,5),(3,6)
P(2,5),(3,6) p4
p4
with the minimum distance to the other points in p1
p1 0.00
0.00
the existing clusters; P(2,5),(3,6) ??? 0.00
0.00
P(2,5),(3,6) 0.22
p4
p4 0.37 0.16
0.37 ??? 0.00
0.00
Distance Single link
dist({{3,6},{2,5}}, 4) = min (dist{(3,6), 4}, dist{(2,5), 4}) Updated matrix
= min (0.16,0.20)
= 0.16
dist({{3,6},{2,5}}, 1) = min (dist{(3,6), 1}, dist{(2,5), 1}) 0.2
= min (0.22,0.24)
= 0.22 0.15

0.1

0.05
Draw the dendogram
0
3 6 2 5 4 1
CLUSTER SIMILARITY: MIN OR SINGLE LINK
O Step 4 – form the next cluster by finding the distance p1 P((2,5),(3,6)),4
to the existing clusters; p1 0.00

P((2,5),(3,6)),4 ???
0.22 0.00
Distance Single link
Updated matrix
dist({{{3,6},{2,5}}, 4},1) = min ((dist({{3,6},{2,5}}, 1), dist(4,1))
= min (0.22,0.37)
= 0.22 0.2

0.15

0.1
Draw the dendogram
0.05

0
3 6 2 5 4 1

P(((2,5),(3,6)),4),1
Final matrix P(((2,5),(3,6)),4),1 0 Terminate
CLUSTER SIMILARITY: MIN OR SINGLE LINK

Nested Clusters Dendrogram

5
1 0.2
3
0.15
5
2 1 0.1

2 3 6 0.05

0
3 6 2 5 4 1
4
4
CLUSTER SIMILARITY: MAX OR COMPLETE LINKAGE

• Similarity of two clusters is based on the two least similar (most


distant) points in the different clusters
• Determined by all pairs of points in the two clusters

Calculation is using the maximum


but selection to merge is based on
the minimum

1 2 3 4 5
CLUSTER SIMILARITY: MAX OR COMPLETE LINKAGE

• Step 1 – find 2 shortest distance from Draw the dendogram


the dissimilarity table to become the 0.2

first 2 clusters (p3, p6), (p2,p5) 0.15

p1 p2 p3 p4 p5 p6 0.1

p1 0.00
0.05
p2 0.24 0.00
p3 0.22 0.15 0.00 0
3 6 2 5 4 1

p4 0.37 0.20 0.16 0.00


p5 0.34 0.14 0.28 0.29 0.00 p1 P(2,5) P(3,6) p4
p6 0.23 0.25 0.11 0.22 0.39 0.00 p1 0.00
P(2,5) XX 0.00
P(3,6) XX XX 0.00
UPDATE MATRIX p4 0.37 XX XX 0.00
CLUSTER SIMILARITY: MAX OR COMPLETE LINKAGE

Points Distance
dist({3,6},{2,5}) = max (dist(3,2), dist(6,2), dist(3,5), dist(6,5)) p1 P(2,5) P(3,6) p4
= max (0.15,0.25,0.28,0.39) p1 0.00
= 0.39
dist({3,6},{1}) = max (dist(3,1), dist(6,1)) P(2,5) 0.34 0.00
= max (0.22,0.23) P(3,6) 0.23 0.39 0.00
= 0.23
p4 0.37 0.29 0.22 0.00
dist({3,6},{4}) = max (dist(3,4), dist(6,4))
= max (0.16,0.22)
= 0.22 UPDATED MATRIX
dist({2,5},{1}) = max (dist(2,1), dist(5,1))
= max (0.24,0.34)
= 0.34
dist({2,5},{4}) = max (dist(2,4), dist(5,4))
= max (0.20,0.29)
= 0.29
CLUSTER SIMILARITY: MAX OR COMPLETE LINKAGE

O Step 2 – form the next cluster by finding the point with the
minimum distance to the other points in the existing
clusters;
 newly formed cluster {(3,6), 4}
Draw dendrogram
p1 P(2,5) P(3,6) p4 0.4

p1 0.00 0.35

0.3
P(2,5) 0.34 0.00 0.25

P(3,6) 0.23 0.39 0.00 0.2

0.15
p4 0.37 0.29 0.22 0.00 0.1

0.05

0
3 6 4 1 2 5
CLUSTER SIMILARITY: MAX OR COMPLETE LINKAGE

UPDATE MATRIX Points Distance


p1 P(2,5) P(3,6), 4 dist({{3,6},{4}}, = max (dist{(3,6), (2,5)}, dist{4, (2,5)})
{2,5}) = max (0.39,0.29)
= 0.39
p1 0.00
dist({{3,6},{4}}, 1) = max (dist{(3,6), 1}, dist{4, 1})
P(2,5) 0.34
XXX 0.00 = max (0.37,0.37)
P(3,6),4 0.37
XXX 0.39
XXX 0.00 = 0.37
dist({{2,5}, 1) = 0.34

O Step 3 – form the next cluster by 0.4

0.35
finding the point with the Draw the dendogram 0.3

minimum distance to the other 0.25

0.2

points in the existing clusters; 0.15

0.1

 newly formed cluster {(2,5), 1} 0.05

0
3 6 4 1 2 5
CLUSTER SIMILARITY: MAX OR COMPLETE LINKAGE

O Step 4 – form the next cluster by finding the distance to the other points in the
existing clusters;  newly formed cluster {{{2,5}, 1},{{3,6},4}}
P(2,5),1 P(3,6), 4 Points Distance
dist({{3,6},{4}}, = max (dist{(3,6),4}, (2,5)}, dist{(3,6),4}, 1}})
P(2,5),1 0.00 {{2,5},1}) = max (0.39,0.37)
P(3,6),4 0.39 0.00 = 0.39

0.4
UPDATED MATRIX
0.35

0.3 P({{3,6},{4}},
0.25
{{2,5},1})
Draw the dendrogram 0.2 P({{3,6},{4}}, {{2,5},1}) 0
0.15

0.1

0.05

0
3 6 4 1 2 5 Terminate
CLUSTER SIMILARITY: MAX OR COMPLETE LINKAGE

Nested Clusters

Dendrogram
4 1
0.4
2 5 0.35

5 0.3

2 0.25

0.2

3 6
0.15

0.1
3
1
0.05

0
3 6 4 1 2 5

4
HIERARCHICAL CLUSTERING: COMPARISON
Single Link OR MIN Complete Link OR MAX
5
1 4
3 1
2 5
5
2 1 5
2
2 3 6 3 6
3
1
4
4 4

0.4

0.2 0.35

0.3
0.15 0.25

0.2
0.1
0.15

0.1
0.05
0.05

0 0
3 6 2 5 4 1 3 6 4 1 2 5
SUMMARY

• Cluster analysis groups objects based on their similarity/dissimilarity


and has wide applications
• Measure of dissimilarity can be computed for various types of data:
nominal, binary, ordinal, numeric, textual
• Clustering algorithms discussed are categorized into partitioning
methods (k means) and hierarchical methods.
References

1. Jiawei Han and Micheline Kamber, Data Mining: Concepts and


Techniques, 3rd Edition, Morgan Kaufmann, 2012.

2. Pang-Ning Tan, Michael Steinbach & Vipin Kumar, Introduction to Data


Mining, Addison Wesley, 2019.
THANK YOU
Shuzlina Abdul Rahman | Sofianita Mutalib | Siti Nur Kamaliah Kamarudin

You might also like