0% found this document useful (0 votes)
18 views

6 - Chapter 6 - Hierarchical Clustering

The document discusses hierarchical clustering methods. It describes how hierarchical clustering works by grouping data into a tree of clusters and merging the closest pairs of clusters at each step. It also discusses different types of hierarchical clustering including bottom-up and top-down approaches as well as linkage methods for determining inter-cluster similarity.

Uploaded by

mazeen naser
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
18 views

6 - Chapter 6 - Hierarchical Clustering

The document discusses hierarchical clustering methods. It describes how hierarchical clustering works by grouping data into a tree of clusters and merging the closest pairs of clusters at each step. It also discusses different types of hierarchical clustering including bottom-up and top-down approaches as well as linkage methods for determining inter-cluster similarity.

Uploaded by

mazeen naser
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 32

Chapter 6

Hierarchical Clustering
• Outline
• A Hierarchical clustering method works via grouping data into a tree of
clusters.
• Hierarchical clustering begins by treating every data points as a separate
cluster.

• Then, it repeatedly executes the subsequent steps:


1.Identify the 2 clusters which can be closest together, and
2.Merge the 2 maximum comparable clusters.
3.We need to continue these steps until all the clusters are merged together.
• Hierarchical Clustering
 Two main types of hierarchical clustering
• Bottom-up, agglomerative‫أسلوب التجميع‬
 Start with the points as individual clusters
 At
each step, combine / merge the closest/ nearest/ pair of clusters until
only one cluster (or k clusters) left

• Top-down, divisive ‫أسلوب التجزئة أو التقسيم‬


 Startwith one, all-inclusive cluster
 At each step, split a cluster until each cluster contains an individual point (or
there are k clusters)

Hierarchical Clustering
• Build a tree-based hierarchical taxonomy (dendrogram) from a set of
documents.
animal

vertebrate invertebrate

fish reptile amphib. mammal worm insect crustacean


 Constriction
• Agglomerative ‫أسلوب التجميع‬

Initially consider every data point as an individual Cluster
• At every step, Combine/ merge the nearest pairs of the cluster. (It
is a bottom-up method).
• At first every data set is considered as individual entity or cluster.
At every iteration, the clusters merge with different clusters until
one cluster is formed.
• Example 1 shows an example of hierarchical clustering of five labeled points:
A, B, C, D, and E. The dendrogram represents the following sequence of
nested partitions:
• Clustering Clusters
C1 {A},{B},{C},{D},{E}
C2 {AB},{C},{D},{E}
C3 {AB},{CD},{E}
C4 {ABCD},{E}
C5 {ABCDE}

• with C ⊂ C for t = 2,...,5. We assume that A and B are merged before C and
t-1 t

D.
• Number of Hierarchical Clustering
The number of different nested or hierarchical clustering corresponds to
the number
of different binary rooted trees or dendrograms with n leaves with
distinct labels.
• Any tree with t nodes has t -1 edges.
1 edges= t-1
• Also, any rooted binary tree with m leaves has m-1 internal nodes.
• Thus, a dendrogram with m leaf nodes has a total of t = m + m - 1 =
2m-1 nodes, T node= 2m-1
• and consequently edges. = t -1 = 2m-2 edges. edges = 2m-1


• Exercise1
• How many number of (t nodes , m leaf , edges) in the figures below
• Exercise1
• calculate the number of (t nodes edges ) depend on m leaf in the figures
below
• How many number of (t nodes edges) in the figures below
Dendrogram: Hierarchical Clustering

• Clustering obtained
by cutting the
dendrogram at a
desired level: each
connected
component forms a
cluster.
• Determining clusters
• One of the problems with hierarchical clustering is that there is no objective
way to say how many clusters there are.
• If we cut the single linkage tree at the point shown below, we would say that
there are two clusters.
• However, if we cut the tree lower we might say that there is one cluster and
two singletons / Node /Leaf
Hierarchical Clustering algorithms
• Agglomerative (bottom-up): ‫أسلوب التجميع‬
• -starts with each object in a separate cluster.
• Clusters are formed by grouping objects into bigger and bigger clusters.
• Divisive (top-down): ‫أسلوب التجزئة أو التقسيم‬
• starts with all the objects grouped in a single cluster. Clusters are divided or split
until each object is in a separate cluster.
• Does not require the number of clusters k in advance
• Needs a termination/readout condition
Hierarchical Agglomerative Clustering (HAC)

• Assumes a similarity function for determining the similarity of two


instances.
• Starts with all instances in a separate cluster and then repeatedly
joins the two clusters that are most similar until there is only one
cluster.
• The history of merging forms a binary tree or hierarchy.
• Agglomerative methods are commonly used in marketing research.
• They consist of linkage methods, variance methods, and centroid
methods.
Hierarchical Agglomerative Clustering-
Linkage Method
• 1-The single linkage method is based on minimum distance, or the
nearest neighbor rule.
How to Define Inter-Cluster Similarity
p1 p2 p3 p4 p5 ...
p1

p2

p3

p4

p5
 MIN
.
 MAX
.
 Group Average .
Proximity Matrix
 Distance Between Centroids
 Other methods driven by an objective
function
– Ward’s Method uses squared error
• 2-The complete linkage method is based on the maximum
distance or the furthest neighbor approach.
How to Define Inter-Cluster Similarity
p1 p2 p3 p4 p5 ...
p1

p2

p3

p4

p5
 MIN
.
 MAX
.
 Group Average .
Proximity Matrix
 Distance Between Centroids
 Other methods driven by an objective
function
– Ward’s Method uses squared error
• 3- The average linkage method the distance between two
clusters is defined as the average of the distances between all
pairs of objects
How to Define Inter-Cluster Similarity
p1 p2 p3 p4 p5 ...
p1

p2

p3

p4

p5
 MIN
.
 MAX
.
 Group Average .
Proximity Matrix
 Distance Between Centroids
 Other methods driven by an objective
function
– Ward’s Method uses squared error
Linkage Methods of Clustering

Single Linkage
Minimum Distance

Cluster 1 Cluster 2
Complete Linkage
Maximum Distance

Cluster 1 Cluster 2
Average Linkage

Average Distance
Cluster 1 Cluster 2
Select a Similarity Measure
• Similarity measure can be correlations or distances

• The most commonly used measure of similarity is


the Euclidean distance. The city-block distance is
also used.

• If variables measured in vastly different units, we


must standardize data. Also eliminate outliers

• Use of different similarity/distance measures may


lead to different clustering results.

• Hence, it is advisable to use different measures


and compare the results.
• Rules
• 1- The distance function between two points a = (x1, y1) and b = (x2, y2) is
defined as-
Ρ(a, b) = |x2 – x1| + |y2 – y1|
• 2- Euclidean Distance’

• 3- The k-means algorithm uses the concept of centroid to create ‘k


clusters.

7/4/2023 27
• Note: The main idea of Agglomerative algorithm is
• Let’s say we have six data points A, B, C, D, E, F.
• Step-1:
Consider each alphabet as a single cluster and calculate the distance of one cluster from
all the other clusters.
• Step-2:
In the second step comparable clusters are COMBIN / merged together to form a single
cluster. Let’s say cluster (B) and cluster (C) are very similar to each other therefore we
merge them in the
• Step3: similarly with cluster (D) and (E)
• Step4: get the clusters [(A), (BC), (DE), (F)]
• Step-5:
We recalculate the proximity according to the algorithm and merge the two nearest
clusters([(DE), (F)]) together to form new clusters as [(A), (BC), (DEF)]


• Step-6:
Repeating the same process; The clusters DEF and BC are comparable and
merged together to form a new cluster. We’re now left with clusters [(A), (BCDEF)].
• Step-7:
At last the two remaining clusters are merged together to form a single cluster
[(ABCDEF)].

7/4/2023 29
Nonhierarchical Clustering Methods

• The nonhierarchical clustering methods are frequently referred to as k-


means clustering. .

-In the sequential threshold method, a cluster center is selected and all
objects within a prespecified threshold value from the center are grouped
together.

-In the parallel threshold method, several cluster centers are selected and
objects within the threshold level are grouped with the nearest center.

-The optimizing partitioning method differs from the two threshold procedures
in that objects can later be reassigned to clusters to optimize an overall criterion,
such as average within cluster distance for a given number of clusters.
Classification of Clustering Procedures
Clustering Procedures
Fig. 20.4

Hierarchical Nonhierarchical

Agglomerative Divisive

Linkage Variance Centroid Sequential Parallel Optimizing


Methods Methods Methods Threshold Threshold Partitioning

Ward’s
Method

Single Complete Average


Linkage Linkage Linkage
• End

You might also like