0% found this document useful (0 votes)

90 views

Topic 6d - Hierarchical Algorithm

Hierarchical clustering is an unsupervised machine learning algorithm that builds nested clusters based on the distances between observations. It does not require specifying the number of clusters as input. There are two main approaches: agglomerative, which starts with each observation in its own cluster and merges them sequentially, and divisive, which starts with all observations in one cluster and splits them sequentially. The algorithm uses a distance or similarity matrix to merge or split clusters at each step. The results can be visualized as a dendrogram tree structure.

Uploaded by

Nurizzati Md Nizam

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

90 views

Topic 6d - Hierarchical Algorithm

Uploaded by

Nurizzati Md Nizam

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 38

TOPIC 6 – PART D

CLUSTERING: HIERARCHICAL
APPROACH
HIERARCHICAL CLUSTERING

• Produces a set of nested clusters organized as a hierarchical tree

• Can be visualized as a dendrogram
• A tree like diagram that records the sequences of merges or splits

6 5
0.2
4
3 4
0.15 2
5
0.1 2

0.05 1
3 1
0
1 3 2 5 4 6
HIERARCHICAL CLUSTERING

• Use distance matrix as clustering criteria. This method does not

require the number of clusters k as an input, but needs a termination
condition Step 0 Step 1 Step 2 Step 3 Step 4
agglomerative
(AGNES)
a
ab
b
abcde
c
cde
d
de
e
divisive
Step 4 Step 3 Step 2 Step 1 Step 0 (DIANA)
HIERARCHICAL CLUSTERING

 Two main types of hierarchical clustering

◦ Agglomerative:
 Start with the points as individual clusters
 At each step, merge the closest pair of clusters until only one cluster (or k clusters) left

◦ Divisive:
 Start with one, all-inclusive cluster
 At each step, split a cluster until each cluster contains a point (or there are k clusters)

 Traditional hierarchical algorithms use a similarity or distance matrix

◦ Merge or split one cluster at a time
HIERARCHICAL CLUSTERING
 Introduced in Kaufmann and Rousseeuw (1990)
 Implemented in statistical packages, e.g., Splus
 Use the single-link method and the dissimilarity matrix
 Merge nodes that have the least dissimilarity (min. Euclidean distance)
 Go on in a non-descending fashion
 Eventually all nodes belong to the same cluster
10 10 10

9 9 9

8 8 8

7 7 7

6 6 6

5 5 5

4 4 4

3 3 3

2 2 2

1 1 1

0 0 0
0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10
AGGLOMERATIVE CLUSTERING ALGORITHM

 More popular hierarchical clustering technique

 Basic algorithm :
1. Compute the dissimilarity matrix
2. Let each data point be a cluster
3. Repeat
4. Merge the two closest clusters
5. Update the dissimilarity matrix
6. Until only a single cluster remains

 Key operation is the computation of the dissimilarity of two clusters

◦ Different approaches to defining the distance between clusters
DENDROGRAM
DENDROGRAM: shows how
the clusters are merged
Decompose data objects into a
several levels of nested
partitioning (tree of clusters),
called a dendrogram.

A clustering of the data objects is

obtained by cutting the
dendrogram at the desired level,
then each connected component
forms a cluster.
STARTING SITUATION
• Start with clusters of individual points and a
dissimilarity matrix
p1 p2 p3 p4 p5 ...
p1

p2
p3

p4
p5
.
.
. Dissimilarity Matrix

...
p1 p2 p3 p4 p9 p10 p11 p12
INTERMEDIATE SITUATION
c1 c2 c3 c4 c5 ...

• After some merging steps, c1

c2
we have some clusters
c3

C3 c5
.
C4 .
Dissimilarity Matrix
.

C2 C5

...
p1 p2 p3 p4 p9 p10 p11 p12
INTERMEDIATE SITUATION
c1 c2 c3 c4 c5 ...
• We want to merge the two closest clusters c1

(C2 and C5) and update the dissimilarity c2

matrix. c3

C3 c5
.
C4 . Dissimilarity Matrix
.

C2 C5

...
p1 p2 p3 p4 p9 p10 p11 p12
AFTER MERGING
C2
U
C1 C5 C3 C4
• The question is “How do we
update the dissimilarity C1 ?

matrix?” C2 U C5 ? ? ? ?

C3 ?

C3 C4 ?

C4 Dissimilarity Matrix

C2 U C5

...
p1 p2 p3 p4 p9 p10 p11 p12
HOW TO DEFINE INTER-CLUSTER SIMILARITY

p1 p2 p3 p4 p5 ...

Similarity? p1

p2
p3

p4
 MIN p5
 MAX .
.
 Group Average
 Distance Between Centroids
.
Dissimilarity Matrix
 Other methods driven by an objective
function
HOW TO DEFINE INTER-CLUSTER SIMILARITY

p1 p2 p3 p4 p5 ...

p2
p3

p4
 MIN p5
 MAX .
 Group Average .
. Dissimilarity Matrix
 Distance Between Centroids
 Other methods driven by an objective function
HOW TO DEFINE INTER-CLUSTER SIMILARITY

p1 p2 p3 p4 p5 ...

p2
p3

p4
 MIN p5
 MAX .
 Group Average .
. Dissimilarity Matrix
 Distance Between Centroids
 Other methods driven by an objective function
HOW TO DEFINE INTER-CLUSTER SIMILARITY

p1 p2 p3 p4 p5 ...

p2
p3

p4
 MIN p5
 MAX .
 Group Average .
. Dissimilarity Matrix
 Distance Between Centroids
 Other methods driven by an objective function
HOW TO DEFINE INTER-CLUSTER SIMILARITY

p1 p2 p3 p4 p5 ...

  p2
p3

p4
 MIN p5
 MAX .
 Group Average .
. Dissimilarity Matrix
 Distance Between Centroids
 Other methods driven by an objective function
MIN AND MAX
MIN or Single Link • MAX or Complete Linkage
• Similarity of two clusters is based on • Similarity of two clusters is based on
the two most similar (closest) points the two least similar (most distant)
in the different clusters points in the different clusters
• Determined by one pair of points, i.e., by • Determined by all pairs of points in the
one link in the dissimilarity graph. two clusters
• Using MIN OPERATOR (minimum • Using MAX OPERATOR (maximum
value) value)

Example: Example:
dist({3,6},{2,5}) dist({3,6},{2,5})
= min (dist(3,2), dist(6,2), dist(3,5), dist(6,5)) = max (dist(3,2), dist(6,2), dist(3,5), dist(6,5))
= min (0.15,0.25,0.28,0.39) = max (0.15,0.25,0.28,0.39)
= 0.15 = 0.39
CLUSTER SIMILARITY: MIN OR SINGLE LINK

• Similarity of two clusters is based on

the two most similar (closest) points in
the different clusters
• Determined by one pair of points, i.e., by
one link in the dissimilarity graph.

1 2 3 4 5
CLUSTER SIMILARITY: MIN OR SINGLE LINK
Point x coordinate y coordinate
p1 0.40 0.53
p2 0.22 0.38 Numeric
p3 0.35 0.32 Attributes
p4 0.26 0.19
p5 0.08 0.41
p6 0.45 0.30

p1 p2 p3 p4 p5 p6
p1 0.00
p2 0.24 0.00
p3 0.22 0.15 0.00
p4 0.37 0.20 0.15 0.00
p5 0.34 0.14 0.28 0.29 0.00
√ 2
𝑑 ( 𝑝 1 , 𝑝 2 ) = ( 𝑥1 − 𝑥 2 ) + ( 𝑦 1 − 𝑦 2 )
2

p6 0.23 0.25 0.11 0.22 0.39 0.00

𝑑 ( 𝑝 1 , 𝑝 2 ) =√ ( 0 . 40 − 0 . 22 ) + ( 0 . 53 −0 . 38 )
2 2
CLUSTER SIMILARITY: MIN OR SINGLE LINK
• Step 1 – find 2 shortest distances from the dissimilarity matrix become
the first 2 clusters
cluster (p3, p6) and cluster (p2,p5)

p1 p2 p3 p4 p5 p6
Draw the dendrogram
p1 0.00 0.2
p2 0.24 0.00
p3 0.22 0.15 0.00 0.15

p4 0.37 0.20 0.16 0.00

0.1
p5 0.34 0.14 0.28 0.29 0.00
p6 0.23 0.25 0.11 0.22 0.39 0.00 0.05

0
3 6 2 5 4 1
CLUSTER SIMILARITY: MIN OR SINGLE LINK

1 O Step 2 – form the next cluster by finding the point

with the minimum distance to the other points in
the existing clusters, so update the matrix with two
5
2 1 newly formed clusters
2 3 6 Update dissimilarity matrix
p1 P(2,5) P(3,6) p4
p1 0.00
4
P(2,5) ??? 0.00
P(3,6) ??? ??? 0.00
p4 0.37 ??? ??? 0.00
CLUSTER SIMILARITY: MIN OR SINGLE LINK
Distance Single link
dist({3,6},{2,5}) = min (dist(3,2), dist(6,2), dist(3,5), dist(6,5)) p1 P(2,5) P(3,6) p4
= min (0.15,0.25,0.28,0.39) p1 0.00
= 0.15 P(2,5) 0.24 0.00
dist({3,6},{1}) = min (dist(3,1), dist(6,1)) P(3,6) 0.22 0.15 0.00
= min (0.22,0.23)
= 0.22 p4 0.37 0.20 0.16 0.00
dist({3,6},{4}) = min (dist(3,4), dist(6,4))
= min (0.16,0.22)
= 0.16 The closest clusters are
dist({2,5},{1}) = min (dist(2,1), dist(5,1)) cluster {3,6} and cluster
= min (0.24,0.34) {2,5} with 0.15, so the
= 0.24 clusters are merged.
dist({2,5},{4}) = min (dist(2,4), dist(5,4))
= min (0.20,0.29)
= 0.20
CLUSTER SIMILARITY: MIN OR SINGLE LINK

0.2

Draw the 0.15

dendrogram
0.1

0.05

0
3 6 2 5 4 1
CLUSTER SIMILARITY: MIN OR SINGLE LINK
O Step 3 – form the next cluster by finding the point p1 P(2,5),(3,6)
P(2,5),(3,6) p4
p4
with the minimum distance to the other points in p1
p1 0.00
0.00
the existing clusters; P(2,5),(3,6) ??? 0.00
0.00
P(2,5),(3,6) 0.22
p4
p4 0.37 0.16
0.37 ??? 0.00
0.00
Distance Single link
dist({{3,6},{2,5}}, 4) = min (dist{(3,6), 4}, dist{(2,5), 4}) Updated matrix
= min (0.16,0.20)
= 0.16
dist({{3,6},{2,5}}, 1) = min (dist{(3,6), 1}, dist{(2,5), 1}) 0.2
= min (0.22,0.24)
= 0.22 0.15

0.1

0.05
Draw the dendogram
0
3 6 2 5 4 1
CLUSTER SIMILARITY: MIN OR SINGLE LINK
O Step 4 – form the next cluster by finding the distance p1 P((2,5),(3,6)),4
to the existing clusters; p1 0.00

P((2,5),(3,6)),4 ???
0.22 0.00
Distance Single link
Updated matrix
dist({{{3,6},{2,5}}, 4},1) = min ((dist({{3,6},{2,5}}, 1), dist(4,1))
= min (0.22,0.37)
= 0.22 0.2

0.15

0.1
Draw the dendogram
0.05

0
3 6 2 5 4 1

P(((2,5),(3,6)),4),1
Final matrix P(((2,5),(3,6)),4),1 0 Terminate
CLUSTER SIMILARITY: MIN OR SINGLE LINK

Nested Clusters Dendrogram

5
1 0.2
3
0.15
5
2 1 0.1

2 3 6 0.05

0
3 6 2 5 4 1
4
4
CLUSTER SIMILARITY: MAX OR COMPLETE LINKAGE

• Similarity of two clusters is based on the two least similar (most

distant) points in the different clusters
• Determined by all pairs of points in the two clusters

Calculation is using the maximum

but selection to merge is based on
the minimum

1 2 3 4 5
CLUSTER SIMILARITY: MAX OR COMPLETE LINKAGE

• Step 1 – find 2 shortest distance from Draw the dendogram

the dissimilarity table to become the 0.2

first 2 clusters (p3, p6), (p2,p5) 0.15

p1 p2 p3 p4 p5 p6 0.1

p1 0.00
0.05
p2 0.24 0.00
p3 0.22 0.15 0.00 0
3 6 2 5 4 1

p4 0.37 0.20 0.16 0.00

p5 0.34 0.14 0.28 0.29 0.00 p1 P(2,5) P(3,6) p4
p6 0.23 0.25 0.11 0.22 0.39 0.00 p1 0.00
P(2,5) XX 0.00
P(3,6) XX XX 0.00
UPDATE MATRIX p4 0.37 XX XX 0.00
CLUSTER SIMILARITY: MAX OR COMPLETE LINKAGE

Points Distance
dist({3,6},{2,5}) = max (dist(3,2), dist(6,2), dist(3,5), dist(6,5)) p1 P(2,5) P(3,6) p4
= max (0.15,0.25,0.28,0.39) p1 0.00
= 0.39
dist({3,6},{1}) = max (dist(3,1), dist(6,1)) P(2,5) 0.34 0.00
= max (0.22,0.23) P(3,6) 0.23 0.39 0.00
= 0.23
p4 0.37 0.29 0.22 0.00
dist({3,6},{4}) = max (dist(3,4), dist(6,4))
= max (0.16,0.22)
= 0.22 UPDATED MATRIX
dist({2,5},{1}) = max (dist(2,1), dist(5,1))
= max (0.24,0.34)
= 0.34
dist({2,5},{4}) = max (dist(2,4), dist(5,4))
= max (0.20,0.29)
= 0.29
CLUSTER SIMILARITY: MAX OR COMPLETE LINKAGE

O Step 2 – form the next cluster by finding the point with the
minimum distance to the other points in the existing
clusters;
 newly formed cluster {(3,6), 4}
Draw dendrogram
p1 P(2,5) P(3,6) p4 0.4

p1 0.00 0.35

0.3
P(2,5) 0.34 0.00 0.25

P(3,6) 0.23 0.39 0.00 0.2

0.15
p4 0.37 0.29 0.22 0.00 0.1

0.05

0
3 6 4 1 2 5
CLUSTER SIMILARITY: MAX OR COMPLETE LINKAGE

UPDATE MATRIX Points Distance

p1 P(2,5) P(3,6), 4 dist({{3,6},{4}}, = max (dist{(3,6), (2,5)}, dist{4, (2,5)})
{2,5}) = max (0.39,0.29)
= 0.39
p1 0.00
dist({{3,6},{4}}, 1) = max (dist{(3,6), 1}, dist{4, 1})
P(2,5) 0.34
XXX 0.00 = max (0.37,0.37)
P(3,6),4 0.37
XXX 0.39
XXX 0.00 = 0.37
dist({{2,5}, 1) = 0.34

O Step 3 – form the next cluster by 0.4

0.35
finding the point with the Draw the dendogram 0.3

minimum distance to the other 0.25

0.2

points in the existing clusters; 0.15

0.1

 newly formed cluster {(2,5), 1} 0.05

0
3 6 4 1 2 5
CLUSTER SIMILARITY: MAX OR COMPLETE LINKAGE

O Step 4 – form the next cluster by finding the distance to the other points in the
existing clusters;  newly formed cluster {{{2,5}, 1},{{3,6},4}}
P(2,5),1 P(3,6), 4 Points Distance
dist({{3,6},{4}}, = max (dist{(3,6),4}, (2,5)}, dist{(3,6),4}, 1}})
P(2,5),1 0.00 {{2,5},1}) = max (0.39,0.37)
P(3,6),4 0.39 0.00 = 0.39

0.4
UPDATED MATRIX
0.35

0.3 P({{3,6},{4}},
0.25
{{2,5},1})
Draw the dendrogram 0.2 P({{3,6},{4}}, {{2,5},1}) 0
0.15

0.1

0.05

0
3 6 4 1 2 5 Terminate
CLUSTER SIMILARITY: MAX OR COMPLETE LINKAGE

Nested Clusters

Dendrogram
4 1
0.4
2 5 0.35

5 0.3

2 0.25

0.2

3 6
0.15

0.1
3
1
0.05

0
3 6 4 1 2 5

4
HIERARCHICAL CLUSTERING: COMPARISON
Single Link OR MIN Complete Link OR MAX
5
1 4
3 1
2 5
5
2 1 5
2
2 3 6 3 6
3
1
4
4 4

0.4

0.2 0.35

0.3
0.15 0.25

0.2
0.1
0.15

0.1
0.05
0.05

0 0
3 6 2 5 4 1 3 6 4 1 2 5
SUMMARY

• Cluster analysis groups objects based on their similarity/dissimilarity

and has wide applications
• Measure of dissimilarity can be computed for various types of data:
nominal, binary, ordinal, numeric, textual
• Clustering algorithms discussed are categorized into partitioning
methods (k means) and hierarchical methods.
References

1. Jiawei Han and Micheline Kamber, Data Mining: Concepts and

Techniques, 3rd Edition, Morgan Kaufmann, 2012.

2. Pang-Ning Tan, Michael Steinbach & Vipin Kumar, Introduction to Data

Mining, Addison Wesley, 2019.
THANK YOU
Shuzlina Abdul Rahman | Sofianita Mutalib | Siti Nur Kamaliah Kamarudin

Marketing Analytics Unit 1
No ratings yet
Marketing Analytics Unit 1
48 pages
Kriging Interpolation
No ratings yet
Kriging Interpolation
49 pages
Cluster
100% (1)
Cluster
72 pages
A Beginner's Guide To Hierarchical Clustering
No ratings yet
A Beginner's Guide To Hierarchical Clustering
23 pages
Text Categorization and Classification
No ratings yet
Text Categorization and Classification
13 pages
Customer Purchasing Behavior Prediction Using Machine Learning Classification Techniques
No ratings yet
Customer Purchasing Behavior Prediction Using Machine Learning Classification Techniques
26 pages
Uml Finalized
No ratings yet
Uml Finalized
46 pages
Unit 3 Uml Diagrams
No ratings yet
Unit 3 Uml Diagrams
86 pages
Cdma Technology Seminar Report
No ratings yet
Cdma Technology Seminar Report
24 pages
Hierarchical Diagram of The Members of The Company
No ratings yet
Hierarchical Diagram of The Members of The Company
2 pages
Gain More Knowledge Reach Greater Heights Presidency College, Kempapura, Hebbal, BANGALORE-560024
No ratings yet
Gain More Knowledge Reach Greater Heights Presidency College, Kempapura, Hebbal, BANGALORE-560024
144 pages
Online BSC Degree and Diploma Program: Programming and Data Science
No ratings yet
Online BSC Degree and Diploma Program: Programming and Data Science
16 pages
Group 4 Managing The Distribution Channels For High Technology Products
No ratings yet
Group 4 Managing The Distribution Channels For High Technology Products
32 pages
Case - Study of Data Warehouse
No ratings yet
Case - Study of Data Warehouse
14 pages
Hierarchical Clustering in Unsupervised Learning
No ratings yet
Hierarchical Clustering in Unsupervised Learning
9 pages
7.introduction To Clustering
No ratings yet
7.introduction To Clustering
11 pages
Cluster Analysis: Abu Bashar
No ratings yet
Cluster Analysis: Abu Bashar
18 pages
U20cs604 Machine Learning Unit III
No ratings yet
U20cs604 Machine Learning Unit III
23 pages
Healthcare Workers
No ratings yet
Healthcare Workers
15 pages
Vinay Deokar: Core Competencies
No ratings yet
Vinay Deokar: Core Competencies
2 pages
Management of Sales Territories and Quotas
No ratings yet
Management of Sales Territories and Quotas
7 pages
Arecaunut Classification Report Final Yolo Based
No ratings yet
Arecaunut Classification Report Final Yolo Based
35 pages
Hierarchical Diagram of The
No ratings yet
Hierarchical Diagram of The
2 pages
HRM in Retail: Dr. Parveen Nagpal
No ratings yet
HRM in Retail: Dr. Parveen Nagpal
23 pages
MKT 470 Section: 12: Loyalty Intention Factors of Customer in Private Health Care Sector
No ratings yet
MKT 470 Section: 12: Loyalty Intention Factors of Customer in Private Health Care Sector
28 pages
Multi-Criteria Decision Making
No ratings yet
Multi-Criteria Decision Making
5 pages
Cluster Analysis Concept & Methods
No ratings yet
Cluster Analysis Concept & Methods
14 pages
Visualization of High Dimensional Scientific Data
No ratings yet
Visualization of High Dimensional Scientific Data
105 pages
Time Series and Index
No ratings yet
Time Series and Index
27 pages
Flow Diagram of Waterlily Process of Production
No ratings yet
Flow Diagram of Waterlily Process of Production
1 page
Suhail Bamzena CV
No ratings yet
Suhail Bamzena CV
4 pages
Generative AI Certification Course
No ratings yet
Generative AI Certification Course
8 pages
Machine Learning in Genomics Medicine
No ratings yet
Machine Learning in Genomics Medicine
22 pages
Ppt-Chap 1
No ratings yet
Ppt-Chap 1
34 pages
Generative AI Tghjraining in Hyderabad
No ratings yet
Generative AI Tghjraining in Hyderabad
22 pages
Advanced R Statistical Programming and Data Models: Analysis, Machine Learning, and Visualization 1st Edition Matt Wiley download pdf
100% (2)
Advanced R Statistical Programming and Data Models: Analysis, Machine Learning, and Visualization 1st Edition Matt Wiley download pdf
55 pages
Presentation of Kaifeng Shengda Water Meter Co.,Ltd - 2
No ratings yet
Presentation of Kaifeng Shengda Water Meter Co.,Ltd - 2
37 pages
Prepared and Presented By:-Prashant Sakariya
No ratings yet
Prepared and Presented By:-Prashant Sakariya
17 pages
PRODUCT (1)
100% (1)
PRODUCT (1)
34 pages
Finished Project
No ratings yet
Finished Project
108 pages
New Microsoft Word Document
No ratings yet
New Microsoft Word Document
77 pages
Stakeholder Engagement Plan SEP GAMBIA FISCAL MANAGEMENT DEVELOPMENT PROJECT P166695
No ratings yet
Stakeholder Engagement Plan SEP GAMBIA FISCAL MANAGEMENT DEVELOPMENT PROJECT P166695
103 pages
Implantation of Global Brand Equity Measurement System
No ratings yet
Implantation of Global Brand Equity Measurement System
29 pages
Simple Linear Regression Analysis
No ratings yet
Simple Linear Regression Analysis
34 pages
Rajesh Kumar Ray: Resume of
No ratings yet
Rajesh Kumar Ray: Resume of
4 pages
Data Science Case Study For Introduction
No ratings yet
Data Science Case Study For Introduction
19 pages
Ent in FinTech Merged
No ratings yet
Ent in FinTech Merged
234 pages
02 DataCategorization
No ratings yet
02 DataCategorization
41 pages
Data Analysis
No ratings yet
Data Analysis
16 pages
Machine Learning - Customer Segment Project. Approved by UDACITY
100% (1)
Machine Learning - Customer Segment Project. Approved by UDACITY
19 pages
Learning HTML
No ratings yet
Learning HTML
26 pages
Introduction To Data Mining
No ratings yet
Introduction To Data Mining
8 pages
2nd Unit - 2.2 - Data Analytics
No ratings yet
2nd Unit - 2.2 - Data Analytics
22 pages
Segmentation and Targeting: Basics Market Definition Segmentation Research and Methods Behavior-Based Segmentation
No ratings yet
Segmentation and Targeting: Basics Market Definition Segmentation Research and Methods Behavior-Based Segmentation
77 pages
Final To Print Services Offered by SBI
No ratings yet
Final To Print Services Offered by SBI
113 pages
AMR Iimr 2019 For Students
100% (1)
AMR Iimr 2019 For Students
273 pages
Big Data in Healthcare Systems and Research
No ratings yet
Big Data in Healthcare Systems and Research
4 pages
Final Big Data
No ratings yet
Final Big Data
23 pages
The Power of Prediction in Health Care: A Step-by-step Guide to Data Science in Health Care
From Everand
The Power of Prediction in Health Care: A Step-by-step Guide to Data Science in Health Care
Rafiq Muhammad
No ratings yet
(Excerpts From) Investigating Performance: Design and Outcomes With Xapi
From Everand
(Excerpts From) Investigating Performance: Design and Outcomes With Xapi
Janet Laane Effron
No ratings yet
Single customer view Second Edition
From Everand
Single customer view Second Edition
Gerardus Blokdyk
No ratings yet
Geosta 1
No ratings yet
Geosta 1
36 pages
13 Geostatistics
No ratings yet
13 Geostatistics
17 pages
Geostatistical Overview
No ratings yet
Geostatistical Overview
12 pages
Ore Body Modelling: Concepts and Techniques
100% (1)
Ore Body Modelling: Concepts and Techniques
33 pages
5D Geostatistics For Directional Variables
No ratings yet
5D Geostatistics For Directional Variables
11 pages
Geostatistics in Geotechnical Engineering
No ratings yet
Geostatistics in Geotechnical Engineering
5 pages
Comparison Analysis of Euclidean and Gower Distanc
No ratings yet
Comparison Analysis of Euclidean and Gower Distanc
8 pages
Experiment-7: Implementation of K-Means Clustering Algorithm
No ratings yet
Experiment-7: Implementation of K-Means Clustering Algorithm
3 pages
379-Article Text-1473-1-10-20220217
No ratings yet
379-Article Text-1473-1-10-20220217
8 pages
Full Spatial Analysis With R 2nd Edition Tonny J. Oyana PDF All Chapters
100% (5)
Full Spatial Analysis With R 2nd Edition Tonny J. Oyana PDF All Chapters
62 pages
DBSCAN Clustering Algorithm Based On Density
No ratings yet
DBSCAN Clustering Algorithm Based On Density
5 pages
Get Spatial Regression Models for the Social Sciences Advanced Quantitative Techniques in the Social Sciences 1st Edition Guangqing Chi Jun Zhu free all chapters
100% (1)
Get Spatial Regression Models for the Social Sciences Advanced Quantitative Techniques in the Social Sciences 1st Edition Guangqing Chi Jun Zhu free all chapters
37 pages
Advanced Builder and Results - Builder Tutorial (2013.10) PDF
100% (2)
Advanced Builder and Results - Builder Tutorial (2013.10) PDF
130 pages
Mining Geostatistics - Journel PDF
No ratings yet
Mining Geostatistics - Journel PDF
306 pages
Geostatistics Without Tears
No ratings yet
Geostatistics Without Tears
116 pages
Module 8: Numerical Methods: Topics
No ratings yet
Module 8: Numerical Methods: Topics
20 pages
Introduction To Geostatistics Notes Prof Rop 2023
No ratings yet
Introduction To Geostatistics Notes Prof Rop 2023
11 pages
19 Jurnal Erizal Respatti EDIT
No ratings yet
19 Jurnal Erizal Respatti EDIT
8 pages
Use of Simulations For Mining Applications
No ratings yet
Use of Simulations For Mining Applications
9 pages
EastWestAirlines Cluster
100% (1)
EastWestAirlines Cluster
6 pages
Strategic Mine Planning Under Uncertainty
No ratings yet
Strategic Mine Planning Under Uncertainty
13 pages
Estimasi Sumberdaya Batubara Pada Pit Timur, PT. Allied Indo Coal Jaya, Kota Sawahlunto, Sumatera Barat
No ratings yet
Estimasi Sumberdaya Batubara Pada Pit Timur, PT. Allied Indo Coal Jaya, Kota Sawahlunto, Sumatera Barat
18 pages
Mining Course Program - PT. Stania Bara Consulting
No ratings yet
Mining Course Program - PT. Stania Bara Consulting
7 pages
Geostatistical Analysis
100% (1)
Geostatistical Analysis
38 pages
MineScape Geostatistics
No ratings yet
MineScape Geostatistics
24 pages
Perbandingan Estimasi Sumberdaya Batubara Menggunakan Metode Ordinary Kriging Dan Metode Cross Section Di Pt. Nan Riang Jambi
No ratings yet
Perbandingan Estimasi Sumberdaya Batubara Menggunakan Metode Ordinary Kriging Dan Metode Cross Section Di Pt. Nan Riang Jambi
17 pages
Hidrogeologia
No ratings yet
Hidrogeologia
112 pages
Topic 6d - Hierarchical Algorithm
No ratings yet
Topic 6d - Hierarchical Algorithm
38 pages
Applied Spatial Statistics in R, Section 5: Geostatistics
No ratings yet
Applied Spatial Statistics in R, Section 5: Geostatistics
16 pages

Topic 6d - Hierarchical Algorithm

Uploaded by

Topic 6d - Hierarchical Algorithm

Uploaded by

TOPIC 6 – PART D

• Produces a set of nested clusters organized as a hierarchical tree

• Use distance matrix as clustering criteria. This method does not

 Two main types of hierarchical clustering

 Traditional hierarchical algorithms use a similarity or distance matrix

 More popular hierarchical clustering technique

 Key operation is the computation of the dissimilarity of two clusters

A clustering of the data objects is

• After some merging steps, c1

(C2 and C5) and update the dissimilarity c2

• Similarity of two clusters is based on

p6 0.23 0.25 0.11 0.22 0.39 0.00

p4 0.37 0.20 0.16 0.00

1 O Step 2 – form the next cluster by finding the point

Draw the 0.15

Nested Clusters Dendrogram

• Similarity of two clusters is based on the two least similar (most

Calculation is using the maximum

• Step 1 – find 2 shortest distance from Draw the dendogram

first 2 clusters (p3, p6), (p2,p5) 0.15

p4 0.37 0.20 0.16 0.00

P(3,6) 0.23 0.39 0.00 0.2

UPDATE MATRIX Points Distance

O Step 3 – form the next cluster by 0.4

minimum distance to the other 0.25

points in the existing clusters; 0.15

 newly formed cluster {(2,5), 1} 0.05

• Cluster analysis groups objects based on their similarity/dissimilarity

1. Jiawei Han and Micheline Kamber, Data Mining: Concepts and

2. Pang-Ning Tan, Michael Steinbach & Vipin Kumar, Introduction to Data

You might also like