0% found this document useful (0 votes)

86 views

3 UnSupervised Learning

The document discusses different types of unsupervised machine learning techniques, specifically clustering algorithms. It provides an overview of hierarchical and partitional clustering. For hierarchical clustering, it describes agglomerative and divisive approaches. For partitional clustering, it focuses on the k-means algorithm, providing examples and step-by-step explanations of how k-means works. It also discusses different distance measures that can be used to compare clusters in agglomerative hierarchical clustering.

Uploaded by

Zaeem Abbas

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

86 views

3 UnSupervised Learning

Uploaded by

Zaeem Abbas

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 53

SE-807/CS-871

Machine Learning
Prof Dr. Hammad Afzal
[email protected]

Data and Text Processing Lab

www.codteem.com

1
Agenda
• Unsupervised Learning

– K-Means Clustering

– Agglomerative Clustering

2
Unsupervised Learning

3
CLUSTERING

Clustering is the partitioning of a data set into

subsets (clusters), so that the data in each
subset (ideally) share some common trait -
often according to some defined distance
measure.

Clustering is unsupervised classification

4
CLUSTERING
There is no explicit teacher and the system forms clusters or “natural
groupings” or structure in the input pattern

5
CLUSTERING
• Data WITHOUT classes or labels

x1, x2 , x3 , xn  , x  d

• Deals with finding a structure in a collection of unlabeled data.

• The process of organizing objects into groups whose members are

similar in some way

• A cluster is therefore a collection of objects which are “similar” between

them and are “dissimilar” to the objects belonging to other clusters.

6
CLUSTERING

• In this case we easily identify the 4 clusters into which the data can be
divided;

• The similarity criterion is distance: two or more objects belong to the

same cluster if they are “close” according to a given distance

7
Types of Clustering
Hierarchical algorithms
These find successive clusters using previously established clusters.

1. Agglomerative ("bottom-up"):
Agglomerative algorithms begin with each element as a separate cluster and merge
them into successively larger clusters.

2. Divisive ("top-down"):
Divisive algorithms begin with the whole set and proceed to divide it into
successively into smaller clusters.

8
Types of Clustering
• Partitional clustering
– Construct a partition of a data set to produce
several clusters – At once
– The process is repeated iteratively –
Termination condition
– Examples
▪ K-means clustering
▪ Fuzzy c-means clustering

9
K means Clustering
1. Chose the number (K) of clusters and randomly
select the centroids of each cluster.

2. For each data point:

I. Calculate the distance from the data point to each
cluster.
II. Assign the data point to the closest cluster.

3. Recompute the centroid of each cluster.

4. Repeat steps 2 and 3 until there is no further

change in the assignment of data points (or in the
centroids).

10
K MEANS – Example 2
– Suppose we have 4 medicines and each has two attributes (pH
and weight index).
– Our goal is to group these objects into K=2 clusters of medicine

Medicine Weight pH-Index C

A 1 1
B 2 1
C 4 3 A B
D 5 4

11
K MEANS – Example 2
– Compute the distance between all samples and K centroids

c1 = A, c2 = B

d( D, c1 ) = ( 5 − 1)2 + ( 4 − 1)2 = 5
d( D, c2 ) = ( 5 − 2)2 + ( 4 − 1)2 = 4.24

12
K MEANS – Example 2
– Assign the sample to its closest cluster

– An element in a row of the Group matrix

below is 1 if and only if the object is
assigned to that group

13
K MEANS – Example 2
– Re-calculate the K-centroids
– Knowing the members of each
– cluster, now we compute the new
– centroid of each group based on
– these new memberships.

c1 = (1, 1)
 2 + 4 + 5 1+ 3 + 4 
c2 =  , 
 3 3 
= (11 / 3, 8 / 3)
= (3.67 , 2.67 )

14
K MEANS – Example 2
• Repeat the above steps

Compute the distance of all objects to

the new centroids

15
K MEANS – Example 2

Assign the membership to objects

16
K MEANS – Example 2

Knowing the members of each cluster, now we

compute the new centroid of each group based
on these new memberships.

 1+ 2 1+1 1
c1 =  ,  = (1 , 1)
 2 2  2
 4+5 3+4 1 1
c2 =  ,  = ( 4 ,3 )
 2 2  2 2

17
K MEANS – Example 2

18
K MEANS – Example 2

19
K MEANS – Example 2
• We obtain result that G2=G1. Comparing the grouping of last
iteration and this iteration reveals that the objects do not move
group anymore.

• Thus, the computation of the k-mean clustering has reached its

stability and no more iterations are needed.

20
Kmeans - Examples
Data Points – RGB Values of pixels
Can be used for Image Segmentation

D. Comaniciu and P.
Meer, Robust Analysis of
Feature Spaces:
Color Image
Segmentation, 1997.
21
Kmeans - Examples
Extraction of text in degraded documents

Original Image Kmeans with k=3

22
Kmeans - Examples

Original K=5 K=11

23
Kmeans - Examples

• Quantization of colors

24
Hierarchical
Clustering

25
Hierachical clustering
 Agglomerative and divisive clustering on the data set {a, b, c, d ,e }

Step 0 Step 1 Step 2 Step 3 Step 4

Agglomerative

a
ab
b
abcde
c
cde
d
de
e
Divisive
Step 4 Step 3 Step 2 Step 1 Step 0 26
Agglomerative clustering
1. Convert object attributes to distance matrix
2. Set each object as a cluster (thus if we have N
objects, we will have N clusters at the beginning)
3. Repeat until number of cluster is one (or known #
of clusters)
a. Merge two closest clusters
b. Update distance matrix

d3
d5
d1 d3,d4,d5
d4
d2
d1,d2 d4,d5 d3

27
Starting Situation
• Start with clusters of individual points and a distance/proximity matrix

p1 p2 p3 p4 p5 ...
p1

p2
p3

p4
p5
.
.
.

Distance Matrix
...
p1 p2 p3 p4 p9 p10 p11 p12

28
Intermediate situation

• After some merging steps, we have some clusters

C1 C2 C3 C4 C5
C1

C2
C3
C3
C4
C4
C5
C1 Distance Matrix

C2 C5

...
p1 p2 p3 p4 p9 p10 p11 p12

29
Intermediate situation
• How do we compare two clusters

C2 C5

30
Inter cluster distance measures

Similarity?

• Single Link
• Average Link
• Complete Link
• Distance between centroids

31
Intermediate situation
• We want to merge the two closest clusters (C2 and C5) and
update the distance matrix.
C1 C2 C3 C4 C5
C1

C2
C3
C3
C4
C4
C5
C1
Distance Matrix

C2 C5

...
p1 p2 p3 p4 p9 p10 p11 p12
32
Single link
• Smallest distance between an element in one cluster and an element
in the other

D(ci , c j ) = min D( x, y)
xci , yc j

33
Complete link
• Largest distance between an element in one cluster and an element in
the other

D(ci , c j ) = max D( x, y)
xci , yc j

34
Average Link
• Avg distance between an element in one cluster and an element in the
other

D(ci , c j ) = avg D( x, y)
xci , yc j

35
Distance between centroids
• Distance between the centroids of two clusters

 

36
After Merging
• Update the distance matrix

C2 U
C5
C1 C3 C4

C1 ?
C3
C2 U C5 ? ? ? ?
C4
C3 ?

C4 ?
C1

C2 U C5

...
p1 p2 p3 p4 p9 p10 p11 p12
37
Agglomerative Clustering - Example
X1 X2
A 1 1
B 1.5 1.5
C 5 5
D 3 4
E 4 4
F 3 3.5

Data matrix
Dist A B C D E F
A 0.00 0.71 5.66 3.61 4.24 3.20
B 0.71 0.00 4.95 2.92 3.54 2.50
dAB = ((1-1.5)2+(1-1.5)2)1/2 = 0.707
C 5.66 4.95 0.00 2.24 1.41 2.50
Euclidean distance D 3.61 2.92 2.24 0.00 1.00 0.50
E 4.24 3.54 1.41 1.00 0.00 1.12
F 3.20 2.50 2.50 0.50 1.12 0.00

38
Merge two closest clusters

Agglomerative Clustering - Example

X1 X2
A 1 1
B 1.5 1.5
C 5 5
D 3 4
E 4 4
Merge them into
single cluster` F 3 3.5

Data matrix
Dist A B C D E F
A 0.00 0.71 5.66 3.61 4.24 3.20
B 0.71 0.00 4.95 2.92 3.54 2.50
C 5.66 4.95 0.00 2.24 1.41 2.50

Find two closest clusters D 3.61 2.92 2.24 0.00 1.00 0.50
E 4.24 3.54 1.41 1.00 0.00 1.12
F 3.20 2.50 2.50 0.50 1.12 0.00

39
Update Distance Matrix

Agglomerative Clustering - Example

Dist A B C D E F
A 0.00 0.71 5.66 3.61 4.24 3.20
B 0.71 0.00 4.95 2.92 3.54 2.50
C 5.66 4.95 0.00 2.24 1.41 2.50
D 3.61 2.92 2.24 0.00 1.00 0.50
E 4.24 3.54 1.41 1.00 0.00 1.12
F 3.20 2.50 2.50 0.50 1.12 0.00

Dist A B C D,F E
A 0.00 0.71 5.66 ? 4.24
B 0.71 0.00 4.95 ? 3.54
C 5.66 4.95 0.00 ? 1.41
D,F ? ? ? 0.00 ?
E 4.24 3.54 1.41 ? 0.00

40
Update Distance Matrix

Agglomerative Clustering - Example

Dist A B C D E F
Min Distance – Single Linkage
A 0.00 0.71 5.66 3.61 4.24 3.20
D(D,F)→A = min(dDA,dFA)=min(3.61,3.20) = 3.20
B 0.71 0.00 4.95 2.92 3.54 2.50
C 5.66 4.95 0.00 2.24 1.41 2.50 D(D,F)→B = min(dDB,dFB)=min(2.92,2.50) = 2.50

D 3.61 2.92 2.24 0.00 1.00 0.50

D(D,F)→C = min(dDC,dFC)=min(2.24,2.50) = 2.24
E 4.24 3.54 1.41 1.00 0.00 1.12
F 3.20 2.50 2.50 0.50 1.12 0.00 D(D,F)→E = min(dDE,dFE)=min(1.00,1.12) = 1.00

Dist A B C D,F E Dist A B C D,F E

A 0.00 0.71 5.66 ? 4.24 A 0.00 0.71 5.66 3.20 4.24
B 0.71 0.00 4.95 ? 3.54 B 0.71 0.00 4.95 2.50 3.54
C 5.66 4.95 0.00 ? 1.41 C 5.66 4.95 0.00 2.24 1.41
D,F ? ? ? 0.00 ? D,F 3.20 2.50 2.24 0.00 1.00
E 4.24 3.54 1.41 ? 0.00 E 4.24 3.54 1.41 1.00 0.00

41
Merge two closest clusters

Agglomerative Clustering - Example

Dist A B C D,F E
A 0.00 0.71 5.66 3.20 4.24
B 0.71 0.00 4.95 2.50 3.54
C 5.66 4.95 0.00 2.24 1.41
D,F 3.20 2.50 2.24 0.00 1.00
E 4.24 3.54 1.41 1.00 0.00

Dist A,B C D,F E

A,B 0.00 ? ? ?
C ? 0.00 2.24 1.41
D,F ? 2.24 0.00 1.00
E ? 1.41 1.00 0.00

42
Update Distance Matrix

Agglomerative Clustering - Example

Dist A B C D,F E
A 0.00 0.71 5.66 3.20 4.24 D(A,B)→C = min(dCA,dCB)=min(5.66,4.95) = 4.95
B 0.71 0.00 4.95 2.50 3.54
C 5.66 4.95 0.00 2.24 1.41 D(A,B)→(D,F) = min(dDA,dDB, dFA,dFB)
=min(3.61, 2.92, 3.20, 2.50) = 2.50
D,F 3.20 2.50 2.24 0.00 1.00
E 4.24 3.54 1.41 1.00 0.00
D(A,B)→E = min(dAE,dBE)=min(4.24,3.54) = 3.54

Dist A,B C D,F E Dist A,B C D,F E

A,B 0.00 ? ? ? A,B 0.00 4.95 2.50 3.54
C ? 0.00 2.24 1.41 C 4.95 0.00 2.24 1.41
D,F ? 2.24 0.00 1.00 D,F 2.50 2.24 0.00 1.00
E ? 1.41 1.00 0.00 E 3.54 1.41 1.00 0.00

43
Merge two closest clusters/Update Distance Matrix

Agglomerative Clustering - Example

Dist A,B C D,F E

A,B 0.00 4.95 2.50 3.54
C 4.95 0.00 2.24 1.41
D,F 2.50 2.24 0.00 1.00
E 3.54 1.41 1.00 0.00

Dist (A,B) C (D,F),E

(A,B) 0.00 4.95 2.50
C 4.95 0.00 1.41
(D,F),E 2.50 1.41 0.00

44
Merge two closest clusters/Update Distance Matrix

Agglomerative Clustering - Example

Dist (A,B) C (D,F),E

(A,B) 0.00 4.95 2.50
C 4.95 0.00 1.41
(D,F),E 2.50 1.41 0.00

Dist (A,B) ((D,F),E),C

(A,B) 0.00 2.50
((D,F),E),C 2.50 0.00

45
Final Result

Agglomerative Clustering - Example

X1 X2
A 1 1
B 1.5 1.5
C 5 5
D 3 4
E 4 4
F 3 3.5

Data matrix

46
Dendrogram Representation

Agglomerative Clustering - Example

1. In the beginning we have 6 clusters: A,

B, C, D, E and F
2. We merge cluster D and F into cluster
6 (D, F) at distance 0.50
3. We merge cluster A and cluster B into
(A, B) at distance 0.71
4. We merge cluster E and (D, F) into ((D,
F), E) at distance 1.00
5. We merge cluster ((D, F), E) and C into
5 (((D, F), E), C) at distance 1.41
6. We merge cluster (((D, F), E), C) and
(A, B) into ((((D, F), E), C), (A, B)) at
4 distance 2.50
3 7. The last cluster contain all the objects,
2 thus conclude the computation

47
Single Link Clustering
5
1
3
5
2 1 0.2

2 3 6
0.15

0.1

0.05

4
4 0
3 6 2 5 4 1

Dendrogram
Nested Clusters

48
Complete link Clustering

4 1
2 5 0.4

5 0.35
2 0.3

0.25

3 6 0.2

3 0.15

1 0.1

0.05
4 0
3 6 4 1 2 5

Dendrogram
Nested Clusters

49
Average link clustering

5 4 1
2
5 0.25

2 0.2

3 6
0.15

0.1
1
0.05

4 0
3 3 6 4 1 2 5

Dendrogram

Nested Clusters

50
Comparison
5 4 1
1
3 2 5
5 5
2
2 1
2 3 3 6
6 3
1
4
4
4

Average Link
51
Agglomerative Clustering

• Where to cut the tree?

3-cluster model 2-cluster model

52
Thank You

Characters Around The Cross, 1986, Tom Houston, 0947697217, 9780947697211, Marc Europe, 1986
No ratings yet
Characters Around The Cross, 1986, Tom Houston, 0947697217, 9780947697211, Marc Europe, 1986
19 pages
Being A Man in Dance Socialization Modes and Gender Identities
No ratings yet
Being A Man in Dance Socialization Modes and Gender Identities
23 pages
Estados Financieros Mercadona ... Navarro
No ratings yet
Estados Financieros Mercadona ... Navarro
19 pages
CAD Techniques Porfolio Exercise 7
No ratings yet
CAD Techniques Porfolio Exercise 7
2 pages
Gaussian Tips
No ratings yet
Gaussian Tips
70 pages
Templates: Your Own Sub Headline
No ratings yet
Templates: Your Own Sub Headline
22 pages
A Real Time Video Streaming Web Portal
No ratings yet
A Real Time Video Streaming Web Portal
5 pages
April
No ratings yet
April
8 pages
2 - Number System
No ratings yet
2 - Number System
55 pages
UTILTS User Guide
No ratings yet
UTILTS User Guide
137 pages
International Trade Assessment - Decathlon
No ratings yet
International Trade Assessment - Decathlon
6 pages
Math Hons Previous
No ratings yet
Math Hons Previous
79 pages
Format: Submitted: Three (One For Personal, One For Department, One For Internal Guide)
No ratings yet
Format: Submitted: Three (One For Personal, One For Department, One For Internal Guide)
7 pages
Tirana River Waterfront Development Prop PDF
No ratings yet
Tirana River Waterfront Development Prop PDF
15 pages
Reliance Industry
No ratings yet
Reliance Industry
23 pages
C++ IOS and Manipulator
No ratings yet
C++ IOS and Manipulator
41 pages
DBMS Solution
No ratings yet
DBMS Solution
64 pages
L1 Oe6980 PDF
No ratings yet
L1 Oe6980 PDF
165 pages
ETA E Report - Sept 2021 2
No ratings yet
ETA E Report - Sept 2021 2
21 pages
1.0 2 - How To Download and Install AVCS and AIO For New Orders - Updates
No ratings yet
1.0 2 - How To Download and Install AVCS and AIO For New Orders - Updates
3 pages
Marlabs Interview PDF
No ratings yet
Marlabs Interview PDF
3 pages
A.6.4) Column-Brochure-appolo
No ratings yet
A.6.4) Column-Brochure-appolo
13 pages
ICICI Prudential Banking & Financials Fund
No ratings yet
ICICI Prudential Banking & Financials Fund
13 pages
Crash List
No ratings yet
Crash List
42 pages
How To (Maybe) Measure Laser Beam Quality
No ratings yet
How To (Maybe) Measure Laser Beam Quality
16 pages
Statistics Chapter 15a (Index Numbers)
No ratings yet
Statistics Chapter 15a (Index Numbers)
20 pages
Computer Organization and Architecture
No ratings yet
Computer Organization and Architecture
36 pages
Solved SSC CGL 4 June 2019 Shift-1 Paper With Solutions
No ratings yet
Solved SSC CGL 4 June 2019 Shift-1 Paper With Solutions
41 pages
Roadmap of Python
No ratings yet
Roadmap of Python
3 pages
Chapter 4 Location Decision and Facilities Layout
No ratings yet
Chapter 4 Location Decision and Facilities Layout
30 pages
Strategic - Management - Marathon - Revision Notes
No ratings yet
Strategic - Management - Marathon - Revision Notes
40 pages
JManagResAnal 8 3 131 138
No ratings yet
JManagResAnal 8 3 131 138
8 pages
Gpa-Ssc Results 2013 10th Gpa Grading Marks Grade Points Calculation - Jobs Recruitment Exam Results University Admissions
No ratings yet
Gpa-Ssc Results 2013 10th Gpa Grading Marks Grade Points Calculation - Jobs Recruitment Exam Results University Admissions
5 pages
Padhle 11th - Maths - Ch-1 Sets Notes - Edited
No ratings yet
Padhle 11th - Maths - Ch-1 Sets Notes - Edited
10 pages
LMV Report - Group-5
No ratings yet
LMV Report - Group-5
57 pages
User Mannual - Receipt Import
0% (1)
User Mannual - Receipt Import
10 pages
Raw Industrial Ass
No ratings yet
Raw Industrial Ass
43 pages
Zara
No ratings yet
Zara
47 pages
Unit 5 - Computer Networks - WWW - Rgpvnotes.in
No ratings yet
Unit 5 - Computer Networks - WWW - Rgpvnotes.in
19 pages
Sa105 English Form-2021
No ratings yet
Sa105 English Form-2021
2 pages
Module 3 Team Project Swot Analysis
No ratings yet
Module 3 Team Project Swot Analysis
5 pages
STA 122 Instruction: Answer 10 Questions From Each of The Five Sections Time Allowed: 40 Minutes
No ratings yet
STA 122 Instruction: Answer 10 Questions From Each of The Five Sections Time Allowed: 40 Minutes
40 pages
Problem Tutorial: "Minimum Spanning Trees": I, J 0,1 0, J I, J
No ratings yet
Problem Tutorial: "Minimum Spanning Trees": I, J 0,1 0, J I, J
6 pages
Music Class 12 Cbse
No ratings yet
Music Class 12 Cbse
14 pages
Julia Liss - Du Bois and Boas
No ratings yet
Julia Liss - Du Bois and Boas
41 pages
Anual Report On Survey 1894-95 Part-2
No ratings yet
Anual Report On Survey 1894-95 Part-2
55 pages
Script Pega Tudo BySam Blox
100% (1)
Script Pega Tudo BySam Blox
2 pages
TTL-Network ProductCatalogue 2021 2022 ENG
No ratings yet
TTL-Network ProductCatalogue 2021 2022 ENG
116 pages
Integrating ADC
No ratings yet
Integrating ADC
4 pages
Pair of Straight Lines-Full
100% (1)
Pair of Straight Lines-Full
40 pages
Caiaque Prijon - Eng
0% (1)
Caiaque Prijon - Eng
26 pages
Engineers/managers?
No ratings yet
Engineers/managers?
38 pages
First Last: Education
No ratings yet
First Last: Education
3 pages
Local Spatial Autocorrelation - Geographic Data Science With Python
No ratings yet
Local Spatial Autocorrelation - Geographic Data Science With Python
24 pages
Wa0004 PDF
No ratings yet
Wa0004 PDF
8 pages
MKT318m Syllabus
No ratings yet
MKT318m Syllabus
54 pages
Assignment#4 (20 Ie 13)
No ratings yet
Assignment#4 (20 Ie 13)
33 pages
Demux Academy
No ratings yet
Demux Academy
205 pages
4 Colgate Palmolive
No ratings yet
4 Colgate Palmolive
36 pages
CS423 Data Warehousing and Data Mining: Dr. Hammad Afzal
No ratings yet
CS423 Data Warehousing and Data Mining: Dr. Hammad Afzal
41 pages
Active Learning and Machine Teaching For Online Learning A Study of Attention and Labelling Cost
No ratings yet
Active Learning and Machine Teaching For Online Learning A Study of Attention and Labelling Cost
6 pages
2 Supervised Learning
No ratings yet
2 Supervised Learning
48 pages
Release ML Assignment1 2k22
No ratings yet
Release ML Assignment1 2k22
3 pages
5 Ann
No ratings yet
5 Ann
103 pages
4-Bayesian Theory
No ratings yet
4-Bayesian Theory
65 pages
Site Information Table For PERN Feasibility
No ratings yet
Site Information Table For PERN Feasibility
1 page
Abebe Feyissa
No ratings yet
Abebe Feyissa
74 pages
Aiml Assignment 1
No ratings yet
Aiml Assignment 1
6 pages
AD3S
No ratings yet
AD3S
6 pages
Question Bank AML
No ratings yet
Question Bank AML
4 pages
Data Selection
No ratings yet
Data Selection
6 pages
Xi Chap 4
100% (1)
Xi Chap 4
7 pages
Data Classification 1707296890
No ratings yet
Data Classification 1707296890
6 pages
Prediction of COVID-19 Using Genetic Deep Learning Convolutional Neural Network GDCNN
No ratings yet
Prediction of COVID-19 Using Genetic Deep Learning Convolutional Neural Network GDCNN
20 pages
10.image Recognition For Plant Species Classification
No ratings yet
10.image Recognition For Plant Species Classification
1 page
06 - Decision Trees
100% (1)
06 - Decision Trees
83 pages
A Machine Learning Approach For Rainfall Estimation Integrating
No ratings yet
A Machine Learning Approach For Rainfall Estimation Integrating
11 pages
ida unit-4
No ratings yet
ida unit-4
19 pages
Crime Analysis and Prediction Using Datamining: A Review
No ratings yet
Crime Analysis and Prediction Using Datamining: A Review
20 pages
Sri Venkateswara University: Tirupati: Department of Computer Science
No ratings yet
Sri Venkateswara University: Tirupati: Department of Computer Science
24 pages
MLT Notes
No ratings yet
MLT Notes
17 pages
CAIE-D-22-03171
No ratings yet
CAIE-D-22-03171
36 pages
Mahout in Action Sean Owen pdf download
100% (7)
Mahout in Action Sean Owen pdf download
57 pages
DL4CV-Seq-Att
No ratings yet
DL4CV-Seq-Att
63 pages
Revenue Predictor - Udit Ennam PDF
No ratings yet
Revenue Predictor - Udit Ennam PDF
30 pages
ML Lab
No ratings yet
ML Lab
7 pages
DM GTU Study Material Presentations Unit-5 21052021124400PM
No ratings yet
DM GTU Study Material Presentations Unit-5 21052021124400PM
63 pages
2 - A Machine Learning Approach For Tracing Regulatory Codes To Product Specific Requirements
No ratings yet
2 - A Machine Learning Approach For Tracing Regulatory Codes To Product Specific Requirements
10 pages
Introduction To Machine Learning and Its Application
No ratings yet
Introduction To Machine Learning and Its Application
8 pages
Lab Exercise
No ratings yet
Lab Exercise
9 pages
Generative Ai and Prompt Basis Rules For Beginners
No ratings yet
Generative Ai and Prompt Basis Rules For Beginners
114 pages
Diabetes Documentation
No ratings yet
Diabetes Documentation
54 pages
Data Mining: Concepts and Techniques
No ratings yet
Data Mining: Concepts and Techniques
25 pages
M.tech (Cse) 2017 18
No ratings yet
M.tech (Cse) 2017 18
59 pages
About The Dataset - Car Evaluation Dataset (UCI Machine Learning Repository
No ratings yet
About The Dataset - Car Evaluation Dataset (UCI Machine Learning Repository
5 pages
Final Report - Smart and Fast Email Sorting: 1 Project's Description
No ratings yet
Final Report - Smart and Fast Email Sorting: 1 Project's Description
5 pages

3 UnSupervised Learning

Uploaded by

3 UnSupervised Learning

Uploaded by

SE-807/CS-871

Data and Text Processing Lab

Clustering is the partitioning of a data set into

Clustering is unsupervised classification

• Deals with finding a structure in a collection of unlabeled data.

• The process of organizing objects into groups whose members are

• A cluster is therefore a collection of objects which are “similar” between

• The similarity criterion is distance: two or more objects belong to the

2. For each data point:

3. Recompute the centroid of each cluster.

4. Repeat steps 2 and 3 until there is no further

Medicine Weight pH-Index C

– An element in a row of the Group matrix

Compute the distance of all objects to

Assign the membership to objects

Knowing the members of each cluster, now we

• Thus, the computation of the k-mean clustering has reached its

Original Image Kmeans with k=3

Original K=5 K=11

Step 0 Step 1 Step 2 Step 3 Step 4

• After some merging steps, we have some clusters

Agglomerative Clustering - Example

Agglomerative Clustering - Example

Agglomerative Clustering - Example

D 3.61 2.92 2.24 0.00 1.00 0.50

Dist A B C D,F E Dist A B C D,F E

Agglomerative Clustering - Example

Dist A,B C D,F E

Agglomerative Clustering - Example

Dist A,B C D,F E Dist A,B C D,F E

Agglomerative Clustering - Example

Dist A,B C D,F E

Dist (A,B) C (D,F),E

Agglomerative Clustering - Example

Dist (A,B) C (D,F),E

Dist (A,B) ((D,F),E),C

Agglomerative Clustering - Example

Agglomerative Clustering - Example

1. In the beginning we have 6 clusters: A,

Single Link 5 Complete Link

• Where to cut the tree?

3-cluster model 2-cluster model

You might also like