07Clustering

Uploaded by

hussienayman366

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

12 views

07Clustering

Uploaded by

hussienayman366

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 34

Cluster Analysis

What is Cluster Analysis?

 Finding groups of objects such that the objects in
a group will be similar (or related) to one another
and different from (or unrelated to) the objects in
other groups Inter-cluster
Intra-cluster distances are
distances are maximized
minimized
What is Cluster Analysis?
 Cluster: a collection of data objects
 Similar to one another within the same cluster
 Dissimilar to the objects in other clusters
 Cluster analysis
 Grouping a set of data objects into clusters
 Clustering is unsupervised classification: no
predefined classes
 Clustering is used:
 As a stand-alone tool to get insight into data distribution
 Visualization of clusters may unveil important information
 As a preprocessing step for other algorithms
 Efficient indexing or compression often relies on clustering
Some Applications of Clustering
What Is Good Clustering?
 A good clustering method will produce high
quality clusters with
 high intra-class similarity
 low inter-class similarity
 The quality of a clustering result depends on both
the similarity measure used by the method and its
implementation.
 The quality of a clustering method is also
measured by its ability to discover some or all of
the hidden patterns.
Requirements of Clustering in Data
Mining
 Scalability
 Ability to deal with different types of attributes
 Discovery of clusters with arbitrary shape
 Minimal requirements for domain knowledge to
determine input parameters
 Able to deal with noise and outliers
 Insensitive to order of input records
 High dimensionality
 Incorporation of user-specified constraints
 Interpretability and usability
Clustering Algorithms
Four of the most used clustering Algorithms
Distances Measure
K-Means Clustering
Algorithm
K-means Clustering

 Partitional clustering approach

 Each cluster is associated with a centroid (center
point)
 Each point is assigned to the cluster with the
closest centroid
 Number of clusters, K, must be specified
 The basic algorithm is very simple
K-means Clustering – Details
 Initial centroids are often chosen randomly.
 Clusters produced vary from one run to another.
 The centroid is (typically) the mean of the points in
the cluster.
 ‘Closeness’ is measured by Euclidean distance,
cosine similarity, correlation, etc.
 Most of the convergence happens in the first few
iterations.
 Often the stopping condition is changed to ‘Until relatively
few points change clusters’
 Complexity is O( n * K * I * d )
 n = number of points, K = number of clusters,
I = number of iterations, d = number of attributes
Two different K-means Clusterings
3

2.5

2
Original Points
1.5

y
1

0.5

-2 -1.5 -1 -0.5 0 0.5 1 1.5 2

3 3

2.5 2.5

2 2

1.5 1.5
y

y
1 1

0.5 0.5

0 0

-2 -1.5 -1 -0.5 0 0.5 1 1.5 2 -2 -1.5 -1 -0.5 0 0.5 1 1.5 2

x x

Optimal Clustering Sub-optimal Clustering

How the K-Mean Clustering algorithm works?
Example of K-Means Clustering
𝐺 𝑖 = 𝐺 𝑖+1 That the objects does not move group anymore
Hierarchical Clustering
Algorithms
How They Work
Step 3 can be done in different ways:
Example
How to calculate distance between newly grouped
clustered (D,F) and other clusters?
Assignment
A hierarchical clustering of distances in kilometers between some Italian
cities. The method used is single-linkage.
Input distance matrix (L = 0 for all the clusters):

The process is summarized by

the following hierarchical tree

BGo - AEF Starter - File1 - TestA
No ratings yet
BGo - AEF Starter - File1 - TestA
8 pages
AXXIS 23/23 (Ascent) : Service Manual 90-180171-05
100% (2)
AXXIS 23/23 (Ascent) : Service Manual 90-180171-05
145 pages
BAC Général 2018 LV1 Anglais
100% (1)
BAC Général 2018 LV1 Anglais
8 pages
DM Lecture 06
No ratings yet
DM Lecture 06
32 pages
Clustering-Part1
No ratings yet
Clustering-Part1
79 pages
Clustering
No ratings yet
Clustering
29 pages
Clustering
No ratings yet
Clustering
84 pages
Clustering
No ratings yet
Clustering
125 pages
Clustering
No ratings yet
Clustering
104 pages
Machine Learning Notes-1 (Clustering-1)
No ratings yet
Machine Learning Notes-1 (Clustering-1)
25 pages
Lecture Notes For Chapter 8: by Tan, Steinbach, Kumar
No ratings yet
Lecture Notes For Chapter 8: by Tan, Steinbach, Kumar
93 pages
Data Mining: I Gede Mahendra Darmawiguna
No ratings yet
Data Mining: I Gede Mahendra Darmawiguna
25 pages
CS8091 - Big Data Analytics - Unit 2
No ratings yet
CS8091 - Big Data Analytics - Unit 2
44 pages
Module 5
No ratings yet
Module 5
98 pages
Week 10 Lecture - Introduction to Clustering(1)
No ratings yet
Week 10 Lecture - Introduction to Clustering(1)
35 pages
22AIP3101A Session 9
No ratings yet
22AIP3101A Session 9
38 pages
CT075!3!2 DTM Topic 10 Cluster Analysis
No ratings yet
CT075!3!2 DTM Topic 10 Cluster Analysis
21 pages
Fds Unit03
No ratings yet
Fds Unit03
11 pages
Data Mining - Clustering
No ratings yet
Data Mining - Clustering
90 pages
Unit 4 Clustering - K-Means and Hierarchical
No ratings yet
Unit 4 Clustering - K-Means and Hierarchical
40 pages
Clustering Algorithm
No ratings yet
Clustering Algorithm
47 pages
datamining-lect8
No ratings yet
datamining-lect8
79 pages
Clustering-Part 1
No ratings yet
Clustering-Part 1
35 pages
FML Unit4
No ratings yet
FML Unit4
14 pages
BDA Unit 2
No ratings yet
BDA Unit 2
31 pages
DMDWUNITV
No ratings yet
DMDWUNITV
72 pages
Unit 4
No ratings yet
Unit 4
74 pages
Chapter 3-Unsupervised learning_updated
No ratings yet
Chapter 3-Unsupervised learning_updated
54 pages
8. Clustering
No ratings yet
8. Clustering
38 pages
Clustering Algorithm: An Unsupervised Learning Approach
No ratings yet
Clustering Algorithm: An Unsupervised Learning Approach
23 pages
ML Module 4 2022 1 PDF
No ratings yet
ML Module 4 2022 1 PDF
31 pages
What Is Cluster Analysis?: - Cluster: A Collection of Data Objects
No ratings yet
What Is Cluster Analysis?: - Cluster: A Collection of Data Objects
42 pages
Unit 4 Descriptive Modeling
No ratings yet
Unit 4 Descriptive Modeling
18 pages
w6 Clustering
No ratings yet
w6 Clustering
29 pages
APznzaaxpWzYylHJmwXGn2puBz7GP1usZYf9XTi7oqfrrKnFV9DMMfVzPCu6yO0UOnr_XFt1gJv4TE1ITR6850n9k65DydQUgoRlylNdn2acWAu6KNonoO8z7QULN6BlLxY_B-JhKko0tJ3K77woLz26oTaAv1YNcIuMcOSqInmgeCUzpUxjKC9VqnT_lhE7vDyWp_LQQjGTRnamgIC6ya3nlwi7mjjE9EUIiO2sUhjkD6RV
No ratings yet
APznzaaxpWzYylHJmwXGn2puBz7GP1usZYf9XTi7oqfrrKnFV9DMMfVzPCu6yO0UOnr_XFt1gJv4TE1ITR6850n9k65DydQUgoRlylNdn2acWAu6KNonoO8z7QULN6BlLxY_B-JhKko0tJ3K77woLz26oTaAv1YNcIuMcOSqInmgeCUzpUxjKC9VqnT_lhE7vDyWp_LQQjGTRnamgIC6ya3nlwi7mjjE9EUIiO2sUhjkD6RV
38 pages
Cluster
100% (1)
Cluster
72 pages
DSML-ML09. Unsupervised Learning
No ratings yet
DSML-ML09. Unsupervised Learning
69 pages
Data Mining Lecture Notes-1: Bsc. (H) Computer Science: Vi Semester Teacher: Ms. Sonal Linda
No ratings yet
Data Mining Lecture Notes-1: Bsc. (H) Computer Science: Vi Semester Teacher: Ms. Sonal Linda
40 pages
Machine Learning & Data Mining: Understanding
No ratings yet
Machine Learning & Data Mining: Understanding
7 pages
Chapter 5 Clustering
No ratings yet
Chapter 5 Clustering
40 pages
Clustering in Python
No ratings yet
Clustering in Python
31 pages
K Mean Clustering1
No ratings yet
K Mean Clustering1
23 pages
Unit 3 Data
No ratings yet
Unit 3 Data
37 pages
UNIT 4 Updated
No ratings yet
UNIT 4 Updated
56 pages
7.introduction To Clustering
No ratings yet
7.introduction To Clustering
11 pages
Clustering Agglo Devisive DBSCAN
No ratings yet
Clustering Agglo Devisive DBSCAN
78 pages
Lecture 6
No ratings yet
Lecture 6
14 pages
Unit 4
No ratings yet
Unit 4
4 pages
UNIT - 4 DWDM
No ratings yet
UNIT - 4 DWDM
27 pages
ML Unit-4
No ratings yet
ML Unit-4
14 pages
4 Clustering
No ratings yet
4 Clustering
9 pages
Datamining-lect5 - Clustering. the K-means Algorithm. Hierarchical Clustering. the DBSCAN Algorithm. Clustering Evaluation
No ratings yet
Datamining-lect5 - Clustering. the K-means Algorithm. Hierarchical Clustering. the DBSCAN Algorithm. Clustering Evaluation
110 pages
Unsupervised Learning
No ratings yet
Unsupervised Learning
83 pages
Lecture 01 - Unsupervised Learning (Optional)
No ratings yet
Lecture 01 - Unsupervised Learning (Optional)
57 pages
Clustering-Part1.pptx
No ratings yet
Clustering-Part1.pptx
84 pages
Clustering FinancialData
No ratings yet
Clustering FinancialData
38 pages
Lect 12
No ratings yet
Lect 12
80 pages
UNIT5
No ratings yet
UNIT5
60 pages
Cluster Analysis
No ratings yet
Cluster Analysis
21 pages
What Is Cluster Analysis?: - Cluster: A Collection of Data Objects
No ratings yet
What Is Cluster Analysis?: - Cluster: A Collection of Data Objects
9 pages
8. Clustering
No ratings yet
8. Clustering
80 pages
Autodesk Maya 2022: A Comprehensive Guide, 13th Edition
From Everand
Autodesk Maya 2022: A Comprehensive Guide, 13th Edition
Prof. Sham Tickoo
No ratings yet
Vector Calculus Using Mathematica Second Edition
From Everand
Vector Calculus Using Mathematica Second Edition
Steven Tan
No ratings yet
Ha b1 Workbooksample 1
No ratings yet
Ha b1 Workbooksample 1
14 pages
Stopwatch 3 Answer Key Standard Test 5TO PERITO Y BACH
No ratings yet
Stopwatch 3 Answer Key Standard Test 5TO PERITO Y BACH
4 pages
External Cutters
100% (1)
External Cutters
15 pages
172-Article Text-447-2-10-20200401
No ratings yet
172-Article Text-447-2-10-20200401
7 pages
Curriculum Vitae: Harpreet Singh
No ratings yet
Curriculum Vitae: Harpreet Singh
2 pages
Market Access Map Uschi
No ratings yet
Market Access Map Uschi
2 pages
ASRA Guidelines For CNB
100% (1)
ASRA Guidelines For CNB
66 pages
Strap Strategy: Hiral Thanawala
No ratings yet
Strap Strategy: Hiral Thanawala
4 pages
Ls400 (Deutzengine) : Cold Start Relay Kit
No ratings yet
Ls400 (Deutzengine) : Cold Start Relay Kit
10 pages
Glacier Region Walks
No ratings yet
Glacier Region Walks
29 pages
Language Test 2B
No ratings yet
Language Test 2B
2 pages
DataSets 1
No ratings yet
DataSets 1
4 pages
Anthony Rimmington - From Military To Industrial Complex The Conversion of Biological Weapons' Facilities in The Russian Federation
No ratings yet
Anthony Rimmington - From Military To Industrial Complex The Conversion of Biological Weapons' Facilities in The Russian Federation
35 pages
Create and Usee-Mail
No ratings yet
Create and Usee-Mail
11 pages
Description: Tags: 08bylevel
No ratings yet
Description: Tags: 08bylevel
2 pages
Empower B1+grammar Units 1-5
No ratings yet
Empower B1+grammar Units 1-5
111 pages
Ajay's Resume PDF
No ratings yet
Ajay's Resume PDF
1 page
1-Lecture Notes in Business Information Processing
No ratings yet
1-Lecture Notes in Business Information Processing
164 pages
Drillpipe and Bottom Hole Assembly Standards
100% (1)
Drillpipe and Bottom Hole Assembly Standards
2 pages
Viking Oil Pump 495 Series
No ratings yet
Viking Oil Pump 495 Series
2 pages
Cat Apem Ermec Nuevo Catalogo General de Pulsadores Interruptores Apem Big Blue 2011 12
No ratings yet
Cat Apem Ermec Nuevo Catalogo General de Pulsadores Interruptores Apem Big Blue 2011 12
589 pages
Relative Clauses PowerPoint
100% (1)
Relative Clauses PowerPoint
18 pages
VoxSmart (Spain) - DevOps (Senior)
No ratings yet
VoxSmart (Spain) - DevOps (Senior)
3 pages
Site Investigation & Instrumentation: Assignment No
No ratings yet
Site Investigation & Instrumentation: Assignment No
4 pages
Rod Ends in Bending
No ratings yet
Rod Ends in Bending
4 pages
Broadband X - ERP Implementation
No ratings yet
Broadband X - ERP Implementation
7 pages
Web Technologies Nodrm
0% (1)
Web Technologies Nodrm
651 pages

07Clustering

Uploaded by

07Clustering

Uploaded by

Cluster Analysis

What is Cluster Analysis?

 Partitional clustering approach

-2 -1.5 -1 -0.5 0 0.5 1 1.5 2

-2 -1.5 -1 -0.5 0 0.5 1 1.5 2 -2 -1.5 -1 -0.5 0 0.5 1 1.5 2

Optimal Clustering Sub-optimal Clustering

The process is summarized by

You might also like