0% found this document useful (0 votes)
20 views

Cluster

Uploaded by

Hriday Shetty
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
20 views

Cluster

Uploaded by

Hriday Shetty
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 50

Clustering

Unsupervised learning
• Unsupervised learning:
– Data with no target attribute. Describe hidden structure from unlabeled data.
– Explore the data to find some intrinsic structures in them.
• Clustering: the task of grouping a set of objects in such a way that objects
in the same group (called a cluster) are more similar to each other than to
those in other clusters.
• Useful for
– Automatically organizing data.
– Understanding hidden structure in data.
– Preprocessing for further analysis.

2
Applications
• Biology: classification of plants and animal kingdom given their
features
• Marketing: Customer Segmentation based on a database of
customer data containing their properties and past buying
records
• Clustering weblog data to discover groups of similar access
patterns.
• Recognize communities in social networks.
Aspects of
clustering
• A clustering algorithm such as The quality of a
– Partitional clustering eg, kmeans clustering result
– Hierarchical clustering depends on the
algorithm, the distance
– Mixture of Gaussians
function, and the
• A distance or similarity function application.
– such as Euclidean, Minkowski, cosine
• Clustering quality
– Inter-clusters distance  maximized
– Intra-clusters distance  minimized
Partitioning
Algorithms
• Partitioning method: Construct a partition of a
database D of m objects into a set of k
clusters
• Given a k, find a partition of k clusters that optimizes
the chosen partitioning criterion
– Global optimal: exhaustively enumerate all partitions
– Heuristic method: k-means (MacQueen, 1967)
Partitioning
Algorithms
• Given k
• Construct a partition of m objects 𝑋 = {𝒙𝟏 , 𝒙𝟐 , … , 𝒙𝒎 }
where 𝒙𝒊 = (𝑥 𝑖1, 𝑥 𝑖2, … , 𝑥 𝑖 𝑛 ) is a vector in a real-valued space 𝑋 ⊆ ℝ 𝑛 , n is the
number of attributes.
• into a set of k clusters 𝑆 = {𝑆1 , 𝑆2 , … , 𝑆𝑘 }
• The cluster mean 𝜇𝑖 serves as a prototype of the cluster 𝑆𝑖 .
• Find k clusters that optimizes a chosen criterion
– E.g., the within-cluster sum of squares (WCSS)
(sum of distance functions of each point in the cluster to the cluster
mean)
𝑘 2
argmin ෍ ෍ 𝑥𝑖 − 𝜇 𝑖
𝑆
𝑖=1 𝑥∈𝑆 𝑖

Heuristic method: k-means (MacQueen, 1967)


K-means
algorithm
Given k
1. Randomly choose k data points (seeds) to be the
initial cluster centres
2. Assign each data point to the closest cluster
centre
3. Re-compute the cluster centres using the current
cluster memberships.
4. If a convergence criterion is not met, go to 2.
Stopping/convergence criterion
OR
1. no re-assignments of data points to different
clusters
2. no (or minimum) change of centroids
3. minimum decrease in the sum of squared
error
𝑘
2
𝑆𝑆𝐸 = ෍ ෍ 𝑥𝑖 − 𝜇𝑖
𝑖=1 𝑥∈𝑆 𝑖
Kmeans illustrated
K-medioid
Medoids
Medoids are representative objects of a dataset or a cluster within a dataset whose sum of distances to other
objects in the cluster is minimal.
K-Means uses centroids.
The relationship between centroids and medoids is similar to the relationship between means and
medians.
Medoids and medians will always be one of the observations in the data, while that’s not necessarily
the case for centroids and means.
The main difference between K-Means and K-Medoids is that K-Means will form clusters based on the distance
of observations to each centroid, while K-Medoid forms clusters based on the distance to medoids.
Algorithm
• Select initial k medoids randomly
• Repeat:
• object re-assignment
• Swap medoid m with O, if it improves the clustering quality
• Until convergence criterion satisfied
Hierarchical Clustering
Dendogram:
Example
Similarity / Distance measures
• Distance metric (scale-dependent)
– Minkowski family of distance measures
𝑛 1

𝑑 𝒙𝒊 , 𝒙𝒋 � ൗ𝑝

= 𝑥𝑖𝑠
Manhattan (p=1), Euclidean
− (p=2)
– Cosine distance 𝑥𝑗𝑠
𝑠=1
Similarity / Distance measures
• Correlation coefficients (scale-invariant)
• Mahalanobis distance
𝑑 𝑥 𝑖 , 𝑥𝑖 = 𝑥𝑖 − 𝑥𝑗 Σ−1 𝑥𝑖 −
𝑥𝑗
• Pearson correlation
𝐶𝑜𝑣(𝑥𝑖 ,
𝑟
𝑥𝑗 𝜎)𝑥 𝑖 𝜎 𝑥 𝑗
𝑥 𝑖 , 𝑥𝑗

=
Convergence of K-Means
• Recomputation monotonically decreases each square
error since
(𝑚𝑗 is number of members in cluster j):
σ 𝑥𝑖 − 𝑎 2 reaches minimum for:
෍ −2 𝑥𝑖 − 𝑎 = 0

෍ 𝑥𝑖 = ෍ 𝑎 = 𝑚𝑗 𝑎

𝑎 = 1ൗ𝑚𝑗 ෍ 𝑥𝑖 = 𝑐𝑗

• K-means typically converges quickly


Time
Complexity
• Computing distance between two items is O(n)
where n is the dimensionality of the vectors.
• Reassigning clusters: O(km) distance
computations, or O(kmn).
• Computing centroids: Each item gets added
once to some centroid: O(mn).
• Assume these two steps are each done once
for t iterations: O(tknm).
Advantages
• Fast, robust easy to understand.
• Relatively efficient: O(tkmn)
• Gives best result when data set are distinct or
well separated from each other.
Disadvantages
• Requires apriori specification of the number of
cluster centers.
• Hard assignment of data points to clusters
• Euclidean distance measures can unequally
weight underlying factors.
• Applicable only when mean is defined i.e. fails
for categorical data.
• Only local optima
K-Means on RGB
image
x1={r1, g1, b1} Classification Results
x1C(x1)
x2={r2, g2, b2}
x2C(x2)
… …
Classifier
xiC(xi)
xi={ri, gi, bi} (K-Means)


Cluster Parameters
𝜃1 for C1
𝜃2 for C2

𝜃k for Ck

example from Bishop’s Book


12
BCS Summer School, example fromChristopher
Bishop’s Book
M.
Exeter, 2003 Bishop
BCS Summer School, example fromChristopher
Bishop’s Book
M.
Exeter, 2003 Bishop
BCS Summer School, example fromChristopher
Bishop’s Book
M.
Exeter, 2003 Bishop
BCS Summer School, example fromChristopher
Bishop’s Book
M.
Exeter, 2003 Bishop
BCS Summer School, example fromChristopher
Bishop’s Book
M.
Exeter, 2003 Bishop
BCS Summer School,  Christopher M.
Exeter, 2003 Bishop
BCS Summer School, example fromChristopher
Bishop’s Book
M.
Exeter, 2003 Bishop
BCS Summer School, example fromChristopher
Bishop’s Book
M.
Exeter, 2003 Bishop
BCS Summer School, example fromChristopher
Bishop’s Book
M.
Exeter, 2003 Bishop

You might also like