Clustering
Clustering
Assistant Professor
UIET, Panjab University
Chandigarh, India
[email protected]
Chandigarh, India
[email protected]
n
Abstract Clustering is an important phase in data mining. A
number of different clustering methods are used to perform cluster
analysis: Partitioning Clustering, hierarchical clustering, gridbased clustering, model-based, graph based clustering and density
based clustering and so on. Hierarchical method helps us to cluster
the data objects in the form of a tree known as hierarchy. And each
node in hierarchy is known as the cluster. Hierarchical clustering
can be performed in two ways: agglomerative clustering and
divisive clustering. Agglomerative clustering is always more
preferable. For a good cluster analysis, the quality of the clusters
should be high. In this paper, we will measure the quality of
clusters with the help of three parameters: Cohesion measurement,
Silhouette index and Elapsed time.
KeywordsData mining, Clustering, Hierarchical Clustering,
Quality, Quality parameters, Cohesion, Silhouette index, Elapsed
time.
I.
INTRODUCTION
RELATED WORK
CLUSTERING METHOD
A. Hierarchical Clustering
Hierarchical clustering is defined as a method in which
clusters are formed in the form of a tree or hierarchy. Every
node in the tree represents the different cluster and the clusters
in the hierarchy are known as dendrograms. Hierarchical
clustering can be performed in two ways based on splitting
and merging of clusters: divisive method and agglomerative
method.
Divisive method of hierarchical clustering is also known as
top-down approach in which a large data set is given initially
and this data set is further divided into a number of smaller
subsets (known as clusters) until a threshold is reached [7] .
Agglomerative method works in the reverse direction of
divisive method. In this method, a number of clusters are
given initially and these clusters are merged in such a way that
the two clusters to be merged are very similar to each other
[7]. These clusters are merged together until a large cluster is
QUALITY MEASUREMENT
EXPERIMENTAL DESIGN
I.
II.
III.
IV.
V.
VI.
0.092
200x3
0.010
300x3
0.012
400x3
0.020
500x3
0.031
100x3
0.8210
200x3
0.8294
300x3
0.8219
400x3
0.8253
500x3
0.8215
References
[1]
[2]
[3]
[4]
[5]
[6]
[7]
VII.
CONCLUSION
[8]
[9]
[10]
[11]
[12]
[13]
[14]
[15]
spike sorting by
unsupervised clustering with diffusion maps and silhouettes,
Neurocomputing, Vol. 153, p.199-210, April 2015.
[16] J.Han and M.Kamber, Data Mining: Concepts and Techniques,
The Morgankaufmann/ Elsevier,India.