Spectral Clustering Survey

Uploaded by

Reuben Chin

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

21 views

Spectral Clustering Survey

Uploaded by

Reuben Chin

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 12

The latest research progress on spectral

clustering

Hongjie Jia, Shifei Ding, Xinzheng Xu &

Ru Nie

Neural Computing and Applications

ISSN 0941-0643

Neural Comput & Applic

DOI 10.1007/s00521-013-1439-2

1 23
Your article is protected by copyright and
all rights are held exclusively by Springer-
Verlag London. This e-offprint is for personal
use only and shall not be self-archived
in electronic repositories. If you wish to
self-archive your article, please use the
accepted manuscript version for posting on
your own website. You may further deposit
the accepted manuscript version in any
repository, provided it is only made publicly
available 12 months after official publication
or later and provided acknowledgement is
given to the original source of publication
and a link is inserted to the published article
on Springer's website. The link must be
accompanied by the following text: "The final
publication is available at link.springer.com”.

1 23
Author's personal copy
Neural Comput & Applic
DOI 10.1007/s00521-013-1439-2

REVIEW

The latest research progress on spectral clustering

Hongjie Jia • Shifei Ding • Xinzheng Xu •

Ru Nie

Received: 3 January 2013 / Accepted: 23 May 2013

Ó Springer-Verlag London 2013

Abstract Spectral clustering is a clustering method based while data points in different groups are dissimilar to each
on algebraic graph theory. It has aroused extensive atten- other [56]. Traditional clustering methods, such as k-means
tion of academia in recent years, due to its solid theoretical algorithm [30, 41], FCM algorithm [21], and EM algorithm
foundation, as well as the good performance of clustering. [13], are simple, but lack the ability to handle complex data
This paper introduces the basic concepts of graph theory structures. When the sample space is non-convex, these
and reviews main matrix representations of the graph, then algorithms are easy to fall into local optimum [29].
compares the objective functions of typical graph cut In recent years, spectral clustering has aroused more and
methods and explores the nature of spectral clustering more attention of academia, due to its good clustering
algorithm. We also summarize the latest research performance and solid theoretical foundation [47]. Spectral
achievements of spectral clustering and discuss several key clustering does not make any assumptions on the global
issues in spectral clustering, such as how to construct structure of the data. It can converge to global optimum
similarity matrix and Laplacian matrix, how to select and performs well for the sample space of arbitrary shape,
eigenvectors, how to determine cluster number, and the especially suitable for non-convex dataset [16]. The idea of
applications of spectral clustering. At last, we propose spectral clustering is based on spectral graph theory. It
several valuable research directions in light of the defi- treats data clustering problem as a graph partitioning
ciencies of spectral clustering algorithms. problem and constructs an undirected weighted graph with
each point in the dataset being a vertex and the similarity
Keywords Spectral clustering Graph theory value between any two points being the weight of the edge
Graph cut Laplacian matrix Eigen-decomposition connecting the two vertices [8]. Then, we can decompose
the graph into connected components by certain graph cut
method and call those components as clusters.
1 Introduction There are a variety of traditional graph cut methods,
such as minimum cut, ratio cut, normalized cut and min/
Clustering is an important research field in data mining. max cut. The optimal clustering results can be obtained by
The purpose of clustering is to divide a dataset into natural minimizing or maximizing the objective function of the
groups so that data points in the same group are similar graph cut methods. However, for various graph cut meth-
ods, seeking the optimal solution of the objective function
is often NP-hard. With the help of spectral method, the
H. Jia S. Ding (&) X. Xu R. Nie
original problem can be solved in polynomial time by
School of Computer Science and Technology, China University
of Mining and Technology, Xuzhou 221116, China relaxing the original discrete optimization problem to the
e-mail: [email protected] real domain [17]. In graph partitioning, a point can be
considered part belonging to subset A and part belonging to
S. Ding
subset B, rather than strictly belonging to one cluster. It can
Key Laboratory of Intelligent Information Processing, Institute
of Computing Technology, Chinese Academy of Science, be proved that the classification information of vertices is
Beijing 100190, China contained in the eigenvalues and eigenvectors of graph