Video 19
Video 19
PII: S0925-2312(17)30614-8
DOI: 10.1016/j.neucom.2017.03.068
Reference: NEUCOM 18301
Please cite this article as: Lilin Jie , Weidong Liu , Zheng Sun , Shasha Teng , Hybrid fuzzy clustering
methods based on improved self-adaptive cellular genetic algorithm and optimal-selection-based fuzzy
c-means, Neurocomputing (2017), doi: 10.1016/j.neucom.2017.03.068
This is a PDF file of an unedited manuscript that has been accepted for publication. As a service
to our customers we are providing this early version of the manuscript. The manuscript will undergo
copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please
note that during the production process errors may be discovered which could affect the content, and
all legal disclaimers that apply to the journal pertain.
ACCEPTED MANUSCRIPT
1
Highlights
The new dynamic crossover and entropy-based two-combination mutation operations are
constructed to prevent the convergence of the algorithms to a local optimum by
adaptively modifying the probabilities of crossover and mutation as well as mutation step
size according to dynamic adjusting strategies and judging criterions.
An improved self-adaptive cellular genetic algorithm (IDCGA) is presented for a more
efficient search by combining the Arnold cat map with modified evolution rule, as well as
the constructed dynamic crossover and the entropy based two-combination mutation
operators.
Two novel adaptive fuzzy clustering algorithms based on IDCGA, referred to as
T
IDCGA-FCM and IDCGA2-FCM, are proposed in this paper. The first one is a standalone
IP
form for fuzzy clustering on the basis of IDCGA. The second one is a hybrid method
based on FCM and IDCGA which takes advantage of the merits of both algorithms.
CR
The experimental results showed that the presented algorithms have high efficiency and
accuracy.
US
AN
M
ED
PT
CE
AC
ACCEPTED MANUSCRIPT
2
Address: 999 Xuefu Road, Honggutan New District, Nanchang 330031, Jiangxi, China
T
IP
Order of Authors: Lilin Jiea,b, Weidong Liua,c*, Zheng Suna, Shasha Tenga
CR
School of Mechatronics Engineering, Nanchang University, Nanchang 330031, China
b
School of Aeronautical Manufacturing Engineering, Nanchang Hangkong University, Nanchang
c
330063, China
US
AN
School of Economic Management, Nanchang Hangkong University, Nanchang 330063, China
ABSTRACT
M
With an aim to overcome low efficiency and improve the performance of fuzzy clustering, two novel fuzzy clustering algorithms based on
improved self-adaptive cellular genetic algorithm (IDCGA) are proposed in this paper. The new dynamic crossover and entropy-based
two-combination mutation operations are constructed to prevent the convergence of the algorithms to a local optimum by adaptively modifying the
ED
probabilities of crossover and mutation as well as mutation step size according to dynamic adjusting strategies and judging criterions. Arnold cat
map is employed to initialize population for the purpose of overcoming the sensitivity of the algorithms to initial cluster centers. A modified
evolution rule is introduced to build a dynamic environment so as to explore the search space more effectively. Then a new IDCGA that combined
PT
these three processes is used to optimize fuzzy c-means (FCM) clustering (IDCGA-FCM). Furthermore, an optimal-selection-based strategy is
presented by the golden section method and then a hybrid fuzzy clustering method (IDCGA2-FCM) is developed by automatically integrating
IDCGA with optimal-selection-based FCM according to the variation of population entropy. Experiments were performed with six synthetic
CE
datasets and seven real-world datasets to compare the performance of our IDCGA-based clustering algorithms with FCM, other GA-based and
PSO-based clustering methods. The results showed that the presented algorithms have high efficiency and accuracy.
Keywords: Fuzzy clustering; Fuzzy c-means; Cellular genetic algorithm; Dynamic crossover; Two-combination mutation
AC
1. Introduction
With the increasing importance of extracting useful information from huge quantities of data, clustering technique has been
widely used in a variety of application domains such as data mining, machine learning, pattern recognition, computer vision and
image segmentation [1, 2]. Cluster analysis is an unsupervised learning process of partitioning an unlabeled data set into a
number of groups with the principle of similarity that objects in the same group are more similar to each other than those from
different groups. These clustering methods can be broadly divided into two basic types, namely hard and fuzzy methods [3].
Although hard clustering methods have the characteristics of simplicity and low computational complexity, they might be less
attractive for real-life data sets in which there are no definite boundaries between the groups because of assigning each element to
a single group. Comparatively, fuzzy clustering algorithms can assign each element of a data set to multiple groups
ACCEPTED MANUSCRIPT
3
simultaneously in accordance with the membership functions matrix, and therefore overcome the deficiency of the hard
clustering methods.
The most widely used fuzzy clustering algorithm is fuzzy c-means (FCM) proposed by Bezdek (1981) [4]. In spite of its high
popularity and extensive applications, FCM has the following inherent limitations: (i) the equal weight for data attributes without
considering the difference in importance of data attributes, all attributes being treated equally in determining the cluster
memberships of objects, (ii) the poor performance on these types of data such as uneven sample distribution and samples with
noises or outliers as well as the incomplete data, (iii) the low speed in solving clustering problems of high complexity, (iv) the
clustering results largely determined by the initial cluster centers, (v) the iterative process being trapped into local minima easily.
Aimed at the above-described defects, a number of extensions of FCM have been reported by some researchers in the last
couple of years. These studies mainly focused on attribute-weighted fuzzy clustering, modification of fuzzy membership
T
functions, and effectiveness assessment of fuzzy cluster validity indices, etc. For instance, Zhou et al. [5] presented a study on the
IP
attribute-weighted fuzzy clustering problem and proposed a maximum-entropy-regularized weighted fuzzy FCM (EWFCM)
algorithm. Zhang et al. [6] investigated an interval weighed FCM method with a genetic heuristic strategy to search for weights
CR
of data attributes in interval-constrained ranges, so as to quantitatively evaluate the relative importance of each attribute to the
clustering performance of FCM. Pimentel and Souza [2] conducted an experimental study of a multivariate FCM method in
which membership degrees by variables and clusters are calculated. That study showed that the membership degrees vary from
US
one variable to another and from one group to another. With an aim to deal with interval-valued data, Pimentel and Souza [7]
proposed a weighted multivariate FCM in which the membership degree for each cluster is defined by a linear combination of
membership degrees by variables. In another effort to improve the validity of the clustering on non-equilibrium distribution
AN
samples, Xiao et al. [8] reported a variant of FCM with modified fuzzy membership constraint function by relaxing the
normalization condition and constantly correcting the membership in the clustering process. Sabzekar and Naghibzadeh [9]
developed a modified FCM using relaxed constraints support vector machines to tackle the problem of many samples being
assigned to some clusters with low membership values. In order to reduce the convergence time of the FCM, several
M
generalization schemes for the suppressed FCM were also investigated extensively through experiment and numerical simulation
by Szilágyi et al. [10]. Zhao et al. [11] put forward an optimal-selection-based suppressed fuzzy c-means clustering algorithm
with self-tuning non-local spatial information and applied it to improve segmentation performance on images heavily
ED
contaminated by noise. Li et al. [12] used a GA approach to search for appropriate imputations of missing attributes in
nearest-neighbor intervals to recover the incomplete data set, with the goal of solving the problem of the uncertainty of missing
attributes. Zhou et al. [13] introduced a weighted summation type of fuzzy cluster validity index through properly setting the
PT
weights of the ten indices to evaluate the clustering results and determine the optimal number of clusters.
In spite of many obvious advantages of all the FCM versions discussed above, they still have limitations, such as the
CE
initialization with randomly selected cluster centers and the high probability of falling into local minima. Furthermore, there is
great research space to further improve the performance of fuzzy clustering algorithms.
To solve in part this problem, metaheuristic algorithms such as genetic algorithm (GA), simulated annealing (SA), ant colony
AC
optimization (ACO) and particle swarm optimization (PSO) have been successfully applied to optimize the objective function of
FCM, attempting to avoid being trapped in local minima [14]. GA has become one of the most popular search and optimization
tool in many application fields due to its ease of implementation, global perspective and extensive applicability [15].
Consequently, many scholars have been dedicated to applying GAs to the cluster analysis. As a result, a variety of GA-based
approaches for the hard and fuzzy clustering problem have been made by some authors [14]. GA-based approaches for the hard
clustering problem are as follows: Chang et al. [16] proposed a new clustering algorithm based on GA with gene rearrangement
(GAGR). They also compared the performance of GAGR with K-means and other GA methods. A quantum inspired GA for
k-means clustering (KMQGA) was discussed by Xiao et al. [17], in which a Q-bit-based representation is employed for
exploration and exploitation in discrete 0-1 hyperspace using rotation operation of quantum gate as well as the typical genetic
algorithm operations of Q-bits. He and Tan [18] investigated a two-stage genetic clustering algorithm (TGCA) in which
ACCEPTED MANUSCRIPT
4
two-stage selection and mutation operations were implemented to exploit the search capability of the algorithm. The study
showed that TGCA is superior to k-means and standard genetic k-means algorithms. Kuo et al. [19] developed a hybrid
evolutionary algorithm based on GA, SO and k-means algorithm (HGAPSOA). GA-based approaches for the fuzzy clustering
problem are as follows: Hall et al. [20] presented a GA approach to minimize the clustering objective functions of hard c-means
(HCM) and FCM simultaneously. The results of all possible comparisons revealed that GA-HCM/GA-FCM approaches could
provide a better final partition than HCM/FCM, respectively. To overcome the disadvantages and improve the performance of
FCM significantly, Zhu et al. [21] proposed an adaptive fuzzy clustering based on GA (AGA-FCM). They obtained the optimal
number of clusters by analyzing cluster validity function and get the initial cluster centers with HCM. Ding and Fu [22]
introduced a kernel-based FCM clustering algorithm based on GA (GAKFCM), which combines the improved GA and the kernel
technique. Wikaisuksakul [23] investigated a multiobjective genetic algorithm-based fuzzy clustering algorithm (FCM-NSGA)
T
using NSGA-II and FCM to solve the data clustering problem. Ye and Jin [24] conducted a study on the performance of a novel
IP
FCM clustering algorithm based on improved quantum genetic algorithm (IQGA). In order to deal with FCM’s sensitivity to the
initial value, Zhang and Li [25] proposed a hybrid clustering algorithm (GCQPSO-FCM) by combining GA and chaotic particle
CR
swarm optimization (CPSO) with FCM. That study showed that GCQPSO-FCM achieves better performance than the original
algorithm.
As noted above, many of the clustering algorithms based on GAs may adopt GA in standalone form or in combination with
US
K-means, HCM or FCM. It is worth noting that these GA-based approaches for hard or fuzzy clustering have improved
performance as compared to traditional methods such as K-means, HCM and FCM. However, these clustering methods described
above suffer from the following two difficulties: (i) it takes enormous time for them to evaluate the functions, which leads to
AN
higher computational complexity and therefore limits their practical applications, (ii) when the size of data sets is large, some of
them may converge to local optimal solutions of the objective function due to the loss of the diversity of population during the
evolution. Therefore, it is essential to design more efficient optimization algorithms to improve the performance of fuzzy
clustering.
M
In an attempt to resolve the dilemmas that could not be settled by GA-based clustering methods, we focus on the cellular
model of GA (CGA) in this study. The key reason for choosing CGA is that, as one of the best-known cellular model
evolutionary algorithms (CEAs), its advantage lies in its implementation and the good tradeoff between exploration and
ED
exploitation compared with other evolutionary methods, which is critical in determining the effectiveness of the algorithm
[26-28]. In a CGA, the concept of neighborhood is intensively used, and thus an individual may only interact with its defined
local neighbors in the breeding loop. The exploration is provided by the overlapped small neighborhoods due to the induced slow
PT
diffusion of solutions through the population, which contributes to the genetic diversity of population and effectively avoids
premature convergence. On the contrary, the exploitation takes place within each neighborhood by genetic operators aiming at
CE
improving the quality of solutions. CGAs have been proven to be an efficient way in dealing with optimization problems of
various complexities [27, 28]. However, little attention has been paid to its use in the field of cluster analysis. According to the
existing research of hard or fuzzy clustering methods, extremely limited information is available concerning the combination of
AC
CGA with K-means, HCM or FCM. Only Leite et al. [29] used a CGA to solve the problem of classification of patterns, the
unique cellular genetic clustering algorithm for the crisp clustering based on the canonical CGA before this study, to the best of
our knowledge.
We are motivated by these investigations to design excellent optimization methods based on CGA that can dynamically adapt
to changes in the diversity of population through accurate balance between exploration and exploitation, and apply the methods
to the fuzzy clustering problem. With this concept in mind, we use the well-known diversity measure, the population entropy [28],
to compute the diversity of population so as to guide the search process. In our design, the widely-used Arnold chaotic map is
employed to initialize population for the purpose of overcoming the sensitivity of clustering algorithms to initial cluster centers.
A modified evolution rule is also introduced to build a dynamic environment so as to maintain the diversity which is in favor of
global exploration. Furthermore, the new dynamic crossover and entropy-based two-combination mutation operator are
ACCEPTED MANUSCRIPT
5
constructed to prevent the convergence of the algorithms to a local optimum. Based on the diversity computed, real-valued
coding mode is combined with the Arnold cat map, modified evolution rule, as well as the constructed dynamic crossover and the
entropy based two-combination mutation operators to present an improved self-adaptive cellular genetic algorithm (IDCGA). As
a result, IDCGA has two advantages over the basic CGA, namely, (i) the adaptive adjustment of parameters during the evolution.
(ii) the appropriate balance between global search and local search.
In this paper, two new methods for the fuzzy clustering problem based on the improved IDCGA are introduced to achieve a
better performance. The first method proposed in this study is a standalone form for fuzzy clustering by using IDCGA, called
IDCGA-FCM. The improved IDCGA is applied in fuzzy clustering model aiming to achieve the best possible results. The second
method is a hybrid between IDCGA and FCM, called IDCGA2-FCM, which dynamically integrates FCM with IDCGA according
to the variation of the diversity of population. This is done to exploit the local search space more efficiently and make the
T
evolution process converge fast. Thus, IDCGA2-FCM may tackle the two main problems of GA-based clustering methods, i.e.,
IP
the low speed and the easy involvement in partial extrema when the size or dimension of the data is large. Experiments on six
synthetic data sets and seven real-world data sets are reported and the results are compared with the six other clustering
CR
algorithms, including FCM, two GA-based clustering methods named GA-FCM, and AGA-FCM, and three PSO-based clustering
methods named PSO-FCM, FCM-FPSO and FCM-IDPSO.
The remainder of this paper is organized as follows. Section 2 provides a description of the FCM algorithm and the basic
US
CGA algorithm. Section 3 presents the details of the proposed method IDCGA and discusses the IDCGA algorithm for fuzzy
clustering. Section 4 introduces our hybrid clustering method. Section 5 reports the experimental results of different clustering
algorithms on synthetic and real-world data sets. Finally, conclusions and suggestions for further research are given in Section 6.
AN
2. Theoretical basis
The algorithms on which this study is based have already been mentioned in the previous section and now are further
illustrated. In this section, first we present a description of fuzzy c-means clustering; then, we describe the basic cellular genetic
M
algorithm.
Suppose that fuzzy c-means (FCM) partitions a set of n data objects 1,,k ,, n into c
(1 c n) fuzzy clusters, where each object has d attributes, and the kth object k is represented as a vector of
PT
quantitative variables k (1k , jk ,,dk ) , where jk R . Let V v1,,vi ,,v c be a set of
CE
cluster centers, where the ith cluster center vi is also represented by a vector of quantitative variables
vi (v1i ,v ji ,, vdi ) , where v ji R . Let U [uik ] be a cn fuzzy matrix of membership degrees in which
AC
uik is the membership degree of the kth object to the ith cluster center. The characters of U are as follows:
c
uik 1, 1 k n
i 1
n
uik (0,n), 1 i c (1)
k 1
uik 0,1, 1 i c, 1 k n
The goal of FCM is to determine cluster centers v i (1 i c ) and a fuzzy matrix U that minimize an objective
ACCEPTED MANUSCRIPT
6
where m is the fuzzification coefficient and dik is the Euclidian distance that represents the similarity between object k
dik k vi (3)
The minimum of J (U ,V ) can be achieved by Lagrange multiplier while the cluster centers and
T
membership degrees are updated according to Eq. (4) and Eq. (5).
IP
n
u ikm k
k 1
vi n
CR
u ikm
(4)
k 1
1
u ik 1 m 1
k vi
US
2 (5)
c
v 2
l 1
k l
AN
2.2. Basic cellular genetic algorithm
Cellular genetic algorithm (CGA) is a decentralized GA which combines cellular automata with genetic algorithm [30]. In
this algorithm model, all individuals are usually located in a specific toroidal grid of d dimensions (d=1, 2, 3). Each
M
individual is assigned to a grid position and only interacts with its defined local neighbors. Four typical neighborhoods used in
the CGA are illustrated in Figs.1. In this paper, we use population structured in a two-dimensional regular grid, and the
neighborhood defined on it (C9) contains nine individuals: the considered one (position(x, y)) plus the east, west, north, south,
ED
northeast, northwest, southeast and southwest ones (also called Moore neighborhood).
PT
CE
L5 L9 C9 C25
Fig.1. Typical neighbourhoods used in CGA.
More specifically, there are three kinds of operations in CGA, including selection, crossover and mutation, all of which are
AC
restricted within neighborhood. CGA usually do as follows: Each individual is updated by selecting its parents from its
neighborhood with a given criterion (line 7 and line 8). A genetic crosser operator is only applied to the two parents with a
probability Pc to form two offsprings (line 9), one of which is then mutated by a mutation operator with a probability Pm (line
10). Afterwards, the new offspring is evaluated and replaces the current individual according to a specified replacement policy
(line 11 and line 12). This process is repeated until all individuals in the population are updated. Then the newly generated
auxiliary population Paux (t ) replaces the current population to start a new generation (line 15), and the whole process is repeated
until specific termination conditions are satisfied. Algorithm1 shows the steps of the basic CGA algorithm.
Algorithm1. Basic CGA
1. Initialize parameters for CGA including the population size M , Pc , Pm ;
ACCEPTED MANUSCRIPT
7
T
14. end for
15. end for
IP
16. P (t 1) Paux (t )
17. t t 1
18. end while
CR
3. Improved self-adaptive cellular genetic algorithm
real parameter representations are commonly used for genetic clustering algorithms [12]. It has been shown that the real-valued
representations are more practical due to their consistency with the real-world number system and are more convenient for
further processing [31]. Thus, a real-valued representation is employed to express the chromosome in this paper. Taking both the
ED
computational complexity and useful information of the data into account, we turn our attention to cluster-center-based
representation approach.
Let Q x1 , x 2 ..., x p ..., x M, (1 p M) be a cellular population listed by x p . Each individual represents a set of
PT
cluster centers, i.e., one pattern partition of the dataset. In other words, each chromosome is described by a sequence of cd
real-valued numbers, where c is the number of clusters and d is the dimension of a data. This is more natural than the binary
CE
(6)
Where the first d values show the first cluster, the next d values represent the second cluster, and so forth.
On the one hand, as a result of the randomness and blindness of random selection methods, some meaningless clusters
irrelevant to the given data set might be yielded. On the other hand, the suitable initial cluster centers have a significant effect on
the performance of clustering [18]. Based on the above, we focus on chaotic map and apply it to initialization of chromosomes.
Among the chaotic maps, the Arnold cat map introduced by Russian mathematician V.I. Arnold is one of the widely used chaotic
maps due to its advantages of ergodic and dynamic properties of chaos variables [32, 33]. In IDCGA-based clustering algorithms,
Arnold cat map is employed to generate sequences to substitute the random initial cluster centers, aiming at obtaining a diverse
ACCEPTED MANUSCRIPT
8
n 1 1 1 n
n 1 1 2 n (mod1)
(7)
Where n , n 0, 1 , n is the chaotic variable by n iterations, x pj is the jth variable of x p , and the range of the
T
minimum and maximum values of each attribute of the given data set. This process is repeated until M chromosomes are
IP
generated by iterative using Eq. (7) and (8), which ensures that the initial population can make the best use of inherent
information of data set. Thus, M initial chromosomes are uniformly distributed in the whole search space and contain as much
CR
diverse genes as possible.
US
In IDCGA algorithm the same as other evolutionary algorithms, a fitness function is used to determine fitness value so as to
evaluate the optimality of each chromosome x p (1 p M) . Afterwards, the fitness value is utilized to decide whether the
AN
chromosome is eliminated or retained. In accordance with the principle of survival of the fittest, the more adaptive chromosomes
are kept, and the less adaptive ones are discarded in the process of generating a new population.
For a set of data 1,, k ,, n , the use of FCM can provide fitness metric for genetic optimization in
M
the proposed algorithm framework. The reciprocal of clustering objective function is defined as the fitness function to evaluate all
chromosomes here. Thus, we adopt the fitness function as follows:
1
ED
(9)
FIT ( x p ) c n
k vi
2
uikm
i 1 k 1
Note that the chromosome is evaluated according to clustering objective function of FCM, as shown in Eq. (2). Then, it is
PT
easy to show that, the smaller the objective function value, the better the effect of the clustering and the higher the fitness value
FIT ( x p ) . Therefore, when clustering objective function achieves its minimum, i.e., FIT ( x p ) reaches the maximum value,
CE
this means that the best partition and the optimal cluster centers are obtained.
A basic CGA shows better capacity of global exploration than GA through constraining individual genetic operations in its
neighborhood. However, it ignores the dynamic influence among all individuals with each other during the evolutionary
operation, that is to say, whether an individual is living or dead has no effect on the population evolution. Lu et al. [34] has
performed extensive experiments to compare the cellular genetic algorithm with evolutionary rule (CEGA), basic CGA and GA,
which indicates that CEGA is more efficient than CGA in terms of maintaining the diversity of population and conducting the
global exploration. Hence, the evolutionary rule is introduced in this paper to build a dynamic environment and synchronously
update states of individuals in the population.
In this dynamic environment, the evolutionary rule is made by moderate density cell in the same way as that of [34], which
will be described as below. Let t be the current number of evolutionary generation, S t is the state of current individual in the
tth generation, S t 1 is the state of current individual in the next generation, S is the number of the living individuals in its
ACCEPTED MANUSCRIPT
9
S 4,5,6,7
If S t 0, then S t 1 1
0
S 4,5,6,7
S 2,3,4,5
If S t 1, then S t 1 1 (10)
0 S 2,3,4,5
where 0 represents a dead individual and 1 represents a living individual. From the expression of evolutionary rule, it is seen that,
when the current individual is dead, it will be transformed into a living individual in the next generation as long as the condition
is met, i.e., there are 4 or 5 or 6 or 7 living individuals in its neighbors. Oppositely, when the current individual is living, it will
be transformed into a dead individual in the next generation as long as the number of the living individuals is not equal to 2, 3, 4,
or 5.
T
3.5. Construction of genetic operators
IP
It is well known that the crossover and mutation operations have a significant impact on the behavior and performance of
CR
CGA. In particular, the probabilities of crossover and mutation (hereafter referred to Pc and Pm respectively) greatly
determine whether the algorithm could find a near-optimum solution or whether it could find a solution efficiently. A number of
guidelines have been discussed in the previously reported literature for selecting them [15, 35]. However, these generalized
US
guidelines were drawn from empirical studies on a fixed set of test problems, and were inadequate since the optimal Pc and Pm
are specific to the problem under consideration. In fact, the optimal Pc and Pm , vary along with the problems of concern,
even within different stages of the evolutionary process in a problem. But at present, the genetic operations adopted in the
AN
existing CGA, such as crossover and mutation usually work at a constant probability determined in advance [26-29]. Excessive
high probabilities would disrupt the excellent individual while low values might result in the convergence of the algorithm to a
local optimum. In addition, the search process of CGA is a complex and non-linear process, the fixed values or simple linear
transformation for probabilities of crossover and mutation would fail to explore the search space and cannot achieve the best
M
possible results. Consequently, a judicious choice of Pc and Pm is critical to the successful working of CGA. Furthermore,
biological evolution shows that Pc and Pm are dependent on the evolution state and should be adapted [36]. It is, thus, essential
to design a CGA that adapts itself to the appropriate Pc and Pm instead of using fixed values. In our approach, we aim at
ED
achieving the trade-off between exploration and exploitation of CGA in a different manner, by adjusting them dynamically in
response to the current evolutionary condition of population and the distance between each individual and optimal individual. In
view of this, the new dynamic crossover and entropy-based two-combination mutation operators are constructed to prevent
PT
The crossover operator is the process of combining the information of two parent chromosomes so that two offsprings are
formed aiming at producing some diversified and promising new chromosomes. Crossover occurs only with some probability
Pc . The crossover probability affects the capability of CGA in exploiting a located hill to reach the local optima. The higher the
AC
value of Pc, the quicker exploitation proceeds. But too large Pc will disrupt individuals faster than they can be exploited.
Hence, a dynamic adjusting strategy based on the variable sigmoid function is introduced to modify the crossover probability
adaptively. The individuals suffer from the arithmetic crossover which utilizes the linear combination.
Let f avg be the average fitness value of current population, f max be the maximum fitness value of the population, f ( xi )
be the value of the fitness of the individual to be crossed. f max f ( xi ) denotes the distance between the individual to be
crossed and the optimal individual. f max f avg represents the current evolutionary condition of population to some extent.
1
Pct (i ) Max (11)
1 1exp( ( f max f ( xi )) ( f max f avg ))
Where Max is the maximum value allowed for Pct (i) , 1 is a crossover factor to determine the threshold boundary of
Pct (i) . It is obvious that the range of Pct (i) under the proposed adaptation scheme is always within the interval of
Max t
1 . Clearly, when f avg f max ,
, Max . And then we can obtain an appropriate ranges of Pc (i ) by fine tuning of
1 1
T
get higher values for low fitness solutions and get lower values for high fitness solutions. That is to say, in terms of the quality of
IP
clustering, the value of Pct (i) will decrease as its fitness increases in order to reduce the possibility of disrupting a good solution
CR
by crossover.
Here, we use the arithmetic crossover to produce new chromosomes. Let xi and x j be two parent chromosomes to be
crossed, in which
neighbors, then
xi is the current chromosome itself and
xi x j ( xi x j )
US x j is the chromosome with highest fitness chosen from its
AN
(12)
x x ( x x )
j i j i
This exploration depends on the mutation pattern of the genes and the probability of applying this operator. The mutation
probability Pm controls the speed of CGA in exploring a new area. In fact, the diversity of population is likely to remain at a
relatively high level at the early evolution stage. While the generation goes on, all individuals in the population show a high level
PT
of similarity at the later evolution stage, which leads to a relatively low level of diversity, thereby causing the convergence to a
local optimum. However, it’s well known that maintaining the diversity of population guarantees to get the global optimum. Thus,
given a problem in hand, we have to carefully investigate the variation tendency of diversity of population during all the
CE
evolution and design a two-combination mutation corresponding with the current condition according to the variation of diversity
of population.
With this concept in mind, we adopt the entropy originated from the information theory [28], to measure the diversity of
AC
population and guide the search process. Let M be the population size, the entropy of the lth gene in generation t is
defined as:
M M
DEl (t ) Pijt log Pijt (13)
i 1 j i 1
where Pijt represents the similarity between the value of the lth gene of the ith chromosome and that of the lth gene of
xilt xtjl
Pijt 1 (14)
bl al
where al and bl are the minimum and maximum values on the lth gene, respectively.
The population entropy DE(t ) is equal to the average of the entropies of the different genes, as shown in Eq. (15).
1 L
DE(t ) DEl (t )
L l 1
(15)
The diversity of population can be derived from Eq. (15). It is worth noting that DE(t ) has a theoretical minimum of zero
when the population is only made up of copies of the same chromosome. Accordingly, the more varied the chromosomes, the
T
larger the population entropy, and the better its diversity is.
IP
In the following, the definitions presented above are used to design a two-combination mutation operator that works by
introducing a general mutation and a large disturbance in two stages respectively. Here, a dynamic adjusting strategy is also
CR
employed to select an appropriate value of the mutation probability for each individual adaptively. Assume the detection
1 , and
threshold of population entropy is METmin 1
METmin k1 DEmax , where DEmax is the maximum value of
population entropy, and k1 is a disturbance factor that determines the disturbance frequency in the search process. Based on the
where is a constant greater than 4 and restrains the mutation rate, Pm (i) is an adaptive mutation probability. The
M
where Max is the maximum value allowed for Pm (i) , 2 is a mutation factor to determine the threshold boundary of
PT
Pm (i) . f avg , f max , f max g ( xi ) and f max f avg are the same as defined above, and g ( xi ) is the fitness of the
chromosome under mutation. Here, we let = 2 and it is the same as used in Ref. [16]. It is obvious that the range of Pm (i)
CE
under the proposed adaptation scheme is always within the interval of Max . And then we obtain an appropriate ranges of
, Max
1 2
Pm (i) by fine tuning of 2 . From the expressions of Pct (i) and Pmt (i) , it can be seen that the probabilities of Pct (i) and
AC
Pmt (i) based on the variable sigmoid functions are modified dynamically for each individual in real time, as opposed to the
existing study which sets single-fixed values of Pct (i) and Pmt (i) for all individuals in the population. The algorithm can adjust
Pct (i) and Pmt (i) adaptively in accordance with the current evolutionary condition of population and the distance between the
current individual and the optimal individual. More specifically, the values of Pct (i) and Pmt (i) are high when the current
individual is far away from the current optimal individual of the population. In contrast, when the current individual and the
ACCEPTED MANUSCRIPT
12
current optimal individual are very close, Pct (i) and Pmt (i ) will be relatively low.
Then with the mutation probability, the corresponding judging criterions to the above two-combination mutation are as
follows.
In the first case, i.e., when METmin
1
DE(t ) DEmax , this means that a relatively high level of the diversity of
population is maintained at the early evolution. As soon as the judging criterion detects such a condition, a general mutation
operator is implemented under the following variant of Gaussian mutation with an adaptive probability Pm (i ) that will be
reduced as its fitness increases. Due to the epistatic interactions, small changes in the individuals tend to bring about big
differences in the fitness values.
In the second case, i.e., when 0 DE(t ) METmin
1 , that is to say, the diversity of population is smaller than the
T
1
detection threshold METmin since most individuals in the population have a great similarity at the later evolution. As soon as
IP
the judging criterion detects such a condition, in which the algorithm is near to convergence to a local optimum, it will try to
jump out of this local optimum by promoting exploration. Then, a large disturbance with a high probability is performed to
CR
introduce some more diversity into the population and improve the capability of global search. According to this reasoning, the
solutions are disrupted with a probability Pmt (i ) Pm (i) and thus better solutions are generated. Therefore, the IDCGA
US
algorithm will come out of local algorithm.
Preliminary experiments were carried out taking into consideration the sensitivity of the mutation condition to the choice of
the value of , in which different values that represented low to high restrictive conditions were assessed. The values are: 4,
6, 8, and 10. Based on these tests, it was found that the global convergence rate increases as the value of increases. Initially,
AN
the run time decreases as the value of increases and later increases. It is very meaningful to the selection of single value
for all the considered problems. Thus, the best value will be selected for in term of efficiency and speed.
A variant of Gaussian mutation on which the mutation process is based was mentioned above and is written as below. Let
M
xij be the gene value of xij after the mutation. Then with mutation probability Pmt (i) , xij is given by
(18)
(t ) (0) exp( t Tm ax)
where N (0,1) is a random number of normal distribution, (t ) is an adaptive mutation step size that determines the disturbance
PT
amplitude. (0) is an initial value of the mutation step size and could be set according to specific clustering problem, t and
Tmax are the same as define above, is the time constant. The improved (t ) in Eq. (18) modifies the mutation step size
CE
dynamically in real time corresponding with the current condition and then achieves the goal of self-adaptive changes. The
algorithm should search with a larger step size at the early evolution stage and should ensure the ability of global search. In
AC
contrast, the algorithm should be searched in a smaller step size at the later evolution stage and should ensure the ability of local
search. Accordingly, it can be stated that the two-combination mutation operation makes full use of the flexible coordination of
population entropy and sigmoid functions. It will definitely overcome the disadvantage brought about by the improper mutation
of constant probability and fixed mutation step size.
The IDCGA algorithm for fuzzy clustering (IDCGA-FCM) can be described as follows:
Algorithm2. IDCGA-FCM
Require: Dataset and number of clusters c;
1. Initialize parameters for FCM and IDCGA including population size M , Max , Max , k1 , m , and Tmax ,
2. Adopt the chaotic map to generate initial cluster centers to create a initialize population P ( 0) using Eq.(8);
3. Classify an individual as living or dead on the k k grid at random;
4. Decode each individual to obtain the cluster centers, and calculate the membership degrees using Eq. (5);
5. Calculate the J of each individual using Eq.(2) based on U and V ;
ACCEPTED MANUSCRIPT
13
6. Calculate the fitness value of each individual using Eq.(9), and store f avg and f max at each iteration;
7. Calculate the population entropy DE(t ) using Eq.(15);
T
return the best individual.
IP
4. Hybrid fuzzy c-means and improved cellular genetic algorithm for clustering problem
The FCM algorithm is faster than the IDCGA algorithm since it only uses the gradient descent method and evaluates less
CR
functions, but it is easy to be trapped in local minima areas. In this paper, IDCGA is integrated with FCM to form a hybrid
clustering algorithm called IDCGA2-FCM which takes full advantage of the merits of both IDCGA and FCM. This is expected to
lead to a more accurate balance between global exploration and local exploitation, thereby achieving excellent clustering results
and a faster convergence speed.
US
In order to design a combination of IDCGA and FCM more effectively, the key issue to be solved is to determine when the
FCM iterations is employed and how to adaptively perform the FCM iterations in this algorithm. In fact, the algorithm should
AN
ensure the ability of global search and enhance the capability of exploration at the early evolution stage. On the contrary, the
algorithm should ensure the ability of local search and enhance the capability of exploitation at the later evolution stage.
On the basis of the above discussion, we also use the population entropy to measure the diversity of population so as to detect
2 , and
the current condition. Suppose the convergence threshold in population is METmin 2
METmin k2 DEmax , where DEmax is
ED
the same as defined above, k 2 is a combination factor that adjusts the proportion of FCM and IDCGA in the whole search
process. The implementation strategy adopted in this paper can be expressed as follows:
2
In the first stage, if METmin DE(t ) DEmax, i.e., the diversity of population is maintained at a relatively high level in the
PT
generation t , the algorithm prefers exploration and favors global search. In this case, IDCGA2-FCM only makes full use of
strong capability of global search of IDCGA.
In the second stage, if DE(t ) METmin
2
, i.e., the diversity of the population drops to a relatively low level and is
CE
2
smaller than the convergence threshold METmin , As soon as the condition is found, the algorithm prefers exploitation and
favors local search. Under this circumstance, FCM will be performed to strengthen the ability of local search after genetic
operations, with the goal of guiding the individuals to approach the optimal solution quickly. The specific criterion can be
AC
the only top 38.2% ranked individuals are subjected to accurate local search by using FCM iterations as follows:
(A) Set 0 ; set the number of iterations Gd ; get the cluster centers by decoding each individual.
(B) Calculate the membership degrees using Eq.(5).
(C) Calculate the cluster centers for each individual using Eq.(4).
(D) If Gd , go to step B; else stop this loop and do as follows: calculate the J using Eq.(2) and then update
each individual by coding the cluster centers.
(2) The unselected 61.8% individuals maintain the original population structure, thus the balance between global search and
ACCEPTED MANUSCRIPT
14
T
7. Calculate the population entropy DE(t ) using Eq. (15);
IP
8. Update states synchronously by evolution rule using Eq. (10);
9. Select the living individual and get its neighborhood as parents;
10. Update the crossover probability of each individual using Eq. (11);
CR
11. Update the mutation probability of each individual using Eq. (16);
12. Implement parents’ recombination with an adaptive probability Pct (i ) using Eq. (12);
1
13. If METmin DE(t ) DEmax , implement individual’s general mutation with an adaptive probability Pm (i ) using
16. Sort all the individuals within population in terms of their fitness values, and adopt the golden section to determine
AN
the excellent individuals; then the top 38.2% ranked individuals are subjected to the following operations while the
unselected 61.8% individuals maintain the original population structure.
(A) Set 0 ; set the number of iteration Gd ; get the cluster centers by decoding each individual.
M
The time complexity of a clustering algorithm becomes one of the significant issues that are most worthy of concern. The
CE
Fitness evaluation: Since each individual is decoded to obtain the cluster centers and assigned the membership degrees using
AC
FCM. On one hand, the time complexity of assigning the membership degrees is O(ncd ) . On the other hand, the fitness
calculation for each individual needs O(n) time. Assume that population size is M in every generation, and then the total
time complexity for fitness evaluation is O(Mncd ) .
Measure of population diversity: Because of the adoption of entropy, the time complexity of calculating the population
diversity is O(M (M 1)cd 2) .
Genetic operations: Each genetic operation requires O( Mcd ) time in the worst case. As a result, the time complexity of the
population for genetic operations is O( Mcd ) .
Integration of FCM: As mentioned above, the time complexity of calculating the membership degrees is O(ncd ) . In addition,
the time complexity of updating the cluster centers for each individual is O(cd ) . Suppose that the number of iterations is
equal to Gd and there are individuals chosen for FCM iterations. Hence, the total time complexity for the integration of
ACCEPTED MANUSCRIPT
15
2n . Therefore, the time complexity of iteration cycle of IDCGA-FCM is O(MncdTmax ) . The total time complexity of
IDCGA-FCM is O(Mcd (1 nT )) . As for IDCGA2-FCM, when M is greater than Gd , the time complexity of iteration
max
cycle of IDCGA2-FCM is the same as that of IDCGA-FCM. Otherwise, the time complexity of iteration cycle of IDCGA2-FCM
is O(n cdGd Tmax ) . The total time complexity of IDCGA2-FCM is O(cd (M nGd Tmax )) . However, it's worth noting that the
FCM iterations is only performed to favour local search when the diversity in the population drops below a certain level, rather
than the whole process. The practical time used in theIDCGA2-FCM is greatly less than that of IDCGA-FCM.
T
Start
No
DE(t) METmin2
IP
No Calculate the population
Meet the termination
entropy
Initial population using condition?
Arnold Cat Calculate and store the Yes
CR
Yes average and maximum
Obtain the optimal fitness value Sort all the individuals
Get the cluster centers by
solution
decoding each individual
Update states by the Determine the excellent
Assign objects into evolution rule individuals using golden
Assign the membership
section method
US
clusters Select the living individual
degrees using FCM
and its neighborhood as
Output clustering result parents Perform the FCM iterations
Calculate fitness using
Generation=Generation+1
fitness function
Dynamic crossover
Stop
AN
Gd=Gd+1
No
Two-combination mutation Meet the stopping
based on entropy criterion?
For the purpose of verifying the performance of both methods proposed in this study (IDCGA-FCM and IDCGA2-FCM),
several FCM-type clustering algorithms (the FCM algorithm [4],the standard genetic FCM clustering algorithm(GA-FCM)
CE
[20],the adaptive genetic FCM clustering algorithm (AGA-FCM) [21],the particle swarm optimized FCM clustering algorithm
(PSO-FCM ) [37], the hybrid clustering algorithm based on fuzzy PSO and FCM ( FCM- FPSO) [38], the hybrid clustering
algorithm based on improved PSO and FCM (FCM-IDPSO) [39] ) are chosen for extensive comparative analysis. A series of
AC
experiments were conducted on the synthetic data sets and UCI data sets, aiming at verifying the accuracy and efficiency of the
presented algorithms.
The experiments were implemented on the following eight synthetic data sets and eight real-world data sets. The former
allows for a better control of data behavior and a better understanding of methods performance. For the latter, methods are
assessed using several well-known data sets from the UCI Machine Learning Repository [40]. As for the synthetic datasets, data
points in each group of these eight data sets will be randomly generated by Gaussian distributions with the centroids as the
average and a random number as the variance. As shown in Figs.3, the six data sets, i.e., DS1, DS2, DS3, DS4, DS5 and DS6
have unequal sizes, and show different overlapping levels and different cluster shapes. DS7 and DS8 are two large data sets. All
ACCEPTED MANUSCRIPT
16
of them represent various difficulties for clustering. As for the real-world data sets, eight widely-used data sets were chosen: Iris,
Wine, Glass, Heart disease, Cancer, Prima, Image segmentation and Landsat Satellite. The detailed description of these data sets
can be obtained from the UCI Repository [40]. For convenience, we summarize the main characteristics of the 16 data sets in
Table 2 and 3. As shown in Table 1, the three columns show the number of objects n , the number of variables d and the
number of clusters c . All the possible extreme values of clustering objective function obtained by FCM for 2000 different runs
are listed in Table 2.
a 10 c
b 1
8
10 0.8
6
8
4
6 0.6
2
y
4
y
0 0.4
2
-2
T
0
-4 10 0.2
8 10
-6 6 8
4 6
4
IP
-8 2 0
2 0 0.2 0.4 0.6 0.8 1
-4 -3 -2 -1 0 1 2 3 4 0 0
x y x x
d e
f 100
CR
1 100
0.8 80 80
0.6 60
60
z
z
0.4 40
y
0.2 20 40
US
0
0
1 100
20
0.8 1 100
0.6 0.8 50 80
0.6
60
0.4 40
0.4 20 0
0.2 0.2 0 0 0 20 40 60 80 100
y 0 0 y x x
x
AN
Figs.3. Part of synthetic data sets. (a) DS1. (b) DS2. (c) DS3. (d) DS4. (e) DS5. (f) DS6.
Table 2
Dateset n c d Dateset n c d
DS1 200 4 2 Iris 150 3 4
ED
Table 3
AC
DS6 5.4062e+03 9.8901e+03 1.0924e+04 1.1116e+04 1.1745e+04 1.2121e+04 1.2854e+04 1.3297e+04 1.3668e+04
DS8 1.3095e+06 1.7158e+06 1.7480e+06 1.7553e+06 1.7640e+06 1.7791e+06 1.7848e+06 1.7907e+06 1.8071e+06
Glass 154.1460 154.9533 157.9265 159.5464 159.9705 160.4936 160.5597 160.6937 165.2214 168.3052
Cancer 1.4917e+04
Prima 3.9868e+06
T
5.7975e+06 5.8388e+06 5.8526e+06
IP
Landsat Satellite 7.7086e+06 7.7087e+06 7.7088e+06 7.7257e+06 8.0574e+06 8.5070e+06
CR
From Table 2 and 3, it can be seen that these chosen data sets cover examples of data of small/large objects, low/medium/high
dimensions, small/medium/large number of clusters and single/multiple extrema. We consider that the characteristics of these
data sets make them attractive and meaningful enough to support our claims and thus they will allow us to draw clear conclusions
about the accuracy and efficiency of the proposed algorithms.
and SC, as shown in Table 1. The larger value of index PC and index PBMF mean the better fuzzy cluster results. The smaller
values of index PE, index XB, index FS and index SC are expected.
ED
Table 1
c n
1
Partition coeffcient PC
n
ijm Max( PC )
i 1 j 1
c n
1
Partition entropy PE ij log ij Min( PE )
CE
n i 1 j 1
c n
ijm
2
x j vi
i 1 j 1
Xie-Beni function XB 2
Min( XB )
AC
n min vi v j
i j
c n
FS ijm ( x j - vi
2 2
Fukuyama-Sugeno function vi v ) Min( FS )
i 1 j 1
Pakhira-Bandyopadhyay-Maulik fuzzy 1 E
PBMF ( 1 Dc )2
c Ec
Max( PBMF )
c n
x j -v ,Ec
n c
E1 x j vi , Dc max vi v j
2 m
ij ij
i , j 1
j 1 i 1 j 1
ACCEPTED MANUSCRIPT
18
n
ij2
2
x j - vi
c
SC
j 1
SC function Min( SC )
n c
i 1
ij vi - vl
2
j 1 l 1
In order to take advantage of each fuzzy CVI and obtain the true partitioning results, we optimize two objective functions
simultaneously instead of single CVI during clustering process. They are defined as follows:
(c) PC(c) XB(c) (19)
PC (c) considers only the membership functions. XB(c) considers both the membership functions and the geometrical
properties of the data structure, measuring the compactness and separation. The larger value of implies the better fuzzy
T
clustering results.
IP
(c) PE(c) SC(c) (20)
PE (c) uses only fuzzy memberships and it may be lack for the connection to the geometrical structure of data. SC (c)
CR
simultaneously takes into account fuzzy memberships and the data structure. A smaller value of means a better solution.
10-5 are set. The probabilities of crossover and mutation for GA-FCM and AGA-FCM are Pc =0.8 and Pm =0.05,
M
respectively. Because of the different characteristics, we use different maximum number of generations Tmax for each dataset:
500 generations for DS7and Glass, 1000 generations for DS8, 3000 generations for Image segmentation and Landsat Satellite,
ED
1 , 1.5 for 2 , 0.3 for k1 , 0.5 for k 2 , 8 for =8 and 5 for Gd . Finally, all the algorithms will share the same stopping
criterion: the maximum number of generations or until convergence. If the algorithm fails to satisfy the stopping criterion,
CE
terminates when reaching the maximum number of generations defined for that dataset.
In this section, IDCGA-FCM and IDCGA2-FCM are compared to FCM, GA-FCM and AGA-FCM through experiments
based on the two kinds of data sets that are used above. The comparison with FCM is important to prove the significant
improvement over the traditional clustering method. The comparison against GA-FCM and AGA-FCM clustering algorithms, the
predecessors of IDCGA-FCM and IDCGA2-FCM, were carried out because both of them showed better results than FCM.
Therefore, it would be more convictive to use all of them for validating the efficiency and superiority of the proposed algorithms.
Firstly, we compare the speed of convergence of FCM, the two GA-based algorithms and the two proposed IDCGA-based
clustering algorithms. In the experiments, five algorithms run 100 times independently for each data set with randomly generated
initializations for each repetition. There are two exceptions for DS8 and Landsat Satellite because of their higher complexity
and a large amount of computation time needed compared to the other datasets. The average value recorded was to account for
the stochastic nature of the algorithms. For a better view of the results, the average values of the J obtained by the five
ACCEPTED MANUSCRIPT
19
algorithms for the synthetic data sets and real-world data sets are displayed in a series of plots shown in Fig.4 and Fig.5,
respectively. For a more careful comparison among algorithms, we recorded the mean and standard deviation values of the J of
five algorithms, as shown in in Table 4 and 5, respectively. All of the experiments were implemented on a computer with 3.50
GHz CPU and 4 GB RAM. Fig.6 gives the run time of FCM, GA-FCM, AGA-FCM, IDCGA-FCM and IDCGA2-FCM.
a 440 b 900 c 7
FCM GA-FCM AGA-FCM IDCGA-FCM IDCGA2-FCM FCM GA-FCM AGA-FCM IDCGA-FCM IDCGA2-FCM
FCM GA-FCM AGA-FCM IDCGA-FCM IDCGA2-FCM 6.5
420 2.25
265 340
800
400 6
335
380 260 5.5
700 330 2.2
360 5
Average J
Average J
Average J
255 325
340 600 4.5
320
320 4 2.15
250
500 315
T
300 3.5
310
280 245 0 20 40 60 80 100 3
0 20 40 60 80 100 400 2.1
0 20 40 60 80 100
IP
260 2.5
240 300 2
0 10 20 30 40 50 60 70 80 90 100 0 10 20 30 40 50 60 70 80 90 100 0 10 20 30 40 50 60 70 80 90 100
Iteration numbers Iteration numbers Iteration numbers
e 11 x 10
CR
4 4
d8 f 4
x 10
FCM GA-FCM AGA-FCM IDCGA-FCM IDCGA2-FCM FCM GA-FCM AGA-FCM IDCGA-FCM IDCGA2-FCM FCM GA-FCM AGA-FCM IDCGA-FCM IDCGA2-FCM
4 4
2.6 x 10 x 10
7 10 2
5 3.5
2.4
9
6 3 1.5
4.5
2.2
US
8
Average J
Average J
Average J
5 2.5
2 7 1
4
4 2
1.8 6
0.5
0 20 40 60 80 100
3 0 20 40 60 80 100 3.5 1.5
5 0 20 40 60 80 100
AN
2 4 1
1 3 0.5
0 10 20 30 40 50 60 70 80 90 100 0 10 20 30 40 50 60 70 80 90 100 0 10 20 30 40 50 60 70 80 90 100
Iteration numbers Iteration numbers Iteration numbers
4 6
g 6
x 10 h 8 x 10
M
FCM GA-FCM AGA-FCM IDCGA-FCM IDCGA2-FCM FCM GA-FCM AGA-FCM IDCGA-FCM IDCGA2-FCM
5.5 6
4 7 x 10
x 10
1.57 2.5
5
1.565
6
4.5
2
ED
Average J
Average J
1.56 5
4
1.555 1.5
3.5
4
1.55
3 0 200 400 600 800 1000
3
1.545
2.5
PT
1.54 2
2 0 100 200 300 400 500
1.5 1
0 50 100 150 200 250 300 350 400 450 500 0 100 200 300 400 500 600 700 800 900 1000
Iteration numbers Iteration numbers
CE
Fig. 4. Clustering results for the eight synthetic data sets. The figures plot the average J obtained by FCM, GA-FCM, AGA-FCM, IDCAGA-FCM
and IDCGA2-FCM. (a) DS1. (b) DS2. c) DS3. (d) DS4. (e) DS5. (f) DS6. (g) DS7. (h) DS8.
a 220 b x 10
6
c 8
x 10
5
5
FCM GA-FCM AGA-FCM IDCGA-FCM IDCGA2-FCM FCM GA-FCM AGA-FCM IDCGA-FCM IDCGA2-FCM
FCM GA-FCM AGA-FCM IDCGA-FCM IDCGA2-FCM 6
AC
x 10 5
200 1.9 7.5 x 10
62 4.5 4.2
Average J
3.5
4.14
140 1.84 6
3 4.12
120 61 5.5
1.82
4.1
2.5
100 1.8 5
4.08
60.5 2 0 20 40 60 80 100
80 0 20 40 60 80 100 4.5 4.06
0 20 40 60 80 100
60 1.5 4
0 10 20 30 40 50 60 70 80 90 100 0 10 20 30 40 50 60 70 80 90 100 0 10 20 30 40 50 60 70 80 90 100
Iteration numbers Iteration numbers Iteration numbers
ACCEPTED MANUSCRIPT
20
d 4
x 10
4
e 650
FCM GA-FCM AGA-FCM IDCGA-FCM IDCGA2-FCM FCM GA-FCM AGA-FCM IDCGA-FCM IDCGA2-FCM
4 600
x 10
1.8 180
3.5
550
175
1.7 500
3
450 170
Average J
Average J
1.6
2.5 400
165
1.5
350
160
2
1.4 300
0 20 40 60 80 100
155
250
1.5
200 150
0 100 200 300 400 500
1 150
0 10 20 30 40 50 60 70 80 90 100 0 50 100 150 200 250 300 350 400 450 500
Iteration numbers Iteration numbers
7
g 7
T
x 10
f 6
4
x 10
FCM GA-FCM AGA-FCM IDCGA-FCM IDCGA2-FCM FCM GA-FCM AGA-FCM IDCGA-FCM IDCGA2-FCM
6
x 10
IP
3.5 6
5 6.8 x 10
11
6.6 3
4 10
Average J
6.4
Average J
CR
2.5
3 6.2 9
2
6 8
2
5.8 1.5
7
0 500 1000 1500 2000 2500
US
1 5.6 1
0 500 1000 1500 2000 2500
0.5
0 0 500 1000 1500 2000 2500 3000
0 500 1000 1500 2000 2500 3000
Iteration numbers Iteration numbers
J
AN
Fig. 5. Clustering results for the four real-word data sets. The figures plot the average obtained by FCM, GA-FCM, AGA-FCM,
IDCAGA-FCM and IDCGA2-FCM. (a) Iris. (b) Wine. (c) Heart. (d) Cancer. (e) Glass. (f) Image segment. (g) Landsat Satellite.
It can be seen from Figs.4 and 5, compared with the other four algorithms, the hybrid IDCGA2-FCM obtained better results
in all the data sets and can escape from local optima. Besides, it is also prominent that IDCGA-FCM and IDCGA2-FCM
M
converge to the desired values and make some improvement in the J over FCM, GA-FCM and AGA-FCM for most data sets,
both of which indicate their superiority. Comparatively, GA-FCM and AGA-FCM cannot prevent the convergence of the
algorithms to the local optimum, while FCM easily traps into local minima in a short time. Furthermore, IDCGA2-FCM
ED
succeeded in converging to global minimum in a relatively fewer number of iterations. The key reason for the good performance
is that the search route of IDCAGA-FCM and IDCGA2-FCM is more flexible and directive due to the adoption of the dynamic
crossover and two-combination mutation operations, as well as the combination of FCM iterations.
PT
Table 4
CE
The average and standard deviation values of the J obtained by five algorithms for the eight synthetic data sets. Best results are highlighted in
bold.
As shown in Table 4, IDCGA-FCM and IDCGA2-FCM achieve better performances than FCM, GA-FCM and AGA-FCM in
ACCEPTED MANUSCRIPT
21
well-separated, hyper spherical and overlapping clusters from the synthetic data sets, and they can escape from local optima.
Moreover, IDCGA-FCM and IDCGA2-FCM had equal averages for the first data set, but with different standard deviation. For
the other data sets, IDCGA2-FCM always achieved the best average and the smallest standard deviation values for the J ,
followed by IDCGA-FCM. For example, IDCGA2-FCM can achieve 100% in the J on all synthetic data sets. Nevertheless,
GA-FCM and AGA-FCM cannot effectively overcome the prematurity in the local optimum, which illustrates that the
combinations of GAs and FCM perform poorly in the search for global optimum. In addition, it is also noticeable that FCM is the
worst approach since it does not outperform neither GA-FCM nor AGA-FCM for any synthetic data sets which represent the
multiple extrema problems.
Table 5
T
The average and standard deviation values of the J obtained by five algorithms for the eight real data sets. Best results are highlighted in bold.
IP
FCM GA-FCM AGA-FCM IDCGA-FCM IDCGA2-FCM
CR
Wine 1.8232e+06±1.1759e +05 1.7987e+06±3549.7000 1.7967e+06±2166.7000 1.7963e+06±1509.9000 1.7961e+06±2.1080e-06
Cancer
Prima
Image
1.4917e+04±1.8115e-07
3.9868e+06±1.3061e-06
5.7516e+06±5.3511e+04
1.5065e+04±136.5500
3.9933e+06±5.5382e+03
5.8802e+06±4.0529e+04
US 1.4999e+04±91.4336
3.9919e+06±2.8462e+03
5.8093e+06±4.0288e+04
1.4930e+04±64.2690
3.9901e+06±2.7909e+03
5.7398e+06±2.3847e+04
1.4917e+04±7.2776e-08
3.9868e+06±2.8451e-07
5.7142e+06±2.1472e+04
AN
Satellite 7.9747e+06±9.7453e+04 8.9902e+06±4.5908e+04 8.2143e+06±2.4465e+04 7.7147e+06±1358.7345 7.7086e+06±3.8909e-06
In Table 5, it is demonstrated that the value of the J obtained by IDCGA2-FCM is always better than that obtained by the
other algorithms for the same real-world data sets. We can clearly see that IDCGA2-FCM provides adequate efficiency and
M
accuracy for solving data sets with a great number of clusters or extrema, especially in Image segment and Landsat Satellite. But
there is no obvious advantage in solving those data sets with single extremum, such as Cancer and Prima. Although
ED
IDCGA-FCM is inferior to IDCGA2-FCM, IDCGA-FCM obtained better results than GA-FCM and AGA-FCM in all data sets
while it surpasses FCM in all data sets but the two (Cancer and Prima). Besides, the experimental results reveal that, when the
size of data set is small (i.e., number of objects or clusters), the GA-FCM and AGA-FCM surpass FCM, but with the increasing
PT
size of data set, FCM obtains better results than GA-FCM and AGA-FCM. Additionally, it also stands out that both IDCGA-FCM
and IDCGA2-FCM have better stability and robustness than the two GA-based methods considered in the comparison. The major
causes for the difference in clustering results are that IDCGA-FCM and IDCGA2-FCM take full advantage of IDCGA’s global
CE
search capability, which make them prevent the premature in local extrema, compared to FCM, GA-FCM and AGA-FCM; on the
other hand, IDCGA2-FCM make use of strong ability of local search of FCM iterations, which makes it exactly converge to the
only extremum, compared to GA-FCM, AGA-FCM and IDCGA-FCM.
AC
4
a 120
b 140 c x 10
18
FCM FCM FCM
GA-FCM 16
100 120 GA-FCM GA-FCM
AGA-FCM AGA-FCM 14 AGA-FCM
IDCGA-FCM 100 IDCGA-FCM IDCGA-FCM
80 12
IDCGA2-FCM IDCGA2-FCM IDCGA2-FCM
Time (s)
Time (s)
Time (s)
80 10
60
8
60
40 6
40
4
20
20 2
0
0 0 DS7 DS8 Image seg. Landsat Satellite
DS1 DS2 DS3 DS4 DS5 DS6 Iris Wine Heart Cancer Prima Glass
Datasets Datasets Datasets
Fig. 6. Run time of five algorithms for the 16 data sets. (a) Six synthetic data sets. (b) Six UCI data sets. (c) Four large data sets.
ACCEPTED MANUSCRIPT
22
As can be seen from Fig. 6, the run time of IDCGA2-FCM is significantly less than GA-FCM, AGA-FCM and
IDCAGA-FCM, with the time consumption reduced by about 50%-90%. Comparatively, the run time of IDCGA-FCM is
slightly less than that of GA-FCM and AGA-FCM when the size of data set is small. But when handling data sets with large
size and high dimension, the clustering time of IDCGA-FCM will greater than that of GA-FCM and AGA-FCM (see Fig. 6c).
Thus, there is still deficiency in the computation time. However, IDCGA2-FCM makes full use of FCM iterations to accelerate
the convergence at the evolution process while IDCGA-FCM does not. That is why IDCGA2-FCM convergence in a relatively
shorter time.
By comparing the Tables 4-5 and Fig. 6 synthetically, we note that the hybrid IDCGA2-FCM obtained superior results than
all of the other four algorithms in data sets of high complexity. Therefore, it can be concluded that IDCGA2-FCM is an efficient
clustering algorithm and can get very encouraging results in terms of quality of solution found and the convergence speed
T
simultaneously.
IP
In the following, , and are employed to evaluate the clustering accuracy of the five algorithms. We recorded the
average and standard deviation values of the three CVIs of the five algorithms on all of the test data sets, respectively. Then, the
CR
results of the three CVIs obtained by the five algorithms on each one of the 16 datasets are included in Tables 7-9.
Through a careful analysis of the results in Tables 7-9, we can easily find that the three CVIs are not simultaneously best in
some situations. We tend to choose the situation that has the most optimized CVIs as the best. It is obvious that for most cases the
US
three CVIs obtained by IDCGA2-FCM are better than those obtained by the other four algorithms. Besides, it is also prominent
that IDCGA-FCM has achieved better values of CVIs than FCM, GA-FCM and AGA-FCM for most of data sets.
Table 7
The average and standard deviation values of obtained by FCM, GA-FCM, AGA-FCM, IDCAGA-FCM and IDCGA2-FCM methods for the
AN
16 data sets. Best results are highlighted in bold.
Table 8
The average and standard deviation values of obtained by FCM, GA-FCM, AGA-FCM, IDCAGA-FCM and IDCGA2-FCM methods for the 16
data sets. Best results are highlighted in bold.
T
Glass 1.0720±0.0398 1.1457±0.0367 1.1264±0.0412 1.0905±0.0188 1.0621±4.0967e-06
IP
Cancer 0.5583±1.2271e-07 0.5681±0.0050 0.5630±0.0027 0.5618±0.0041 0.5583±9.8406e-11
CR
Image 1.4483±0.0365 1.4809±0.0691 1.3792±0.0528 1.2709±0.0457 1.2684±0.0111
Table 9
Let us analyze first, we can observe in Table 7 that IDCGA2-FCM obtains the largest values in all data sets except Wine
and Heart, whereas IDCAGA-FCM yields the best results in six data sets and FCM only does well in two data sets with single
extremum. Table 8 shows that IDCAGA-FCM and IDCGA2-FCM obtain better results than the other three algorithms with
respect to , since they yield the best values in 4 and 14 out of the 16 data sets, respectively. In terms of , IDCGA2-FCM
obtains better results than the other four algorithms, as seen on Table 9. For this validity measure, IDCGA2-FCM yields the
largest values in 13 out of the 16 data sets, only failing on Wine, Heart and Prima, which are better addressed by IDCGA-FCM.
Moreover, the standard deviation values obtained by the IDCGA2-FCM in Section 5 have a tendency of being small relative to
ACCEPTED MANUSCRIPT
24
their corresponding average values, which reveals that its clustering performance has robustness to initialization. Consequently,
we can conclude that clustering accuracy of IDCGA2-FCM is really better than all the algorithms compared for most case.
The experiments of the previous section proved that both IDCGA-based clustering algorithms proposed in this paper have
achieved perfect performance, particularly IDCGA2-FCM, which obtained very encouraging results for the J, convergence
speed and CVIs.
To further verify how competitive IDCGA2-FCM is, it would be convictive to compare its performance with other
PSO-based clustering methods that are representative of state-of-the-art algorithms. For that purpose, three algorithms were
chosen: PSO-FCM [37], FCM- FPSO [38] and FCM-IDPSO [39]. The main reason for this choice is that these methods are
T
recently published and show good results. Next, we briefly describe these methods, including the parameter settings used in the
IP
subsequent experiments.
The PSO-FCM employs particle swarm optimization (PSO) to solve the problem of classification of instances in databases.
CR
The FCM-FPSO combines fuzzy particle swarm optimization (FPSO) with FCM, which makes use of the merits of both
algorithms. In FCM-FPSO, FCM is applied to FPSO every number of iterations such that the fitness value of each particle is
improved. The FCM-IDPSO is a hybrid method for fuzzy clustering that combines the improved self-adaptive particles warm
US
optimization (IDPSO) and FCM. In FCM-IDPSO, FCM is also applied to IDPSO every number of iterations. Specifically, we
used the real-coded version of the algorithms and the same parameter settings used by their authors. The algorithms have the
same population size and stopping criterion as in the previous section. In the experiments, the four algorithms were assigned the
AN
following values for their parameters:
for IDCGA2-FCM: the same values used in Section 5;
for PSO-FCM: wmax 0.9 , wmax 0.4 , c1 c2 2.0 ;
for FCM- FPSO: wmax 0.9 , wmax 0.1 , c1 c2 2.0 ;
M
l 1 c()
for FCM-IDPSO: winitial 0.9 , wfinal 0.4 , c()
1
l 1 2
2
at instant t 1 and 100 .
The experiments have been conducted here on the six real data sets in the same way as those in the previous section. The
J
ED
average and standard deviation values of the obtained by the four algorithms are presented in Tables 10. Because of
divergence, the PSO-FCM terminates when reaching the maximum number of generations defined for that dataset. Thus, we
compared the convergence speed of FCM- FPSO, FCM-IDPSO and IDCGA2-FCM, as depicted in Fig.7.
PT
Table 10
The average and standard deviation values of the J obtained by four algorithms. Best results are highlighted in bold.
CE
It can be seen from Tables 10 that the results of IDCGA2-FCM for all data sets are better than those of PSO-based algorithms
which indicates that IDCGA2-FCM clearly outperforms PSO-FCM, FCM-FPSO and FCM-IDPSO. Comparatively, FCM-FPSO
and FCM-IDPSO only report the best results for the J in Heart, while FCM-PSO always obtained the largest values. It is also
seen that FCM-FPSO, FCM-IDPSO and IDCGA2-FCM obtained the small values for the number of iterations as a result of the
combination of FCM, as shown in Fig.7. However, FCM-PSO does not integrate with FCM iterations to achieve a faster
ACCEPTED MANUSCRIPT
25
convergence. Thus, it was the worst one concerning the convergence speed. Obviously, IDCGA2-FCM shows the smallest values
for the number of iterations in all data sets except for Image segment. Even though FCM-FPSO and FCM-IDPSO obtained the
smaller values for the number of iterations than IDCGA2-FCM in Image segment, but this led to poor result for the J.
Accordingly, it can be concluded that DCGA2-FCM is a highly competitive algorithm and more effective than FCM-PSO,
FCM-FPSO and FCM-IDPSO for clustering problems.
30
FCM-FPSO
25 FCM-IDPSO
IDCGA2-FCM
20
Iterations
15
T
10
IP
5
0
Iris Wine Heart Glass Prima Image seg.
CR
Datasets
Fig.7. Number of iterations obtained by three algorithms on the six data sets.
US
To analyze in depth the influence of key parameters, which is crucial for IDCGA2-FCM to achieve better performances, we
also designed the experiments to discuss the effects brought about by the number of iteration Gd on the performance of the
algorithm. The experiments were carried out by adjusting the value of Gd from one to eight and keeping all other parameters the
AN
same as those used in Section 5.
Firstly, the Glass was taken as an example to investigate the clustering results of IDCGA2-FCM at eight different parameters,
i.e., the iteration number Gd =1, 2, 3, 4, 5, 6, 7, 8. The clustering results obtained by IDCGA2-FCM with Gd change are
M
illustrated in Fig.8. To show the effects clearly, the run time of IDCGA2-FCM as the value of Gd increases are also provided.
remains stable. Although there are small fluctuations in the number of iterations and run time, it is obvious that the run time of
the IDCGA2-FCM is greatly affected by the values of Gd . As depicted in Fig.9, the run time of IDCGA2-FCM decreases at
first and then increases with increasing Gd . We can draw a conclusion from Figs. 8 and Fig.9 that, when the value of Gd is
PT
equal to 5, it would be conducive to the effcient balance between convergence accuracy and speed. Hence, Gd =5 is a more
170
Gd=1
168
Gd=2
166 Gd=3
Gd=4
164
AC
Gd=5
Average J
162 Gd=6
Gd=7
160 Gd=8
158
156
154
152
0 5 10 15 20 25 30
Iteration numbers
40
Gd=1
35 Gd=2
Gd=3
30
Gd=4
25 Gd=5
Time (s)
Gd=6
20 Gd=7
Gd=8
15
10
0
DS1 Wine Prima Glass
Datasets
T
IP
6. Conclusions
CR
Two novel adaptive fuzzy clustering algorithms based on cellular genetic algorithm, referred to as IDCGA-FCM and
IDCGA2-FCM, have been proposed in this paper. The first one is a standalone form for fuzzy clustering on the basis of improved
self-adaptive cellular genetic algorithm (IDCGA). The second one is a hybrid method based on FCM and IDCGA which takes
US
advantage of the merits of both algorithms.
There are two contributions in this study. Firstly, an efficient global optimization method IDCGA has been presented for a
more efficient search. In detail, the new dynamic crossover and entropy-based two-combination mutation operations have been
AN
constructed to effectively prevent the convergence of the algorithms to a local optimum. The adaptive adjusting strategies and
judging criterions have been designed to dynamically modify the probabilities of crossover and mutation as well as the mutation
step size. In addition, Arnold cat map has been employed to initialize population to overcome the sensitivity of the algorithms to
initial cluster centers. Also, a modified evolution rule has been introduced to build a dynamic environment so as to carry out
M
global exploration more effectively. Secondly, IDCGA has been successfully used to optimize the fuzzy clustering model. On one
hand, IDCGA-FCM has been discussed for fuzzy clustering by using IDCGA. On the other hand, an optimal-selection-based
strategy has been presented to select partial individuals for exactly performing local search, and then IDCGA2-FCM has been
ED
developed by integrating IDCGA with optimal-selection-based FCM automatically according to the variation of population
entropy. It can ensure that IDCGA2-FCM achieves an accurate balance between global exploration and local exploitation.
The superiority of the proposed algorithms over AGA-FCM, GA-FCM, FCM-IDPSO, FCM-FPSO, PSO-FCM, and FCM
PT
algorithm have been demonstrated by experiments. The results have proved that IDCGA2-FCM has the most desirable behavior
among all the compared algorithms in terms of efficiency and accuracy. It not only overcomes the shortcomings of the GA-based
CE
and PSO-based fuzzy clustering methods, but also improves the clustering performance greatly. Moreover, the desired results
reported in this paper might increase its attractiveness for use in real-world applications compared to other GA-based and
PSO-based methods.
AC
Acknowledgment
This research is supported by the National Natural Science Foundation of China (No.71461020), Production-Study-Research
Cooperation Program of Science and Technology Bureau of Guangdong Province (No. 2012B091100175), and the Science and
Technology Research Program of ministry of education of Jiangxi Province (No.11686).
References
[1] R. Xu, D. I. Wunsch, Survey of clustering algorithms, IEEE Transactions on Neural Networks,16(3) (2005) 645-678.
[2] B. A. Pimentel, R.M.C.R. Souza, A multivariate fuzzy c-means method. Applied Soft Computing,13 (4) (2013) 1592-1607.
[3] V. Dey, D. K. Pratihar, G. L. Datta, Genetic algorithm-tuned entropy-based fuzzy C-means algorithm for obtaining distinct
T
and compact clusters, Fuzzy Optimization and Decision Making,10(2) (2011) 153-166.
[4] J. C. Bezdek, Pattern recognition with fuzzy objective function algorithms, Plenum, New York, 1981.
IP
[5] J. Zhou, L. Chen, C. L. Philip Chen, Y. Zhang, H. Li, Fuzzy clustering with the entropy of attribute weights, Neurocomputing,
198 (2016) 125-134.
CR
[6] L. Zhang, W. Pedrycz, W. Lu, X. Liu, L. Zhang, An interval weighed fuzzy c-means clustering by genetically guided
alternating optimization, Expert Systems with Applications, 41(13) (2014) 5960-5971.
[7] B. A. Pimentel, R.M.C.R. Souza, A weighted multivariate fuzzy c-means method in interval-valued scientific production data,
Expert Systems with Applications, 41(7) (2014) 3223-3236.
US
[8] M. S. Xiao, Z. C. Wen, J. W. Zhang, X. F. Wang, An FCM clustering algorithm with improved membership function, Control
and Decision, 30(12) (2015) 2270-2274.
AN
[9] M. Sabzekar, M. Naghibzadeh, Fuzzy c-means improvement using relaxed constraint support vector machines. Applied Soft
Computing, 13(2) (2013) 881-890.
[10] L. Szilágyi, S M. Szilágyi, Generalization rules for the suppressed fuzzy c-means clustering algorithm, Neurocomputing,
M
[12] D. Li, H. Gu, L. Zhang, A hybrid genetic algorithm-fuzzy c-means approach for incomplete data clustering based on
nearest-neighbor intervals, Soft Computing, 17(10) (2013) 1787-1796.
[13] K. Zhou, S. Ding, C. Fu, S.L.Yang, Comparison and weighted summation type of fuzzy cluster validity indices. International
PT
[15] D. E. Goldberg, Genetic Algorithms in Search, Optimization, and Machine Learning, Addison-Wesley, NewYork, 2012.
[16] D. X. Chang, X. D. Zhang, C. W. Zheng, A genetic algorithm with gene rearrangement for K-means clustering, Pattern
Recognition, 2009, 42(7) (2009) 1210-1222.
AC
[17] J. Xiao, Y. Yan, J. Zhang, Y. Tang, A quantum-inspired genetic algorithm for k -means clustering, Expert Systems with
Applications, 37(7) (2010) 4966-4973.
[18] H. He, Y. Tan, A two-stage genetic algorithm for automatic clustering, Neurocomputing, 81 (2012) 49-59.
[19] R. J. Kuo, L.M. Lin, Application of a hybrid of genetic algorithm and particle swarm optimization algorithm for order
clustering, Decision Support Systems, 49(4) (2010) 451-462.
[20] L. O. Hall, I. B. Ozyurt, J. C. Bezdek, Clustering with a genetically optimized approach, IEEE Trans. Evol. Comput. 3(2)
(1999) 103–112.
[21] L. Zhu, S. Qu, T. Du, Adaptive fuzzy clustering based on Genetic algorithm, in: IEEE International Conference on Advanced
Computer Control, vol. 5, 2010, pp.79-82
[22] Y. Ding, X. Fu, Kernel-based Fuzzy C-Means Clustering Algorithm Based on Genetic Algorithm, Neurocomputing, 188
ACCEPTED MANUSCRIPT
28
(2015) 233-238.
[23] S. Wikaisuksakul, A multi-objective genetic algorithm with fuzzy c-means for automatic data clustering, Applied Soft
Computing, 24 (2014) 679-691.
[24] A. X. Ye, Y. X. Jin, A fuzzy c-means clustering algorithm based on improved quantum genetic algorithm, International
Journal of Database Theory and Application, 9(1) (2016) 227-236.
[25] C N Zhang, Y R. Li, A kind of chaotic particle swarm and fuzzy c-mean clustering based on genetic algorithm, International
Journal of Hybrid Information Technology, 7(4) (2014) 287-298.
[26] E. Alba, B. Dorronsoro, The exploration/exploitation tradeoff in dynamic cellular genetic algorithms. IEEE Transactions on
Evolutionary Computation, 2005, 9(2):126-142.
[27] A. J. Nebro, J. J. Durillo, F. Luna, B. Dorronsoro, E. Alba, MOCell: A cellular genetic algorithm for multiobjective
T
optimization, International Journal of Intelligent Systems, 24(7) (2009) 726-746.
IP
[28] A. Al-Naqi, A. T. Erdogan, T. Arslan, Adaptive three-dimensional cellular genetic algorithm for balancing exploration and
exploitation processes, Soft Computing, 17(7) (2013) 1145-1157.
CR
[29] N. Leite, F. Melício, A. Rosa. Clustering using cellular genetic algorithms, in: International Conference on Evolutionary
Computation Theory and Applications, vol. 1, 2015, pp. 366-373.
[30] E. Alba, B. Dorronsoro, Cellular Genetic Algorithms. Springer, New York, 2008.
US
[31] J. P. Su, T. E. Lee, K. W. Yu, A combined hard and soft variable-structure control scheme for a class of nonlinear systems.
IEEE Transactions on Industrial Electronics, 56(9) (2009) 3305-3313.
[32] V. Arnold, A. Avez, Ergodic Problems of Classical Mechanics, Benjamin, New York, 1968.
AN
[33] F. Chen, K. W. Wong, X. Liao, X. Tao, Period distribution of generalized discrete Arnold cat map, Theoretical Computer
Science, 552 (2014) 13-25.
[34] Y. M. Lu, M. Li, L. Li, The cellular genetic algorithm with evolutionary rule, Acta Electronica Sinica, 38(7) (2010)
1603-1607.
M
[35] J. Grefenstette, Optimization of Control Parameters for Genetic Algorithms, IEEE Transactions on Systems Man &
Cybernetics, 16(1) (1986) 122-128.
[36] J. Reed, R. Toombs, N. A. Barricelli, Simulation of biological evolution and machine learning, Journal of Theoretical
ED
[38] H. Izakian, A. Abraham, Fuzzy c-means and fuzzy swarm for fuzzy clustering problem. Expert Systems with Applications,
38(3) (2011)1835–1838.
CE
[39] T. M. S. Filho, B. A. Pimentel, R. M. C. R. Souza, A. L. I. Oliveira, Hybrid methods for fuzzy clustering based on fuzzy
c-means and improved particle swarm optimization, Expert Systems with Applications, 42 (2015) 6315–6328.
[40] A. Asuncion, D. J. Newman, UCI machine learning repository <http://archive.ics.uci.edu/ml/>. Irvine, CA: University of
AC
Authors Bio
ACCEPTED MANUSCRIPT
29
Lilin Jie received the B.S. degree in Computational Intelligence and pattern recognition from the
Nanchang Hangkong University, Nanchang, China, in 2012. She is currently a Ph.D. Candidate
and experience include reliability engineering, fuzzy clustering, Computational Intelligence and
T
pattern recognition.
IP
CR
US
AN
Weidong Liu is professor of the School of Mechanical & Electrical
Nanchang Hangkong University. He is also the president of the Association for Quality, Jiangxi,
China. He earned his PhD. in Mechanical Engineering from the Nanjing University of Science and
ED
Technology in 1994. His research interests and experience include quality management, reliability
about the mechanical and electrical products or equipment, neural networks, and fuzzy clustering.
PT
CE
AC
China. His research interests are reliability engineering and quality management.
ACCEPTED MANUSCRIPT
30
T
China. Her research interests include reliability engineering and quality management.
IP
CR
A list of captions for figures:
Fig.1. Typical neighbourhoods used in CGA.
Fig.2. Flowchart of the IDCGA-based clustering algorithms.
US
Figs.3. Part of synthetic data sets. (a) DS1. (b) DS2. (c) DS3. (d) DS4. (e) DS5. (f) DS6.
Fig. 4. Clustering results for the eight synthetic data sets. The figures plot the average obtained by FCM, GA-FCM, AGA-FCM,
IDCAGA-FCM and IDCGA2-FCM. (a) DS1. (b) DS2. c) DS3. (d) DS4. (e) DS5. (f) DS6. (g) DS7. (h) DS8.
AN
Fig. 5. Clustering results for the four real-word data sets. The figures plot the average obtained by FCM, GA-FCM,
AGA-FCM, IDCAGA-FCM and IDCGA2-FCM. (a) Iris. (b) Wine. (c) Heart. (d) Cancer. (e) Glass. (f) Image segment. (g)
Landsat Satellite.
Fig. 6. Run time of five algorithms for the 16 data sets. (a) Six synthetic data sets. (b) Six UCI data sets. (c) Four large data sets.
M
Fig.7. Number of iterations obtained by three algorithms on the six data sets.
Fig. 8. The clustering results obtained by IDCGA2-FCM with change.
ED