0% found this document useful (0 votes)
51 views31 pages

Video 19

This document summarizes a research paper that proposes two novel fuzzy clustering algorithms based on an improved self-adaptive cellular genetic algorithm. The first algorithm, IDCGA-FCM, uses the genetic algorithm to optimize fuzzy c-means clustering. The second algorithm, IDCGA2-FCM, automatically integrates the genetic algorithm with an optimal-selection-based version of fuzzy c-means clustering. Experimental results on synthetic and real-world datasets show that the proposed algorithms have higher efficiency and accuracy than other fuzzy clustering methods.

Uploaded by

kamutmaz
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
51 views31 pages

Video 19

This document summarizes a research paper that proposes two novel fuzzy clustering algorithms based on an improved self-adaptive cellular genetic algorithm. The first algorithm, IDCGA-FCM, uses the genetic algorithm to optimize fuzzy c-means clustering. The second algorithm, IDCGA2-FCM, automatically integrates the genetic algorithm with an optimal-selection-based version of fuzzy c-means clustering. Experimental results on synthetic and real-world datasets show that the proposed algorithms have higher efficiency and accuracy than other fuzzy clustering methods.

Uploaded by

kamutmaz
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 31

Accepted Manuscript

Hybrid fuzzy clustering methods based on improved self-adaptive


cellular genetic algorithm and optimal-selection-based fuzzy c-means

Lilin Jie , Weidong Liu , Zheng Sun , Shasha Teng

PII: S0925-2312(17)30614-8
DOI: 10.1016/j.neucom.2017.03.068
Reference: NEUCOM 18301

To appear in: Neurocomputing

Received date: 30 October 2016


Revised date: 6 February 2017
Accepted date: 29 March 2017

Please cite this article as: Lilin Jie , Weidong Liu , Zheng Sun , Shasha Teng , Hybrid fuzzy clustering
methods based on improved self-adaptive cellular genetic algorithm and optimal-selection-based fuzzy
c-means, Neurocomputing (2017), doi: 10.1016/j.neucom.2017.03.068

This is a PDF file of an unedited manuscript that has been accepted for publication. As a service
to our customers we are providing this early version of the manuscript. The manuscript will undergo
copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please
note that during the production process errors may be discovered which could affect the content, and
all legal disclaimers that apply to the journal pertain.
ACCEPTED MANUSCRIPT
1

Highlights
 The new dynamic crossover and entropy-based two-combination mutation operations are
constructed to prevent the convergence of the algorithms to a local optimum by
adaptively modifying the probabilities of crossover and mutation as well as mutation step
size according to dynamic adjusting strategies and judging criterions.
 An improved self-adaptive cellular genetic algorithm (IDCGA) is presented for a more
efficient search by combining the Arnold cat map with modified evolution rule, as well as
the constructed dynamic crossover and the entropy based two-combination mutation
operators.
 Two novel adaptive fuzzy clustering algorithms based on IDCGA, referred to as

T
IDCGA-FCM and IDCGA2-FCM, are proposed in this paper. The first one is a standalone

IP
form for fuzzy clustering on the basis of IDCGA. The second one is a hybrid method
based on FCM and IDCGA which takes advantage of the merits of both algorithms.

CR
The experimental results showed that the presented algorithms have high efficiency and
accuracy.

US
AN
M
ED
PT
CE
AC
ACCEPTED MANUSCRIPT
2

Hybrid fuzzy clustering methods based on improved self-adaptive


cellular genetic algorithm and optimal-selection-based fuzzy c-means

Corresponding Author: Weidong Liu,E-mail: [email protected]

Fax: 86-0791-83863966, Tel: 86-18179105566

Address: 999 Xuefu Road, Honggutan New District, Nanchang 330031, Jiangxi, China

T
IP
Order of Authors: Lilin Jiea,b, Weidong Liua,c*, Zheng Suna, Shasha Tenga

CR
School of Mechatronics Engineering, Nanchang University, Nanchang 330031, China

b
School of Aeronautical Manufacturing Engineering, Nanchang Hangkong University, Nanchang

c
330063, China
US
AN
School of Economic Management, Nanchang Hangkong University, Nanchang 330063, China

ABSTRACT
M

With an aim to overcome low efficiency and improve the performance of fuzzy clustering, two novel fuzzy clustering algorithms based on

improved self-adaptive cellular genetic algorithm (IDCGA) are proposed in this paper. The new dynamic crossover and entropy-based

two-combination mutation operations are constructed to prevent the convergence of the algorithms to a local optimum by adaptively modifying the
ED

probabilities of crossover and mutation as well as mutation step size according to dynamic adjusting strategies and judging criterions. Arnold cat

map is employed to initialize population for the purpose of overcoming the sensitivity of the algorithms to initial cluster centers. A modified

evolution rule is introduced to build a dynamic environment so as to explore the search space more effectively. Then a new IDCGA that combined
PT

these three processes is used to optimize fuzzy c-means (FCM) clustering (IDCGA-FCM). Furthermore, an optimal-selection-based strategy is

presented by the golden section method and then a hybrid fuzzy clustering method (IDCGA2-FCM) is developed by automatically integrating

IDCGA with optimal-selection-based FCM according to the variation of population entropy. Experiments were performed with six synthetic
CE

datasets and seven real-world datasets to compare the performance of our IDCGA-based clustering algorithms with FCM, other GA-based and

PSO-based clustering methods. The results showed that the presented algorithms have high efficiency and accuracy.

Keywords: Fuzzy clustering; Fuzzy c-means; Cellular genetic algorithm; Dynamic crossover; Two-combination mutation
AC

1. Introduction

With the increasing importance of extracting useful information from huge quantities of data, clustering technique has been
widely used in a variety of application domains such as data mining, machine learning, pattern recognition, computer vision and
image segmentation [1, 2]. Cluster analysis is an unsupervised learning process of partitioning an unlabeled data set into a
number of groups with the principle of similarity that objects in the same group are more similar to each other than those from
different groups. These clustering methods can be broadly divided into two basic types, namely hard and fuzzy methods [3].
Although hard clustering methods have the characteristics of simplicity and low computational complexity, they might be less
attractive for real-life data sets in which there are no definite boundaries between the groups because of assigning each element to
a single group. Comparatively, fuzzy clustering algorithms can assign each element of a data set to multiple groups
ACCEPTED MANUSCRIPT
3

simultaneously in accordance with the membership functions matrix, and therefore overcome the deficiency of the hard
clustering methods.
The most widely used fuzzy clustering algorithm is fuzzy c-means (FCM) proposed by Bezdek (1981) [4]. In spite of its high
popularity and extensive applications, FCM has the following inherent limitations: (i) the equal weight for data attributes without
considering the difference in importance of data attributes, all attributes being treated equally in determining the cluster
memberships of objects, (ii) the poor performance on these types of data such as uneven sample distribution and samples with
noises or outliers as well as the incomplete data, (iii) the low speed in solving clustering problems of high complexity, (iv) the
clustering results largely determined by the initial cluster centers, (v) the iterative process being trapped into local minima easily.
Aimed at the above-described defects, a number of extensions of FCM have been reported by some researchers in the last
couple of years. These studies mainly focused on attribute-weighted fuzzy clustering, modification of fuzzy membership

T
functions, and effectiveness assessment of fuzzy cluster validity indices, etc. For instance, Zhou et al. [5] presented a study on the

IP
attribute-weighted fuzzy clustering problem and proposed a maximum-entropy-regularized weighted fuzzy FCM (EWFCM)
algorithm. Zhang et al. [6] investigated an interval weighed FCM method with a genetic heuristic strategy to search for weights

CR
of data attributes in interval-constrained ranges, so as to quantitatively evaluate the relative importance of each attribute to the
clustering performance of FCM. Pimentel and Souza [2] conducted an experimental study of a multivariate FCM method in
which membership degrees by variables and clusters are calculated. That study showed that the membership degrees vary from

US
one variable to another and from one group to another. With an aim to deal with interval-valued data, Pimentel and Souza [7]
proposed a weighted multivariate FCM in which the membership degree for each cluster is defined by a linear combination of
membership degrees by variables. In another effort to improve the validity of the clustering on non-equilibrium distribution
AN
samples, Xiao et al. [8] reported a variant of FCM with modified fuzzy membership constraint function by relaxing the
normalization condition and constantly correcting the membership in the clustering process. Sabzekar and Naghibzadeh [9]
developed a modified FCM using relaxed constraints support vector machines to tackle the problem of many samples being
assigned to some clusters with low membership values. In order to reduce the convergence time of the FCM, several
M

generalization schemes for the suppressed FCM were also investigated extensively through experiment and numerical simulation
by Szilágyi et al. [10]. Zhao et al. [11] put forward an optimal-selection-based suppressed fuzzy c-means clustering algorithm
with self-tuning non-local spatial information and applied it to improve segmentation performance on images heavily
ED

contaminated by noise. Li et al. [12] used a GA approach to search for appropriate imputations of missing attributes in
nearest-neighbor intervals to recover the incomplete data set, with the goal of solving the problem of the uncertainty of missing
attributes. Zhou et al. [13] introduced a weighted summation type of fuzzy cluster validity index through properly setting the
PT

weights of the ten indices to evaluate the clustering results and determine the optimal number of clusters.
In spite of many obvious advantages of all the FCM versions discussed above, they still have limitations, such as the
CE

initialization with randomly selected cluster centers and the high probability of falling into local minima. Furthermore, there is
great research space to further improve the performance of fuzzy clustering algorithms.
To solve in part this problem, metaheuristic algorithms such as genetic algorithm (GA), simulated annealing (SA), ant colony
AC

optimization (ACO) and particle swarm optimization (PSO) have been successfully applied to optimize the objective function of
FCM, attempting to avoid being trapped in local minima [14]. GA has become one of the most popular search and optimization
tool in many application fields due to its ease of implementation, global perspective and extensive applicability [15].
Consequently, many scholars have been dedicated to applying GAs to the cluster analysis. As a result, a variety of GA-based
approaches for the hard and fuzzy clustering problem have been made by some authors [14]. GA-based approaches for the hard
clustering problem are as follows: Chang et al. [16] proposed a new clustering algorithm based on GA with gene rearrangement
(GAGR). They also compared the performance of GAGR with K-means and other GA methods. A quantum inspired GA for
k-means clustering (KMQGA) was discussed by Xiao et al. [17], in which a Q-bit-based representation is employed for
exploration and exploitation in discrete 0-1 hyperspace using rotation operation of quantum gate as well as the typical genetic
algorithm operations of Q-bits. He and Tan [18] investigated a two-stage genetic clustering algorithm (TGCA) in which
ACCEPTED MANUSCRIPT
4

two-stage selection and mutation operations were implemented to exploit the search capability of the algorithm. The study
showed that TGCA is superior to k-means and standard genetic k-means algorithms. Kuo et al. [19] developed a hybrid
evolutionary algorithm based on GA, SO and k-means algorithm (HGAPSOA). GA-based approaches for the fuzzy clustering
problem are as follows: Hall et al. [20] presented a GA approach to minimize the clustering objective functions of hard c-means
(HCM) and FCM simultaneously. The results of all possible comparisons revealed that GA-HCM/GA-FCM approaches could
provide a better final partition than HCM/FCM, respectively. To overcome the disadvantages and improve the performance of
FCM significantly, Zhu et al. [21] proposed an adaptive fuzzy clustering based on GA (AGA-FCM). They obtained the optimal
number of clusters by analyzing cluster validity function and get the initial cluster centers with HCM. Ding and Fu [22]
introduced a kernel-based FCM clustering algorithm based on GA (GAKFCM), which combines the improved GA and the kernel
technique. Wikaisuksakul [23] investigated a multiobjective genetic algorithm-based fuzzy clustering algorithm (FCM-NSGA)

T
using NSGA-II and FCM to solve the data clustering problem. Ye and Jin [24] conducted a study on the performance of a novel

IP
FCM clustering algorithm based on improved quantum genetic algorithm (IQGA). In order to deal with FCM’s sensitivity to the
initial value, Zhang and Li [25] proposed a hybrid clustering algorithm (GCQPSO-FCM) by combining GA and chaotic particle

CR
swarm optimization (CPSO) with FCM. That study showed that GCQPSO-FCM achieves better performance than the original
algorithm.
As noted above, many of the clustering algorithms based on GAs may adopt GA in standalone form or in combination with

US
K-means, HCM or FCM. It is worth noting that these GA-based approaches for hard or fuzzy clustering have improved
performance as compared to traditional methods such as K-means, HCM and FCM. However, these clustering methods described
above suffer from the following two difficulties: (i) it takes enormous time for them to evaluate the functions, which leads to
AN
higher computational complexity and therefore limits their practical applications, (ii) when the size of data sets is large, some of
them may converge to local optimal solutions of the objective function due to the loss of the diversity of population during the
evolution. Therefore, it is essential to design more efficient optimization algorithms to improve the performance of fuzzy
clustering.
M

In an attempt to resolve the dilemmas that could not be settled by GA-based clustering methods, we focus on the cellular
model of GA (CGA) in this study. The key reason for choosing CGA is that, as one of the best-known cellular model
evolutionary algorithms (CEAs), its advantage lies in its implementation and the good tradeoff between exploration and
ED

exploitation compared with other evolutionary methods, which is critical in determining the effectiveness of the algorithm
[26-28]. In a CGA, the concept of neighborhood is intensively used, and thus an individual may only interact with its defined
local neighbors in the breeding loop. The exploration is provided by the overlapped small neighborhoods due to the induced slow
PT

diffusion of solutions through the population, which contributes to the genetic diversity of population and effectively avoids
premature convergence. On the contrary, the exploitation takes place within each neighborhood by genetic operators aiming at
CE

improving the quality of solutions. CGAs have been proven to be an efficient way in dealing with optimization problems of
various complexities [27, 28]. However, little attention has been paid to its use in the field of cluster analysis. According to the
existing research of hard or fuzzy clustering methods, extremely limited information is available concerning the combination of
AC

CGA with K-means, HCM or FCM. Only Leite et al. [29] used a CGA to solve the problem of classification of patterns, the
unique cellular genetic clustering algorithm for the crisp clustering based on the canonical CGA before this study, to the best of
our knowledge.
We are motivated by these investigations to design excellent optimization methods based on CGA that can dynamically adapt
to changes in the diversity of population through accurate balance between exploration and exploitation, and apply the methods
to the fuzzy clustering problem. With this concept in mind, we use the well-known diversity measure, the population entropy [28],
to compute the diversity of population so as to guide the search process. In our design, the widely-used Arnold chaotic map is
employed to initialize population for the purpose of overcoming the sensitivity of clustering algorithms to initial cluster centers.
A modified evolution rule is also introduced to build a dynamic environment so as to maintain the diversity which is in favor of
global exploration. Furthermore, the new dynamic crossover and entropy-based two-combination mutation operator are
ACCEPTED MANUSCRIPT
5

constructed to prevent the convergence of the algorithms to a local optimum. Based on the diversity computed, real-valued
coding mode is combined with the Arnold cat map, modified evolution rule, as well as the constructed dynamic crossover and the
entropy based two-combination mutation operators to present an improved self-adaptive cellular genetic algorithm (IDCGA). As
a result, IDCGA has two advantages over the basic CGA, namely, (i) the adaptive adjustment of parameters during the evolution.
(ii) the appropriate balance between global search and local search.
In this paper, two new methods for the fuzzy clustering problem based on the improved IDCGA are introduced to achieve a
better performance. The first method proposed in this study is a standalone form for fuzzy clustering by using IDCGA, called
IDCGA-FCM. The improved IDCGA is applied in fuzzy clustering model aiming to achieve the best possible results. The second
method is a hybrid between IDCGA and FCM, called IDCGA2-FCM, which dynamically integrates FCM with IDCGA according
to the variation of the diversity of population. This is done to exploit the local search space more efficiently and make the

T
evolution process converge fast. Thus, IDCGA2-FCM may tackle the two main problems of GA-based clustering methods, i.e.,

IP
the low speed and the easy involvement in partial extrema when the size or dimension of the data is large. Experiments on six
synthetic data sets and seven real-world data sets are reported and the results are compared with the six other clustering

CR
algorithms, including FCM, two GA-based clustering methods named GA-FCM, and AGA-FCM, and three PSO-based clustering
methods named PSO-FCM, FCM-FPSO and FCM-IDPSO.
The remainder of this paper is organized as follows. Section 2 provides a description of the FCM algorithm and the basic

US
CGA algorithm. Section 3 presents the details of the proposed method IDCGA and discusses the IDCGA algorithm for fuzzy
clustering. Section 4 introduces our hybrid clustering method. Section 5 reports the experimental results of different clustering
algorithms on synthetic and real-world data sets. Finally, conclusions and suggestions for further research are given in Section 6.
AN
2. Theoretical basis

The algorithms on which this study is based have already been mentioned in the previous section and now are further
illustrated. In this section, first we present a description of fuzzy c-means clustering; then, we describe the basic cellular genetic
M

algorithm.

2.1. Fuzzy c-means algorithm


ED

Suppose that fuzzy c-means (FCM) partitions a set of n data objects   1,,k ,, n  into c

(1 c  n) fuzzy clusters, where each object has d attributes, and the kth object  k is represented as a vector of
PT

quantitative variables k  (1k , jk ,,dk ) , where  jk  R . Let V  v1,,vi ,,v c  be a set of
CE

cluster centers, where the ith cluster center vi is also represented by a vector of quantitative variables

vi  (v1i ,v ji ,, vdi ) , where v ji  R . Let U  [uik ] be a cn fuzzy matrix of membership degrees in which
AC

uik is the membership degree of the kth object to the ith cluster center. The characters of U are as follows:

 c
  uik  1, 1  k  n
 i 1
 n
  uik  (0,n), 1  i  c (1)
 k 1

uik  0,1, 1  i  c, 1  k  n

The goal of FCM is to determine cluster centers v i (1  i  c ) and a fuzzy matrix U that minimize an objective
ACCEPTED MANUSCRIPT
6

function J , which can be defined as follows:


c n
J (U ,V )   uikm dik2 (2)
i 1 k 1

where m is the fuzzification coefficient and dik is the Euclidian distance that represents the similarity between object  k

and center vi . The distance is calculated as:

dik  k  vi (3)

The minimum of J (U ,V ) can be achieved by Lagrange multiplier while the cluster centers and

T
membership degrees are updated according to Eq. (4) and Eq. (5).

IP
n

 u ikm k
k 1
vi  n

CR
 u ikm
(4)

k 1

1
u ik  1 m 1
  k  vi 

US
2 (5)
c
 
    v 2 
l 1
 k l 
AN
2.2. Basic cellular genetic algorithm

Cellular genetic algorithm (CGA) is a decentralized GA which combines cellular automata with genetic algorithm [30]. In

this algorithm model, all individuals are usually located in a specific toroidal grid of d dimensions (d=1, 2, 3). Each
M

individual is assigned to a grid position and only interacts with its defined local neighbors. Four typical neighborhoods used in
the CGA are illustrated in Figs.1. In this paper, we use population structured in a two-dimensional regular grid, and the
neighborhood defined on it (C9) contains nine individuals: the considered one (position(x, y)) plus the east, west, north, south,
ED

northeast, northwest, southeast and southwest ones (also called Moore neighborhood).
PT
CE

L5 L9 C9 C25
Fig.1. Typical neighbourhoods used in CGA.
More specifically, there are three kinds of operations in CGA, including selection, crossover and mutation, all of which are
AC

restricted within neighborhood. CGA usually do as follows: Each individual is updated by selecting its parents from its
neighborhood with a given criterion (line 7 and line 8). A genetic crosser operator is only applied to the two parents with a
probability Pc to form two offsprings (line 9), one of which is then mutated by a mutation operator with a probability Pm (line

10). Afterwards, the new offspring is evaluated and replaces the current individual according to a specified replacement policy
(line 11 and line 12). This process is repeated until all individuals in the population are updated. Then the newly generated

auxiliary population Paux (t ) replaces the current population to start a new generation (line 15), and the whole process is repeated

until specific termination conditions are satisfied. Algorithm1 shows the steps of the basic CGA algorithm.
Algorithm1. Basic CGA
1. Initialize parameters for CGA including the population size M , Pc , Pm ;
ACCEPTED MANUSCRIPT
7

Randomly generate an initial population P ( 0) ;


2. Evaluate(P ( 0));
3. t 0
4. while not Termination_Condition () do
5. for x  1 to Row do;
6. for y  1 to Column do;
7. neighbors  Get_Neighbors (position(x, y));
8. parent1  position (x, y);
9. parent2  Select (neighbors);
10. offspring  Crossover (Pc, parent1, parent2);
11. offspring  Mutate(Pm, offspring);
12. offspring  Evaluate (offspring);

13. Replace (position(x,y), offspring, Paux (t ) );

T
14. end for
15. end for

IP
16. P (t  1)  Paux (t )
17. t  t 1
18. end while

CR
3. Improved self-adaptive cellular genetic algorithm

3.1. Chromosome representation


US
For any CGA, the first essential task to be accomplished is the representation of chromosome, which is needed to describe
AN
each individual in the population. Generally, the representation approach is determined mainly in accordance to specific
optimization problem, since the approach determines how the under-solved problem is structured in the algorithm. The characters
in the chromosome are called genes, which consist of symbols, floating-point numbers, binary digits. Among them, binary and
M

real parameter representations are commonly used for genetic clustering algorithms [12]. It has been shown that the real-valued
representations are more practical due to their consistency with the real-world number system and are more convenient for
further processing [31]. Thus, a real-valued representation is employed to express the chromosome in this paper. Taking both the
ED

computational complexity and useful information of the data into account, we turn our attention to cluster-center-based
representation approach.

Let Q  x1 , x 2 ..., x p ..., x M, (1  p  M) be a cellular population listed by x p . Each individual represents a set of
PT

cluster centers, i.e., one pattern partition of the dataset. In other words, each chromosome is described by a sequence of cd
real-valued numbers, where c is the number of clusters and d is the dimension of a data. This is more natural than the binary
CE

representation. More specifically, the chromosome x p can be expressed as

x p  [ x p,1x p, 2  x p, d x p, d 1x p, d  2  x p, 2d  x p,c( d 1) 1x p,c( d 1)  2  x p,cd ]


AC

(6)

Where the first d values show the first cluster, the next d values represent the second cluster, and so forth.

3.2. Population initialization

On the one hand, as a result of the randomness and blindness of random selection methods, some meaningless clusters
irrelevant to the given data set might be yielded. On the other hand, the suitable initial cluster centers have a significant effect on
the performance of clustering [18]. Based on the above, we focus on chaotic map and apply it to initialization of chromosomes.
Among the chaotic maps, the Arnold cat map introduced by Russian mathematician V.I. Arnold is one of the widely used chaotic
maps due to its advantages of ergodic and dynamic properties of chaos variables [32, 33]. In IDCGA-based clustering algorithms,
Arnold cat map is employed to generate sequences to substitute the random initial cluster centers, aiming at obtaining a diverse
ACCEPTED MANUSCRIPT
8

initial population. This map is defined as follows:

 n 1  1 1  n 
  n 1   1 2   n  (mod1)
(7)

The initial chromosomes based on Arnold cat map is given by

x pj  x jmin   n ( x jmax  x j min ) (8)

Where  n ,  n  0, 1 ,  n is the chaotic variable by n iterations, x pj is the jth variable of x p , and the range of the

jth variable is within the interval of [ x j


min
, x jmax] , where x j min and x jmax can be derived from the corresponding

T
minimum and maximum values of each attribute of the given data set. This process is repeated until M chromosomes are

IP
generated by iterative using Eq. (7) and (8), which ensures that the initial population can make the best use of inherent
information of data set. Thus, M initial chromosomes are uniformly distributed in the whole search space and contain as much

CR
diverse genes as possible.

3.3. Fitness function

US
In IDCGA algorithm the same as other evolutionary algorithms, a fitness function is used to determine fitness value so as to

evaluate the optimality of each chromosome x p (1  p  M) . Afterwards, the fitness value is utilized to decide whether the
AN
chromosome is eliminated or retained. In accordance with the principle of survival of the fittest, the more adaptive chromosomes
are kept, and the less adaptive ones are discarded in the process of generating a new population.
For a set of data    1,, k ,, n , the use of FCM can provide fitness metric for genetic optimization in
M

the proposed algorithm framework. The reciprocal of clustering objective function is defined as the fitness function to evaluate all
chromosomes here. Thus, we adopt the fitness function as follows:

1
ED

(9)
FIT ( x p )  c n
  k  vi
2
uikm
i 1 k 1

Note that the chromosome is evaluated according to clustering objective function of FCM, as shown in Eq. (2). Then, it is
PT

easy to show that, the smaller the objective function value, the better the effect of the clustering and the higher the fitness value
FIT ( x p ) . Therefore, when clustering objective function achieves its minimum, i.e., FIT ( x p ) reaches the maximum value,
CE

this means that the best partition and the optimal cluster centers are obtained.

3.4. Evolutionary rule


AC

A basic CGA shows better capacity of global exploration than GA through constraining individual genetic operations in its
neighborhood. However, it ignores the dynamic influence among all individuals with each other during the evolutionary
operation, that is to say, whether an individual is living or dead has no effect on the population evolution. Lu et al. [34] has
performed extensive experiments to compare the cellular genetic algorithm with evolutionary rule (CEGA), basic CGA and GA,
which indicates that CEGA is more efficient than CGA in terms of maintaining the diversity of population and conducting the
global exploration. Hence, the evolutionary rule is introduced in this paper to build a dynamic environment and synchronously
update states of individuals in the population.
In this dynamic environment, the evolutionary rule is made by moderate density cell in the same way as that of [34], which

will be described as below. Let t be the current number of evolutionary generation, S t is the state of current individual in the
tth generation, S t 1 is the state of current individual in the next generation, S is the number of the living individuals in its
ACCEPTED MANUSCRIPT
9

neighborhood. Then the state of individual is updated as follows:

S  4,5,6,7
If S t  0, then S t 1  1
0
 S  4,5,6,7

S  2,3,4,5
If S t  1, then S t 1  1 (10)
0 S  2,3,4,5

where 0 represents a dead individual and 1 represents a living individual. From the expression of evolutionary rule, it is seen that,
when the current individual is dead, it will be transformed into a living individual in the next generation as long as the condition
is met, i.e., there are 4 or 5 or 6 or 7 living individuals in its neighbors. Oppositely, when the current individual is living, it will
be transformed into a dead individual in the next generation as long as the number of the living individuals is not equal to 2, 3, 4,
or 5.

T
3.5. Construction of genetic operators

IP
It is well known that the crossover and mutation operations have a significant impact on the behavior and performance of

CR
CGA. In particular, the probabilities of crossover and mutation (hereafter referred to Pc and Pm respectively) greatly
determine whether the algorithm could find a near-optimum solution or whether it could find a solution efficiently. A number of
guidelines have been discussed in the previously reported literature for selecting them [15, 35]. However, these generalized

US
guidelines were drawn from empirical studies on a fixed set of test problems, and were inadequate since the optimal Pc and Pm
are specific to the problem under consideration. In fact, the optimal Pc and Pm , vary along with the problems of concern,
even within different stages of the evolutionary process in a problem. But at present, the genetic operations adopted in the
AN
existing CGA, such as crossover and mutation usually work at a constant probability determined in advance [26-29]. Excessive
high probabilities would disrupt the excellent individual while low values might result in the convergence of the algorithm to a
local optimum. In addition, the search process of CGA is a complex and non-linear process, the fixed values or simple linear
transformation for probabilities of crossover and mutation would fail to explore the search space and cannot achieve the best
M

possible results. Consequently, a judicious choice of Pc and Pm is critical to the successful working of CGA. Furthermore,
biological evolution shows that Pc and Pm are dependent on the evolution state and should be adapted [36]. It is, thus, essential
to design a CGA that adapts itself to the appropriate Pc and Pm instead of using fixed values. In our approach, we aim at
ED

achieving the trade-off between exploration and exploitation of CGA in a different manner, by adjusting them dynamically in
response to the current evolutionary condition of population and the distance between each individual and optimal individual. In
view of this, the new dynamic crossover and entropy-based two-combination mutation operators are constructed to prevent
PT

premature convergence of IDCGA to a local optimum.

3.5.1. Dynamic crossover


CE

The crossover operator is the process of combining the information of two parent chromosomes so that two offsprings are
formed aiming at producing some diversified and promising new chromosomes. Crossover occurs only with some probability
Pc . The crossover probability affects the capability of CGA in exploiting a located hill to reach the local optima. The higher the
AC

value of Pc, the quicker exploitation proceeds. But too large Pc will disrupt individuals faster than they can be exploited.
Hence, a dynamic adjusting strategy based on the variable sigmoid function is introduced to modify the crossover probability
adaptively. The individuals suffer from the arithmetic crossover which utilizes the linear combination.

Let f avg be the average fitness value of current population, f max be the maximum fitness value of the population, f ( xi )

be the value of the fitness of the individual to be crossed. f max  f ( xi ) denotes the distance between the individual to be

crossed and the optimal individual. f max  f avg represents the current evolutionary condition of population to some extent.

Then the probability of crossover Pct (i ) in generation t is given by


ACCEPTED MANUSCRIPT
10

1
Pct (i )  Max  (11)
1  1exp( ( f max  f ( xi )) ( f max  f avg ))

Where Max is the maximum value allowed for Pct (i) , 1 is a crossover factor to determine the threshold boundary of

Pct (i) . It is obvious that the range of Pct (i) under the proposed adaptation scheme is always within the interval of

 Max  t
1 . Clearly, when f avg  f max ,
 , Max  . And then we can obtain an appropriate ranges of Pc (i ) by fine tuning of
1  1 

f ( xi )  f max , and Pct (i)


Max
then is equal to . As can be seen from Eq. (11), the crossover probability Pct (i) will
1  1

T
get higher values for low fitness solutions and get lower values for high fitness solutions. That is to say, in terms of the quality of

IP
clustering, the value of Pct (i) will decrease as its fitness increases in order to reduce the possibility of disrupting a good solution

CR
by crossover.

Here, we use the arithmetic crossover to produce new chromosomes. Let xi and x j be two parent chromosomes to be

crossed, in which

neighbors, then
xi is the current chromosome itself and

 xi  x j   ( xi  x j )
US x j is the chromosome with highest fitness chosen from its
AN
(12)
 x  x   ( x  x )
 j i j i

where   D (0,1) and D (0, 1) is a uniform distribution on the interval 0,1 .


M

3.5.2. Two-combination mutation based on entropy


The mutation operator randomly changes the value of one or more genes of a chosen chromosome with a small probability
Pm and provides a method to bring some extra diversity into the population, thereby allowing the exploration in search space.
ED

This exploration depends on the mutation pattern of the genes and the probability of applying this operator. The mutation
probability Pm controls the speed of CGA in exploring a new area. In fact, the diversity of population is likely to remain at a

relatively high level at the early evolution stage. While the generation goes on, all individuals in the population show a high level
PT

of similarity at the later evolution stage, which leads to a relatively low level of diversity, thereby causing the convergence to a
local optimum. However, it’s well known that maintaining the diversity of population guarantees to get the global optimum. Thus,
given a problem in hand, we have to carefully investigate the variation tendency of diversity of population during all the
CE

evolution and design a two-combination mutation corresponding with the current condition according to the variation of diversity
of population.
With this concept in mind, we adopt the entropy originated from the information theory [28], to measure the diversity of
AC

population and guide the search process. Let M be the population size, the entropy of the lth gene in generation t is
defined as:

M M
DEl (t )    Pijt log Pijt (13)
i 1 j  i 1

where Pijt represents the similarity between the value of the lth gene of the ith chromosome and that of the lth gene of

the jth chromosome in generation t . It is calculated as:


ACCEPTED MANUSCRIPT
11

xilt  xtjl
Pijt  1  (14)
bl  al

where al and bl are the minimum and maximum values on the lth gene, respectively.

The population entropy DE(t ) is equal to the average of the entropies of the different genes, as shown in Eq. (15).

1 L
DE(t )   DEl (t )
L l 1
(15)

The diversity of population can be derived from Eq. (15). It is worth noting that DE(t ) has a theoretical minimum of zero

when the population is only made up of copies of the same chromosome. Accordingly, the more varied the chromosomes, the

T
larger the population entropy, and the better its diversity is.

IP
In the following, the definitions presented above are used to design a two-combination mutation operator that works by
introducing a general mutation and a large disturbance in two stages respectively. Here, a dynamic adjusting strategy is also

CR
employed to select an appropriate value of the mutation probability for each individual adaptively. Assume the detection
1 , and
threshold of population entropy is METmin 1
METmin  k1  DEmax , where DEmax is the maximum value of

population entropy, and k1 is a disturbance factor that determines the disturbance frequency in the search process. Based on the

population entropy computed, the mutation probability US


Pmt (i) in the generation t is given by
AN
 Pm (i ), METm1 in  DE(t )  DEm ax
 (16)
Pmt (i ) 

   Pm (i ), 0  DE(t )  METm in
1

where  is a constant greater than 4 and restrains the mutation rate, Pm (i) is an adaptive mutation probability. The
M

expression for mutation probability Pm (i) is given below


1
Pm (i )  Max  (17)
1 2exp(  ( f max  g ( xi )) ( ( f max  f avg )))
ED

where Max is the maximum value allowed for Pm (i) ,  2 is a mutation factor to determine the threshold boundary of
PT

Pm (i) . f avg , f max , f max  g ( xi ) and f max  f avg are the same as defined above, and g ( xi ) is the fitness of the

chromosome under mutation. Here, we let  = 2 and it is the same as used in Ref. [16]. It is obvious that the range of Pm (i)
CE

under the proposed adaptation scheme is always within the interval of   Max  . And then we obtain an appropriate ranges of
 ,  Max 
1   2 

Pm (i) by fine tuning of  2 . From the expressions of Pct (i) and Pmt (i) , it can be seen that the probabilities of Pct (i) and
AC

Pmt (i) based on the variable sigmoid functions are modified dynamically for each individual in real time, as opposed to the

existing study which sets single-fixed values of Pct (i) and Pmt (i) for all individuals in the population. The algorithm can adjust

Pct (i) and Pmt (i) adaptively in accordance with the current evolutionary condition of population and the distance between the

current individual and the optimal individual. More specifically, the values of Pct (i) and Pmt (i) are high when the current

individual is far away from the current optimal individual of the population. In contrast, when the current individual and the
ACCEPTED MANUSCRIPT
12

current optimal individual are very close, Pct (i) and Pmt (i ) will be relatively low.

Then with the mutation probability, the corresponding judging criterions to the above two-combination mutation are as
follows.
In the first case, i.e., when METmin
1
 DE(t )  DEmax , this means that a relatively high level of the diversity of
population is maintained at the early evolution. As soon as the judging criterion detects such a condition, a general mutation
operator is implemented under the following variant of Gaussian mutation with an adaptive probability Pm (i ) that will be

reduced as its fitness increases. Due to the epistatic interactions, small changes in the individuals tend to bring about big
differences in the fitness values.
In the second case, i.e., when 0  DE(t )  METmin
1 , that is to say, the diversity of population is smaller than the

T
1
detection threshold METmin since most individuals in the population have a great similarity at the later evolution. As soon as

IP
the judging criterion detects such a condition, in which the algorithm is near to convergence to a local optimum, it will try to
jump out of this local optimum by promoting exploration. Then, a large disturbance with a high probability is performed to

CR
introduce some more diversity into the population and improve the capability of global search. According to this reasoning, the

solutions are disrupted with a probability Pmt (i )    Pm (i) and thus better solutions are generated. Therefore, the IDCGA

US
algorithm will come out of local algorithm.
Preliminary experiments were carried out taking into consideration the sensitivity of the mutation condition to the choice of
the value of  , in which different  values that represented low to high restrictive conditions were assessed. The values are: 4,
6, 8, and 10. Based on these tests, it was found that the global convergence rate increases as the value of  increases. Initially,
AN
the run time decreases as the value of  increases and later increases. It is very meaningful to the selection of single  value
for all the considered problems. Thus, the best value will be selected for  in term of efficiency and speed.

A variant of Gaussian mutation on which the mutation process is based was mentioned above and is written as below. Let
M

xij be the gene value of xij after the mutation. Then with mutation probability Pmt (i) , xij is given by

xij  xij   (t ) N (0,1)


ED

(18)
 (t )   (0)  exp(  t Tm ax)
where N (0,1) is a random number of normal distribution,  (t ) is an adaptive mutation step size that determines the disturbance
PT

amplitude.  (0) is an initial value of the mutation step size and could be set according to specific clustering problem, t and

Tmax are the same as define above,  is the time constant. The improved  (t ) in Eq. (18) modifies the mutation step size
CE

dynamically in real time corresponding with the current condition and then achieves the goal of self-adaptive changes. The
algorithm should search with a larger step size at the early evolution stage and should ensure the ability of global search. In
AC

contrast, the algorithm should be searched in a smaller step size at the later evolution stage and should ensure the ability of local
search. Accordingly, it can be stated that the two-combination mutation operation makes full use of the flexible coordination of
population entropy and sigmoid functions. It will definitely overcome the disadvantage brought about by the improper mutation
of constant probability and fixed mutation step size.
The IDCGA algorithm for fuzzy clustering (IDCGA-FCM) can be described as follows:
Algorithm2. IDCGA-FCM
Require: Dataset  and number of clusters c;
1. Initialize parameters for FCM and IDCGA including population size M , Max , Max , k1 , m ,  and Tmax ,
2. Adopt the chaotic map to generate initial cluster centers to create a initialize population P ( 0) using Eq.(8);
3. Classify an individual as living or dead on the k  k grid at random;
4. Decode each individual to obtain the cluster centers, and calculate the membership degrees using Eq. (5);
5. Calculate the J of each individual using Eq.(2) based on U and V ;
ACCEPTED MANUSCRIPT
13

6. Calculate the fitness value of each individual using Eq.(9), and store f avg and f max at each iteration;
7. Calculate the population entropy DE(t ) using Eq.(15);

8. Update states synchronously by evolution rule using Eq.(10);


9. Select the living individual and get its neighborhood as parents;
10. Update the crossover probability of each individual using Eq.(11);
11. Update the mutation probability of each individual using Eq.(16);
12. Implement parents’ recombination with an adaptive probability Pct (i) using Eq.(12);
13. If 1
METmin  DE(t )  DEmax , implement individual’s general mutation with an adaptive probability Pm (i )
using Eq.(18); else implement individual’s big mutation with a high probability   Pm (i) ;
14. Evaluate fitness and replace existing individual if the fitness is higher;
15. If IDCGA-FCM has not met the termination condition ( J (t 1)  J (t )   and t  T ), go to step 4.
max

T
return the best individual.

IP
4. Hybrid fuzzy c-means and improved cellular genetic algorithm for clustering problem

The FCM algorithm is faster than the IDCGA algorithm since it only uses the gradient descent method and evaluates less

CR
functions, but it is easy to be trapped in local minima areas. In this paper, IDCGA is integrated with FCM to form a hybrid
clustering algorithm called IDCGA2-FCM which takes full advantage of the merits of both IDCGA and FCM. This is expected to
lead to a more accurate balance between global exploration and local exploitation, thereby achieving excellent clustering results
and a faster convergence speed.

US
In order to design a combination of IDCGA and FCM more effectively, the key issue to be solved is to determine when the
FCM iterations is employed and how to adaptively perform the FCM iterations in this algorithm. In fact, the algorithm should
AN
ensure the ability of global search and enhance the capability of exploration at the early evolution stage. On the contrary, the
algorithm should ensure the ability of local search and enhance the capability of exploitation at the later evolution stage.

4.1. Description of IDCGA2-FCM algorithm


M

On the basis of the above discussion, we also use the population entropy to measure the diversity of population so as to detect
2 , and
the current condition. Suppose the convergence threshold in population is METmin 2
METmin  k2  DEmax , where DEmax is
ED

the same as defined above, k 2 is a combination factor that adjusts the proportion of FCM and IDCGA in the whole search

process. The implementation strategy adopted in this paper can be expressed as follows:
2
In the first stage, if METmin  DE(t )  DEmax, i.e., the diversity of population is maintained at a relatively high level in the
PT

generation t , the algorithm prefers exploration and favors global search. In this case, IDCGA2-FCM only makes full use of
strong capability of global search of IDCGA.
In the second stage, if DE(t )  METmin
2
, i.e., the diversity of the population drops to a relatively low level and is
CE

2
smaller than the convergence threshold METmin , As soon as the condition is found, the algorithm prefers exploitation and
favors local search. Under this circumstance, FCM will be performed to strengthen the ability of local search after genetic
operations, with the goal of guiding the individuals to approach the optimal solution quickly. The specific criterion can be
AC

described in detail below.


(1) Firstly, all individuals in the population are ranked in descending order in term of their fitness values. Then the only top
 ranked individuals are selected. Herein the golden section method is used to determine the parameter  , that is to say,

the only top 38.2% ranked individuals are subjected to accurate local search by using FCM iterations as follows:
(A) Set   0 ; set the number of iterations Gd ; get the cluster centers by decoding each individual.
(B) Calculate the membership degrees using Eq.(5).
(C) Calculate the cluster centers for each individual using Eq.(4).
(D) If   Gd , go to step B; else stop this loop and do as follows: calculate the J using Eq.(2) and then update
each individual by coding the cluster centers.
(2) The unselected 61.8% individuals maintain the original population structure, thus the balance between global search and
ACCEPTED MANUSCRIPT
14

local search could be achieved.


Based on all the descriptions above, Algorithm 3 shows the steps of IDCGA2-FCM. All the procedure is briefed in Fig.2 for
a quick reference.
Algorithm3. IDCGA2-FCM
Require: Dataset  and number of clusters c ;
1. Initialize parameters FCM and IDCGA including population size M , Max , Max , k1 , k 2 , m ,  and Tmax ;
2. Adopt the chaotic map to generate initial cluster centers to create a initialize population P ( 0) using Eq. (8);
3. Classify an individual as living or dead on the k  k grid at random;
4. Decode each individual to obtain the cluster centers, and assign the membership degrees using Eq. (5);
5. Calculate the J of each individual using Eq. (2) based on U and V ;
6. Calculate the fitness of each individual using Eq. (9), and store f avg and f max during iterative process;

T
7. Calculate the population entropy DE(t ) using Eq. (15);

IP
8. Update states synchronously by evolution rule using Eq. (10);
9. Select the living individual and get its neighborhood as parents;
10. Update the crossover probability of each individual using Eq. (11);

CR
11. Update the mutation probability of each individual using Eq. (16);
12. Implement parents’ recombination with an adaptive probability Pct (i ) using Eq. (12);
1
13. If METmin  DE(t )  DEmax , implement individual’s general mutation with an adaptive probability Pm (i ) using

15. If DE(t )  METmin 2 US


Eq. (18); else implement individual’s big mutation with a high probability   Pm (i) ;
14. Evaluate fitness and replace existing individual if the fitness is higher;
, go to step 16; else go to step 4;

16. Sort all the individuals within population in terms of their fitness values, and adopt the golden section to determine
AN
the excellent individuals; then the top 38.2% ranked individuals are subjected to the following operations while the
unselected 61.8% individuals maintain the original population structure.
(A) Set   0 ; set the number of iteration Gd ; get the cluster centers by decoding each individual.
M

(B) Calculate the membership degrees using Eq. (5).


(C) Calculate the cluster centers for each individual using Eq. (4).
(D) If   Gd , go to step B ; else stop this loop and do as follows: calculate the criterion J using Eq. (2) and then
ED

update each individual by coding the cluster centers.


17. If IDCGA2-FCM has not met the stopping criterion ( J (t 1)  J (t )   and t  Tmax ), go to step 4.
return the best individual.
PT

4.2. Complexity Analysis

The time complexity of a clustering algorithm becomes one of the significant issues that are most worthy of concern. The
CE

time complexity of our IDCGA-based clustering algorithms is analyzed as follows:


Initialization: The time complexity of the population initialization is O( Mcd ) .

Fitness evaluation: Since each individual is decoded to obtain the cluster centers and assigned the membership degrees using
AC

FCM. On one hand, the time complexity of assigning the membership degrees is O(ncd ) . On the other hand, the fitness
calculation for each individual needs O(n) time. Assume that population size is M in every generation, and then the total
time complexity for fitness evaluation is O(Mncd ) .

Measure of population diversity: Because of the adoption of entropy, the time complexity of calculating the population
diversity is O(M (M  1)cd 2) .
Genetic operations: Each genetic operation requires O( Mcd ) time in the worst case. As a result, the time complexity of the
population for genetic operations is O( Mcd ) .
Integration of FCM: As mentioned above, the time complexity of calculating the membership degrees is O(ncd ) . In addition,
the time complexity of updating the cluster centers for each individual is O(cd ) . Suppose that the number of iterations is
equal to Gd and there are  individuals chosen for FCM iterations. Hence, the total time complexity for the integration of
ACCEPTED MANUSCRIPT
15

FCM is O(Gd ncd ) in the worst case.


Assume that the maximum number of generations is equal to Tmax . It is noticeable that in general M is much smaller than

2n . Therefore, the time complexity of iteration cycle of IDCGA-FCM is O(MncdTmax ) . The total time complexity of
IDCGA-FCM is O(Mcd (1  nT )) . As for IDCGA2-FCM, when M is greater than  Gd , the time complexity of iteration
max

cycle of IDCGA2-FCM is the same as that of IDCGA-FCM. Otherwise, the time complexity of iteration cycle of IDCGA2-FCM
is O(n cdGd Tmax ) . The total time complexity of IDCGA2-FCM is O(cd (M  nGd Tmax )) . However, it's worth noting that the

FCM iterations is only performed to favour local search when the diversity in the population drops below a certain level, rather
than the whole process. The practical time used in theIDCGA2-FCM is greatly less than that of IDCGA-FCM.

T
Start
No
DE(t)  METmin2

IP
No Calculate the population
Meet the termination
entropy
Initial population using condition?
Arnold Cat Calculate and store the Yes

CR
Yes average and maximum
Obtain the optimal fitness value Sort all the individuals
Get the cluster centers by
solution
decoding each individual
Update states by the Determine the excellent
Assign objects into evolution rule individuals using golden
Assign the membership
section method

US
clusters Select the living individual
degrees using FCM
and its neighborhood as
Output clustering result parents Perform the FCM iterations
Calculate fitness using
Generation=Generation+1
fitness function
Dynamic crossover
Stop
AN
Gd=Gd+1
No
Two-combination mutation Meet the stopping
based on entropy criterion?

Assign the membership Yes


degrees using FCM Calculate the criterion J
M

Evaluate fitness and replace


exist individual if fitness is Update individual by coding
higher the cluster centers
ED

Fig.2. Flowchart of the IDCGA-based clustering algorithms.

5. Experiments and results


PT

For the purpose of verifying the performance of both methods proposed in this study (IDCGA-FCM and IDCGA2-FCM),
several FCM-type clustering algorithms (the FCM algorithm [4],the standard genetic FCM clustering algorithm(GA-FCM)
CE

[20],the adaptive genetic FCM clustering algorithm (AGA-FCM) [21],the particle swarm optimized FCM clustering algorithm
(PSO-FCM ) [37], the hybrid clustering algorithm based on fuzzy PSO and FCM ( FCM- FPSO) [38], the hybrid clustering
algorithm based on improved PSO and FCM (FCM-IDPSO) [39] ) are chosen for extensive comparative analysis. A series of
AC

experiments were conducted on the synthetic data sets and UCI data sets, aiming at verifying the accuracy and efficiency of the
presented algorithms.

5.1. Synthetic and UCI data sets

The experiments were implemented on the following eight synthetic data sets and eight real-world data sets. The former
allows for a better control of data behavior and a better understanding of methods performance. For the latter, methods are
assessed using several well-known data sets from the UCI Machine Learning Repository [40]. As for the synthetic datasets, data
points in each group of these eight data sets will be randomly generated by Gaussian distributions with the centroids as the
average and a random number as the variance. As shown in Figs.3, the six data sets, i.e., DS1, DS2, DS3, DS4, DS5 and DS6
have unequal sizes, and show different overlapping levels and different cluster shapes. DS7 and DS8 are two large data sets. All
ACCEPTED MANUSCRIPT
16

of them represent various difficulties for clustering. As for the real-world data sets, eight widely-used data sets were chosen: Iris,
Wine, Glass, Heart disease, Cancer, Prima, Image segmentation and Landsat Satellite. The detailed description of these data sets
can be obtained from the UCI Repository [40]. For convenience, we summarize the main characteristics of the 16 data sets in

Table 2 and 3. As shown in Table 1, the three columns show the number of objects n , the number of variables d and the
number of clusters c . All the possible extreme values of clustering objective function obtained by FCM for 2000 different runs
are listed in Table 2.
a 10 c
b 1

8
10 0.8
6
8
4
6 0.6
2

y
4
y

0 0.4
2
-2

T
0
-4 10 0.2
8 10
-6 6 8
4 6
4

IP
-8 2 0
2 0 0.2 0.4 0.6 0.8 1
-4 -3 -2 -1 0 1 2 3 4 0 0
x y x x

d e
f 100

CR
1 100

0.8 80 80

0.6 60
60
z
z

0.4 40

y
0.2 20 40

US
0
0
1 100
20
0.8 1 100
0.6 0.8 50 80
0.6
60
0.4 40
0.4 20 0
0.2 0.2 0 0 0 20 40 60 80 100
y 0 0 y x x
x
AN
Figs.3. Part of synthetic data sets. (a) DS1. (b) DS2. (c) DS3. (d) DS4. (e) DS5. (f) DS6.

Table 2

Synthetic data sets and real data sets used in experiments.


M

Dateset n c d Dateset n c d
DS1 200 4 2 Iris 150 3 4
ED

DS2 280 4 3 Wine 178 3 13

DS3 300 5 2 Glass 214 6 9

DS4 250 5 3 Heart disease 270 2 13


PT

DS5 360 6 3 Cancer 683 2 9

DS6 500 10 2 Prima indians diabetes 768 2 8

DS7 2100 3 12 Image segmentation 2310 7 19


CE

DS8 5400 9 12 Landsat Satellite 6435 6 36

Table 3
AC

All the extreme values of J for the 16 data sets.

Date set Extreme values

DS1 246.8857 268.3853 274.9073 287.1339 296.9416 312.5941

DS2 310.2638 448.0859 459.8187 474.5458 516.2935

DS3 2.1089 3.0356 3.1825 3.4949

DS4 1.7118 2.5275 3.0224 3.1206 3.3164 4.2953

DS5 3.5275e+04 5.1123e+04 5.1648e+04 6.1335e+04

DS6 5.4062e+03 9.8901e+03 1.0924e+04 1.1116e+04 1.1745e+04 1.2121e+04 1.2854e+04 1.3297e+04 1.3668e+04

1.4297e+04 1.5239e+04 1.5436e+04 1.6764e+04 1.7672e+04 2.1250e+04

DS7 1.5471e+04 2.0893e+04 2.1124e+04


ACCEPTED MANUSCRIPT
17

DS8 1.3095e+06 1.7158e+06 1.7480e+06 1.7553e+06 1.7640e+06 1.7791e+06 1.7848e+06 1.7907e+06 1.8071e+06

1.8189e+06 1.8290e+06 1.8321e+06 1.8598e+06 1.8692e+06 1.9859e+06 2.0040e+06 2.0118e+06 2.0144e+06

2.0201e+06 2.1044e+06 2.1205e+06 2.1438e+06 2.5241e+06 2.6327e+06

Iris 60.5760 76.6361 105.8731

Wine 1.7961e+06 2.5323e+06 2.6986e+06

Glass 154.1460 154.9533 157.9265 159.5464 159.9705 160.4936 160.5597 160.6937 165.2214 168.3052

Heart 4.0687e+05 4.8675e+05

Cancer 1.4917e+04

Prima 3.9868e+06

Image seg. 5.6668e+06 5.6802e+06 5.7134e+06 5.7279e+06 5.7404e+06 5.7810e+06 5.7847e+06

T
5.7975e+06 5.8388e+06 5.8526e+06

IP
Landsat Satellite 7.7086e+06 7.7087e+06 7.7088e+06 7.7257e+06 8.0574e+06 8.5070e+06

CR
From Table 2 and 3, it can be seen that these chosen data sets cover examples of data of small/large objects, low/medium/high
dimensions, small/medium/large number of clusters and single/multiple extrema. We consider that the characteristics of these
data sets make them attractive and meaningful enough to support our claims and thus they will allow us to draw clear conclusions
about the accuracy and efficiency of the proposed algorithms.

5.2. Performance metrics US


AN
To quantitatively evaluate the clustering performance of these algorithms, four different performance metrics are applied in
the comparative experiments: the value of J vs. the number of iterations, the run time, and the cluster validity indices (CVIs).
A number of CVIs have been proposed as a measure of indication of the goodness or validity of a cluster solution. Several
such CVIs have been described in [13]. Among those indices, we selected six well-known CVIs namely PC, PE, XB, FS, PBMF
M

and SC, as shown in Table 1. The larger value of index PC and index PBMF mean the better fuzzy cluster results. The smaller
values of index PE, index XB, index FS and index SC are expected.
ED

Table 1

A brief description of six selected CVIs.

Validity index Functional description Optimal partition


PT

c n
1
Partition coeffcient PC 
n
  ijm Max( PC )
i 1 j 1

c n
1
Partition entropy PE    ij log  ij Min( PE )
CE

n i 1 j 1

c n
 ijm
2
x j  vi
i 1 j 1
Xie-Beni function XB  2
Min( XB )
AC

n  min vi  v j
i j

c n
FS   ijm ( x j - vi
2 2
Fukuyama-Sugeno function  vi  v ) Min( FS )
i 1 j 1

Pakhira-Bandyopadhyay-Maulik fuzzy 1 E
PBMF  (  1  Dc )2
c Ec
Max( PBMF )
c n
x j -v ,Ec   
n c
E1    x j  vi , Dc  max vi  v j
2 m
ij ij
i , j 1
j 1 i 1 j 1
ACCEPTED MANUSCRIPT
18

n
 ij2
2
x j - vi
c
SC  
j 1
SC function Min( SC )
n c
i 1
 ij  vi - vl
2

j 1 l 1

In order to take advantage of each fuzzy CVI and obtain the true partitioning results, we optimize two objective functions
simultaneously instead of single CVI during clustering process. They are defined as follows:
 (c)  PC(c)  XB(c) (19)

PC (c) considers only the membership functions. XB(c) considers both the membership functions and the geometrical
properties of the data structure, measuring the compactness and separation. The larger value of  implies the better fuzzy

T
clustering results.

IP
 (c)  PE(c)  SC(c) (20)

PE (c) uses only fuzzy memberships and it may be lack for the connection to the geometrical structure of data. SC (c)

CR
simultaneously takes into account fuzzy memberships and the data structure. A smaller value of means a better solution.

(c)  PBMF (c)  FS (c) (21)


Both PBMF (c) and FS (c) measure the compactness and separation for each cluster. The most compact and separate partition

is found when the maximum value of  is achieved.

5.3. Parameter settings US


AN
The same parameters are used during the experiments in order to arrive at a fair comparison. For all the datasets, the
population size is taken as 100. In all of the algorithms, the fuzzification coefficient m =2 and convergence threshold value

  10-5 are set. The probabilities of crossover and mutation for GA-FCM and AGA-FCM are Pc =0.8 and Pm =0.05,
M

respectively. Because of the different characteristics, we use different maximum number of generations Tmax for each dataset:

500 generations for DS7and Glass, 1000 generations for DS8, 3000 generations for Image segmentation and Landsat Satellite,
ED

and 100 generations for the remaining 11 datasets.


In order to optimize the performance of the IDCGA-FCM and IDCGA2-FCM, fine tuning and preliminary experiments have
been conducted and the best values for their parameters are chosen. Based on the experimental results, these algorithms achieved
the most desirable performance for each dataset considered under the following setting: 0.99 for Max , 0.1 for Max , 0.5 for
PT

1 , 1.5 for  2 , 0.3 for k1 , 0.5 for k 2 , 8 for  =8 and 5 for Gd . Finally, all the algorithms will share the same stopping

criterion: the maximum number of generations or until convergence. If the algorithm fails to satisfy the stopping criterion,
CE

terminates when reaching the maximum number of generations defined for that dataset.

5.4. Results and analysis


AC

In this section, IDCGA-FCM and IDCGA2-FCM are compared to FCM, GA-FCM and AGA-FCM through experiments
based on the two kinds of data sets that are used above. The comparison with FCM is important to prove the significant
improvement over the traditional clustering method. The comparison against GA-FCM and AGA-FCM clustering algorithms, the
predecessors of IDCGA-FCM and IDCGA2-FCM, were carried out because both of them showed better results than FCM.
Therefore, it would be more convictive to use all of them for validating the efficiency and superiority of the proposed algorithms.
Firstly, we compare the speed of convergence of FCM, the two GA-based algorithms and the two proposed IDCGA-based
clustering algorithms. In the experiments, five algorithms run 100 times independently for each data set with randomly generated
initializations for each repetition. There are two exceptions for DS8 and Landsat Satellite because of their higher complexity
and a large amount of computation time needed compared to the other datasets. The average value recorded was to account for
the stochastic nature of the algorithms. For a better view of the results, the average values of the J obtained by the five
ACCEPTED MANUSCRIPT
19

algorithms for the synthetic data sets and real-world data sets are displayed in a series of plots shown in Fig.4 and Fig.5,
respectively. For a more careful comparison among algorithms, we recorded the mean and standard deviation values of the J of
five algorithms, as shown in in Table 4 and 5, respectively. All of the experiments were implemented on a computer with 3.50
GHz CPU and 4 GB RAM. Fig.6 gives the run time of FCM, GA-FCM, AGA-FCM, IDCGA-FCM and IDCGA2-FCM.
a 440 b 900 c 7

FCM GA-FCM AGA-FCM IDCGA-FCM IDCGA2-FCM FCM GA-FCM AGA-FCM IDCGA-FCM IDCGA2-FCM
FCM GA-FCM AGA-FCM IDCGA-FCM IDCGA2-FCM 6.5
420 2.25
265 340
800
400 6
335
380 260 5.5
700 330 2.2
360 5

Average J
Average J

Average J
255 325
340 600 4.5
320
320 4 2.15
250
500 315

T
300 3.5
310
280 245 0 20 40 60 80 100 3
0 20 40 60 80 100 400 2.1
0 20 40 60 80 100

IP
260 2.5

240 300 2
0 10 20 30 40 50 60 70 80 90 100 0 10 20 30 40 50 60 70 80 90 100 0 10 20 30 40 50 60 70 80 90 100
Iteration numbers Iteration numbers Iteration numbers
e 11 x 10

CR
4 4
d8 f 4
x 10
FCM GA-FCM AGA-FCM IDCGA-FCM IDCGA2-FCM FCM GA-FCM AGA-FCM IDCGA-FCM IDCGA2-FCM FCM GA-FCM AGA-FCM IDCGA-FCM IDCGA2-FCM
4 4
2.6 x 10 x 10
7 10 2
5 3.5

2.4
9
6 3 1.5
4.5
2.2

US
8
Average J

Average J
Average J

5 2.5
2 7 1
4
4 2
1.8 6
0.5
0 20 40 60 80 100
3 0 20 40 60 80 100 3.5 1.5
5 0 20 40 60 80 100
AN
2 4 1

1 3 0.5
0 10 20 30 40 50 60 70 80 90 100 0 10 20 30 40 50 60 70 80 90 100 0 10 20 30 40 50 60 70 80 90 100
Iteration numbers Iteration numbers Iteration numbers
4 6
g 6
x 10 h 8 x 10
M

FCM GA-FCM AGA-FCM IDCGA-FCM IDCGA2-FCM FCM GA-FCM AGA-FCM IDCGA-FCM IDCGA2-FCM
5.5 6
4 7 x 10
x 10
1.57 2.5
5

1.565
6
4.5
2
ED
Average J
Average J

1.56 5
4

1.555 1.5
3.5
4
1.55
3 0 200 400 600 800 1000
3
1.545
2.5
PT

1.54 2
2 0 100 200 300 400 500

1.5 1
0 50 100 150 200 250 300 350 400 450 500 0 100 200 300 400 500 600 700 800 900 1000
Iteration numbers Iteration numbers
CE

Fig. 4. Clustering results for the eight synthetic data sets. The figures plot the average J obtained by FCM, GA-FCM, AGA-FCM, IDCAGA-FCM
and IDCGA2-FCM. (a) DS1. (b) DS2. c) DS3. (d) DS4. (e) DS5. (f) DS6. (g) DS7. (h) DS8.
a 220 b x 10
6
c 8
x 10
5

5
FCM GA-FCM AGA-FCM IDCGA-FCM IDCGA2-FCM FCM GA-FCM AGA-FCM IDCGA-FCM IDCGA2-FCM
FCM GA-FCM AGA-FCM IDCGA-FCM IDCGA2-FCM 6
AC

x 10 5
200 1.9 7.5 x 10
62 4.5 4.2

180 1.88 7 4.18


4

160 61.5 1.86 6.5 4.16


Average J
Average J

Average J

3.5
4.14
140 1.84 6

3 4.12
120 61 5.5
1.82
4.1
2.5
100 1.8 5
4.08
60.5 2 0 20 40 60 80 100
80 0 20 40 60 80 100 4.5 4.06
0 20 40 60 80 100

60 1.5 4
0 10 20 30 40 50 60 70 80 90 100 0 10 20 30 40 50 60 70 80 90 100 0 10 20 30 40 50 60 70 80 90 100
Iteration numbers Iteration numbers Iteration numbers
ACCEPTED MANUSCRIPT
20

d 4
x 10
4
e 650
FCM GA-FCM AGA-FCM IDCGA-FCM IDCGA2-FCM FCM GA-FCM AGA-FCM IDCGA-FCM IDCGA2-FCM
4 600
x 10
1.8 180
3.5
550

175
1.7 500
3
450 170
Average J

Average J
1.6

2.5 400
165
1.5
350
160
2
1.4 300
0 20 40 60 80 100
155
250
1.5

200 150
0 100 200 300 400 500

1 150
0 10 20 30 40 50 60 70 80 90 100 0 50 100 150 200 250 300 350 400 450 500
Iteration numbers Iteration numbers

7
g 7

T
x 10
f 6
4
x 10
FCM GA-FCM AGA-FCM IDCGA-FCM IDCGA2-FCM FCM GA-FCM AGA-FCM IDCGA-FCM IDCGA2-FCM
6
x 10

IP
3.5 6
5 6.8 x 10
11
6.6 3
4 10

Average J
6.4
Average J

CR
2.5
3 6.2 9
2
6 8
2
5.8 1.5
7
0 500 1000 1500 2000 2500

US
1 5.6 1
0 500 1000 1500 2000 2500

0.5
0 0 500 1000 1500 2000 2500 3000
0 500 1000 1500 2000 2500 3000
Iteration numbers Iteration numbers

J
AN
Fig. 5. Clustering results for the four real-word data sets. The figures plot the average obtained by FCM, GA-FCM, AGA-FCM,

IDCAGA-FCM and IDCGA2-FCM. (a) Iris. (b) Wine. (c) Heart. (d) Cancer. (e) Glass. (f) Image segment. (g) Landsat Satellite.

It can be seen from Figs.4 and 5, compared with the other four algorithms, the hybrid IDCGA2-FCM obtained better results
in all the data sets and can escape from local optima. Besides, it is also prominent that IDCGA-FCM and IDCGA2-FCM
M

converge to the desired values and make some improvement in the J over FCM, GA-FCM and AGA-FCM for most data sets,
both of which indicate their superiority. Comparatively, GA-FCM and AGA-FCM cannot prevent the convergence of the
algorithms to the local optimum, while FCM easily traps into local minima in a short time. Furthermore, IDCGA2-FCM
ED

succeeded in converging to global minimum in a relatively fewer number of iterations. The key reason for the good performance
is that the search route of IDCAGA-FCM and IDCGA2-FCM is more flexible and directive due to the adoption of the dynamic
crossover and two-combination mutation operations, as well as the combination of FCM iterations.
PT

Table 4
CE

The average and standard deviation values of the J obtained by five algorithms for the eight synthetic data sets. Best results are highlighted in

bold.

FCM GA-FCM AGA-FCM IDCGA-FCM IDCGA2-FCM


AC

DS1 261.8843±19.8328 247.2500±0.4336 247.0500±0.2318 246.8857±1.6623e-05 246.8857±2.4090e-06

DS2 316.0114±28.3238 313.5300±2.5047 312.5000±3.2098 310.9053±1.6511 310.2638±2.0708e-08

DS3 2.1791±0.2604 2.1193±0.0081 2.1146±2.0409 2.1090±1.4816e-05 2.1089±1.1995e-09

DS4 1.9722±0.2885 1.7797±0.0546 1.7695±0.0568 1.7125±0.0002 1.7118±7.6576e-07

DS5 3.7032e+04±6071.2112 3.6796e+04±1515.1000 3.6459e+04±525.4369 3.5278e+04±0.8641 3.5275e+04±4.1443e-07

DS6 9385.4643±4073.8475 7605.1065±1649.9000 7361.3725±1596.5485 5597.1986±1157.4660 5406.2495±3.0083e-08

DS7 1.5528e+04±565.2612 1.5579e+04±15.2766 1.5479e+04±3.2716 1.5471e+04±8.6420e-04 1.5471e+04±7.0237e-11

DS8 1.6013e+06±3.1160e+05 1.8865e+06±3.2060e+04 1.6715e+06±2.3115e+04 1.4600e+06±5.9978e+04 1.3095e+06±1.1527e-06

As shown in Table 4, IDCGA-FCM and IDCGA2-FCM achieve better performances than FCM, GA-FCM and AGA-FCM in
ACCEPTED MANUSCRIPT
21

well-separated, hyper spherical and overlapping clusters from the synthetic data sets, and they can escape from local optima.
Moreover, IDCGA-FCM and IDCGA2-FCM had equal averages for the first data set, but with different standard deviation. For

the other data sets, IDCGA2-FCM always achieved the best average and the smallest standard deviation values for the J ,
followed by IDCGA-FCM. For example, IDCGA2-FCM can achieve 100% in the J on all synthetic data sets. Nevertheless,
GA-FCM and AGA-FCM cannot effectively overcome the prematurity in the local optimum, which illustrates that the
combinations of GAs and FCM perform poorly in the search for global optimum. In addition, it is also noticeable that FCM is the
worst approach since it does not outperform neither GA-FCM nor AGA-FCM for any synthetic data sets which represent the
multiple extrema problems.

Table 5

T
The average and standard deviation values of the J obtained by five algorithms for the eight real data sets. Best results are highlighted in bold.

IP
FCM GA-FCM AGA-FCM IDCGA-FCM IDCGA2-FCM

Iris 61.4189±5.6035 61.2735±0.7586 60.9240±0.3453 60.5765±0.0006 60.5760±8.4175e-07

CR
Wine 1.8232e+06±1.1759e +05 1.7987e+06±3549.7000 1.7967e+06±2166.7000 1.7963e+06±1509.9000 1.7961e+06±2.1080e-06

Heart 4.0847e+05±1.1239e+04 4.0734e+05±615.4864 4.0728e+05±296.6500 4.0711e+05±221.8300 4.0687e+05±1.6273e-06

Glass 156.4697±6.1583 158.5232±2.3457 156.4000±1.6378 154.9402±0.9457 154.1460±8.1032e-07

Cancer

Prima

Image
1.4917e+04±1.8115e-07

3.9868e+06±1.3061e-06

5.7516e+06±5.3511e+04
1.5065e+04±136.5500

3.9933e+06±5.5382e+03

5.8802e+06±4.0529e+04
US 1.4999e+04±91.4336

3.9919e+06±2.8462e+03

5.8093e+06±4.0288e+04
1.4930e+04±64.2690

3.9901e+06±2.7909e+03

5.7398e+06±2.3847e+04
1.4917e+04±7.2776e-08

3.9868e+06±2.8451e-07

5.7142e+06±2.1472e+04
AN
Satellite 7.9747e+06±9.7453e+04 8.9902e+06±4.5908e+04 8.2143e+06±2.4465e+04 7.7147e+06±1358.7345 7.7086e+06±3.8909e-06

In Table 5, it is demonstrated that the value of the J obtained by IDCGA2-FCM is always better than that obtained by the
other algorithms for the same real-world data sets. We can clearly see that IDCGA2-FCM provides adequate efficiency and
M

accuracy for solving data sets with a great number of clusters or extrema, especially in Image segment and Landsat Satellite. But
there is no obvious advantage in solving those data sets with single extremum, such as Cancer and Prima. Although
ED

IDCGA-FCM is inferior to IDCGA2-FCM, IDCGA-FCM obtained better results than GA-FCM and AGA-FCM in all data sets
while it surpasses FCM in all data sets but the two (Cancer and Prima). Besides, the experimental results reveal that, when the
size of data set is small (i.e., number of objects or clusters), the GA-FCM and AGA-FCM surpass FCM, but with the increasing
PT

size of data set, FCM obtains better results than GA-FCM and AGA-FCM. Additionally, it also stands out that both IDCGA-FCM
and IDCGA2-FCM have better stability and robustness than the two GA-based methods considered in the comparison. The major
causes for the difference in clustering results are that IDCGA-FCM and IDCGA2-FCM take full advantage of IDCGA’s global
CE

search capability, which make them prevent the premature in local extrema, compared to FCM, GA-FCM and AGA-FCM; on the
other hand, IDCGA2-FCM make use of strong ability of local search of FCM iterations, which makes it exactly converge to the
only extremum, compared to GA-FCM, AGA-FCM and IDCGA-FCM.
AC

4
a 120
b 140 c x 10
18
FCM FCM FCM
GA-FCM 16
100 120 GA-FCM GA-FCM
AGA-FCM AGA-FCM 14 AGA-FCM
IDCGA-FCM 100 IDCGA-FCM IDCGA-FCM
80 12
IDCGA2-FCM IDCGA2-FCM IDCGA2-FCM
Time (s)

Time (s)
Time (s)

80 10
60
8
60
40 6
40
4
20
20 2

0
0 0 DS7 DS8 Image seg. Landsat Satellite
DS1 DS2 DS3 DS4 DS5 DS6 Iris Wine Heart Cancer Prima Glass
Datasets Datasets Datasets

Fig. 6. Run time of five algorithms for the 16 data sets. (a) Six synthetic data sets. (b) Six UCI data sets. (c) Four large data sets.
ACCEPTED MANUSCRIPT
22

As can be seen from Fig. 6, the run time of IDCGA2-FCM is significantly less than GA-FCM, AGA-FCM and
IDCAGA-FCM, with the time consumption reduced by about 50%-90%. Comparatively, the run time of IDCGA-FCM is
slightly less than that of GA-FCM and AGA-FCM when the size of data set is small. But when handling data sets with large
size and high dimension, the clustering time of IDCGA-FCM will greater than that of GA-FCM and AGA-FCM (see Fig. 6c).
Thus, there is still deficiency in the computation time. However, IDCGA2-FCM makes full use of FCM iterations to accelerate
the convergence at the evolution process while IDCGA-FCM does not. That is why IDCGA2-FCM convergence in a relatively
shorter time.
By comparing the Tables 4-5 and Fig. 6 synthetically, we note that the hybrid IDCGA2-FCM obtained superior results than
all of the other four algorithms in data sets of high complexity. Therefore, it can be concluded that IDCGA2-FCM is an efficient
clustering algorithm and can get very encouraging results in terms of quality of solution found and the convergence speed

T
simultaneously.
  

IP
In the following, , and are employed to evaluate the clustering accuracy of the five algorithms. We recorded the

average and standard deviation values of the three CVIs of the five algorithms on all of the test data sets, respectively. Then, the

CR
results of the three CVIs obtained by the five algorithms on each one of the 16 datasets are included in Tables 7-9.
Through a careful analysis of the results in Tables 7-9, we can easily find that the three CVIs are not simultaneously best in
some situations. We tend to choose the situation that has the most optimized CVIs as the best. It is obvious that for most cases the

US
three CVIs obtained by IDCGA2-FCM are better than those obtained by the other four algorithms. Besides, it is also prominent
that IDCGA-FCM has achieved better values of CVIs than FCM, GA-FCM and AGA-FCM for most of data sets.
Table 7
The average and standard deviation values of  obtained by FCM, GA-FCM, AGA-FCM, IDCAGA-FCM and IDCGA2-FCM methods for the
AN
16 data sets. Best results are highlighted in bold.

FCM GA-FCM AGA-FCM IDCGA-FCM IDCGA2-FCM

DS1 0.3284±0.0127 0.3436±0.0010 0.3480±0.0007 0.3496±2.0401e-05 0.3496±1.8685e-06


M

DS2 0.6870±0.0173 0.6884±0.0026 0.6893±0.0016 0.6904±0.0038 0.6927±2.6438e-08

DS3 0.7134±0.0257 0.7162±0.0007 0.7166±0.0006 0.7183±2.1205e-05 0.7183±2.6703e-08


ED

DS4 0.7677±0.0268 0.7611±0.0056 0.7641±0.0064 0.7722±0.0001 0.7725±4.0621e-06

DS5 0.6541±0.0389 0.6527±0.0125 0.6613±0.0050 0.6704±9.7679e-05 0.6704±1.3147e-08

DS6 0.1575±0.0381 0.7951±0.0284 0.8027±0.0270 0.8427±0.0129 0.8754±1.3040e-09


PT

DS7 0.5906±0.0207 0.5899±0.0016 0.5924±0.0006 0.5932±1.3056E-05 0.5932±4.0183E-10

DS8 -0.2750±0.0142 -1.3609±0.0077 0.4997±0.0053 0.6121±0.0061 0.6512±2.2689E-09

Iris 0.6415±0.0236 0.6398±0.0030 0.6410±0.0015 0.6460±7.2613e-05 0.6461±1.1465e-10


CE

Wine 0.6614±0.0266 0.6642±0.0017 0.6650±0.0010 0.6655±7.5943e-04 0.6652±1.2148e-07

Heart 0.4512±0.0199 0.4551±0.0010 0.4553±0.0007 0.4572±0.0014 0.4564±1.0435e-09

Glass -1.8701±0.0204 -2.9842±0.0383 -2.8537±0.0206 -2.2610±0.0092 -1.8648±1.5310e-06


AC

Cancer 0.7306±5.9498e-08 0.7260±0.0030 0.7283±0.0016 0.7291±0.0025 0.7306±4.7799e-11

Prima 0.7015±7.3244e-08 0.7004±0.0011 0.7025±0.0009 0.7031±0.0029 0.7015±8.0792e-10

Image -0.0395±0.0011 -0.1673±0.0092 -0.1351±0.0053 -0.0218±8.8081E-04 -0.0085±1.4081E-04

Satellite 0.0254±0.0003 -0.5196±0.0077 0.0256±0.0006 0.0487±0.0004 0.1096±5.0575E-08

Table 8

The average and standard deviation values of  obtained by FCM, GA-FCM, AGA-FCM, IDCAGA-FCM and IDCGA2-FCM methods for the 16
data sets. Best results are highlighted in bold.

FCM GA-FCM AGA-FCM IDCGA-FCM IDCGA2-FCM

DS1 0.8115±0.0510 0.7252±0.0036 0.7240±0.0023 0.7228±3.4416e-05 0.7228±3.0707e-06


ACCEPTED MANUSCRIPT
23

DS2 0.5432±0.0266 0.5443±0.0047 0.5435±0.0030 0.5419±0.0067 0.5401±5.8459e-08

DS3 0.5479±0.0416 0.5420±0.0013 0.5414±0.0011 0.5402±5.3371e-05 0.5402±6.7130e-08

DS4 0.4255±0.0442 0.4326±0.0109 0.4305±0.0120 0.4190±0.0003 0.4187±7.8671e-06

DS5 0.6844±0.0757 0.6856±0.0244 0.6791±0.0100 0.6600±0.0002 0.6599±2.4315e-08

DS6 0.4104±0.1162 0.3930±0.0653 0.3799±0.0587 0.2968±0.0276 0.2914±2.6947e-09

DS7 0.6949±0.0282 0.6961±0.0027 0.6927±0.0009 0.6916±1.9800E-05 0.6915±6.0127E-10

DS8 0.8718±0.1013 0.9992±0.0137 0.9392±0.0022 0.8339±0.0187 0.7702±4.9232E-09

Iris 0.4601±0.0310 0.4587±0.0046 0.4568±0.0059 0.4541±9.1517e-05 0.4540±8.26816e-11

Wine 0.4471±0.0358 0.4426±0.0020 0.4419±0.0012 0.4415±8.1357e-04 0.4412±1.3813e-07

Heart 0.9749±0.0342 0.9717±0.0020 0.9713±0.0014 0.9682±0.0028 0.9692±1.3645e-09

T
Glass 1.0720±0.0398 1.1457±0.0367 1.1264±0.0412 1.0905±0.0188 1.0621±4.0967e-06

IP
Cancer 0.5583±1.2271e-07 0.5681±0.0050 0.5630±0.0027 0.5618±0.0041 0.5583±9.8406e-11

Prima 0.5999±1.0393e-07 0.6024±0.0017 0.5980±0.0014 0.5969±0.0044 0.5999±1.1464e-09

CR
Image 1.4483±0.0365 1.4809±0.0691 1.3792±0.0528 1.2709±0.0457 1.2684±0.0111

Satellite 1.1865±0.0131 1.3664±0.0539 1.2578±0.0065 1.1935±0.0062 1.1862±4.9070E-08

Table 9

The average and standard deviation values of 


16 data sets. Best results are highlighted in bold.
US
obtained by FCM, GA-FCM, AGA-FCM, IDCAGA-FCM and IDCGA2-FCM methods for the
AN
FCM GA-FCM AGA-FCM IDCGA-FCM IDCGA2-FCM

DS1 496.4852±193.7497 659.4360±12.6050 660.6160±7.6293 663.5584±0.1680 663.5598±0.0106

DS2 2541.6470±68.5956 2537.5702±63.0580 2539.2160±45.9410 2540.4944±34.5419 2548.5457±0.0004


M

DS3 26.0273±0.9640 26.1833±0.3621 26.2145±0.2863 26.1991±0.0195 26.1993±2.0726e-05

DS4 37.7956±1.7521 37.4615±0.8827 37.5937±0.8475 37.9992±0.0801 38.0401±0.0019

DS5 3.8783E+05±3.9639e+04 3.8299E+05±1.5940e+04 3.8776E+05±1.5269e+04 3.9912E+05±641.4502 3.9936E+05±0.0650


ED

DS6 6.0819E+05±4.9726e+04 6.2579E+05±2.3797e+04 6.2851E+05±2.0161e+04 6.5414E+05±1.3974e+04 6.5726E+05±0.0059

DS7 2.6272E+04±2662.4577 2.6146E+04±217.4800 2.6369E+04±835.5656 2.6528E+04±5.0306 2.6540E+04±1.5299E-04

DS8 2.3598E+07±2.9202e+06 2.0257E+07±7.3476e+05 2.2164E+07±9.4674e+05 2.4372E+07±5.3869e+05 2.6300E+07±0.5716


PT

Iris 482.6120±15.3452 482.1903±11.3114 484.4630±0.0013 487.3314±0.2921 487.5008±4.6300e-07

Wine 1.1582E+07±6.6287e+05 1.1667E+07±9.3267e+04 1.1675E+07±4.9859e+04 1.1689E+07±2.9835e+04 1.1683E+07±1.2915

Heart -1.3534E+05±4.9734e+04 -1.2931E+05±5.8594e+03 -1.2902E+05±4.3475e+03 -1.2837E+05±6.4429e+03 -1.2828E+05±0.0018


CE

Glass 325.6018±30.3772 272.7430±28.5090 282.5500±31.8310 304.1984±23.1498 332.2281±0.0123

Cancer 7371.2323±0.0064 7036.4600±1.2236e+03 7183.6800±934.8458 7207.1001±680.3100 7371.2368±4.1102e-05

Prima 8.6645E+05±1.6752 8.3761E+05±9.1954e+04 8.8490E+05±7.5557e+04 8.8773E+05±1.1141e+05 8.6646E+05±0.0452


AC

Image 7.3914E+06±1.4715e+06 7.3957E+06±2.6108e+06 1.0145E+07±2.6565e+06 1.3082E+07±3.2363e+05 1.3157E+07±2.7635e+05

Satellite 2.4104E+07±4.3593E+05 1.9000E+07±2.8595E+05 2.1795E+07±2.8836E+05 2.5060E+07±4551.4000 2.5546E+07±2.2945

Let us analyze  first, we can observe in Table 7 that IDCGA2-FCM obtains the largest values in all data sets except Wine

and Heart, whereas IDCAGA-FCM yields the best results in six data sets and FCM only does well in two data sets with single
extremum. Table 8 shows that IDCAGA-FCM and IDCGA2-FCM obtain better results than the other three algorithms with
respect to  , since they yield the best values in 4 and 14 out of the 16 data sets, respectively. In terms of  , IDCGA2-FCM
obtains better results than the other four algorithms, as seen on Table 9. For this validity measure, IDCGA2-FCM yields the
largest values in 13 out of the 16 data sets, only failing on Wine, Heart and Prima, which are better addressed by IDCGA-FCM.
Moreover, the standard deviation values obtained by the IDCGA2-FCM in Section 5 have a tendency of being small relative to
ACCEPTED MANUSCRIPT
24

their corresponding average values, which reveals that its clustering performance has robustness to initialization. Consequently,
we can conclude that clustering accuracy of IDCGA2-FCM is really better than all the algorithms compared for most case.

5.5. Comparison with other PSO-based algorithms

The experiments of the previous section proved that both IDCGA-based clustering algorithms proposed in this paper have

achieved perfect performance, particularly IDCGA2-FCM, which obtained very encouraging results for the J, convergence
speed and CVIs.
To further verify how competitive IDCGA2-FCM is, it would be convictive to compare its performance with other
PSO-based clustering methods that are representative of state-of-the-art algorithms. For that purpose, three algorithms were
chosen: PSO-FCM [37], FCM- FPSO [38] and FCM-IDPSO [39]. The main reason for this choice is that these methods are

T
recently published and show good results. Next, we briefly describe these methods, including the parameter settings used in the

IP
subsequent experiments.
The PSO-FCM employs particle swarm optimization (PSO) to solve the problem of classification of instances in databases.

CR
The FCM-FPSO combines fuzzy particle swarm optimization (FPSO) with FCM, which makes use of the merits of both
algorithms. In FCM-FPSO, FCM is applied to FPSO every number of iterations such that the fitness value of each particle is
improved. The FCM-IDPSO is a hybrid method for fuzzy clustering that combines the improved self-adaptive particles warm

US
optimization (IDPSO) and FCM. In FCM-IDPSO, FCM is also applied to IDPSO every number of iterations. Specifically, we
used the real-coded version of the algorithms and the same parameter settings used by their authors. The algorithms have the
same population size and stopping criterion as in the previous section. In the experiments, the four algorithms were assigned the
AN
following values for their parameters:
for IDCGA2-FCM: the same values used in Section 5;
for PSO-FCM: wmax  0.9 , wmax  0.4 , c1  c2  2.0 ;
for FCM- FPSO: wmax  0.9 , wmax  0.1 , c1  c2  2.0 ;
M

l 1  c()
for FCM-IDPSO: winitial  0.9 , wfinal  0.4 , c()
1
l 1 2
2
at instant t 1 and   100 .
The experiments have been conducted here on the six real data sets in the same way as those in the previous section. The

J
ED

average and standard deviation values of the obtained by the four algorithms are presented in Tables 10. Because of
divergence, the PSO-FCM terminates when reaching the maximum number of generations defined for that dataset. Thus, we
compared the convergence speed of FCM- FPSO, FCM-IDPSO and IDCGA2-FCM, as depicted in Fig.7.
PT

Table 10

The average and standard deviation values of the J obtained by four algorithms. Best results are highlighted in bold.
CE

PSO-FCM FCM-FPSO FCM-IDPSO IDCGA2-FCM

Iris 61.5452±2.7795 60.5895±0.0150 60.5821±0.0131 60.5760±8.4175e-07

Wine 1.8003e+06±6814.7708 1.7963e+06±190.5211 1.7963e+06±479.6602 1.7961e+06±2.1080e-06


AC

Heart 4.0786e+05±938.8300 4.0687e+05±2.6266) 4.0687e+05±1.2480 4.0687e+05±1.6273e-06

Glass 190.9924±16.0056 161.0099±1.7559 159.8868±0.8932 154.1460±8.1032e-07

Prima 4.0573e+06±9.3247e+04 3.9881e+06±726.9657 3.9878e+06±501.4690 3.9868e+06±2.8451e-06

Image 2.4433e+07±4.7579e+06 6.1724e+06±1.6523e+05 6.0009e+06±6.5018e+04 5.7142e+06±1.1472e+04

It can be seen from Tables 10 that the results of IDCGA2-FCM for all data sets are better than those of PSO-based algorithms
which indicates that IDCGA2-FCM clearly outperforms PSO-FCM, FCM-FPSO and FCM-IDPSO. Comparatively, FCM-FPSO

and FCM-IDPSO only report the best results for the J in Heart, while FCM-PSO always obtained the largest values. It is also
seen that FCM-FPSO, FCM-IDPSO and IDCGA2-FCM obtained the small values for the number of iterations as a result of the
combination of FCM, as shown in Fig.7. However, FCM-PSO does not integrate with FCM iterations to achieve a faster
ACCEPTED MANUSCRIPT
25

convergence. Thus, it was the worst one concerning the convergence speed. Obviously, IDCGA2-FCM shows the smallest values
for the number of iterations in all data sets except for Image segment. Even though FCM-FPSO and FCM-IDPSO obtained the

smaller values for the number of iterations than IDCGA2-FCM in Image segment, but this led to poor result for the J.
Accordingly, it can be concluded that DCGA2-FCM is a highly competitive algorithm and more effective than FCM-PSO,
FCM-FPSO and FCM-IDPSO for clustering problems.

30
FCM-FPSO
25 FCM-IDPSO
IDCGA2-FCM
20
Iterations
15

T
10

IP
5

0
Iris Wine Heart Glass Prima Image seg.

CR
Datasets

Fig.7. Number of iterations obtained by three algorithms on the six data sets.

5.6. Parametric selection of IDCGA2-FCM

US
To analyze in depth the influence of key parameters, which is crucial for IDCGA2-FCM to achieve better performances, we
also designed the experiments to discuss the effects brought about by the number of iteration Gd on the performance of the
algorithm. The experiments were carried out by adjusting the value of Gd from one to eight and keeping all other parameters the
AN
same as those used in Section 5.
Firstly, the Glass was taken as an example to investigate the clustering results of IDCGA2-FCM at eight different parameters,
i.e., the iteration number Gd =1, 2, 3, 4, 5, 6, 7, 8. The clustering results obtained by IDCGA2-FCM with Gd change are
M

illustrated in Fig.8. To show the effects clearly, the run time of IDCGA2-FCM as the value of Gd increases are also provided.

Here, we give the results of four datasets as example, as demonstrated in Fig.9.


As shown in Fig.8, IDCGA2-FCM always converged to the global optimum in any instance, which reveals that convergence
ED

remains stable. Although there are small fluctuations in the number of iterations and run time, it is obvious that the run time of
the IDCGA2-FCM is greatly affected by the values of Gd . As depicted in Fig.9, the run time of IDCGA2-FCM decreases at
first and then increases with increasing Gd . We can draw a conclusion from Figs. 8 and Fig.9 that, when the value of Gd is
PT

equal to 5, it would be conducive to the effcient balance between convergence accuracy and speed. Hence, Gd =5 is a more

appropriate choice in the experiments.


CE

170
Gd=1
168
Gd=2
166 Gd=3
Gd=4
164
AC

Gd=5
Average J

162 Gd=6
Gd=7
160 Gd=8
158

156

154

152
0 5 10 15 20 25 30
Iteration numbers

Fig. 8. The clustering results obtained by IDCGA2-FCM with G change.


d
ACCEPTED MANUSCRIPT
26

40
Gd=1
35 Gd=2
Gd=3
30
Gd=4
25 Gd=5

Time (s)
Gd=6
20 Gd=7
Gd=8
15

10

0
DS1 Wine Prima Glass
Datasets

Fig.9. Run time of IDCGA2-FCM with G change.


d

T
IP
6. Conclusions

CR
Two novel adaptive fuzzy clustering algorithms based on cellular genetic algorithm, referred to as IDCGA-FCM and
IDCGA2-FCM, have been proposed in this paper. The first one is a standalone form for fuzzy clustering on the basis of improved
self-adaptive cellular genetic algorithm (IDCGA). The second one is a hybrid method based on FCM and IDCGA which takes

US
advantage of the merits of both algorithms.
There are two contributions in this study. Firstly, an efficient global optimization method IDCGA has been presented for a
more efficient search. In detail, the new dynamic crossover and entropy-based two-combination mutation operations have been
AN
constructed to effectively prevent the convergence of the algorithms to a local optimum. The adaptive adjusting strategies and
judging criterions have been designed to dynamically modify the probabilities of crossover and mutation as well as the mutation
step size. In addition, Arnold cat map has been employed to initialize population to overcome the sensitivity of the algorithms to
initial cluster centers. Also, a modified evolution rule has been introduced to build a dynamic environment so as to carry out
M

global exploration more effectively. Secondly, IDCGA has been successfully used to optimize the fuzzy clustering model. On one
hand, IDCGA-FCM has been discussed for fuzzy clustering by using IDCGA. On the other hand, an optimal-selection-based
strategy has been presented to select partial individuals for exactly performing local search, and then IDCGA2-FCM has been
ED

developed by integrating IDCGA with optimal-selection-based FCM automatically according to the variation of population
entropy. It can ensure that IDCGA2-FCM achieves an accurate balance between global exploration and local exploitation.
The superiority of the proposed algorithms over AGA-FCM, GA-FCM, FCM-IDPSO, FCM-FPSO, PSO-FCM, and FCM
PT

algorithm have been demonstrated by experiments. The results have proved that IDCGA2-FCM has the most desirable behavior
among all the compared algorithms in terms of efficiency and accuracy. It not only overcomes the shortcomings of the GA-based
CE

and PSO-based fuzzy clustering methods, but also improves the clustering performance greatly. Moreover, the desired results
reported in this paper might increase its attractiveness for use in real-world applications compared to other GA-based and
PSO-based methods.
AC

Our future works include:


1. It would be worth investigating new adjusting strategies of control parameters and new mechanisms to ensure the diversity
of population. Also, the presented methods will be extended to other fuzzy clustering methods such as the works in [21-25].
2. The number of the clusters cannot be determined by the IDCGA2-FCM. Future work should be done to extend the proposed
algorithms and present a method that allows an automatic identification of the number of clusters, i.e., automatic versions of
the methods.
3. The application of the algorithm of this paper or its extensions discussed above in real world are under consideration, and
some results about regional clustering analysis of mainland China will report in future papers.
ACCEPTED MANUSCRIPT
27

Acknowledgment

This research is supported by the National Natural Science Foundation of China (No.71461020), Production-Study-Research
Cooperation Program of Science and Technology Bureau of Guangdong Province (No. 2012B091100175), and the Science and
Technology Research Program of ministry of education of Jiangxi Province (No.11686).

References

[1] R. Xu, D. I. Wunsch, Survey of clustering algorithms, IEEE Transactions on Neural Networks,16(3) (2005) 645-678.
[2] B. A. Pimentel, R.M.C.R. Souza, A multivariate fuzzy c-means method. Applied Soft Computing,13 (4) (2013) 1592-1607.
[3] V. Dey, D. K. Pratihar, G. L. Datta, Genetic algorithm-tuned entropy-based fuzzy C-means algorithm for obtaining distinct

T
and compact clusters, Fuzzy Optimization and Decision Making,10(2) (2011) 153-166.
[4] J. C. Bezdek, Pattern recognition with fuzzy objective function algorithms, Plenum, New York, 1981.

IP
[5] J. Zhou, L. Chen, C. L. Philip Chen, Y. Zhang, H. Li, Fuzzy clustering with the entropy of attribute weights, Neurocomputing,
198 (2016) 125-134.

CR
[6] L. Zhang, W. Pedrycz, W. Lu, X. Liu, L. Zhang, An interval weighed fuzzy c-means clustering by genetically guided
alternating optimization, Expert Systems with Applications, 41(13) (2014) 5960-5971.
[7] B. A. Pimentel, R.M.C.R. Souza, A weighted multivariate fuzzy c-means method in interval-valued scientific production data,
Expert Systems with Applications, 41(7) (2014) 3223-3236.

US
[8] M. S. Xiao, Z. C. Wen, J. W. Zhang, X. F. Wang, An FCM clustering algorithm with improved membership function, Control
and Decision, 30(12) (2015) 2270-2274.
AN
[9] M. Sabzekar, M. Naghibzadeh, Fuzzy c-means improvement using relaxed constraint support vector machines. Applied Soft
Computing, 13(2) (2013) 881-890.
[10] L. Szilágyi, S M. Szilágyi, Generalization rules for the suppressed fuzzy c-means clustering algorithm, Neurocomputing,
M

139(5223) (2014) 298-309.


[11] F. Zhao, J. Fan, H. Liu, Optimal-selection-based suppressed fuzzy c-means clustering algorithm with self-tuning non local
spatial information for image segmentation, Expert Systems with Applications, 41(9) (2014) 4083-4093.
ED

[12] D. Li, H. Gu, L. Zhang, A hybrid genetic algorithm-fuzzy c-means approach for incomplete data clustering based on
nearest-neighbor intervals, Soft Computing, 17(10) (2013) 1787-1796.
[13] K. Zhou, S. Ding, C. Fu, S.L.Yang, Comparison and weighted summation type of fuzzy cluster validity indices. International
PT

Journal of Computers Communications & Control, 9(3) (2014) 370-378.


[14] S. J. Nanda, G. Panda, A survey on nature inspired metaheuristic algorithms for partitional clustering, Swarm and
Evolutionary Computation, 16 (2014) 1-18.
CE

[15] D. E. Goldberg, Genetic Algorithms in Search, Optimization, and Machine Learning, Addison-Wesley, NewYork, 2012.
[16] D. X. Chang, X. D. Zhang, C. W. Zheng, A genetic algorithm with gene rearrangement for K-means clustering, Pattern
Recognition, 2009, 42(7) (2009) 1210-1222.
AC

[17] J. Xiao, Y. Yan, J. Zhang, Y. Tang, A quantum-inspired genetic algorithm for k -means clustering, Expert Systems with
Applications, 37(7) (2010) 4966-4973.
[18] H. He, Y. Tan, A two-stage genetic algorithm for automatic clustering, Neurocomputing, 81 (2012) 49-59.
[19] R. J. Kuo, L.M. Lin, Application of a hybrid of genetic algorithm and particle swarm optimization algorithm for order
clustering, Decision Support Systems, 49(4) (2010) 451-462.
[20] L. O. Hall, I. B. Ozyurt, J. C. Bezdek, Clustering with a genetically optimized approach, IEEE Trans. Evol. Comput. 3(2)
(1999) 103–112.
[21] L. Zhu, S. Qu, T. Du, Adaptive fuzzy clustering based on Genetic algorithm, in: IEEE International Conference on Advanced
Computer Control, vol. 5, 2010, pp.79-82
[22] Y. Ding, X. Fu, Kernel-based Fuzzy C-Means Clustering Algorithm Based on Genetic Algorithm, Neurocomputing, 188
ACCEPTED MANUSCRIPT
28

(2015) 233-238.
[23] S. Wikaisuksakul, A multi-objective genetic algorithm with fuzzy c-means for automatic data clustering, Applied Soft
Computing, 24 (2014) 679-691.
[24] A. X. Ye, Y. X. Jin, A fuzzy c-means clustering algorithm based on improved quantum genetic algorithm, International
Journal of Database Theory and Application, 9(1) (2016) 227-236.
[25] C N Zhang, Y R. Li, A kind of chaotic particle swarm and fuzzy c-mean clustering based on genetic algorithm, International
Journal of Hybrid Information Technology, 7(4) (2014) 287-298.
[26] E. Alba, B. Dorronsoro, The exploration/exploitation tradeoff in dynamic cellular genetic algorithms. IEEE Transactions on
Evolutionary Computation, 2005, 9(2):126-142.
[27] A. J. Nebro, J. J. Durillo, F. Luna, B. Dorronsoro, E. Alba, MOCell: A cellular genetic algorithm for multiobjective

T
optimization, International Journal of Intelligent Systems, 24(7) (2009) 726-746.

IP
[28] A. Al-Naqi, A. T. Erdogan, T. Arslan, Adaptive three-dimensional cellular genetic algorithm for balancing exploration and
exploitation processes, Soft Computing, 17(7) (2013) 1145-1157.

CR
[29] N. Leite, F. Melício, A. Rosa. Clustering using cellular genetic algorithms, in: International Conference on Evolutionary
Computation Theory and Applications, vol. 1, 2015, pp. 366-373.
[30] E. Alba, B. Dorronsoro, Cellular Genetic Algorithms. Springer, New York, 2008.

US
[31] J. P. Su, T. E. Lee, K. W. Yu, A combined hard and soft variable-structure control scheme for a class of nonlinear systems.
IEEE Transactions on Industrial Electronics, 56(9) (2009) 3305-3313.
[32] V. Arnold, A. Avez, Ergodic Problems of Classical Mechanics, Benjamin, New York, 1968.
AN
[33] F. Chen, K. W. Wong, X. Liao, X. Tao, Period distribution of generalized discrete Arnold cat map, Theoretical Computer
Science, 552 (2014) 13-25.
[34] Y. M. Lu, M. Li, L. Li, The cellular genetic algorithm with evolutionary rule, Acta Electronica Sinica, 38(7) (2010)
1603-1607.
M

[35] J. Grefenstette, Optimization of Control Parameters for Genetic Algorithms, IEEE Transactions on Systems Man &
Cybernetics, 16(1) (1986) 122-128.
[36] J. Reed, R. Toombs, N. A. Barricelli, Simulation of biological evolution and machine learning, Journal of Theoretical
ED

Biology, 17(3) (1967) 319-342.


[37] I. De Falco, A. Della Cioppa, E. Tarantino, Facing classification problems with Particle Swarm Optimization, Applied Soft
Computing, 7(3) (2007) 652–658.
PT

[38] H. Izakian, A. Abraham, Fuzzy c-means and fuzzy swarm for fuzzy clustering problem. Expert Systems with Applications,
38(3) (2011)1835–1838.
CE

[39] T. M. S. Filho, B. A. Pimentel, R. M. C. R. Souza, A. L. I. Oliveira, Hybrid methods for fuzzy clustering based on fuzzy
c-means and improved particle swarm optimization, Expert Systems with Applications, 42 (2015) 6315–6328.
[40] A. Asuncion, D. J. Newman, UCI machine learning repository <http://archive.ics.uci.edu/ml/>. Irvine, CA: University of
AC

California, School of Information and Computer Science, 2016.

Authors Bio
ACCEPTED MANUSCRIPT
29

Lilin Jie received the B.S. degree in Computational Intelligence and pattern recognition from the

Nanchang Hangkong University, Nanchang, China, in 2012. She is currently a Ph.D. Candidate

of Mechanical engineering in Nanchang University. She also is a Lecturer in the School of

Aeronautical Manufacturing Engineering, Nanchang Hangkong University. Her research interests

and experience include reliability engineering, fuzzy clustering, Computational Intelligence and

T
pattern recognition.

IP
CR
US
AN
Weidong Liu is professor of the School of Mechanical & Electrical

Engineering, Nanchang University Nanchang, China, and School of Economic Management,


M

Nanchang Hangkong University. He is also the president of the Association for Quality, Jiangxi,

China. He earned his PhD. in Mechanical Engineering from the Nanjing University of Science and
ED

Technology in 1994. His research interests and experience include quality management, reliability

about the mechanical and electrical products or equipment, neural networks, and fuzzy clustering.
PT
CE
AC

Zheng Sun is a Graduate Student of the Research Center of quality and

reliability engineering, School of Mechatronics Engineering in Nanchang University, Nanchang,

China. His research interests are reliability engineering and quality management.
ACCEPTED MANUSCRIPT
30

Shasha Teng is a Graduate Student of the Research Center of quality and

reliability engineering, School of Mechatronics Engineering in Nanchang University , Nanchang,

T
China. Her research interests include reliability engineering and quality management.

IP
CR
A list of captions for figures:
Fig.1. Typical neighbourhoods used in CGA.
Fig.2. Flowchart of the IDCGA-based clustering algorithms.

US
Figs.3. Part of synthetic data sets. (a) DS1. (b) DS2. (c) DS3. (d) DS4. (e) DS5. (f) DS6.
Fig. 4. Clustering results for the eight synthetic data sets. The figures plot the average obtained by FCM, GA-FCM, AGA-FCM,
IDCAGA-FCM and IDCGA2-FCM. (a) DS1. (b) DS2. c) DS3. (d) DS4. (e) DS5. (f) DS6. (g) DS7. (h) DS8.
AN
Fig. 5. Clustering results for the four real-word data sets. The figures plot the average obtained by FCM, GA-FCM,
AGA-FCM, IDCAGA-FCM and IDCGA2-FCM. (a) Iris. (b) Wine. (c) Heart. (d) Cancer. (e) Glass. (f) Image segment. (g)
Landsat Satellite.
Fig. 6. Run time of five algorithms for the 16 data sets. (a) Six synthetic data sets. (b) Six UCI data sets. (c) Four large data sets.
M

Fig.7. Number of iterations obtained by three algorithms on the six data sets.
Fig. 8. The clustering results obtained by IDCGA2-FCM with change.
ED

Fig.9. Run time of IDCGA2-FCM with change.


PT
CE
AC

You might also like