An_Improved_K-Means_Algorithm_Based_on_Fuzzy_Metrics (1)
An_Improved_K-Means_Algorithm_Based_on_Fuzzy_Metrics (1)
ABSTRACT The traditional K-means algorithm has been widely used in cluster analysis. However,
the algorithm only involves the distance factor as the only constraint, so there is a problem of sensitivity to
special data points. To address this problem, in the process of K-means clustering, ambiguity is introduced
as a new constraint condition. Hence, a new membership Equation is proposed on this basis, and a method
for solving the initial cluster center points is given, so as to reduce risks caused by random selection of initial
points. Besides, an optimized clustering algorithm with Gaussian distribution is derived with the utilization
of fuzzy entropy as the cost function constraint. Compared with the traditional clustering method, the new
Equation’s membership degree can reflect the relationship between a certain point and the set in a clearer way,
and solve the problem of the traditional K-means algorithm that it is prone to be trapped in local convergence
and easily influenced by noise. Experimental verification proves that the new method has fewer iterations
and the clustering accuracy is better than other methods, thus having a better clustering effect.
INDEX TERMS K-means, fuzzy entropy, cluster center, membership degree, fuzzy clustering.
This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
217416 VOLUME 8, 2020
X. Geng et al.: Improved K-Means Algorithm Based on Fuzzy Metrics
obtain better clustering effects. In reference [17] it is pre- [31], a heuristic method is offered based on the Silhouette
sented to utilize neighboring ideas, by taking the overall criterion to find the number of clusters. Reference [32] desig-
distribution of data samples as the basis for division, so as nates the digital execution of a model, based on an intuition-
to improve clustering effects. In reference [18], with the istic fuzzy histogram hyperbolization and possibilitic fuzzy
combination of Adaptive radius immune algorithm(ARIA) c-mean clustering algorithm for early breast cancer detection.
and K-means algorithms, the immune algorithm for adaptive In reference [33], a clustering method featuring the combina-
radius is applicated to preprocess the data so as to generate tion of fuzzy c-means algorithm and entropy- based algorithm
mirror data that represen the distribution and density of the is advised to achieve both distinct and compact effects. Ref-
original data, improving the noise resistance and stability of erence [34]proposed a novel intuitionistic possibilistic fuzzy
clustering. In reference [19], a K-means clustering algorithm c-mean algorithm. Possibilistic fuzzy c-mean and intuitionis-
is designed with a random shift of the center of gravity tic fuzzy c-mean are hybridized to overcome the problems
to process single-color images specifically for the isolated of fuzzy c-mean. In reference [35], a novel fuzzy-entropy
point problem. In addition, new directions have emerged, based clustering measure (FECM) is presented, in which the
which combine swarm intelligence and bionic algorithms, average symmetric fuzzy cross entropy of membership subset
such as Particle swarm optimization K-means(PSO-Kmeans) pairs is integrated with the average fuzzy entropy of clusters.
based on particle swarms [20], artificial bee colony K-means The above methods improve clustering effects with the fuzzy
algorithm (ABC-Kmeans) based on bee swarms [21], and entropy utilized to enhance the effective use of information
Gray Wolf K-means (GWO-Kmeans) [22] based on gray wolf such as overall distribution characteristics. Inspired by the
optimization. In the above optimization examples, most are theory of fuzzy mathematics and reference [17] and [18],
still optimization methods for the division of a certain aspect in the clustering process, this paper proposes a fuzzy mean
such as distance or edge data. Among them, only reference clustering algorithm, namely fuzzy metrics K-means (FMK),
[18] presented an application for the characteristics of data with fuzzy entropy as a constraint in the distance condition.
distribution, which is an effective use of data information The algorithm first introduces artificial setting of the initial
except for the distance between points. cluster center to reduce the influence of noise, integrates
After the fuzzy theory appeared, Dunn et al. put for- the overall distribution structure into that of the membership
ward the fuzzy c-means clustering algorithm (FCM) [23]. function, and then compares the overall ambiguity of the
According to its judgement criteria, clustering is divided into cluster after introducing a certain point. The last step is the
hard clustering and soft clustering. Reference [24] raised a convergence completed through iteration, finally realizing
Fuzzy C-Means algorithm with a Divergence-based Kernel the clustering of the FMK-means algorithm.
FCM algorithm (FCMDK), which can handle data whose
boundaries between clusters were non-linear. Chowdhary, II. RELATED INFORMATION
C. L et al.proposesd a novel possibilistic exponential fuzzy A. K-MEANS ALGORITHM
c-means (PEFCM) clustering algorithm for segmenting med- As one of the most well-known clustering algorithms, The
ical images better and earlier [25]. In the process of hard K-means algorithm can actively select the number of cate-
clustering, objects to be clustered are strictly classified by gories and bases its calculation of closeness on the Euclidean
their exclusive nature, but the object of a practical problem distance between the points.
is complex, and there may exist objective problems with In the algorithm, k clusters C = {C1 , C2 , · · · , Ck }are
attributes of multiple categories. Therefore, it is necessary to randomly selected as partitions and n datasets of samples
come up with a soft division method on such fundamental D = {x1 , x2 , · · · , xn }, n > kare divided into the nearest
issues. For example, FCM clustering has been widely used in cluster. Then Recalculate the center point of the cluster. Stop
fields like pattern classification and image processing [26]. the iteration until the convergence condition is reached. Gen-
Because it adds ambiguity to the membership requirement erally, the convergence function is defined as follow:
of each pixel, this type of algorithm is significantly bet- K X
X
ter than traditional K-means clustering in image processing SSE(C) = k xi − ck k2 (1)
results [27]. Although the degree of membership can feed- k=1 xi ∈ck
back the correlation beyond the distance, it is still gained in, SSE is the least square error of the cluster division cor-
by the distance relationship between a single point and the responding to the algorithm sample clustering. ck is the center
class. Therefore, the algorithm is also very susceptible to point of the cluster Ck . In the reference [15], the mathematical
noise. In reference [28] hybridizes intuitionistic fuzzy set meaning of the center point is verified and deduced. Here
and rough set in combination with statistical feature extrac- comes the conclusion: the best center of a cluster is the mean
tion techniques,which is higher than the accuracy achieved value of each point in it. The calculation method is as follow:
by hybridizing fuzzy rough set model. In reference [29], P
a method is advanced to relax the restrictions of membership xi ∈Ck xi
ck = (2)
and improve the robustness. In reference [30], the weight of |Ck |
membership degree is involved for the consideration of influ- In summary, the goal of the K-means algorithm is the min-
ences of each dimension attribute on clustering. In reference imum clustering result in (1). This optimization problem
a membership equation with data distribution characteristics Take the partial derivative of uij in (8) and make it zero.
and the fuzzy clustering algorithm on entropy (FCMOE) Through constraint (7), the degree of membership can be
with fuzzy entropy constraints. Both of them have a certain finally obtained, which can be expressed as equation (9)
improving influence on the convergence direction of data. 4(1−2uij )(xi −xj )2
exp(− Cpω )
Reference [43], the ABC-kmeans algorithm is supplemented uij = (9)
Pc 4(1−2uij )(xi −xj )2
with the strategy of data features, performing better in clus- t=1 exp(− Cpω )
tering. Reference [44], based on the PSO-Kmeans algorithm, Then calculate the partial derivative of the cluster center ci
a two-step method is proposed by giving priority to a better from equation (8), and the cluster center can be expressed as
initial clustering center, finally reducing the cost.. equation (10) Pn
In summary, the introduction of the fuzzy entropy function uij (1 − uij )xi
makes fuzzy entropy the coefficient of the K-means algo- cj = Pi=1n (10)
i=1 uij (1 − uij )
rithm’s objective function. Meanwhile, an optimized clus-
tering algorithm with Gaussian distribution can be derived C. INITIAL CENTER POINT SOLUTION OF FMK-MEANS
by utilizing the distribution characteristics of the data to ALGORITHM
be clustered and taking fuzzy entropy as the cost func- In the traditional K-means algorithm, the initial center point is
tion constraint, finally improving the anti-noise performance. randomly allocated. If using noise or edge points as the initial
Moreover, this algorithm gives a better initial center point to point, a significant interference will definitely be caused to
reduce iterations. the result. To avoid similar situations, the initial point can be
B. FMK-MEANS ALGORITHM DERIVATION given artificially, and the initial point is required to be as close
Plug fuzzy entropy into equation (1) as a measure of fuzzy to actual point size as possible. According to the distribution
degree., equation (5) can be obtained,in which the constraint of clustering centers, centers of different clusters must be
condition can be got through the theorem 2. arranged in a nearly linear way in one or more dimensions.
n c An approximation effect can be achieved through the average
4 XX
SSE = p uij (1 − uij )(xi − cj )2 value. Assume that the data set X includes n N-dimensional
C data divided into c clusters. Then the cluster center of the I th
i=1 j=1
s.t. p = logc 4(c − 1) − 1 (5) dimension (I = 1, 2, · · · , N ) of any i cluster can be expressed
as:
Concurrently, the distribution characteristic ω of the clus- 3i1X
n
tering function is introduced to make the newly added con- vi I = xjI (11)
2cn
straint condition contract according to it. j=1
n N
1 X X xik Get the initial center point Vi = {vi1 , vi2 , · · · , viI , · · · , viN |i =
ω= (6) 1, 2, · · · , c}
nN max xk
i=1 k=1
in, N is the number of dimensions of elements in the cluster, D. THE FLOW OF FMK-MEANS ALGORITHM
and ω represents the normalized distribution characteristics The traditional K-means algorithm randomly selects the ini-
of the clustered data set. According to the factor ω, let the tial point without considering distribution characteristics of
fuzzy entropy meet the data distribution characteristics. The actual data. It does not have soft clustering characteris-
KKT condition of FMK-means algorithm with fuzzy entropy tics, which leads to the lack of robustness and stability of
factor can be expressed as equation (7) clustering results. Obviously, compared with the traditional
n c
4 XX K-means algorithm, it’s more reasonable for the improved
S = p uij (1 − uij )(xi − cj )2 + ωuij ln uij algorithm to work from multiple angles rather than a single
C
i=1 j=1
one. In addition, the improved algorithm reflects a Gaussian
s.t. g(uij ) = uij ln uij 6 0 distribution nature and can effectively overcome the problem
ω>0 of sensitivity to noise. The initial point can be given automati-
ωg(0) = ωg(1) = 0 (7) cally, and a soft clustering algorithm with data characteristics
is proposed in this paper. The specific steps are as follows:
Find the extremum here and define Lagrangian multiplication. Step 1 For a given data set, the mean points of all samples
A new function of (8) is available are calculated by referring to equation (11). The mean point
n c
4 XX is taken as the first clustering center, denoted asV (0) . The
L = p uij (1 − uij )(xi − cj )2 + ωuij ln uij
C
i=1 j=1 distribution characteristic of the overall data, ω, is calculated
n c through equation (6).
X X (0)
+ λi (1 − uij ) Step 2 Calculate the uij of all samples, and the member-
i=1 j=1 ship degrees will be determined according to equation (9).
c
X Similarly, at the beginning, all the sample points are initial-
s.t. uij = 1 (8) ized, which means all of the membership degrees should be
j=1 zero.
algorithm has the fewest iterations and the highest accuracy contraction direction. Both Ring and HTRU are at the level
rate. In the case of rising data volume from Iris to Phoneme, of 100,000, but the impact of their dimensionality on all
the number of iterations of the traditional algorithm increases algorithms is higher than that of quantity.
at an incredible speed, but the soft clustering methods have Even though the FCMOE requires fewer iterations than
fewer iterations and the time cost changes smoothly. The the traditional K-means algorithm and FCM algorithm, there
main reason is that the degree of membership can better is still a certain gap on accuracy and convergence results
reflect the relationship between multiple points rather than between it and the FMK-means algorithm. The main reason
simply the distance between two points, thereby speeding up is that fuzzy entropy is used as the coefficient in the optimiza-
the convergence. Moreover, the algorithms of FMK-means tion algorithm. Although both introduce an adjustment factor
and FCMOE have a smaller number of iterations due to the that indicates the distribution characteristics of the data set,
artificial initial center. Besides, when dimensions increase it is more reliable in the reduction direction. However, due
from Phoneme to Ring, the accuracy of K-means algorithms to the calculation with fuzzy entropy introduced, the output
decreases, but the accuracy of fuzzy clustering does not of the FMK-means algorithm is the smallest, and the result
experience a similar change and clustering algorithms with of the overall distribution is more unambiguous.
fuzzy entropy as the constraint posess the highest accuracy.
Since fuzzy entropy describes the nature of the overall fuzzy 2) ANTI-NOISE ABILITY COMPARISON
degree, and since the more specific the set is, the lower fuzzy For the purpose of comparing the anti-noise ability of the
entropy is, FMK-means has a higher accuracy and a lower FMK-means algorithm and the other five algorithms, 10%,
clustering output. In the Balance data, there appear abnormal 15% and 20% of the noise data are added to the Iris data. Then
feedback results in the above several algorithms because the use these algorithms to complete the clustering and check
balance data is not convex data, and the distribution is shown the anti-noise ability. The experimental results are shown in
in FIGURE 6. Therefore, in spite of a small data volume of TABLE 8.
the sample, the accuracy of the traditional K-means algorithm According to the experimental results, the FMK-means
is not high. The FMK-means and the FCMOE algorithm, algorithm is less affected by noise and has a good anti-noise
though taking a little bit longer time, can gain a high accu- performance. Compared with traditional K-means algo-
racy due to the use of data distribution characteristics as the rithms, FMK-means, with artificial initial cluster centers,
effectively avoids the interference of randomly selected spe- [11] Z. Huang, ‘‘A fast clustering algorithm to cluster very large categorical data
cial points, reduces iterations and possesses some anti-noise sets in data mining,’’ DMKD, vol. 3, no. 8, pp. 34–39, 1997.
[12] Z. Huang, ‘‘Extensions to the K-means algorithm for clustering large data
ability. Fuzzy entropy is introduced as a new constraint in the sets with categorical values,’’ Data Mining Knowl. Discovery, vol. 2, no. 3,
FMK-means algorithm. Secondly, in the FMK-means algo- pp. 283–304, Sep. 1998.
rithm, data distribution characteristics are also introduced [13] C. Ding and X. He, ‘‘K-nearest-neighbor consistency in data clustering:
incorporating local information into global optimization,’’ in Proc. ACM
into the membership degree, so that the membership func- Symp. Appl. Comput., Nicosia, 2004, pp. 584–589.
tion is supplemented with Gaussian distribution characteris- [14] G. Zhang, C. Zhang, and H. Zhang, ‘‘Improved K-means algorithm based
tics. By doing so, it ensures that the reducing direction of on density canopy,’’ Knowl.-Based Syst., vol. 145, pp. 289–297, Apr. 2018.
[15] Z. Xianchao, ‘‘Cluster-based partition algorithm,’’ in Data Clustering, 3rd
entropy conforms to that of distribution characteristics during ed. Beijing, China: Science Press, 2017, pp. 37–62.
the algorithm convergence, thereby reducing the impact of [16] D. Yueming, W. Minghui, Z. Ming, and W. Yan, ‘‘Optimizing initial cluster
non-convex clustering. Centroids by SVD in K-means algorithm for Chinese text clustering,’’
Comparison between the clustering effects of different J. Syst. Simul., vol. 30, pp. 244–251, Oct. 2018.
[17] L. T. Z. Can, ‘‘Nearest neighbor optimization k-means clustering algo-
algorithms and the analysis the clustering results make it rithm,’’ Comput. Sci., vol. 46, pp. 216–219, Nov. 2019.
clear that the improved FMK-means algorithm has stronger [18] W. L. L. X. Z. Liangjun, ‘‘Research on K-means clustering algorithm based
advantages in terms of convergence speed, iterations and on ARIA,’’ J. Sichuan Univ. Sci. Eng. (Natural Sci. Ed.), vol. 32, pp. 65–70,
Apr. 2019.
clustering accuracy. [19] Y. Xiaoli, ‘‘Design of center random drift (CRD) K-means clustering
algorithm,’’ J. Changchun Univ., vol. 27, pp. 35–38, Aug. 2017.
VI. CONCLUSION [20] C. Guo and Y. Zang, ‘‘Clustering algorithm based on density function and
This paper proposes an improved K-means algorithm based nichePSO,’’ J. Syst. Eng. Electron., vol. 23, no. 3, pp. 445–452, Jun. 2012.
on fuzzy entropy. Fuzzy entropy characterized in its descrip- [21] Y. Jinping, Z. Jie, and M. Hongbiao, ‘‘K-means clustering algorithm based
on improved artificial bee colony algorithm,’’ J. Comput. Appl., vol. 34,
tion of the set’s fuzzy degree contributes to the solution of the pp. 1065–1069, Jun. 2014.
problem that the traditional K-means algorithm is extremely [22] L. Jia-Ming, K. Li-Qun, and Y. Hong-Hong, ‘‘K-means clustering algo-
sensitive to special points. Meanwhile, a new solution to rithm optimized by Gray Wolf,’’ Chin. Sciencepaper, vol. 14, pp. 778–782
and 807, Aug. 2019.
the initial center point is proposed, which avoids the defect [23] J. C. Bezdek, R. Ehrlich, and W. Full, ‘‘FCM: The fuzzy c-means clustering
that the initial point is randomly selected and the risks of algorithm,’’ Comput. Geosci., vol. 10, nos. 2–3, pp. 191–203, Jan. 1984.
a special point or a local optimal solution. In the optimiza- [24] Y. S. Song and D. C. Park, ‘‘Fuzzy C-means algorithm with divergence-
based kernel,’’ in Proc. Int. Conf. Fuzzy Syst. Knowl. Discovery,
tion algorithm, the factor representing data set distribution Berlin,Germany, 2006, pp. 99–108.
characteristics is specified for fuzzy entropy as ω, so that [25] C. L. Chowdhary and D. P. Acharjya, ‘‘Clustering algorithm in possibilistic
the clustering result solution has Gaussian distribution char- exponential fuzzy C-Mean segmenting medical images,’’ J. Biomimetics,
acteristics, which improves the accuracy and noise resis- Biomater. Biomed. Eng., vol. 30, pp. 12–23, Jan. 2017.
[26] C. Xia and F. Fang, ‘‘Fuzzy Clustering Methods,’’ Life Sci. Instrum.,
tance of the clustering algorithm. However, the processing vol. 11, pp. 33–37, Dec. 2013.
of multi-dimensional data, such as text data and image data, [27] W. Chengmao and S. Jiamei, ‘‘Adaptive robust picture fuzzy clustering
is not covered in this paper. Further research tasks are listed segmentation algorithm,’’ J. Huazhong Univ. Sci. Technol. (Natural Sci.
Ed.), vol. 47, pp. 115–120, Sep. 2019.
as how to select a correct way to reduce dimensionality and [28] C. L. Chowdhary and D. P. Acharjya, ‘‘A hybrid scheme for breast cancer
how to reasonably introduce other types of ambiguity. detection using intuitionistic fuzzy rough set technique,’’ Int. J. Healthcare
Inf. Syst. Informat., vol. 11, no. 2, pp. 38–61, Apr. 2016.
REFERENCES [29] V. Cherkassky and F. Mulier, ‘‘Support Vector Machines,’’ in Learning
[1] Y. Wu, ‘‘General overview on clustering algorithms,’’ Comput. Sci., vol. 42, from Data Concepts, Theory, and Methods, vol. 1, 9th ed. New York, NY,
pp. 491–499,524, Jun. 2015. USA: Wiley, 2007, pp. 413–417.
[2] Y. Jing and J. Wang, ‘‘Tag clustering algorithm LMMSK: Improved K- [30] J. Li, X. B. Gao, and L. C. Jiao, ‘‘A new feature weighted fuzzy clustering
means algorithm based on latent semantic analysis,’’ J. Syst. Eng. Electron., algorithm,’’ Acta Electronica Sinica, vol. 34, pp. 89–92, Jan. 2006.
vol. 28, no. 2, pp. 374–384, Apr. 2017. [31] C. Palanisamy and S. Selvan, ‘‘Efficient subspace clustering for higher
[3] S. Sambasivam and N. Theodosopoulos, ‘‘Advanced data clustering meth- dimensional data using fuzzy entropy,’’ J. Syst. Sci. Syst. Eng., vol. 18,
ods of mining Web documents,’’ Issues Informing Sci. Inf. Technol., vol. 3, no. 1, pp. 95–110, Mar. 2009.
pp. 563–579, Jan. 2006. [32] C. L. Chowdhary and D. P. Acharjya, ‘‘Breast cancer detection using
[4] J. Xu, K. Zheng, M.-M. Chi, Y.-Y. Zhu, X.-H. Yu, and X.-F. Zhou, ‘‘Tra- intuitionistic fuzzy histogram hyperbolization and possibilitic fuzzy C-
jectory big data: Data, applications and techniques,’’ J. Commun., vol. 36, mean clustering algorithms with texture feature based classification on
no. 12, pp. 97–105, 2015. mammography images,’’ in Proc. Int. Conf. Adv. Inf. Commun. Technol.
[5] IDC. Accessed: Oct. 31, 2020. [Online]. Available: https://www. idc.com/ Comput., 16th ed. Bikaner, India: ACM Press, 2016, pp. 1–6.
[6] P. M. Sosa, G. S. Carrazoni, R. Gonçalves, and P. B. Mello-Carpes, ‘‘Use [33] V. Dey, D. K. Pratihar, and G. L. Datta, ‘‘Genetic algorithm-tuned entropy-
of facebook groups as a strategy for continuum involvement of students based fuzzy C-means algorithm for obtaining distinct and compact clus-
with physiology after finishing a physiology course,’’ Adv. Physiol. Edu., ters,’’ Fuzzy Optim. Decis. Making, vol. 10, no. 2, pp. 153–166, Jun. 2011.
vol. 44, no. 3, pp. 358–361, Sep. 2020. [34] C. L. Chowdhary and D. P. Acharjya, ‘‘Segmentation of mammograms
[7] F. Yu, K. Peng, and X. Zheng, ‘‘Big data and psychology in China,’’ Chin. using a novel intuitionistic possibilistic fuzzy C-mean clustering algo-
Sci. Bull., vol. 60, nos. 5–6, pp. 520–533, Feb. 2015. rithm,’’ Nature Inspired Comput., vol. 652, pp. 75–82, Jan. 2018.
[8] K. Jahanbin and V. Rahmanian, ‘‘Using Twitter and Web news mining [35] H. T. Hong and Yonghong, ‘‘Automatic pattern recognition of ECG signals
to predict COVID-19 outbreak,’’ Asian Pacific J. Tropical Med., vol. 13, using entropy-based adaptive dimensionality reduction and clustering,’’
pp. 378–380, Jul. 2020. Appl. Soft Comput., vol. 55, pp. 238–252, Jun. 2017.
[9] J.-G. Sun, ‘‘Clustering algorithms research,’’ J. Softw., vol. 19, no. 1, [36] J. Zejun, ‘‘Fuzzy Set,’’ in Theory and Methods of Fuzzy Mathematics.
pp. 48–61, Jun. 2008. Beijing, China: Publishing House of Electronics Industry, 2015, pp. 23–27.
[10] H. Dai, Z. Chang, and N. Yu, ‘‘Understanding data mining,’’ in Introduction [37] M. X. QING and SUN, ‘‘A new clustering effectiveness function: Fuzzy
to Data Mining, 1st ed. Beijing, China: Tsinghua Univ. Press, 2015, entropy of fuzzy partition,’’ CAAI Trans. Intell. Syst., vol. 10, pp. 75–80,
pp. 1–27. Jan. 2015.
[38] F. Jiulun and W. Chengmao, ‘‘Clustering validity function based on fuzzy YUKUN MU received the bachelor’s degree in
entropy,’’ Pattern Recognit. Artif. Intell., vol. 14, no. 4, pp. 390–394, network engineering from Southwest Petroleum
Dec. 2001. University, Chengdu, China, where he is currently
[39] Z. Liu and Y. Hu, ‘‘Research on FCM clustering optimization algorithm pursuing the M.S. degree in computer science and
for self-adaptive bacterial foraging,’’ Mod. Electron. Technique, vol. 43, technology. His research interests include fuzzy
pp. 144–148, Mar. 2020. mathematics and data mining.
[40] X. J. Gao and Pei, ‘‘A study of weighting exponent m in a fuzzy C-means
algorithm,’’ Acta Electronica Sinica, vol. 28, pp. 80–83, Apr. 2000.
[41] T. Chaira, ‘‘A novel intuitionistic fuzzy c means clustering algorithm and
its application to medical images,’’ Appl. Soft Comput., vol. 11, no. 2,
pp. 1711–1717, Mar. 2011.
[42] S. Liao, J. Zhang, and A. Liu, ‘‘Fuzzy C-means clustering algorithm
by using fuzzy entropy constraint,’’ J. Chin. Comput. Syst., vol. 35,
pp. 189–193, Feb. 2014.
[43] U. H. Atasever, ‘‘A novel unsupervised change detection approach based
on reconstruction independent component analysis and ABC-kmeans
clustering for environmental monitoring,’’ Environ. Monitor. Assessment,
vol. 191, no. 7, pp. 1–11, Jun. 2019. SENLIN MAO received the bachelor’s degree
[44] C. Ibrahim and I. Mougharbel, ‘‘Two stages K-means and PSO-based
in city underground space engineering from
method for optimal allocation of multiple parallel DRPs application &
Southwest Petroleum University, Chengdu, China,
deployment,’’ IET Smart Grid, vol. 3, no. 2, pp. 216–225, May 2019.
[45] T. Hu, ‘‘Discussion of improving fuzzy K-means clustering,’’ M.S. the- where he is currently pursuing the M.S. degree
sis, Dept. School Data Comput. Sci., Sun Yat-Sen Univ., Guangzhou, in computer science and technology. His research
China,2010. interests include grey system theory, data mining,
[46] F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel, and artificial neural networks.
M. Blondel, P. Prettenhofer, R. Weiss, V. Dubourg, J. Vanderplas,
A. Passos, D. Cournapeau, M. Brucher, M. Perrot, and E. Duchesnay,
‘‘Scikit-learn: Machine learning in Python,’’ J. Mach. Learn. Res., vol. 12,
pp. 2825–2830, Oct. 2011.
[47] L. Buitinck, G. Louppe, M. Blondel, F. Pedregosa, A. Mueller,
O. Grisel, V. Niculae, P. Prettenhofer, A. Gramfort, J. Grobler, R. Layton,
J. VanderPlas, A. Joly, B. Holt and G. Varoquaux, ‘‘API design for
machine learning software: Experiences from the scikit-learn project,’’ in
Proc. ECML PKDD Workshop, Lang. Data Mining Mach. Learn., 2013,
pp. 108–122.
[48] Iris Uci. Accessed: Oct. 31, 2020. [Online]. Available:http://archive.ics. JINCHI YE received the bachelor’s degree in infor-
uci.edu/ml/datasets/Iris mation management and information system from
[49] Balance Uci. Accessed: Oct. 31, 2020. [Online]. Available: http://archive. Southwest Petroleum University, Chengdu, China,
ics.uci.edu/ml/datasets/Balance+Scale where he is currently pursuing the M.S. degree
[50] phoneme DataHub Accessed:Oct. 31.2020. [Online]. Available: in computer science and technology. His research
https://datahub.io/machine-learning/phoneme
interests include grey system theory, data mining,
[51] Ring KEEL. Accessed: Oct. 31, 2020. [Online]. Available: https://sci2s.ugr.
and artificial neural networks.
es/keel/datasets.php
[52] R. J. Lyon, B. W. Stappers, S. Cooper, J. M. Brooke, and J. D. Knowles,
‘‘Fifty years of pulsar candidate selection: From simple filters to a new
principled real-time classification approach,’’ Monthly Notices Roy. Astro-
nomical Soc., vol. 459, no. 1, pp. 1104–1123, Apr. 2016.
[53] N. R. Pal and J. C. Bezdek, ‘‘On cluster validity for the fuzzy c-means
model,’’ IEEE Trans. Fuzzy Syst., vol. 3, no. 3, pp. 370–379, Aug. 1995.
XINYU GENG is currently a Professor with LIPING ZHU received the bachelor’s degree in
the School of Computer Science, Southwest computer science and technology from Sichuan
Petroleum University, Chengdu, China. His main Normal University, Chengdu, China, where she is
research interests include data mining and artificial currently pursuing the M.S. degree in engineering
neural networks. management. Her research interests include data
mining and artificial neural networks.