23-2021 - A Comprehensive Survey of Image Segmentation
23-2021 - A Comprehensive Survey of Image Segmentation
https://doi.org/10.1007/s11042-021-10594-9
1174: FUTURISTIC TRENDS AND INNOVATIONS IN MULTIMEDIA SYSTEMS
USING BIG DATA, IOT AND CLOUD TECHNOLOGIES (FTIMS)
Abstract
Image segmentation is an essential phase of computer vision in which useful informa-
tion is extracted from an image that can range from finding objects while moving across
a room to detect abnormalities in a medical image. As image pixels are generally unla-
belled, the commonly used approach for the same is clustering. This paper reviews various
existing clustering based image segmentation methods. Two main clustering methods have
been surveyed, namely hierarchical and partitional based clustering methods. As partitional
clustering is computationally better, further study is done in the perspective of methods
belonging to this class. Further, literature bifurcates the partitional based clustering meth-
ods into three categories, namely K-means based methods, histogram-based methods, and
meta-heuristic based methods. The survey of various performance parameters for the quan-
titative evaluation of segmentation results is also included. Further, the publicly available
benchmark datasets for image-segmentation are briefed.
1 Introduction
“A picture is worth a thousand words” is a famous idiom which signifies that process-
ing an image may relieve more information than processing the textual data. In computer
vision, image segmentation is the prime research area which corresponds to partitioning of
Raju Pal
[email protected]
an image into its constituent objects or region of interests (ROI). Generally, it assembles the
image pixels into similar regions. It is a pre-processing phase of many image-based appli-
cations like biometric identification, medical imaging, object detection and classification,
and pattern recognition [91]. Some of the prominent applications are as follows.
– Content-based image retrieval: It corresponds to the searching of query-relevant dig-
ital images from large databases. The retrieval results are obtained as per the contents
of the query image. To extract the contents from an image, image segmentation is
performed.
– Machine vision: It is the image-based technology for robotic inspection and analysis,
especially at the industrial level. Here, segmentation extracts the information from the
captured image related to a machine or processed material.
– Medical imaging: Today, image segmentation helps medical science in a number of
ways from medical diagnosis to medical procedures. Some of the examples include seg-
mentation of tumors for locating them, segmenting tissue to measure the corresponding
volumes, and segmentation of cells for performing various digital pathological tasks
like cell count, nuclei classification and many others.
– Object recognition and detection: Object recognition and detection is an important
application of computer vision. Here, an object may be referred to as a pedestrian or a
face or some aerial objects like roads, forests, crops, etc. This application is indispens-
able to image segmentation as the extraction of the indented object from the image is
priorly required.
– Video surveillance: In this, the video camera captures the movements of the region
of interests and analysis them to perform an indented task such as identification of
the action being performed in the captured video or controlling the traffic movement,
counting the number of objects and many more. To perform the analysis, segmentation
of the region of interest is foremost required.
Though segmenting an image into the constituent ROI may end up as a trivial task for
humans, it is relatively complex from the perspective of the computer vision. There are
Fig. 1 Challenges in image segmentation a Illumination variation [2] b Intra-class variation [2] c Background
complexity [3]
Multimedia Tools and Applications (2022) 81:35001–35026 35003
number of challenges which may affects the performance of an image segmentation method.
Figure 1 depicts three major challenges of image segmentation which are discussed below.
– Illumination variation: It is a fundamental problem in image segmentation and has
severe effects on pixels. This variation occurs due to the different lighting conditions
during the image capturing. Figure 1a shows an image that is captured in different
illumination conditions. It can be observed that the corresponding pixels in each image
contain varying intensity values which pose difficulties in image segmentation.
– Intra-class variation: One of the major issues in this field is the existence of the region
of interest in a number of different forms or appearances. Figure 1b depicts an exam-
ple of chairs that are shown in different shapes, each having a different appearance.
Such intra-class variation often makes the segmentation procedure difficult. Thus, a
segmentation method should be invariant to such kind of variations.
– Background complexity: Image with a complex background is a major challenge. Seg-
menting an image as the region of interests may mingle with the complex environment
and constraints. Figure 1c illustrates an example of such image which corresponds to
H&E stained breast cancer histology image. The dark blue color regions in the image
represent the nuclei region which is generally defined as the region of interests in
histopathological applications like nuclei count or cancer detection. It can be observed
that the background is too complex due to which the nuclei regions do not have clearly
defined boundaries. Therefore, such background complexities degrade the performance
of segmentation methods.
Further, the essence of an image segmentation is to represent an image with a few signifi-
cant segments instead of thousands of pixels. Moreover, image segmentation may be viewed
as a clustering approach in which the pixels, that are satisfying a criterion, are grouped into a
cluster while dissatisfying pixels are placed in different groups. To exemplify this, consider
the images in Fig. 2. The first image consists of some animals on the field. To extract the
animals from the background, the ideal result would be to group all the pixels belonging to
the animals into the same cluster while background pixels into another cluster as presented
in Fig. 2b. However, the pixel labelling is unknown. Moreover, it can be observed that first
image presents a complex scenario in terms of different color intensities and shapes of each
animal which correspond to illumination and intra-class variations respectively. To mitigate
these challenges and learning from such unlabelled data, the common approach is to group
the data based on certain similarity or dissimilarity measures followed by labelling of each
group. This approach of grouping the data is generally termed as clustering.
Fig. 2 Example of clustering based image segmentation a Original image b Segmented image
35004 Multimedia Tools and Applications (2022) 81:35001–35026
Generally, clustering has been used in different areas of real-world applications like mar-
ket analysis, social network analysis, online query search, recommendation system, and
image segmentation [66]. The main objective of a clustering method is to classify the unla-
belled pixels into homogeneous groups that have maximum resemblance, i.e. to achieve
maximum similarity within the clusters and minimum dissimilarity among the clusters.
Mathematically, the clustering procedure on an image (X) of size (m × n), defined over
d-dimensions, generates K clusters {C1 , C2 , · · · , CK } subject to the following conditions:
– Ci = ∅, for i = 1, 2, · · · , K
– Ci ∩ Cj = ∅, for i and j = 1, 2, · · · , K and i = j
– ∪K
i=1 Ci = X
The first condition ensures that there will be at least one pixel in every formed cluster.
The next condition implies that all the formed clusters will be mutually exclusive, i.e. a pixel
will not be assigned to two clusters. The last condition states that the data values assigned
to all the clusters will represent the complete image.
In literature, there are number of clustering algorithms for image segmentation. How-
ever, there is no precise definition of a cluster and different clustering methods have defined
their own techniques to group the data. Therefore, based on the cluster formation, the clus-
tering methods may broadly be classified in two main categories, namely hierarchical and
partitional [85]. A taxonomy of the clustering methods is presented in Fig. 3. The following
section presents each category of clustering with respect to image segmentation.
2 Hierarchical clustering
the clusters. All the data items belong to a single cluster initially. This single cluster is
further split into smaller clusters until a termination criteria is satisfied or until each data
item forms its own cluster. Divisive clustering (DIVCLUS-T) [17] and divisive hierarchical
clustering with diameter criterion (DHCDC) [31] are popular methods of this category. On
the other side, the agglomerative clustering is performed in the bottom-up fashion where
data points are merged hierarchically to produce clusters. Initially, each data item defines
itself as a cluster which is further merged into bigger clusters, until a termination criteria
is meet or until a single cluster is formed consisting of all the data items. Some popu-
lar agglomerative methods are balanced iterative reducing and clustering using hierarchies
(BIRCH) [94], clustering using representatives (CURE) [32], and chameleon [41].
In general, divisive clustering is more complex than the agglomerative approach, as it
partitions the data until each cluster contains a single data item. The divisive clustering
will be computationally efficient if the each cluster is not partitioned to individual data
leaves. The time complexity of a naive agglomerative clustering is O(n3 ) which can be
reduced to O(n2 ) using optimization algorithms. On the contrary, the time complexity of
divisive clustering is Ω(n2 ) [66]. Moreover, divisive clustering is also more accurate since
it considers the global distribution of data while partitioning data in top-level partitioning.
The pseudocode of the divisive and agglomerative clustering approaches are presented in
Algorithms 1 and 2 respectively.
3 Partitional clustering
The partitional clustering is relatively popular and preferred over the hierarchical cluster-
ing, especially for a large dataset, due to its computational efficiency [13]. In this clustering
approach, the notion of similarity is used as the measurement parameter. Generally, parti-
tional clustering groups the data items into clusters according to some objective function
such that data items in a cluster are more similar than the data items in the other clusters.
To achieve this, the similarity of each data item is measured with every cluster. Moreover,
in partitional clustering, the general notion of the objective function is the minimization
of the within-cluster similarity criteria which is usually computed by using Euclidean dis-
tance. The objective function expresses the goodness of each formed cluster and returns the
best representation from the generated clusters. Like hierarchical clustering, the number of
clusters to be formed needs to be defined priorly. Moreover, methods based on partitional
clustering assure that each data item is assigned to a cluster even if it is quite far from the
respective cluster centroid. This, sometimes, results in distortion of the cluster shapes or
false results, especially in the case of noise or outlier.
Table 2 depicts partitional methods which have been used in various areas, such as image
segmentation, robotics, wireless sensor network, web mining, business management, and
medical sciences. Each application domain has different data distribution and complexity.
Thus, a single method of partitional clustering might not fit for all problems. Therefore,
based on the problem and dataset, the suitable method is selected. As shown in Table 2,
the partitional clustering methods are categorized in two classes, namely soft and hard
clustering methods, which are discussed in the following subsections.
Soft clustering methods assign each data to either two or more clusters with a degree of
belongingness (or membership) iteratively. The degree of belongingness illustrates the level
of association among data more reasonably. The belongingness of a data item with a cluster
is a continuous value in the interval [0, 1] and depends upon the objective function. Usu-
ally, the minimization of the sum of squared Euclidean distance among the data and formed
clusters is considered. Some popular methods are fuzzy c-means (FCM) [10], fuzzy c-shells
(FCS) [27], and mountain method [88]. In FCS, each cluster is considered as a multidimen-
sional hypersphere and defines the distance function accordingly. Mountain method uses
the Mountain function to find the cluster centres. Particularly, FCM is the most widely used
and popular method of this approach [53]. It returns a set of K fuzzy clusters by minimizing
the objective function defined in (1).
N
K
ik xi − vk , m ≥ 1.
2
μm (1)
i=1 k=1
where, μik ∈ [0, 1] and corresponds to membership degree for i th pixel with k th clus-
ter. Equation (1) is optimized iteratively by updating μik and vk according to (2) and (3)
respectively.
1
μik = (2)
xi −vk
2
K m−1
j =1 xi −vj
N
μmik xi
vk = i=1 N m
(3)
μ
i=1 ik
Normally, the exponent of the fuzzy partition matrix (m) is kept as m > 1. This regulates
the number of pixels that can have membership with more than one cluster. The pseudo-code
of FCM is presented in Algorithm 3.
Hard clustering methods iteratively partition the data into disjoint clusters according to the
objective function. Generally, the objective function is the sum of squared Euclidean dis-
tance between data and associated centroid which is to be minimized. Usually, the centre of
the clustered data is considered as the centroid of the clusters in these methods. Moreover,
in contrast to soft clustering, hard clustering assigns data to a single cluster only i.e., each
data will have the degree of belongingness as either 0 or 1. The hard clustering approach
is relatively simple and scalable with high computation efficiency. Moreover, it is com-
petent for datasets which have spherical shape and are well-separated. However, it suffers
from number of demerits like, formed cluster centroids are relatively poor cluster descrip-
tors, sensitive towards initial parameter settings, and needs the pre-knowledge about the
Multimedia Tools and Applications (2022) 81:35001–35026 35009
number of clusters to be formed. The various hard-clustering methods may be grouped into
three categories, namely Kmeans-based methods, histogram-based thresholding methods,
and meta-heuristics based methods as depicted in Fig. 5.
In Kmeans-based methods, the cluster centroid is updated by taking the mean of all the data
items assigned to the corresponding cluster. This is iteratively continued until some defined
convergence criterion is met. Although methods of this category have merits like relatively
low time complexity, simple in nature, and guaranteed convergence, there are number of
limitations which need to be handled. These limitations include the number of clusters to
be formed needs to be known priorly, solution quality depends on the initial clusters and
number of formed clusters, not appropriate on data having non-convex distribution, follow
hill-climbing strategy and hence usually traps into local optima, and relatively sensitive to
outliers, noise, and initialization phase.
Generally, this category includes all the methods which are inspired from the K-means
method which is the simplest partitional clustering method. It is a golden standard for
clustering in literature [66]. An overview of the K-means method is detailed below.
K-means: K-means [48] partitions a set of data points, X = {x1 , · · · , xn }, into a number
of clusters (k). It performs partition based on the similarity criteria which is
usually the sum of squared error defined in (4).
k
J = sumxj ∈X ||xj − mi ||2 (4)
i=1
where, mi is the centroid of cluster i which is collectively represented as M =
{m1 , · · · , mk } for corresponding clusters, C = {c1 , · · · , ck }. This method iteratively mini-
mizes the criterion function J . Further, the formed clusters C and corresponding centroids
M are updated as given by (5) and (6) respectively.
xi ∈ cl , if l = argminkl=1 ||xj − mi ||2 (5)
xi ∈cl xi
mi = (6)
|cl |
for 1 ≤ i ≤ N and 1 ≤ l ≤ k.
K-means method has the time complexity of O(nkt) where, n, k, and t correspond to
the number of data items, number of clusters to be formed, and maximum iterations respec-
tively. However, this method is biased towards initial cluster centroids and usually traps into
local minima. Moreover, solutions vary with the number of clusters. The pseudo-code of the
K-means method is presented in Algorithm 4.
Some other popular methods of this category are bisecting K-means [72], sort-means
[62], K-harmonic means [92], K-modes [16], K-medoids [61], partition around medoids
(PAM) [40], and clustering large applications based upon randomized search (CLARANS)
[55]. The K-medoids method corresponds to the variant of K-means which defines the clus-
ter centroid as the data point, nearest to the centre of the cluster. This method is befitted
for discrete data. PAM is reminiscent of the medoid-based method in which there are two
phases, namely build and swap. It is a greedy search method where the aim is the minimiza-
tion of the dissimilarity among the objects with the closest medoid. In the first phase, a set
of data points are considered as medoids which define the selected objects set. The remain-
ing data points are kept in another set, termed as unselected objects. This corresponds to
the steps of the build phase. The next phase tries to improve the cluster quality by swap-
ping the data in the selected objects set with the data in the unselected objects set. Further,
CLARANS is an efficient clustering method, especially for the large datasets. It uses the
concept of the graph to find the medoids for the given data. It applies PAM on complete
data and considers only a subset of the possible swaps between the selected objects and
unselected objects.
Fig. 6 1D view of grey level histogram of an image taken from BSDS300 [49] a Image b 1D histogram
To group the image pixels into n clusters {C1 , C2 , · · · , Cn }, n−1 thresholds, (t1 , t2 , · · · ,
tn−1 ), are required which are represented as (9).
⎧
⎪
⎪ 0, v(x, y) ≤ t1
⎪
⎪
⎪
⎪ t1 +t
⎪ 1 < v(x, y) ≤ t2
2
, t
⎪
⎪ 2
⎪
⎪ .
⎨
v(x, y) = . (9)
⎪
⎪
⎪
⎪ .
⎪
⎪
⎪
⎪ tn−2 +tn−1
, tn−2 < v(x, y) ≤ tn−1
⎪
⎪
⎪
⎩L − 1,
2
v(x, y) > tn−1
where v(x, y) corresponds to pixel intensity at (x, y) location of a M × N image. The
partition Cj , 1 ≤ j ≤ n consists of pixels with intensity value more than tj −1 and less than
equals to tj . The frequency of cluster Cj for each plane is computed as (10).
tj
wj = pi (10)
i=tj −1 +1
The mean for the cluster Cj can be calculated by (11) and the inter-class variance is
formulated in (12).
tj
μj = ipi /wj (11)
i=tj −1 +1
35012 Multimedia Tools and Applications (2022) 81:35001–35026
n
σ2 = wj (μj − μ)2 (12)
j =1
To group the pixels into clusters based on intensity, the maximization of inter-class variance
[(12)] is considered. Therefore, the objective function tries to maximize the fitness function,
defined in (13):
φ = max1<t1 <···<tn−1 <L {σ 2 (t)} (13)
Fig. 7 Grey-local 2D histogram of the image, depicted in Fig. 6a a Three dimensional view b Two
dimensional view
Multimedia Tools and Applications (2022) 81:35001–35026 35013
y-axis defines the gradient values of the same image. However, this approach often gener-
ates inferior results than grey-local 2D histogram. Figure 8 represents the grey-gradient 2D
histogram for the sample image considered in Fig. 6a.
Generally, the segmentation methods based on the histogram are efficient as they require
only a single pass through the pixels to construct the histogram. However, these methods
have flaws like prior knowledge about the number of clusters is required, highly sensitive
towards the noise, and in the overlapping regions of the histogram, it is quite difficult to
identify significant peaks and valleys. Moreover, certain limitations which prevail in 2D
histograms are that the diagonal regions are not smooth and the off-diagonal information is
not adding improvement.
This category involves the use of metaheuristic-based approaches to obtain optimal clusters
by updating random solutions according to mathematical formulation and optimality crite-
ria (or objective function). It is a special branch of soft computing that have been leveraged
to solve complex real-world problems in reasonable time as compared to classical methods.
Broadly, algorithms of this category belong to the class of optimization algorithms which
can solve computationally hard problems, like NP-complete problems [8]. However, no sin-
gle meta-heuristic algorithm is efficient to solve every problem as stated by No Free Lunch
Theorem [84]. Generally, a meta-heuristic method solves an optimization problem whose
objective function is to perform either maximization or minimization of a cost function
(f (x)) with a given set of constraints. The mathematically formulation of a maximization
optimization problem is presented in (14).
Maximize{x∈Rd } f (x) (14)
such that : ai (x) ≥ 0, (15)
bj (x) = 0, (16)
ck (x) ≤ 0 (17)
where x = (x1 , x2 , x3 , · · · , xd )T is the set of decision variables which is defined over
d− dimensions and Rd is the search space of the problem. ai (x), bj (x), and ck (x) corre-
spond to the different constraints applicable to an optimization problem. Actual constraints
depend on the considered optimization problem.
Over the last three decades, more than sixty meta-heuristic algorithms have been proposed
based on the inspiration from nature to provide the optimal solution. Each meta-heuristic
Fig. 8 Grey-gradient 2D histogram of an image, depicted in Fig. 6a a Three dimensional view b Two
dimensional view
35014 Multimedia Tools and Applications (2022) 81:35001–35026
algorithm mimics a particular natural phenomena which may belong to evolutionary, physi-
cal, or biological. In literature, two common aspects that are often found in these algorithms
are exploration and exploitation. Exploration represents the diversification in the search
space wherein the existing solutions are updated with the intention of exploring the search
space. This helps in exploring the new solutions, prevents the stagnation problem, and
responsible for achieving the global solution. The exploitation, which corresponds to the
intensification of the current solution, performs the local search around the currently gener-
ated solutions. In this, the goal is to exploit the search space and responsible for convergence
to the optimal solution. Generally, meta-heuristic algorithms may broadly be classified
into two categories, namely evolutionary and swarm algorithms as depicted in Fig. 5.
Evolutionary-based algorithms are based on evolution theories such as Darwin’s evolu-
tionary theory. The evolutionary algorithms work on the principle of generating better
individuals with the course of generation by combining the best individuals of the cur-
rent generation. Genetic algorithm (GA) [59], evolutionary strategy (ES), [6], differential
evolution (DE) [64], biogeography-based optimization (BBO) [58], and probability-based
incremental learning (PBIL) [25] are some examples of evolutionary algorithms. On the
other side, swarm-based algorithms behave like the swarm of agents, such as fishes or birds,
to achieve optimal results. Some algorithms of this category are particle swarm optimization
(PSO) [42], ant colony optimization (ACO) [29], gravitational search algorithm (GSA) [54],
spider monkey optimization (SMO) [7], grey-wolf optimizer (GWO) [75], cuckoo search
(CS) [60], and military dog based optimizer (MDO) [76].
The meta-heuristic algorithms are able to provide promising results for unfolding the
image clustering problem. Generally, these clustering-based methods are better than other
clustering methods in terms of independence from the initial parameter settings and return
global optimal solution [52]. As these methods ensure better solution, they have been widely
used in clustering [38]. The basic approach of using meta-heuristics algorithms as clus-
tering algorithm was introduced by Selim and Alsultan [69], using simulated annealing.
Thereafter, Bezdek et al. [11] presented genetic algorithm based data clustering method that
was basically the first evolutionary based method of data clustering. The first, swarm-based
clustering algorithm was introduced by Lumer et al. [44] using ant colony optimization. A
literature review of the existing metaheuristic-based clustering for image segmentation is
discussed below. Moreover, the pseudo-code for a metaheuristic-based clustering method is
presented in Algorithm 5.
method using the genetic algorithm. Moreover, Schwefel [9] presented the recent works
done in the area of clustering using evolutionary strategy. It has been perceived that the
hybrid evolutionary strategy based algorithms perform better as compared to standard evo-
lutionary strategy on the clustering problem for the majority of the UCI benchmark datasets.
In continuation, Sheng et al. [70] introduced K-medoids and genetic-based hybrid algorithm
for the clustering of the large-scale dataset. However, it has been observed that hybrid evo-
lutionary based algorithms, which are formed by merging the good features of two or more
algorithms, had outperformed the parent algorithms. Lue et al. [46] proposed gene transpo-
sition based algorithm for the efficient clustering in which immune cells (population) have
been initialized with a vector of K cluster centroids. Likewise, Ye et al. [73] presented GA
and PSO based application for image segmentation. Jiao et al. [37] applied memetic based
clustering algorithm for the image segmentation of remote sensing images. Further, Lu
et al. [47] combined multiple clustering using various measures of the partitions and pro-
posed a fast simulated annealing algorithm for efficient clustering. Agusti et al. [4] explored
an algorithm named grouping genetic algorithm to solve clustering problems. The summa-
tion of intra-cluster distance was used as the fitness function for performing the partition
based clustering, while FCM was the fitness function for the fuzzy-based clustering. Further,
Pal et al. [58] used bio-geography based optimizer (BBO) for clustering.
Swarm-based clustering Merve et al. [81] proposed the swarm-based algorithm for par-
titional clustering using PSO. Chuang et al. [19] introduced a chaotic PSO clustering
algorithm in which conventional parameters of the PSO were replaced with chaotic opera-
tors. On the same footprints, Tsai and Cao [77] presented a PSO based clustering algorithm
with selective particle regeneration. Further, the shuffled frog leaping based clustering algo-
rithm was successfully utilized for image segmentation [12]. Moreover, hybrid variants
of PSO with K-means [81], K-harmonic means [90] and rough set theory [34] have also
been introduced. Zhang et al. [93] introduced possibilistic c-means and PSO based variant
for image segmentation. Furthermore, some researchers proposed swarm and evolutionary
based hybrid algorithms for effective data clustering. Xu et al. [86] combined DE with
PSO, and presented efficient results. Similarly, PSO was integrated with GA and simu-
lated annealing for the improvements in the clustering as compared to the conventional
PSO. Zou et al. [95] introduced cooperative ABC based clustering. Moreover, hybrid algo-
rithms for clustering have also been proposed by a number of researchers and utilized for
unfolding image segmentation. Moreover, a clustering algorithm based on bacterial for-
aging optimization was introduced by Lie et al. [83]. Senthilnath et al. [89] developed
the firefly based algorithm for the efficient analysis of the clusters and tested on the UCI
datasets. Besides this, the invasive weed optimization based clustering algorithm was pre-
sented by Chowdhury et al. [18]. Subsequently, Liu et al. [45] proposed multi-objective
invasive weed optimization based algorithm for efficient clustering. Hatamlou et al. [33]
introduced GSA based clustering algorithm where the cluster centroids are initialized with
K-means.
In general, meta-heuristic methods have a number of merits like, applicable to a wide
set of real-world optimization problems, searching behaviour follows a random approach,
and able to find the approximate solutions to the NP-complete problems. However, these
algorithms suffer from a number of demerits like convergence to the global optima is
probabilistic, trap into local optima sometimes, and computational complexity is compara-
tively high.
35016 Multimedia Tools and Applications (2022) 81:35001–35026
Table 3 Various performance parameters for the quantitative evaluation of an image segmentation method
[51]
Parameters Formulation
u−v
L−1 , 0< (u − v)
Boundary Displacement Error (BDE) BDE =
0, (u − v)< 0
Probability Rand Index (PRI) P RI = a+b
a+b+c+d = a+b
L
Variation of Information (VoI) V oI = H (S1 ) + H (S2 ) − 2I (S1 , S2 )
Global Consistency Error (GCE) GCE = n1 i E(S1 , S2 , pi ), i E(S2 , S1 , pi )
(2×X̄×Ȳ +c1 )(2×σXY +c2 )
Structural Similarity Index (SSIM) SSI M =
X +σY +c2 )×|((X̄) +(Ȳ ) +c1 )
(σ 2 2 2 2
Predicted Positive TP FP
Predicted Negative FN TN
correspond to the segmented image and ground truth respectively. The formulation of
IoU is depicted in (21).
|X ∩ Y |
I oU = (21)
|X| + |Y |
3. Dice-Coefficient (DC): Dice-Coefficient is defined as twice the number of common
pixels divided by total number of pixels in X and Y, where X corresponds to the
segmented image and Y is the ground truth. DC is mathematically defined as (22).
2|X ∩ Y |
DC = (22)
|X| + |Y |
4. Boundary Displacement Error (BDE): This parameter computes the average boundary
pixels displacement error between two segmented images as depicted in (23). The
error of one boundary pixel is defined as it’s distance from the closest pixel in the
other boundary image.
u−v
0<u−v
μLA (u, v) = L−1 (23)
0 u−v <0
5. Probability Rand Index (PRI): Probability Rand Index finds labelling consistency
between the segmented image and its ground truth. It counts such fraction of pairs of
pixels and average the result across all ground truths of a given image as shown in (24).
a+b a+b
R= = (24)
a+b+c+d n
6. Variation of Information (VOI): Variation of Information as shown in (25) computes
the randomness in one segmentation in terms of the distance from given segmentation.
V OI (X; Y ) = H (X) + H (Y ) − 2I (X, Y ) (25)
7. Global Consistency Error (GCE): A refinement in one segmentation over the other is
depicted by the value of GCE as given in (26). If two segmentations are related this
way then they are considered as consistent, i.e. both can represent the same natural
image segmentation at different scales.
1
GCE = E(s1 , s2 , pi ), E(s2 , s1 , pi ) (26)
n
i i
8. Structural Similarity Index (SSIM): Structural Similarity Index measures the similar-
ity between two images by taking initial uncompressed or distortion-free image as the
reference and computed as (27). It incorporates important perceptual phenomena such
as luminance masking and contrast masking.
(2 × x̄ × ȳ + c1 )(2 × σxy + c2 )
SSI M = (27)
(σx2 + σy)
2 ) × ((x)2 + (y)2 + c )
1
35018 Multimedia Tools and Applications (2022) 81:35001–35026
9. Feature Similarity Index (FSIM): Feature Similarity Index is a quality score that
uses the phase congruency (PC), which is a dimensionless measure and shows the
significance of a local structure. It is calculated by (28).
SL (x).P Cm (x)
F SI M = x∈Ω (28)
x∈Ω P Cm (x)
10. Root Mean squared error (RMSE): The root-mean-squared error computes the differ-
ence between sample value predicted by a model or an estimator and actual value. The
formulation is shown in (29).
13. Average Difference (AD): This parameter represents the average difference between
the pixel values and is computed by (32).
1
M N
AD = (x(i, j ) − y(i, j )) (32)
MN
i=1 j =1
14. Maximum Difference (MD): This parameter finds the maximum of error signal by
taking the difference between original image and segmented image and defined
in (33).
MD = max | x(i, j ) − y(i, j ) | (33)
15. Normalized Absolute Error (NAE): The normalized absolute difference between the
original and corresponding segmented image gives NAE and is calculated by (34).
M N
i=1 j =1 | x(i, j ) − y(i, j ) |
N AE = M N (34)
i=1 j =1 | x(i, j ) |
In the above mentioned parameters, IoU, DC, SSIM, FSIM, PRI, PSNR, and NCC show
better segmentation on high values. A high value indicates that the segmented image is
more close to the ground truth. To measure the same, these parameters compute values by
considering the region of intersection between the segmented image and the ground truth
which corresponds to matching the number of similar pixels. On the contrary, other indices
prefer lower values for better segmentation as these measures compute error between the
segmented image and the ground truth and aim at reducing the difference between them.
Multimedia Tools and Applications (2022) 81:35001–35026 35019
The considered dataset is the key for fair analysis of an image segmentation method.
The validation of segmentation methods against benchmark datasets tests its performance
against the challenges posed by the considered dataset. Generally, a benchmark dataset con-
sists of a variety of images that varies in number of different aspects. Some of the common
challenges which segmentation methods need to handle are illumination variation, intra-
class variation, and background complexity. This paper lists some of the popular benchmark
datasets in Table 5 which are publicly available for image segmentation task and are briefed
below.
– Aberystwyth Leaf Evaluation Dataset: It is dataset of timelapse images of Ara-
bidopsis thaliana (Arabidopsis) plant to perform leaf-level segmentation. It consists of
original arabidopsis plant images with 56 annotated ground truth images [1].
– ADE20K: It is a scene parsing segmentation dataset with around 22K hierarchically
segmented images. For each image, there is a mask to segment objects and different
parts of the objects [3].
– Sky Dataset: This dataset consists of images taken from the Caltech Airplanes Side
dataset. The images, containing sky regions, were selected and their ground truths
were prepared. In total, there are 60 images along with ground truth to perform the
segmentation of sky [71].
– TB-roses-v1 Dataset: This is a small dataset consisting of 35 images of rose bushes
and corresponding manually segmented rose stems [26].
– Tsinghua Road Markings (TRoM): It is a dataset containing road images along with
ground truths for the purpose of segmentation of road markings. There are 19 categories
of road markings [63].
This paper reviews different clustering-based methods in the field of image segmenta-
tion. The clustering methods may be categorized in two broad classes, namely hierarchical
and partitional based clustering. Hierarchical clustering methods perform data grouping
at different similarity levels. They are suitable for both convex and arbitrary data since
these algorithms construct a hierarchy of data elements. However, these methods fail
when clusters are overlapping and are quite computationally expensive, especially on high-
dimensional dataset like images as these methods scan the N x N matrix for the lowest
distance in each of N-1 iterations. On such dataset, partitional clustering methods have per-
formed better and partitions the data into required number of clusters based on a criteria
and preferable for convex shape data. Therefore, further study includes an extensive survey
in the perspective of partitional based clustering methods. In literature, there exists a num-
ber of partitional-based clustering methods, which belong to either soft or hard clustering
approaches. Further, the hard partitonal clustering methods are categorized into three broad
classes, namely Kmeans-based methods, histogram-based methods, and metaheuristic-
based methods. A concise overview of the different clustering-based image segmentation
methods is presented in Table 6. From the table, following guidelines are extracted.
– The selection of an appropriate clustering method generally depends on certain param-
eters such as type of data, sensitivity towards noise, scalability, and number of
clusters.
– Hierarchical clustering algorithms are suitable for the data set with arbitrary shape and
attribute of arbitrary type. There are a number of applications in which hierarchical
clustering algorithms are used. For example, all files and folders on the hard disk are
organized in a hierarchy and it can be easily managed with the help of hierarchical
clustering. In hierarchical clustering, storage and time requirements grow faster than
linear rate, Therefore, these methods cannot be directly applied to large datasets like
image, micro-arrays, etc. The BIRCH clustering method is computationally efficient
hierarchical clustering method; however, it generates low-quality clusters when applied
on large datasets. Among all the hierarchical methods, the CURE clustering method
is preferred for clustering high dimensional datasets. The computational complexity of
the CURE method is O(n2 logn), where n corresponds to the number of data points.
– Partitional clustering methods generate disjoint clusters according to some predefined
objective functions. The partitional clustering methods are preferred due to their high
computing efficiency and low time complexity [66]. For the large datasets, partitional
clustering methods are most preferred than other clustering methods. The partitional
clustering algorithms are used when number of clusters to be created are known in
35022
Category Method Time Suitable Suitable Noise/outlier Scalability Suitable for Suitable for
complexity data-type data-shape sensitive high-dimension data large-scale data
n: Number of data items; k: number of formed clusters; t: number of iterations; s: number of sample data items
P: Population-size; C: complexity of the objective function; L: number of levels; M: number of partitions
Multimedia Tools and Applications (2022) 81:35001–35026
Multimedia Tools and Applications (2022) 81:35001–35026 35023
advance. For example, partitional clustering is used to group the visitors to a web-
site using their age. Though partitional clustering is computationally efficient, the
number of clusters need to be priorly defined. Moreover, partitional clustering meth-
ods are sensitive to the outliers and sometimes also trap to local optima. Particularly,
FCM and K-means are preferable to perform clustering on spherical-shaped data while
histogram-based methods may be advantageous when clustering is to be performed
irrespective of the data distribution.
– Metaheuristic-based methods are scalable for clustering on large data, especially when
the data distribution is complex. The meta-heuristic clustering mitigate the issue of
trapping to local optima by using the exploration and exploitation property. How-
ever, existing meta-heuristic methods fail to maintain an efficient balance between the
exploitation and exploration process. This will greatly affect the performance and the
convergence behaviour of the meta-heuristic method.
Apart from the complete statistics of clustering-based image segmentation methods as
tabulated in Table 6, the different performance parameters required for the quantitive evalu-
ation of a segmentation method are discussed. Various performance parameters are briefed
along with their formulation. In addition, the publicly available image-segmentation datasets
along with active links (till date) are listed out to facilitate the research in the domain of
image-segmentation. A dataset consists of a variety of images that varies in number of
different aspects like different region-of-interest, image size, applicability of image, and
many more. This results in a number of challenges for the segmentation method such as
illumination variation, intra-class variation, and background complexity.
In future, the clustering algorithms can be compared on different factors such as stability,
accuracy, normalization, and volume of the dataset. The analysis on these factors may reveal
quality and performance aspects of the clustering algorithms. Secondly, the applicability of
the compared clustering methods on real-time applications needs to be analysed. Thirdly,
different cluster validity indices [53] can be considered for the comparison. An efficient
variant of existing meta-heuristic methods may improve the performance. Further, based on
the findings concluded in this paper, similar work on remaining issues can be considered in
continuation and can be presented in future.
References
9. Beyer H-G, Schwefel H-P (2002) Evolution strategies–a comprehensive introduction. Nat Comput 1:3–
52
10. Bezdek JC (1973) Cluster validity with fuzzy sets. J Cybern 3:58–73
11. Bezdek JC, Boggavarapu S, Hall LO, Bensaid A (1994) Genetic algorithm guided clustering. In: Proc.
of IEEE conference on world congress on computational intelligence. USA, pp 34–39
12. Bhaduri A, Bhaduri A (2009) Color image segmentation using clonal selection-based shuffled frog
leaping algorithm. In: Proc. of IEEE international conference on advances in recent technologies in
communication and computing. India, pp 517–520
13. Bouguettaya A, Yu Q, Liu X, Zhou X, Song A (2015) Efficient agglomerative hierarchical clustering.
Expert Syst Appl 42:2785–2797
14. Brain mri segmentation — kaggle (2020) https://www.kaggle.com/mateuszbuda/lgg-mri-segmentation,
(Accessed 08 Jun 2020)
15. Cad 120 affordance dataset — zenodo (2020) https://zenodo.org/record/495570, (Accessed 08 Jun 2020)
16. Chaturvedi A, Green PE, Caroll JD (2001) K-modes clustering. J Classif 18:35–55
17. Chavent M, Lechevallier Y, Briant O (2007) Divclus-t: A monothetic divisive hierarchical clustering
method. Comput Stat Data Anal 52:687–701
18. Chowdhury A, Bose S, Das S (2011) Automatic clustering based on invasive weed optimization algo-
rithm. In: Proc. of springer international conference on swarm, evolutionary, and memetic computing.
India, pp 105–112
19. Chuang L-Y, Hsiao C-J, Yang C-H (2011) Chaotic particle swarm optimization for data clustering.
Expert Syst Appl 38:14555–14563
20. Coift (2020) http://www.vision.ime.usp.br/lucyacm/thesis/coift.html, (Accessed 08 Jun 2020)
21. Covid-19 - medical segmentation (2020) https://medicalsegmentation.com/covid19/, (Accessed 10 Jun
2020)
22. Cs231n convolutional neural networks for visual recognition (2018) http://cs231n.github.io/classification/,
(Accessed 28 Dec 2018)
23. Cvonline: Image databases (2020) http://homepages.inf.ed.ac.uk/rbf/CVonline/Imagedbase.htm,
(Accessed 10 Jun 2020)
24. Daimler pedestrian segmentation benchmark (2020) http://www.gavrila.net/Datasets/Daimler
Pedestrian Benchmark D/Daimler Pedestrian Segmentatio/daimler pedestrian segmentatio.html,
(Accessed 08 Jun 2020)
25. Dasgupta D, Michalewicz Z (2013) Evolutionary algorithms in engineering applications. Springer
Science & Business Media, Berlin
26. Data master nicola strisciuglio / rustico gitlab (2020) https://gitlab.com/nicstrisc/RUSTICO/tree/master/
data, (Accessed 08 Jun 2020)
27. Dave RN, Bhaswan K (1992) Adaptive fuzzy c-shells clustering and detection of ellipses. IEEE Trans
Neur Netw 3:643–662
28. Dhillon IS, Mallela S, Kumar R (2003) A divisive information-theoretic feature clustering algorithm for
text classification. J Mach Learn Res 3:1265–1287
29. Dorigo M, Blum C (2005) Ant colony optimization theory: A survey. Theor Comput Sci 344:243–278
30. Evimo - motion segmentation with event cameras (2020) https://better-flow.github.io/evimo/index.html,
(Accessed 08 Jun 2020)
31. Guénoche A., Hansen P, Jaumard B (1991) Efficient algorithms for divisive hierarchical clustering with
the diameter criterion. J Classif 8:5–30
32. Guha S, Rastogi R, Shim K (2001) Cure: An efficient clustering algorithm for large databases. 26:35–58
33. Hatamlou A, Abdullah S, Nezamabadi-Pour H (2012) A combined approach for clustering based on
k-means and gravitational search algorithms. Swarm Evolution Comput 6:47–52
34. Huang KY (2011) A hybrid particle swarm optimization approach for clustering and classification of
datasets. Knowl-Based Syst 24:420–426
35. Icg - 3dpitotidataset (2020) https://www.tugraz.at/institute/icg/research/team-bischof/lrs/downloads/
3dpitotidataset/, (Accessed 08 Jun 2020)
36. Janowczyk A, Madabhushi A (2016) Deep learning for digital pathology image analysis: A comprehen-
sive tutorial with selected use cases. J Pathology Inform 7:17–42
37. Jiao L, Gong M, Wang S, Hou B, Zheng Z, Wu Q (2010) Natural and remote sensing image segmentation
using memetic computing. IEEE Comput Intell Mag 5:78–91
38. Jose-Garcia A, Gómez-Flores W. (2016) Automatic clustering using nature-inspired metaheuristics: A
survey. Appl Soft Comput 41:192–213
39. Junqiangchen/lits-liver-tumor-segmentation-challenge: Lits - liver tumor segmentation challenge
(2020) https://github.com/junqiangchen/LiTS---Liver-Tumor-Segmentation-Challenge, (Accessed 08
Jun 2020)
Multimedia Tools and Applications (2022) 81:35001–35026 35025
40. Kanungo T, Mount DM, Netanyahu NS, Piatko CD, Silverman R, Wu AY (2002) An efficient k-means
clustering algorithm: Analysis and implementation. IEEE Trans Pattern Anal Machine Intell 881–892
41. Karypis G, Han E.-H., Kumar V (1999) Chameleon: Hierarchical clustering using dynamic modeling.
Computer 32:68–75
42. Kennedy J (2011) Particle swarm optimization. In: Encyclopedia of machine learning. Springer
43. Krishna K, Murty MN (1999) Genetic k-means algorithm. IEEE Trans Syst Man Cybern Part B (Cybern)
29:433–439
44. Langham AE, Grant P (1999) Using competing ant colonies to solve k-way partitioning problems with
foraging and raiding strategies. In: Proc. of springer european conference on artificial life. Switzerland,
pp 621–625
45. Liu R, Wang X, Li Y, Zhang X (2012) Multi-objective invasive weed optimization algortihm for
clustering. In: Proc. of IEEE congress on evolutionary computation. Australia, pp 1–8
46. Liu T, Zhou Y, Hu Z, Wang Z (2008) A new clustering algorithm based on artificial immune system. In:
Proc. of international conference on fuzzy systems and knowledge discovery. USA, pp 347–351
47. Lu Z, Peng Y, Ip HH (2011) Combining multiple clusterings using fast simulated annealing. Pattern
Recogn Lett 32:1956–1961
48. MacQueen J et al (1967) Some methods for classification and analysis of multivariate observations. In:
Proc. of berkeley symposium on mathematical statistics and probability. USA pp 281–297
49. Martin D, Fowlkes C, Tal D, Malik J (2001) A database of human segmented natural images and its
application to evaluating segmentation algorithms and measuring ecological statistics. In: Proc. of IEEE
international conference on computer vision. Canada, pp 1–11
50. Maulik U, Bandyopadhyay S (2000) Genetic algorithm-based clustering technique. Pattern Recogn
33:1455–1465
51. Mittal H, Saraswat M (2018) An optimum multi-level image thresholding segmentation using non-local
means 2d histogram and exponential kbest gravitational search algorithm. Eng Appl Artif Intell 71:226–
235
52. Mittal H, Saraswat M (2019) An automatic nuclei segmentation method using intelligent gravitational
search algorithm based superpixel clustering. Swarm Evolution Comput 45:15–32
53. Mittal H, Saraswat M (2020) A new fuzzy cluster validity index for hyper-ellipsoid or hyper-spherical
shape close clusters with distant centroids, IEEE Trans Fuzzy Syst
54. Mittal H, Tripathi A, Pandey AC, Pal R (2020) Gravitational search algorithm: A comprehensive analysis
of recent variants. Multimed Tools Appl 1–28
55. Ng RT, Han J (2002) Clarans: A method for clustering objects for spatial data mining. IEEE Trans Knowl
Data Eng 5:1003–1016
56. Opensurfaces - a richly annotated catalog of surface appearance (2020) http://opensurfaces.cs.cornell.
edu/publications/opensurfaces/, (Accessed 08 Jun 2020)
57. Opensurfaces - a richly annotated catalog of surface appearance (2020) http://opensurfaces.cs.cornell.
edu/publications/minc/, (Accessed 08 Jun 2020)
58. Pal R, Saraswat M (2019) Histopathological image classification using enhanced bag-of-feature with
spiral biogeography-based optimization. Appl Intell 1–19
59. Pal R, Yadav S, Karnwal R et al (2020) Eewc: Energy-efficient weighted clustering method based on
genetic algorithm for hwsns. Complex Intell Syst 1–10
60. Pandey AC, Rajpoot DS, Saraswat M (2020) Feature selection method based on hybrid data transforma-
tion and binary binomial cuckoo search. J Ambient Intell Human Comput 11(2):719–738
61. Park HS, Jun CH (2009) A simple and fast algorithm for k-medoids clustering. Expert Syst Appl
36:3336–3341
62. Phillips SJ (2002) Acceleration of k-means and related clustering algorithms. In: Lecture notes of
springer workshop on algorithm engineering and experimentation. USA, pp 166–177
63. roadmarking (2020) http://www.tromai.icoc.me/, (Accessed 08 Jun 2020)
64. Saraswat M, Arya K, Sharma H (2013) Leukocyte segmentation in tissue images using differential
evolution algorithm. Swarm Evolution Comput 11:46–54
65. Sarkar S, Das S (2013) Multilevel image thresholding based on 2d histogram and maximum tsallis
entropy-a differential evolution approach. IEEE Trans Image Process 22:4788–4797
66. Saxena A, Prasad M, Gupta A, Bharill N, Patel OP, Tiwari A, Er MJ, Ding W, Lin CT (2017) A review
of clustering techniques and developments. Neurocomputing 267:664–681
67. Segmentation evaluation database (2020) http://www.wisdom.weizmann.ac.il/vision/Seg Evaluation
DB/index.html, (Accessed 08 Jun 2020)
68. Seifoddini HK (1989) Single linkage versus average linkage clustering in machine cells formation
applications. Comput Indust Eng 16:419–426
35026 Multimedia Tools and Applications (2022) 81:35001–35026
69. Selim SZ, Alsultan K (1991) A simulated annealing algorithm for the clustering problem. Pattern Recogn
24:1003–1008
70. Sheng W, Liu X (2004) A hybrid algorithm for k-medoid clustering of large data sets. In: Proc. of IEEE
congress on evolutionary computation. USA, pp 77–82
71. Sky dataset (2020) https://www.ime.usp.br/eduardob/datasets/sky/, (Accessed 08 Jun 2020)
72. Steinbach M, Karypis G, Kumar V et al (2000) A comparison of document clustering techniques. In:
Proc. of ACM international conference on knowledge discovery and data mining workshop on text
mining. USA, pp 1–20
73. Sun F-J, Tian Y (2010) Transmission line image segmentation based ga and pso hybrid algorithm. In:
Proc. of IEEE international conference on computational and information sciences. China, pp 677–680
74. The berkeley segmentation dataset and benchmark (2020) https://www2.eecs.berkeley.edu/Research/
Projects/CS/vision/bsds/, (Accessed 08 Jun 2020)
75. Tripathi AK, Sharma K, Bala M (2018) A novel clustering method using enhanced grey wolf optimizer
and mapreduce. Big Data Res 14:93–100
76. Tripathi AK, Sharma K, Bala M, Kumar A, Menon VG, Bashir AK (2020) A parallel military dog based
algorithm for clustering big data in cognitive industrial internet of things. IEEE Trans Indust Inform
77. Tsai C-Y, Kao I-W (2011) Particle swarm optimization with selective particle regeneration for data
clustering. Expert Syst Appl 38:6565–6576
78. Uc berkeley computer vision group - contour detection and image segmentation - resources (2020)
https://www2.eecs.berkeley.edu/Research/Projects/CS/vision/grouping/resources.html, (Accessed 08
Jun 2020)
79. Use case 1: Nuclei segmentation - Andrew Janowczyk (2020) http://www.andrewjanowczyk.com/
use-case-1-nuclei-segmentation/, (Accessed 08 Jun 2020)
80. Use case 2: Epithelium segmentation - Andrew Janowczyk (2020) http://www.andrewjanowczyk.com/
use-case-2-epithelium-segmentation/, (Accessed 08 Jun 2020)
81. Van der Merwe D, Engelbrecht AP (2003) Data clustering using particle swarm optimization. In: Proc.
of IEEE congress on evolutionary computation. Australia, pp 215–220
82. Visual geometry group - university of oxford (2020) https://www.robots.ox.ac.uk/vgg/data/pets/,
(Accessed 08 Jun 2020)
83. Wan M, Li L, Xiao J, Wang C, Yang Y (2012) Data clustering using bacterial foraging optimization. J
Intell Inform Syst 38:321–341
84. Wolpert DH, Macready WG (1997) No free lunch theorems for optimization. IEEE Trans Evol Comput
1:67–82
85. Xu D, Tian Y (2015) A comprehensive survey of clustering algorithms. Annals Data Sci 2:165–193
86. Xu R, Xu J, Wunsch DC (2010) Clustering with differential evolution particle swarm optimization. In:
Proc. of IEEE congress on evolutionary computation. Spain, pp 1–8
87. Xue-guang W, Shu-hong C (2012) An improved image segmentation algorithm based on two-
dimensional otsu method. Inform Sci Lett 1:77–83
88. Yager RR, Filev DP (1994) Approximate clustering via the mountain method. IEEE Trans Syst Man
Cybern 24:1279–1284
89. Yang X-S (2010) Firefly algorithm, stochastic test functions and design optimisation. Int J Bio-Inspired
Comput 2:78–84
90. Yang F, Sun T, Zhang C (2009) An efficient hybrid data clustering method based on k-harmonic means
and particle swarm optimization. Expert Syst Appl 36:9847–9852
91. Zaitoun NM, Aqel MJ (2015) Survey on image segmentation techniques. Procedia Comput Sci 65:797–
806
92. Zhang B, Hsu M, Dayal U (2001) K-harmonic means-a spatial clustering algorithm with boosting.
In: Lecture notes of springer workshop on temporal, spatial, and spatio-temporal data mining. France,
pp 31–45
93. Zhang Y, Huang D, Ji M, Xie F (2011) Image segmentation using pso and pcm with mahalanobis
distance. Expert Syst Appl 38:9036–9040
94. Zhang T, Ramakrishnan R, Livny M (1996) Birch: An efficient data clustering method for very large
databases. In: Proc. of ACM sigmod record, pp 103–114
95. Zou W, Zhu Y, Chen H, Sui X (2010) A clustering approach using cooperative artificial bee colony
algorithm. Discret Dyn Nat Soc 2010:22–37
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps
and institutional affiliations.