Mining Spatial Data & Enhancing Classification Using Bio - Inspired Approaches
Mining Spatial Data & Enhancing Classification Using Bio - Inspired Approaches
Abstract: Data-Mining (DM) has become one of the most valuable tools for extracting and manipulating data and for establishing
patterns in order to produce useful information for decision-making. It is a generic term that is used to find hidden patterns of
data(tabular, spatial, temporal, spatio-temporal etc.) Spatial data mining is the process of discovering interesting and previously
unknown, but potentially useful patterns from spatial databases Extracting interesting and useful patterns from spatial datasets is more
difficult than extracting the corresponding patterns from traditional numeric and categorical data due to the complexity of spatial data
types, spatial relationship and spatial autocorrelation. Spatial data are the data related to objects that occupy space. A spatial database
stores spatial objects represented by spatial data types and spatial relationship among such objects. Clustering is the process of
partitioning a set of data objects into subsets such that the data elements in a cluster are similar to one another and different from the
element of other cluster The set of cluster resulting from a cluster analysis can be referred to as a clustering. Spatial clustering is a
process of grouping a set of spatial objects into clusters so that objects within a cluster have high similarity in comparison to one
another, but are dissimilar to objects in other clusters. In this paper, enhancement of classification scheme is done using various Honey
bee Optimization and Firefly Optimization. There are number of artificial intelligence techniques which helps in data mining to get the
optimized result of the query. Hybrid of K-Mean & Wards Method, Honeybee Optimization and Firefly Optimization will be compared
on the basis of performance parameters of classification (precision, recall, cohesion, variance, F-Measure, H-Measure) and therefore
enhancement will be done.
Keywords: Spatial Data Mining; Clustering; FFO ; HBO; Hybrid K-Mean; Ward Method.
1. Introduction
Recent development in science and technology has given a
big rise to the data in the data warehouse, so it becomes a
cumbersome task to the information required. To solve this
problem, various data mining techniques has been proposed.
Data mining simply means to extract the data from the
database, but to improve the result of query these data
mining techniques have to be more efficient in order to get
the optimized result of a query. Thus data mining is a
process through which data is discovered with respect to its
pattern and interrelationship because of which it has become
a powerful tool. The process of data mining is shown in the
figure 1
2. Proposed Methodology
1473
(precision, recall, cohesion, variance, F-Measure, HMeasure) and therefore enhancement will be done. The
objective of the work carried out in this paper can be stated
in following points
To create Hybrid algorithm of K-Mean and Wards
Method
Optimization using Honey-Bee and firefly algorithm
To make enhancement through various performance
parameters for evaluation of classification scheme.
1474
5. Experimental Work
Figure 3: GUI used for Regionalization based on Hybrid kmean and Wards clustering algorithm using different
optimization techniques
Figure 7: Window for Applying Hybrid K-mean and
Wards Algorithm for Regionalization
1475
1476
6. Results
Cohesion
The figure 15 indicate that FFO is highly cohesive as
compared to Hybrid K- Means and Honey bee Optimization.
Figure 17: Graph showing values of Precision
Recall
The value of recall in Hybrid K-mean & Ward method is
0.55, in HBO is 0.64 and is high in FFO i.e. 0.73.
1477
References
[1]
[2]
[3]
Firefly
optimization
0.62
0.41
0.68
0.73
0.70
0.71
7. Conclusion
The paper presents regionalization based on Hybrid KMeans and Wards Clustering algorithm using different
optimization technique i.e Honey Bee Optimization and
Firefly algorithm. In the paper, three algorithms Hybrid KMean and Wards Method, HBO and FFO are implemented
on a spatial dataset taken from UCI Machine Learning
Repository. Using each algorithm, some performance
parameters such as Cohesion, Variance, Precision, Recall,
F-Measure and H-Measure are calculated. It can be
concluded that, H-Measure, F-Measure, Cohesion, Recall,
Precision on dataset is more in Firefly Algorithm as
compared to HK-Mean ward method and Honey Bee
Optimization, while Variance is less in FFO. As seen in this
paper work, FFO has been implemented successfully over
Hybrid K-Mean& Ward Algorithm and HBO.
[4]
[5]
[6]
[7]
[8]
[9]
[10]
[11]
[12]
8. Future Work
In the present work we have implemented FFO based
classification successfully using spatial data set taken from
UCI Repository of Machine Learning Databases. For future
work, we can combine some other artificial intelligence
algorithm to get more optimized result and can make
enhancement using some other parameters also
[13]
[14]
[15]
1478
1479