Submitted in Partial Fulfillment of The Requirement For The Award of The Degree of Master of Technology IN (Computer Science & Engineering)
Submitted in Partial Fulfillment of The Requirement For The Award of The Degree of Master of Technology IN (Computer Science & Engineering)
of Thesis
SUBMITTED IN PARTIAL FULFILLMENT OF THE
REQUIREMENT
FOR THE AWARD OF THE DEGREE OF
MASTER OF TECHNOLOGY
IN
(COMPUTER SCIENCE & ENGINEERING)
Internal Presentation
on
ENHANCING CLASSIFICATION
SCHEME USING
BIO-INSPIRED APPROACHES
Supervisor:Er.Navpreet Rupal
Contents
Introduction
Literature Survey
Problem Formulation
Objectives
Methodology
Results
Conclusion
Future Scope
References
Publication
Introduction
I. Data Mining
II. Different forms of Data Mining
III. Spatial Data Mining Techniques
IV. Clustering
V. Spatial Clustering
VI. K-Mean Clustering & Wards Method
VII. Bio-Inspired Approaches
HBO
FFO
Clustering
Outlier Detection
Association and Co-Location
Classification
Trend-Detection
Spatial Clustering
Process of grouping a set of spatial objects into clusters so that
objects within a cluster have high similarity in comparison to
one another, but are dissimilar to objects in other clusters.
For example, clustering is used to determine the hot spots" in
crime analysis and disease tracking.
Spatial clustering can be applied to group similar spatial
objects together; the implicit assumption is that patterns in
space tend to be grouped rather than randomly located.
Patterns generated by a non-random process can be either
cluster
patterns(aggregated
patterns)
or
decluster
patterns(uniformly spaced patterns).
Partitioning Method
Hierarchical Method.
Density-based Method
Grid-based Method
Literature Survey
FEATURES
ADVANTAGES
LIMITATIONS
S.No
.
AUTHOR
1.
different The
problem
Sang
Jun Tells
techniques and where collecting is
Lee et. al
2.
Contd
4 Diansheng
et .al.
5 Dipika
al.
Tells contribution of
spatial data mining and
geographic knowledge
discovery to geographic
information
Sciences.
More
research
domains have gained
access to high-quality
geographic data
of spatio- Explaining
spatial- Space and time, Accuracy
et Overview
temporal data models
temporal data mining complexity data models
and its various tasks and tasks and models
techniques in detail
6 Sundararaj
an et al.
7 Kiran et. al
Spatial
clustering
approach is proposed
which is very efficient in
processing data along
with noise and outliers
Essential
role
in
obtaining
interesting
spatial patterns and
chacteristics
and
capturing
inherent
association
among
spatial and non-spatial
data
Address the problem of Good performance than
mining
co-location other methods
patterns with a novel
method called Mediod
participation index
Time consuming
Problem Formulation
Recent development in computer technology has advanced the
generation and consumption of data in our daily life.
Challenges such as growing data in data warehouse, it
becomes a cumbersome task to extract the relevant
information and to do so data mining techniques are used.
There are number of artificial intelligence techniques which
helps in data mining to get the optimized result of the query.
Hybrid of K-Mean & Wards Method, Honeybee Optimization
and Firefly Optimization will be compared on the basis of
performance parameters of classification (precision, recall,
cohesion, variance, F-Measure, H-Measure) and therefore
enhancement will be done.
Objectives
To create Hybrid algorithm of K-Mean and
Wards Method
Optimization using Honey-Bee and firefly
algorithm
To make enhancement through various
performance parameters for evaluation of
classification scheme.
Methodology
Selection of a Spatial dataset
Implementation Hybrid algorithm of K-Mean and
wards Method and form the clusters.
Implementation Honeybee Optimization Technique.
Implementation Firefly Optimization.
Comparison on the basis of Performance Parameters
of classification.
Flowchart
Objectives
To create Hybrid algorithm of K-Mean and
Wards Method
Optimization using Honey-Bee and firefly
algorithm
To make enhancement through various
performance parameters for evaluation of
classification scheme.
Uploading Dataset
Objectives
To create Hybrid algorithm of K-Mean and
Wards Method
Optimization using Honey-Bee and firefly
algorithm
To make enhancement through various
performance parameters for evaluation of
classification scheme.
HBO Implementation
FFO implementation
Objectives
To create Hybrid algorithm of K-Mean and
Wards Method
Optimization using Honey-Bee and firefly
algorithm
To make enhancement through various
performance parameters for evaluation of
classification scheme.
Performance Parameters
I. Precision
II. Recall
III. Cohesion
IV. Variance
V. F-Measure
VI. H-Measure
Conclusion
The thesis presents regionalization based on Hybrid K-Means and
Wards Clustering algorithm using different optimization technique
i.e. Honey Bee Optimization and Firefly algorithm.
In the thesis, three algorithms Hybrid K-Mean and Wards Method,
HBO and FFO are implemented on a spatial dataset taken from UCI
Machine Learning Repository.
Using each algorithm, some performance parameters such as
Cohesion, Variance, Precision, Recall, F-Measure and H-Measure
are calculated. It can be concluded that ,H-Measure, F-Measure,
Cohesion, Recall, Precision on dataset is more in Firefly Algorithm
as compared to HK-Mean ward method and Honey Bee
Optimization, while Variance is less in FFO.
As seen in this thesis work, FFO has been implemented successfully
over Hybrid K-Mean& Ward Algorithm and HBO.
Future Scope
In the present work we have implemented FFO based classification
successfully using spatial data set taken from UCI Repository of
Machine Learning Databases.
For future work, we can combine some other artificial intelligence
algorithm to get more optimized result and can make enhancement
using some other parameters also
References
1. Assuno, Renato M., Marcos Corra Neves, Gilberto Cmara, and Corina da Costa
Freitas(2006). "Efficient regionalization techniques for socioeconomic geographical units
using minimum spanning trees." International Journal of Geographical Information
Science 20, no. 7 , 797-811.
2. Berry, Michael J. A.(1997) Data-Mining Techniques for Marketing, Sales and Customer
Support. U.S.A: John Wiley and Sons.
3. Cheu, Eng Yeow, Chee Keongg, and Zonglin Zhou(2004) "On the two-level hybrid clustering
algorithm." in International conference on artificial intelligence in science and technology,
pp. 138-142.
4. Christina, J., and K. Komathy(2013) "Analysis of hard clustering algorithms applicable to
regionalization." in Information & Communication Technologies (ICT), 2013 IEEE
Conference on, pp. 606-610.." IEEE Intelligent Systems 11, no. 5 pp. 20-25.
5. Fayyad, Usama M(1996). "Data mining and knowledge discovery: Making sense out of
geographic information systems, pp. 35-39. ACM.
6. Haddad, Omid Bozorg, Abbas Afshar, and Miguel A. Mario(2006) "Honey-bees mating
optimization (HBMO) algorithm: a new heuristic approach for water resources
optimization." Water Resources Management 20, no. 5: 661-680.
7. Jafar, OA Mohamed, and R. Sivakumar. (2013). "A Comparative Study of Hard and Fuzzy Data
Clustering Algorithms with Cluster Validity Indices."
8. Kang, In-Soo, Tae-wan Kim, and Ki-Joune Li.(1997) "A spatial data mining method by
Delaunay triangulation." in Proceedings of the 5th ACM international workshop on
Advances in geographic information systems, pp. 35-39. ACM.
.
Contd
9. Kiran, P. Premchand, and T. Venu Gopal. "Mining of spatial co-location pattern from spatial
datasets." International Journal of Computer Applications 42, no. 21 25-30.
10. Lan, Rongqin, Wenzhong Shi, Xiaomei Yang, and Guangyuan Lin.(2005)"Mining fuzzy
spatial configuration rules: methods and applications." in IPRS Workshop on Service and
Application of Spatial Data Infrastructure, pp. 319-324.
11. Lee, Sang Jun, and Keng Siau. (2001)"A review of data mining techniques." Industrial
Management & Data Systems 101, no. 1 41-46.
12. Li, Sheng-Tun, Shih-Wei Chou, and Jeng-Jong Pan.(2000) "Multi-resolution spatio-temporal
data mining for the study of air pollutant regionalization." in System Sciences. Proceedings of
the 33rd Annual Hawaii International Conference on, pp. 7-pp. IEEE.
13. Lyman, P., and Hal R. Varian(2003), "How much storage is enough?" Storage, 1:4.
14. Mennis, Jeremy, and Diansheng Guo(2000). "Spatial data mining and geographic knowledge
discoveryAn introduction." Computers, Environment and Urban Systems 33, no. 6 403-408.
15. Osama Abu Abbas(2008) Comparison between Data Clustering Algorithm, The
International Arab Journal Of Information Technology, Vol. 3, No. 3.
16.
Pelczer,
Ildiko,
Judith
Ramos,
Ramn
Domnguez,
and
Fernando
Gonzlez(2007)."Establishment of regional homogeneous zones in a watershed using
clustering algorithms." Harmonizing the Demands of Art and Nature in Hydraulics, IAHR,
Venice .
[17] Pham, D. T., A. Ghanbarzadeh, E. Koc, S. Otri, S. Rahim, and M. Zaidi.(2006) "The bees
algorithm-a novel tool for complex optimisation problems." in Proceedings of the 2nd Virtual
International Conference on Intelligent Production Machines and Systems (IPROMS 2006),
pp. 454-459.
Contd
18.Sabar, Nasser R., Masri Ayob, Graham Kendall, and Rong Qu.(2012) "A honey-bee mating
optimization algorithm for educational timetabling problems." European Journal of
Operational Research 216, no. 3 533-543.
19. Saini, Geetinder, and Kamaljit Kaur. (2014)"Regionalization as spatial data mining problem
based on clustering: review."
20. Sharma, Lokesh Kumar, Simon Scheider, Willy Kloesgen, and Om Prakash Vyas.(2008)
"Efficient clustering technique for regionalisation of a spatial database."International Journal
of Business Intelligence and Data Mining 3, no. 1 66-81.
21. Shekhar, Shashi, Pusheng Zhang, Yan Huang, and Ranga Raju Vatsavai.(2003) "Trends in
spatial data mining." Data mining: Next generation challenges and future directions 357380.[13]
22. Shumway, Robert H., and David S. Stoffer.(2010) Time series analysis and its applications:
with R examples. Springer,[24]
23. Srinivas, P. V. V. S., Susanta K. Satpathy, Lokesh K. Sharma, and Ajaya K. Akasapu.
(2011)"Regionalisation as Spatial Data Mining Problem: A Comparative Study." Proc.
International Journal of Computer Trends and Technology 18, no. 5: 577-589.
24. Sumathi, N., R. Geetha, and S. Sathiya Bama.(2014) "Spatial Data Mining-Techniques Trends
and Its Applications." Journal of Computer Applications 1, no. 4 28.
Contd
25 Sundararajan, S., and S. Karthikeyan. (2013)"A Study On Spatial Data Clustering Algorithms
In Data Mining."
26. Teknomo, Kardi, K-Means Clustering (2000) http:\\people.revoledu.com\kardi\
tutorial\kMean\
27. Teodorovi, Duan, and Mauro DellOrco. (2005)"Bee colony optimizationa cooperative
learning approach to complex transportation problems." in Advanced OR and AI Methods in
Transportation: Proceedings of 16th MiniEURO Conference and 10th Meeting of EWGT
(13-16 September 2005).Poznan: Publishing House of the Polish Operational and System
Research, pp. 51-60.
28. Wang, Xin, and Howard Hamilton(2008). "Using clustering methods in geospatial
information systems." in Geoinformatics
and Joint Conference on GIS and Built
environment: Advanced Spatial Data Models and Analyses, pp. 71461N-71461N.
29. Xie, Caixiang, Shilin Chen, Fengmei Suo, Dan Yang, and Chengzhong Sun.(2010)
"Regionalization of Chinese medicinal plants based on spatial data mining." in Fuzzy
Systems and Knowledge Discovery (FSKD), Seventh International Conference on, vol. 4, pp.
1647-1651. IEEE, 2010.
30. Xin Wang, Jing Wang,(2009) Using Clustering methods in geospatial information systems,
International Society for Optics and Photonics.
[31]Yang, XinShe, and Xingshi He.(2013) "Firefly algorithm: recent advances and
applications." International Journal of Swarm Intelligence 1, no. 1 36-50.
Publications
THANK YOU !!