0% found this document useful (0 votes)
163 views

Submitted in Partial Fulfillment of The Requirement For The Award of The Degree of Master of Technology IN (Computer Science & Engineering)

This document appears to be a thesis proposal submitted by Poonam Kataria for a Master's degree in Computer Science and Engineering. The proposal discusses enhancing classification schemes using bio-inspired approaches. The objectives are to create a hybrid K-Means and Ward's clustering algorithm, optimize it using honeybee and firefly algorithms, and evaluate performance based on various parameters. The methodology involves applying the algorithms to a spatial dataset and comparing results. A literature review covers related work in spatial data mining and clustering techniques.

Uploaded by

Poonam Kataria
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
163 views

Submitted in Partial Fulfillment of The Requirement For The Award of The Degree of Master of Technology IN (Computer Science & Engineering)

This document appears to be a thesis proposal submitted by Poonam Kataria for a Master's degree in Computer Science and Engineering. The proposal discusses enhancing classification schemes using bio-inspired approaches. The objectives are to create a hybrid K-Means and Ward's clustering algorithm, optimize it using honeybee and firefly algorithms, and evaluate performance based on various parameters. The methodology involves applying the algorithms to a spatial dataset and comparing results. A literature review covers related work in spatial data mining and clustering techniques.

Uploaded by

Poonam Kataria
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 35

Pre- Submission

of Thesis
SUBMITTED IN PARTIAL FULFILLMENT OF THE
REQUIREMENT
FOR THE AWARD OF THE DEGREE OF
MASTER OF TECHNOLOGY
IN
(COMPUTER SCIENCE & ENGINEERING)

SHAHEED UDHAM SINGH COLLEGE OF ENGINEERING &


TECHNOLOGY, TANGORI (MOHALI) 140306

Internal Presentation
on

ENHANCING CLASSIFICATION
SCHEME USING
BIO-INSPIRED APPROACHES

Thesis for M-Tech (C.S.E.)

Supervisor:Er.Navpreet Rupal

Presented By:Poonam Kataria


Univ Roll No.96446582909

Contents

Introduction
Literature Survey
Problem Formulation
Objectives
Methodology
Results
Conclusion
Future Scope
References
Publication

Introduction
I. Data Mining
II. Different forms of Data Mining
III. Spatial Data Mining Techniques
IV. Clustering
V. Spatial Clustering
VI. K-Mean Clustering & Wards Method
VII. Bio-Inspired Approaches
HBO
FFO

Forms of Data Mining

Spatio-temporal data mining


Multimedia Mining
Spatial Data mining
Web Mining

Spatial Data mining Techniques

Clustering
Outlier Detection
Association and Co-Location
Classification
Trend-Detection

Spatial Clustering
Process of grouping a set of spatial objects into clusters so that
objects within a cluster have high similarity in comparison to
one another, but are dissimilar to objects in other clusters.
For example, clustering is used to determine the hot spots" in
crime analysis and disease tracking.
Spatial clustering can be applied to group similar spatial
objects together; the implicit assumption is that patterns in
space tend to be grouped rather than randomly located.
Patterns generated by a non-random process can be either
cluster
patterns(aggregated
patterns)
or
decluster
patterns(uniformly spaced patterns).

Different types of Clustering


Methods

Partitioning Method
Hierarchical Method.
Density-based Method
Grid-based Method

Literature Survey
FEATURES

ADVANTAGES

LIMITATIONS

S.No
.

AUTHOR

1.

different The
problem
Sang
Jun Tells
techniques and where collecting is
Lee et. al

2.

Abbas et. Al.

This algorithms is not


Eng
Yeow Design the hybrid highest percentage of
clustering
samples
provide
homogenous
Cheu et. al
algorithms
which correctly
clustered clustering
involve two level using
clustering
two level clustering

of Ability to handle different


types of data, Graceful
it used
resolved
degeneration
of
DM
algorithms and Protection
of privacy and data
security
To study and compare Performance of
k- k-means is sensitive to
different
data means for
noise and due to this
clustering algorithm
making cluster and clustering is affected
SOM provides more
accuracy

Contd
4 Diansheng
et .al.

5 Dipika
al.

Tells contribution of
spatial data mining and
geographic knowledge
discovery to geographic
information
Sciences.

More
research
domains have gained
access to high-quality
geographic data

Data have become more


diverse, complex, dynamic,
and much larger and more
difficult to analyze and
understand

of spatio- Explaining
spatial- Space and time, Accuracy
et Overview
temporal data models
temporal data mining complexity data models
and its various tasks and tasks and models
techniques in detail

6 Sundararaj
an et al.

7 Kiran et. al

Spatial
clustering
approach is proposed
which is very efficient in
processing data along
with noise and outliers

Essential
role
in
obtaining
interesting
spatial patterns and
chacteristics
and
capturing
inherent
association
among
spatial and non-spatial
data
Address the problem of Good performance than
mining
co-location other methods
patterns with a novel
method called Mediod
participation index

Predicting the similarity when


the spatial dataset is used for
clustering

Time consuming

Problem Formulation
Recent development in computer technology has advanced the
generation and consumption of data in our daily life.
Challenges such as growing data in data warehouse, it
becomes a cumbersome task to extract the relevant
information and to do so data mining techniques are used.
There are number of artificial intelligence techniques which
helps in data mining to get the optimized result of the query.
Hybrid of K-Mean & Wards Method, Honeybee Optimization
and Firefly Optimization will be compared on the basis of
performance parameters of classification (precision, recall,
cohesion, variance, F-Measure, H-Measure) and therefore
enhancement will be done.

Objectives
To create Hybrid algorithm of K-Mean and
Wards Method
Optimization using Honey-Bee and firefly
algorithm
To make enhancement through various
performance parameters for evaluation of
classification scheme.

Methodology
Selection of a Spatial dataset
Implementation Hybrid algorithm of K-Mean and
wards Method and form the clusters.
Implementation Honeybee Optimization Technique.
Implementation Firefly Optimization.
Comparison on the basis of Performance Parameters
of classification.

Flowchart

Objectives
To create Hybrid algorithm of K-Mean and
Wards Method
Optimization using Honey-Bee and firefly
algorithm
To make enhancement through various
performance parameters for evaluation of
classification scheme.

Hybrid of K-Mean & wards method


ALGORITHM:Step 1. Set the no. Of cluster in K-means.
Step 2. Apply K-means algorithm to generate K-homogenous Cluster.
Step 3.After getting K-cluster applying wards algorithm.
Step 4. Select all cluster as input to wards
Step 5. Take random threshold value as find distance as find maximum
value in the clusters
Step 6. Check approaching value of cluster selected is nearby to max value
or not?
Step 7. Define cluster according nearby value or approaching value to max
value of the previous cluster.
Step 8. Cluster are allocated within the range of max and min value of the
previous cluster
Step 9. Last efficient and optimum cluster obtain

GUI to implement Hybrid K-Mean


& Wards Method

Browsing the data set

Uploading Dataset

Result of Hybrid Algorithm

Objectives
To create Hybrid algorithm of K-Mean and
Wards Method
Optimization using Honey-Bee and firefly
algorithm
To make enhancement through various
performance parameters for evaluation of
classification scheme.

GUI of Optimization Techniques

HBO Implementation

FFO implementation

Objectives
To create Hybrid algorithm of K-Mean and
Wards Method
Optimization using Honey-Bee and firefly
algorithm
To make enhancement through various
performance parameters for evaluation of
classification scheme.

Performance Parameters
I. Precision
II. Recall
III. Cohesion
IV. Variance
V. F-Measure
VI. H-Measure

Comparison Based on Evaluation


Parameters

Conclusion
The thesis presents regionalization based on Hybrid K-Means and
Wards Clustering algorithm using different optimization technique
i.e. Honey Bee Optimization and Firefly algorithm.
In the thesis, three algorithms Hybrid K-Mean and Wards Method,
HBO and FFO are implemented on a spatial dataset taken from UCI
Machine Learning Repository.
Using each algorithm, some performance parameters such as
Cohesion, Variance, Precision, Recall, F-Measure and H-Measure
are calculated. It can be concluded that ,H-Measure, F-Measure,
Cohesion, Recall, Precision on dataset is more in Firefly Algorithm
as compared to HK-Mean ward method and Honey Bee
Optimization, while Variance is less in FFO.
As seen in this thesis work, FFO has been implemented successfully
over Hybrid K-Mean& Ward Algorithm and HBO.

Future Scope
In the present work we have implemented FFO based classification
successfully using spatial data set taken from UCI Repository of
Machine Learning Databases.
For future work, we can combine some other artificial intelligence
algorithm to get more optimized result and can make enhancement
using some other parameters also

References
1. Assuno, Renato M., Marcos Corra Neves, Gilberto Cmara, and Corina da Costa
Freitas(2006). "Efficient regionalization techniques for socioeconomic geographical units
using minimum spanning trees." International Journal of Geographical Information
Science 20, no. 7 , 797-811.
2. Berry, Michael J. A.(1997) Data-Mining Techniques for Marketing, Sales and Customer
Support. U.S.A: John Wiley and Sons.
3. Cheu, Eng Yeow, Chee Keongg, and Zonglin Zhou(2004) "On the two-level hybrid clustering
algorithm." in International conference on artificial intelligence in science and technology,
pp. 138-142.
4. Christina, J., and K. Komathy(2013) "Analysis of hard clustering algorithms applicable to
regionalization." in Information & Communication Technologies (ICT), 2013 IEEE
Conference on, pp. 606-610.." IEEE Intelligent Systems 11, no. 5 pp. 20-25.
5. Fayyad, Usama M(1996). "Data mining and knowledge discovery: Making sense out of
geographic information systems, pp. 35-39. ACM.
6. Haddad, Omid Bozorg, Abbas Afshar, and Miguel A. Mario(2006) "Honey-bees mating
optimization (HBMO) algorithm: a new heuristic approach for water resources
optimization." Water Resources Management 20, no. 5: 661-680.
7. Jafar, OA Mohamed, and R. Sivakumar. (2013). "A Comparative Study of Hard and Fuzzy Data
Clustering Algorithms with Cluster Validity Indices."
8. Kang, In-Soo, Tae-wan Kim, and Ki-Joune Li.(1997) "A spatial data mining method by
Delaunay triangulation." in Proceedings of the 5th ACM international workshop on
Advances in geographic information systems, pp. 35-39. ACM.
.

Contd
9. Kiran, P. Premchand, and T. Venu Gopal. "Mining of spatial co-location pattern from spatial
datasets." International Journal of Computer Applications 42, no. 21 25-30.
10. Lan, Rongqin, Wenzhong Shi, Xiaomei Yang, and Guangyuan Lin.(2005)"Mining fuzzy
spatial configuration rules: methods and applications." in IPRS Workshop on Service and
Application of Spatial Data Infrastructure, pp. 319-324.
11. Lee, Sang Jun, and Keng Siau. (2001)"A review of data mining techniques." Industrial
Management & Data Systems 101, no. 1 41-46.
12. Li, Sheng-Tun, Shih-Wei Chou, and Jeng-Jong Pan.(2000) "Multi-resolution spatio-temporal
data mining for the study of air pollutant regionalization." in System Sciences. Proceedings of
the 33rd Annual Hawaii International Conference on, pp. 7-pp. IEEE.
13. Lyman, P., and Hal R. Varian(2003), "How much storage is enough?" Storage, 1:4.
14. Mennis, Jeremy, and Diansheng Guo(2000). "Spatial data mining and geographic knowledge
discoveryAn introduction." Computers, Environment and Urban Systems 33, no. 6 403-408.
15. Osama Abu Abbas(2008) Comparison between Data Clustering Algorithm, The
International Arab Journal Of Information Technology, Vol. 3, No. 3.
16.
Pelczer,
Ildiko,
Judith
Ramos,
Ramn
Domnguez,
and
Fernando
Gonzlez(2007)."Establishment of regional homogeneous zones in a watershed using
clustering algorithms." Harmonizing the Demands of Art and Nature in Hydraulics, IAHR,
Venice .
[17] Pham, D. T., A. Ghanbarzadeh, E. Koc, S. Otri, S. Rahim, and M. Zaidi.(2006) "The bees
algorithm-a novel tool for complex optimisation problems." in Proceedings of the 2nd Virtual
International Conference on Intelligent Production Machines and Systems (IPROMS 2006),
pp. 454-459.

Contd
18.Sabar, Nasser R., Masri Ayob, Graham Kendall, and Rong Qu.(2012) "A honey-bee mating
optimization algorithm for educational timetabling problems." European Journal of
Operational Research 216, no. 3 533-543.
19. Saini, Geetinder, and Kamaljit Kaur. (2014)"Regionalization as spatial data mining problem
based on clustering: review."
20. Sharma, Lokesh Kumar, Simon Scheider, Willy Kloesgen, and Om Prakash Vyas.(2008)
"Efficient clustering technique for regionalisation of a spatial database."International Journal
of Business Intelligence and Data Mining 3, no. 1 66-81.
21. Shekhar, Shashi, Pusheng Zhang, Yan Huang, and Ranga Raju Vatsavai.(2003) "Trends in
spatial data mining." Data mining: Next generation challenges and future directions 357380.[13]
22. Shumway, Robert H., and David S. Stoffer.(2010) Time series analysis and its applications:
with R examples. Springer,[24]
23. Srinivas, P. V. V. S., Susanta K. Satpathy, Lokesh K. Sharma, and Ajaya K. Akasapu.
(2011)"Regionalisation as Spatial Data Mining Problem: A Comparative Study." Proc.
International Journal of Computer Trends and Technology 18, no. 5: 577-589.
24. Sumathi, N., R. Geetha, and S. Sathiya Bama.(2014) "Spatial Data Mining-Techniques Trends
and Its Applications." Journal of Computer Applications 1, no. 4 28.

Contd
25 Sundararajan, S., and S. Karthikeyan. (2013)"A Study On Spatial Data Clustering Algorithms
In Data Mining."
26. Teknomo, Kardi, K-Means Clustering (2000) http:\\people.revoledu.com\kardi\
tutorial\kMean\
27. Teodorovi, Duan, and Mauro DellOrco. (2005)"Bee colony optimizationa cooperative
learning approach to complex transportation problems." in Advanced OR and AI Methods in
Transportation: Proceedings of 16th MiniEURO Conference and 10th Meeting of EWGT
(13-16 September 2005).Poznan: Publishing House of the Polish Operational and System
Research, pp. 51-60.
28. Wang, Xin, and Howard Hamilton(2008). "Using clustering methods in geospatial
information systems." in Geoinformatics
and Joint Conference on GIS and Built
environment: Advanced Spatial Data Models and Analyses, pp. 71461N-71461N.
29. Xie, Caixiang, Shilin Chen, Fengmei Suo, Dan Yang, and Chengzhong Sun.(2010)
"Regionalization of Chinese medicinal plants based on spatial data mining." in Fuzzy
Systems and Knowledge Discovery (FSKD), Seventh International Conference on, vol. 4, pp.
1647-1651. IEEE, 2010.
30. Xin Wang, Jing Wang,(2009) Using Clustering methods in geospatial information systems,
International Society for Optics and Photonics.
[31]Yang, XinShe, and Xingshi He.(2013) "Firefly algorithm: recent advances and
applications." International Journal of Swarm Intelligence 1, no. 1 36-50.

Publications

THANK YOU !!

You might also like