DataMining
DataMining
net/publication/340540711
Article in International Journal of Advance Research in Computer Science and Management · January 2015
CITATIONS READS
0 308
3 authors:
Somanjoli Mohapatra
St. Claret College, Bangalore
4 PUBLICATIONS 13 CITATIONS
SEE PROFILE
Some of the authors of this publication are also working on these related projects:
All content following this page was uploaded by Prakash Chandra Behera on 10 April 2020.
International Journal of Emerging Trends in Engineering Research (IJETER), Vol. 3 No.6, Pages : 157- 167 (2015)
Special Issue of NCTET 2K15 - Held on June 13, 2015 in SV College of Engineering, Tirupati
http://warse.org/IJETER/static/pdf/Issue/NCTET2015sp32.pdf
International Journal of Emerging Trends in Engineering Research (IJETER), Vol. 3 No.6, Pages : 147- 150 (2015)
Special Issue of NCTET 2K15 - Held on June 13, 2015 in SV College of Engineering, Tirupati
http://warse.org/IJETER/static/pdf/Issue/NCTET2015sp32.pdf
2.3 Predictive Modelling: mining process such as query-driven systems, interactive
exploratory systems, or autonomous systems. A
This model permits the value of one variable to be predicted comprehensive system would provide a wide variety of data
from the known values of other variables. mining techniques to fit different situations and options.
158
ISSN 2347 - 3983
International Journal of Emerging Trends in Engineering Research (IJETER), Vol. 3 No.6, Pages : 147- 150 (2015)
Special Issue of NCTET 2K15 - Held on June 13, 2015 in SV College of Engineering, Tirupati
http://warse.org/IJETER/static/pdf/Issue/NCTET2015sp32.pdf
4.2 Data Understanding: 4.3.1 Select data
Decide on the data to be used for analysis. Criteria include
It starts with an initial data collection, to get familiar with the relevance to the data mining goals, quality and technical
data, to identify data quality problems, to discover first constraints such as limits on data volume or data types.
insights into the data or to detect interesting subsets to form 4.3.2 Clean data
hypotheses for hidden information. Data cleaning may involve selection of clean subsets of the
data, the insertion of suitable defaults or more ambitious
4.2.1 Collect initial data techniques such as replacing the dirty data with derived values,
Acquire within the project the data listed in the project or building separate models for those entities that possess dirty
resources. This initial collection includes data loading if data.
necessary for data understanding. This effort may lead to 4.3.3 Construct data
initial data preparation steps. This task includes constructive data preparation operations
4.2.2 Describe data such as the production of derived attributes, entire new
Examine properties of acquired data and report on the results. records, or transformed values for existing attributes.
4.2.3 Explore data 4.3.4 Integrate data
This task tackles the data mining questions that can be Two methods used for integrating data are merging data and
addressed using querying, visualization and reporting. These generating aggregate values. In these methods information is
analyses may address the data mining goals directly. combined from multiple tables or other information sources to
4.2.4 Verify data quality create new records or values.
Examine the quality of the data. 4.3.5 Format data
Formatting transformations refer to primarily syntactic
4.3 Data Preparation: modifications made to the data that do not change its meaning,
but might be required by the modeling tool.
This phase collects different data sets and constructs the
varieties of the activities based on the initial raw data. 4.4 Modelling:
159
ISSN 2347 - 3983
International Journal of Emerging Trends in Engineering Research (IJETER), Vol. 3 No.6, Pages : 157- 167 (2015)
Special Issue of NCTET 2K15 - Held on June 13, 2015 in SV College of Engineering, Tirupati
http://warse.org/IJETER/static/pdf/Issue/NCTET2015sp32.pdf
customer can use it. The deployment phase can be generating a
4.4.1 Select modeling technique report or as complex as implementing a repeatable data mining
As the first step in modeling, select the actual modeling process across the enterprise.
technique to be used. If a tool was selected in business
understanding (Phase 1), this task refers to selecting the 4.6.1 Plan deployment
specific modeling technique, e.g., building decision trees or To deploy the data mining result(s) into the business, this task
generating a neural network. takes the evaluation results and develops a strategy for
4.4.2 Generate test design deployment. If a general procedure was identified to create the
Prior to building a model, a procedure needs to be defined to relevant model(s), this procedure is documented here for later
test the model’s quality and validity. If the test design specifies deployment.
that the dataset should be separated into training and test sets, 4.6.2 Plan monitoring and maintenance
the model is built on the training set and its quality estimated Monitoring and maintenance are important issues if the data
on the test set. mining result becomes part of the day-to-day business and its
4.4.3 Build model environment. To monitor the deployment of the data mining
The purpose of building models is to use the predictions to result(s), the project needs a detailed plan on the monitoring
make more informed business decisions. The most important process.
goal when building a model is stability, which means that the 4.6.3 Produce final report
model should make predictions that will hold true when it’s At the end of the project, the project leader and the team write
applied to yet unseen data. up a final report. Depending on the deployment plan, this
4.4.4 Assess model report may be only a summary of the project and its
The model should now be assessed to ensure that it meets the experiences or it may be a final and comprehensive
data mining success criteria and passes the desired test criteria. presentation of the data mining result(s).
This step is a purely technical assessment based on the 4.6.4 Review project
outcome of the modeling tasks. Assess what went right and what went wrong, what was done
well and what needs to be improved.
4.5 Evaluation:
5. VISUALIZING DATA MINING MODEL
In this stage the model is thoroughly evaluated and reviewed.
The steps executed to construct the model to be certain it The main objective of data visualization is the overall idea
properly achieves the business objectives. At the end of this about the data mining model.In data mining most of the times
phase, a decision on the use of the data mining results should we are retrieving the data from the repositories which are in
be reached. the hidden form. So visualization of the data mining model
helps us to provide levels of understanding and trust. The data
4.5.1 Evaluate results mining models are of two types: Predictive and Descriptive.
Previous evaluation steps dealt with factors such as the
accuracy and generality of the model. This step assesses the 5.1 Predictive Model:
degree to which the model meets the business objectives and
seeks to determine if there is some business reason why this It makes prediction about unknown data values by using the
chosen model is deficient. known values. Ex. Classification, Regression, Time series
4.5.2 Review process analysis, Prediction etc. Many of the data mining applications
At this point the resultant model appears to be satisfactory and are aimed to predict the future state of the data. Prediction is
appears to satisfy business needsat this stage of Data Mining, the process of analysing the current and past states of the
the Review Process takes on the form of a Quality Assurance. attribute and prediction of its future state.
4.5.3 Determine next steps
According to the assessment results and the process review, Classification is a technique of mapping the target data to the
the analyst decides how to proceed at this stage. The analyst predefined groups or classes, this is a supervise learning
needs to decide whether because the classes are predefined before the examination of
• to finish the project and move on to deployment (Phase 6) the target data. The regression involves the learning of
• to initiate further iterations or function that map data item to real valued prediction variable.
• to set up new data mining projects. In the time series analysis the value of an attribute is examined
as it varies over time. In time series analysis is used for many
4.6 Deployment: statistical techniques which will analyse the time-series data
such as auto regression methods etc.
This phase is used to increase knowledge and further the
knowledge will be organized and presented in a way that the
160
ISSN 2347 - 3983
International Journal of Emerging Trends in Engineering Research (IJETER), Vol. 3 No.6, Pages : 147- 150 (2015)
Special Issue of NCTET 2K15 - Held on June 13, 2015 in SV College of Engineering, Tirupati
http://warse.org/IJETER/static/pdf/Issue/NCTET2015sp32.pdf
5.2 Descriptive model: 6.5 Dependency modelling:
It identifies the patterns or relationships in data and explores It consists of finding a model that describes significant
the properties of the data examined. Ex. Clustering, dependencies between variables. Dependency models exist at
Summarization, Association rule, Sequence discovery etc. two levels: (1) the structural level of the model specifies which
variables are locally dependent on each other and (2) the
The term clustering means analysing the different data objects quantitative level of the model specifies the strengths of the
without consulting a known class levels. It is also referred to dependencies using some numeric scale.
as unsupervised learning or segmentation. It is the partitioning
or segmentation of the data in to groups or clusters. The 6.6 Change and deviation detection:
clusters are defined by studying the behaviour of the data by It focuses on discovering the most significant changes in the
the domain experts. The term segmentation is a process of data from previously measured or normative values.
partitioning of database into disjoint grouping of similar 6.7 Decision trees and rules:
tuples. Summarization is the technique of presenting the
summarize information from the data. The association rule It use univariate splits have a simple representational form,
finds the association between the different attributes. making the inferred model relatively easy for the user to
Association rule mining is a two-step process: Finding all comprehend. However, the restriction to a particular tree or
frequent item sets, Generating strong association rules from rule representation can significantly restrict the functional
the frequent item sets. Sequence discovery is a process of form (and, thus, the approximation power) of the model.
finding the sequence patterns in data. This sequence can be
used to understand the trend. 6.8 Nonlinear Regression and Classification Methods:
6. METHODS OF DATA MINING These methods consist of a family of techniques for prediction
that fit linear and nonlinear combinations of functions to
Data mining methods are broadly classified as: On-Line combinations of the input variables.
Analytical Processing,(OLAP), Classification, Clustering,
Association Rule Mining, Temporal Data Mining, Time Series 6.9 Probabilistic Graphic Dependency Models:
Analysis, Spatial Mining, Web Mining etc. These methods use
different types of algorithms and data. The data source can be Graphic models specify probabilistic dependencies using a
data warehouse, database, flat file or text file. The algorithms graph structure. The model specifies which variables are
may be Statistical Algorithms, Decision Tree based, Nearest directly dependent on each other. Typically, these models are
Neighbour, Neural Network based, Genetic Algorithms based, used with categorical or discrete-valued variables, but
Ruled based, Support Vector Machine etc. extensions to special cases, such as Gaussian densities, for
real-valued variables are also possible.
6.1 Classification:
6.10 Relational Learning Models:
It is learning a function that maps (classifies) a data item into
one of several predefined classes Relational learning (also known as inductive logic
programming) uses the more flexible pattern language of first-
6.2 Regression: order logic. A relational learner can easily find formulas such
as X = Y. Most research to date on model-evaluation methods
It is learning a function that maps a data item to a real-valued for relational learning is logical in nature.
prediction variable.
Generally the data mining algorithms are fully dependent of
6.3 Clustering: the two factors these are
(i) Which type of data sets are using
It is a common descriptive task where one seeks to identify a (ii) What type of requirements are needed for user
finite set of categories or clusters to describe the data.
Knowledge discovery (KD) process involves pre-processing
6.4 Summarization: data, choosing a data-mining algorithm, and post processing
the mining results. The Intelligent Discovery Assistants (IDA),
It involves methods for finding a compact description for a helps users in applying valid knowledge discovery processes.
subset of data.
161
ISSN 2347 - 3983
International Journal of Emerging Trends in Engineering Research (IJETER), Vol. 3 No.6, Pages : 147- 150 (2015)
Special Issue of NCTET 2K15 - Held on June 13, 2015 in SV College of Engineering, Tirupati
http://warse.org/IJETER/static/pdf/Issue/NCTET2015sp32.pdf
7. DATA MINING ELEMENTS dimensions of predictive accuracy, novelty, utility, and
understandability of the fitted model.
Extract, transform, and load transaction data onto the data
warehouse system. 8.3 Search method:
Store and manage the data in a multidimensional database
system. It consists of two components: (1) parameter search and (2)
Provide data access to business analysts and information model search. Once the model representation and the model-
technology professionals. evaluation criteria are fixed, then the data-mining problem has
Analyze the data by application software. been reduced to purely an optimization task: Find the
Present the data in a useful format, such as a graph or parameters and models from the selected family that optimize
table the evaluation criteria. In parameter search, the algorithm must
search for the parameters that optimize the model-evaluation
8. COMPONENT OF DATA MINING ALGORITHM
criteria given observed data and a fixed model representation.
There are three primary components in data-mining algorithm:
9. KNOWLEDGE DISCOVERY FROM
DATABASE(KDD)
Model representation, Model evaluation, and Search.
There is an urgent need for a new generation of computational
theories and tools to assist humans in extracting useful
information (knowledge) from the rapidly growing volumes of
digital data. The main KDD application areas are marketing,
finance or investment, fraud detection, manufacturing,
telecommunications, and Internet agents etc. The term data
mining has mostly been used by statisticians, data analysts,
and the management information systems (MIS) communities.
It has also gained popularity in the database field.
162
ISSN 2347 - 3983
International Journal of Emerging Trends in Engineering Research (IJETER), Vol. 3 No.6, Pages : 147- 150 (2015)
Special Issue of NCTET 2K15 - Held on June 13, 2015 in SV College of Engineering, Tirupati
http://warse.org/IJETER/static/pdf/Issue/NCTET2015sp32.pdf
visualization of the extracted patterns and models or
9.1 Develop an application: visualization of the data given the extracted models.
Develop an understanding of the application domain and the 9.9 Working on the discovered knowledge:
relevant prior knowledge and identifying the goal of the KDD
process from the customer’s viewpoint. Using the knowledge directly, incorporating the knowledge
into another system for further action, or simply documenting
9.2 Create a target data set: it and reporting it to interested parties. This process also
includes checking for and resolving potential conflicts with
Selecting a data set, or focusing on a subset of variables or previously believed (or extracted) knowledge.
data samples, on which discovery is to be performed.
163
ISSN 2347 - 3983
International Journal of Emerging Trends in Engineering Research (IJETER), Vol. 3 No.6, Pages : 147- 150 (2015)
Special Issue of NCTET 2K15 - Held on June 13, 2015 in SV College of Engineering, Tirupati
http://warse.org/IJETER/static/pdf/Issue/NCTET2015sp32.pdf
The right model for a given application can only be processes. In this current era we are using the KDD and the
discovered by experiment or There is No Free Lunch for data mining tools for extracting the knowledge. This
the Data Miner (NFL-DM) knowledge can be used for improving the quality of education.
There are always patterns (Watkins Law)
Data mining amplifies perception in the business domain 12.4Data mining in manufacturing engineering
(Insight Law)
Prediction increases information locally by generalisation When we retrieve the data from manufacturing system then
(Prediction Law) the customer use these data for different purposes like to find
The value of data mining results is not determined by the the errors in the data, to enhance the design methodology, to
accuracy or stability of predictive models (Value Law) make the good quality of the data, how best the data can be
All patterns are subject to change (Law of Change) supported for making the decision. But most of the time, the
data can be first analysed then after finding the hidden patterns
manufacturing process can be controlled to enhance the quality
of the products.
12. DATA MINING APPLICATIONS
.
12.5Data Mining Applications can be generic or domain
12.1 Data Mining Applications in Healthcare
specific.
The success of healthcare data mining hinges on the Data mining system can be applied for generic or domain
availability of clean healthcare data. In this respect, it is specific. The multi agent based data mining application has
critical that the healthcare industry look into how data can be capability of automatic selection of data mining technique to
better captured, stored, prepared and mined. Possible be applied. The Multi Agent System used at different levels:
directions include the standardization of clinical vocabulary First, at the level of concept hierarchy definition then at the
and the sharing of data across organizations to enhance the result level to present the best adapted decision to the user.
benefits of healthcare data mining applications. As healthcare This decision is stored in knowledge Base to use in a later
data are quantitative data, it is necessary to also explore the use decision-making. Multi Agent System Tool used for generic
of text mining to expand the scope and nature of what data mining system development uses different agents to
healthcare data mining can currently do. This is specially used perform different tasks.
to mixed all the data and then mining the text.
12.6A multi-tier data mining system
12.2Data mining for market basket analysis
It consist basic components like user interface, data mining
Data mining technique is used in MBA (Market Basket services, data access services and the data. There are three
Analysis). When the customer want to buy some products then different architectures presented for the data mining system
this technique helps to find the associations between different namely one-tire, Two-tire and Three-tire architecture. Generic
items that the customer put in their shopping buckets. Here the system required to integrate as many learning algorithms as
discovery of such associations that promotes the business possible and decides the most appropriate algorithm to use.
technique. In this way the retailers uses the data mining CORBA (Common Object Request Broker Architecture)
technique so that they can identify that which customers allows reusability in a feasible way and finally it makes
intension. In this way this technique is used for profits of the possible to build large and scalable system.
business and also helps to purchase the related items.
12.7Data mining technique in CRM
12.3The data mining in education system
Data mining technique used in CRM aims to give a research
With huge number of higher education aspirants, we believe summary on the application of data mining in the CRM
that data mining technology can help bridging knowledge gap domain and techniques which are most often used.
in higher educational systems. The hidden patterns,
associations, and anomalies that are discovered by data mining
12.8The Domain Specific Applications
techniques from educational data can improve decision making
processes in higher educational systems. This improvement The domain specific applications are focused to use the
can bring advantages such as maximizing educational system domain specific data and data mining algorithm that targeted
efficiency, decreasing student's drop-out rate, and increasing for specific objective. The applications are aimed to generate
student's promotion rate, increasing student's retention rate in, the specific knowledge. In the different domains the data
increasing student's transition rate, increasing educational generating sources generate different type of data. Data can be
improvement ratio, increasing student's success, increasing from a simple text, numbers to more complex audio-video
student's learning outcome, and reducing the cost of system
164
ISSN 2347 - 3983
International Journal of Emerging Trends in Engineering Research (IJETER), Vol. 3 No.6, Pages : 147- 150 (2015)
Special Issue of NCTET 2K15 - Held on June 13, 2015 in SV College of Engineering, Tirupati
http://warse.org/IJETER/static/pdf/Issue/NCTET2015sp32.pdf
data. To mine the patterns and thus knowledge from this data, crime data. The classification technique is also used to detect
different types of data mining algorithms are used. email spamming and also find person who has given the mail.
12.9In Medical Science 12.14The data mining system in Internal Revenue Service
The use of data mining in health care is the widely used The data mining system implemented at the Internal Revenue
application of data mining. The medical data is complex and Service to identify high-income individuals engaged in abusive
difficult to analyse. A REMIND (Reliable Extraction and tax shelters show significantly good results. The major lines of
Meaningful Inference from Non-structured Data) system investigation included visualization of the relationships and
integrates the structured and unstructured clinical data in data mining to identify and rank possibly abusive tax
patient records to automatically create high quality structured avoidance transactions. To enhance the quality of product data
clinical data. mining techniques can be used effectively.
Data mining methods are used in the web Education which is E-commerce is also the most prospective domain for data
used to improve courseware. The relationships are discovered mining because data records are plentiful, electronic collection
among the usage data picked up during students’ sessions. This provides reliable data, insight can easily be turned into action,
knowledge is very useful for the teacher or the author of the and return on investment can be measured. The integration of
course, who could decide what modifications will be the most e-commerce and data mining significantly improve the results
appropriate to improve the effectiveness of the course. and guide the users in generating knowledge and making
correct business decisions. This integration effectively solves
12.11The Intrusion Detection in the Network several major problems associated with horizontal data mining
tools including the enormous effort required in pre-processing
The data mining method is used to classify the network traffic of the data before it can be used for mining, and making the
normal traffic or abnormal traffic. If any TCP header does not results of mining actionable.
belong to any of the existing TCP header clusters, then it can
be considered as anomaly. The data mining methods used to 12.16The Digital Library Retrieves
accurately detect malicious executables before they run.
The data mining application can be used in the field of the
12.12Sports data mining Digital Library where the user will finds or collects, stores and
preserves the data which are in the form of digital mode. The
In the world, a huge number of games are available where each advent of electronic resources and their increased use in
and every day the national and international games are to be libraries has brought about significant changes in Library. The
scheduled, where a huge number of data are to be maintained. data and information are available in the different formats.
The data mining tools are applied to give the information as These formats include Text, Images, Video, Audio, Picture,
and when we required. Data mining tools like WEKA and Maps, etc. therefore digital library is a suitable domain for
RAPID MINER are frequently used for sport. In the game application of data mining.
sports the data are available in the statistical form where data
mining can be used and discover the patterns, these patterns
are often used to predict the future forecast. Data mining can 12.17The prediction in engineering applications
be used for scouting, prediction of performance, selection of
players, coaching and training and for the strategy planning. The prediction in engineering applications was treated
effectively by a data mining approach. The prediction
12.13The Intelligence Agencies problems like the cost estimation problem in engineering, the
problem of engineering design that involves decisions where
The Intelligence Agencies collect and analyse information to parameters, actions, components, and so on are selected. Data
investigate terrorist activities. One challenge to law mining technique is used for the variety of the parameters in
enforcement and intelligent agencies is the difficulty of the field of engineering applications like prior data. Once we
analysing large volume of data involve in criminal and terrorist gather the data then we can generate the different models,
activities. Now a day the intelligence agency are using the algorithms which will predict different characteristic.
sophisticated data mining algorithms which makes it easy, to
handle the very large databases for organizations. The different 13. CONCLUSION
data mining techniques are used in crime data mining. In data
mining the Clustering techniques are used for the different In this paper we briefly reviewed the various data mining
objects in crime records. Data mining detects and analyses the applications. This review would be helpful to researchers to
165
ISSN 2347 - 3983
International Journal of Emerging Trends in Engineering Research (IJETER), Vol. 3 No.6, Pages : 147- 150 (2015)
Special Issue of NCTET 2K15 - Held on June 13, 2015 in SV College of Engineering, Tirupati
http://warse.org/IJETER/static/pdf/Issue/NCTET2015sp32.pdf
focus on the various issues of data mining. Most of the data DaimlerChrysler AG (Germany), SPSS Inc. (USA) and
mining applications in various fields use the variety of data OHRA Verzekeringenen Bank Group B.V (The
types range from text to images and stores in variety of Netherlands), 2000”.
databases and data structures. The different methods of data [5] Fayyad, U., Piatetsky-Shapiro, G., and Smyth P., “From
mining are used to extract the patterns and thus the knowledge Data Mining to Knowledge Discovery in Databases,”
from this variety databases. Selection of data and methods for AI Magazine, American Association for Artificial
data mining is an important task in this process and needs the Intelligence, 1996.
knowledge of the domain. Several attempts have been made to [6] Tan Pang-Ning, Steinbach, M., Vipin Kumar.
design and develop the generic data mining system but no “Introduction to Data Mining”, Pearson Education,
system found completely generic. New Delhi, ISBN: 978-81-317-1472-0, 3rd Edition, 2009.
[7] Bernstein, A. and Provost, F., “An Intelligent Assistant
Thus, for every domain the domain expert’s assistant is for the Knowledge Discovery Process”, Working Paper
mandatory. The domain experts shall be guided by the system of the Center for Digital Economy Research, New York
to effectively apply their knowledge for the use of data mining University and also presented at the IJCAI 2001
systems to generate required knowledge. The domain experts Workshop on Wrappers for Performance Enhancement in
are required to determine the variety of data that should be Knowledge Discovery in Databases.
collected in the specific problem domain, selection of specific [8] Baazaoui, Z., H., Faiz, S., and Ben Ghezala, H., “A
data for data mining, cleaning and transformation of data, Framework for Data Mining Based Multi-Agent: An
extracting patterns for knowledge generation and finally Application to Spatial Data, volume 5, ISSN 1307-
interpretation of the patterns and knowledge generation. Most 6884,” Proceedings of World Academy of Science,
of the domain specific data mining applications show accuracy Engineering and Technology, April 2005.
above 90%. The generic data mining applications are having [9] Rantzau, R. and Schwarz, H., “A Multi-Tier
the limitations. From the study of various data mining Architecture for High-Performance Data Mining,A
applications it is observed that, no application called generic Technical Project Report of ESPRIT project, The
application is 100 % generic. The intelligent interfaces and consortium of CRITIKAL project, Attar Software Ltd.
intelligent agents up to some extent make the application (UK), Gehe AG (Denmark); Lloyds TSB Group (UK),
generic but have limitations. Parallel Applications Centre, University of Southampton
(UK), BWI, University of Stuttgart (Denmark), IPVR,
The domain experts play important role in the different stages University of Stuttgart (Denmark)”.
of data mining. The decisions at different stages are influenced [10] Botia, J. A., Garijo, M. y Velasco, J. R., Skarmeta, A. F.,
by the factors like domain and data details, aim of the data “A Generic Data mining System basic design and
mining, and the context parameters. The domain specific implementation guidelines”, A Technical Project Report
applications are aimed to extract specific knowledge. The of CYCYTprojectofSpanishGovernment.1998.WebSite:
domain experts by considering the user’s requirements and http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.
other context parameters guide the system. Therefore it is 53.1935
concluded that the domain specific applications are more [11] Agrawal, R., and Psaila, G. 1995. Active Data Mining.
specific for data mining. In Proceedings of the First International Conference on
Knowledge Discovery and Data Mining(KDD-95), 3–8.
Menlo Park, Calif.: American Association for Artificial
14. REFERENCES Intelligence.
[12] Agrawal, R.; Mannila, H.; Srikant, R.; Toivonen, H.;and
[1] Introduction to Data Mining and Knowledge Verkamo, I. 1996. Fast Discovery of AssociationRules.
Discovery, Third Edition ISBN: 1-892095-02-5, Two In Advances in Knowledge Discovery and DataMining,
Crows Corporation, 10500 Falls Road, Potomac, MD eds. U. Fayyad, G. Piatetsky-Shapiro, P.Smyth, and R.
20854 (U.S.A.), 1999. Uthurusamy, 307–328. Menlo Park Calif.: AAAI Press.
[2] Larose, D. T., “Discovering Knowledge in Data: An [13] Brachman, R., and Anand, T. 1996. The Process of
Introduction to Data Mining”, ISBN 0-471-66657-2, Knowledge Discovery in Databases: A Human-
John Wiley & Sons, Inc, 2005. Centered Approach. In Advances in Knowledge
[3] Dunham, M. H., Sridhar S., “Data Mining: Discoveryand Data Mining, 37–58, eds. U. Fayyad, G.
Introductory and Advanced Topics”, Pearson Piatetsky- Shapiro, P. Smyth, and R. Uthurusamy. Menlo
Education, New Delhi, ISBN: 81-7758-785-4, 1st Park, Calif.: AAAI Press.
Edition, 2006 [14] Berry, M. J., Linoff, G. S. (2000), “Mastering Data
[4] Chapman, P., Clinton, J., Kerber, R., Khabaza, T., Mining: The Art and Science of Customer
Reinartz, T., Shearer, C. and Wirth, R... “CRISP-DM 1.0 Relationship Management”. Wiley Computer
: Step-by-step data mining guide, NCR Systems Publishing, New York.
Engineering Copenhagen (USA and Denmark),
166
ISSN 2347 - 3983
International Journal of Emerging Trends in Engineering Research (IJETER), Vol. 3 No.6, Pages : 147- 150 (2015)
Special Issue of NCTET 2K15 - Held on June 13, 2015 in SV College of Engineering, Tirupati
http://warse.org/IJETER/static/pdf/Issue/NCTET2015sp32.pdf
[15] Chung, H. M., Gray, P. (1999), “Special Section: Data
Mining”. Journal of ManagementInformation Systems,
(16:1),11-17.
[16] Colin, S. (2000), “The CRISP-DM Model: The New
Blueprint for Data Mining”, Journal ofData
Warehousing, (5:4), Fall, 13-22.
[17] Fayyad, U., Piatetsky-Shapiro, G., and Smyth, R (1996).
"The KDD Process for Extracting Useful Knowledge
from Volumes of Data," Communications of the ACM,
(39:11), pp.27-34.
[18] Fayyad, U., (2001), “The Digital Physics of Data
Mining”, Communications of the ACM,March, (44:3),
62-65.
[19] Han, J., Kamber, M. (2001), Data Mining: Concepts
and Techniques, Morgan-KaufmannAcademic Press,
San Francisco.
[20] Hand, D. J. (1998), “Data Mining: Statistics and
More?”,The American Statistician, May(52:2), 112-118.
[21] Ranjit, B., Sugumaran, V. (1999), “Application of
Intelligent Agent Technology for Managerial Data
Analysis and Mining”, Database for Advances in
Information Systems, (30:1), 77-94.
[22] Spangler, W. E.; May, J. H., Vargas, L. G. (1999),
“Choosing Data-Mining Methods For Multiple
Classification: Representational And Performance
Measurement Implications For Decision Support
“Journal of Management Information Systems, Summer,
37-62.
[23] White, H.,”A Reality Check for Data Snooping”
(2000), Econometrica, (68:5), September,1097-1126.
[24] Witten, I. H. (2000), Data mining: practical machine
learning tools and techniques with Java
implementations, Morgan Kaufman, San Francisco.
[25] Srivastava, J., Cooley, R., Deshpande, M., Tan, P., “Web
Usage Mining: Discovery and Applications of Usage
Patterns from Web Data”, ACM SIGKDD (Special
Interest Group onKnowledge Discovery and Data
Mining) Explorations, January, (1:2)
[26] Kennedy, R. L., Lee, Y. Roy, B. V. Reed, C. D.
&Lippman, R. P. (1997). Solving Data Mining
Problems Through Pattern Recognition. New Jersey:
Prentice Hall Professional TechnicalReference.
[27] Kosala, R., Blockeel, H. (2000), “Web Mining
Research: A Survey”, ACM SIGKDD(Special Interest
Group on Knowledge Discovery and Data Mining)
Explorations, June, (2:1), 1-10.
167