July 16, 2009 1 Data Mining
July 16, 2009 1 Data Mining
For example, AT and T handles billions of calls per day. Europe's Very
Long Baseline Interferometer (VLBI) has 16 telescopes, each of which
produces 1 Gigabit/second of astronomical data over a 25-day
observation session
Information
Data
Knowledge
6. Pattern evaluation
Evaluation of the interesting patterns by thresholding
7. Knowledge Discovery
Visualization and presentation methods are used to present
the mined knowledge to the user.
Task-relevant Data
Data Selection
Warehouse
Data Cleaning
Data Integration
Databases
Data Mining July 16, 2009 13
Data Mining Tasks
1. Classification
• Classification maps data into predefined groups or classes.
• It may be represented by methods such as decision trees, etc.
Decision tree
Flow chart like tree structure
Each node denotes test of
an attribute value
Each branch represents
outcome of test
Leaves represent classes
or class distribution.
3. Prediction
Many real world applications can be seen
predicting future data states based on
past and current data.
Example - Predicting flooding is difficult problem
Text
Images, video
Mixtures of data
DataMind -- neurOagent
Information Discovery -- IDIS
SAS Institute -- SAS/Neuronets
19
Data Mining July 16, 2009
Data Mining Software
RapidMiner and Weka – Defining data mining process
Angoss software
Infor CRM Epiphany
Portrait Software
SAS
SPSS
ThinkAnalytics
Unica
Viscovery
Industry Application
Finance Credit Card Analysis
Insurance Fraud Analysis
Telecommunication Call record analysis
1. Intelligent Miner
It is IBM data mining product
Distinct feature is include scalability of its mining algorithm and tight
integration with IBM DB2 related data base system.
5. DB Miner
Developed by DBMiner Technologies Inc.
Distinct features of DBMiner are Data cube based Online Analytical
Mining
Household
Telecomm ns
gio
e
R
Video Europe
Far East
Audio India