Unit_2_Introduction of Data Mining
Unit_2_Introduction of Data Mining
2
Knowledge Discovery from Data (KDD)
Process
3
1. Data cleaning (to remove noise and inconsistent data)
3. Data selection (where data relevant to the analysis task are retrieved from the
database)
5. Data mining (an essential process where intelligent methods are applied in order to
extract data patterns)
6. Pattern evaluation (to identify the truly interesting) patterns representing knowledge
based on some interestingness measures
4
Types of Data
• Database-oriented data sets and applications
– Relational database, data warehouse, transactional database
• Advanced data sets and advanced applications
– Data streams and sensor data
– Time-series data, temporal data, sequence data (incl. bio-sequences)
– Structure data, graphs, social networks and multi-linked data
– Object-relational databases
– Heterogeneous databases and legacy databases
– Spatial data and spatiotemporal data
– Multimedia database
– Text databases
– The World-Wide Web
5
Data Mining Functionalities
• Concept/Class Description: Characterization and
Discrimination
• Classification
• Clustering
• Outlier Analysis
• Evolution Analysis
6
Major Issues in Data Mining
• Mining methodology and user interaction issues:
7
• Performance issues:
8
Applications of Data Mining
• Financial Analysis
• Retail Industry
• Health care
• Telecommunication Industry
• Higher Education
• Criminal Investigation
• Intrusion Detection
• E-Commerce
• Research Analysis
9
• Watch out: Is everything “data mining”?
– Simple search and query processing
– (Deductive) expert systems
10
Reference
1. J. Han, M. Kamber - “Data Mining Concepts and
Techniques”, Morgan Kaufmann, 3rd Edition.
11
Thank You
12