III-IT-Data Mining Unit 1-Session 1-Part2
III-IT-Data Mining Unit 1-Session 1-Part2
2
Data Mining
Unit I – INTRODUCTION
• Introduction- Different Kinds of Data
• Patterns Mined –Applications
• Attribute Types
• Data Preprocessing: Data Cleaning
• Data Integration
• Data Reduction
• Data Transformation
• Data Discretization
• Data Visualization
Data Mining 3
Different Kinds of Data
• In principle, data mining should be applicable to
any data repository(source)
• Relational databases
• Data warehouses
• Transactional databases
• Advanced database systems
Data Mining 4
Relational Databases
• Search for trends or data patterns
• Example:
• Predict the credit risk of costumers based on their
income, age and expenses.
Data Mining 5
Data Warehouses
• A data warehouse (DW) is a repository of
information collected from multiple sources,
stored under a unified schema.
• Data warehouse tools help data analysis
• Data Mining tools are required to allow more in-
depth and automated analysis
Data Mining 6
Transactional Databases
• Basic analysis (examples)
• Show me all the items purchased by kumar?
• How many transactions include item number 5?
• Which items sold well together?
Data Mining 7
Advanced Database Systems
• Advanced database systems provide tools for
handling complex data
• Spatial data (e.g., maps)
• Engineering design data (e.g., buildings, system
components)
• Hypertext and multimedia data (text, image, audio,
and video)
• Time-related data (e.g., historical records)
• Stream data (e.g., video surveillance and sensor data)
• World Wide Web, a huge, widely distributed
information repository made available by Internet
Data Mining 8
Data Mining on WWW
• Web usage Mining (user access pattern)
• Better marketing decisions (adverts, user profile)
• Authoritative Web page Analysis
• Ranking web pages based on their importance
• Automated Web page clustering and
classification
• Group and arrange web pages based on their content
Data Mining 9
Summary
• Relational databases
• Data warehouses
• Transactional databases
• Advanced database systems
Data Mining 10
Reference
1. Jiawei Han, Micheline Kamber, Jian Pei, “Data Mining:
Concepts and Techniques”, 3rd Edition, Elsevier, 2014.
2. Jure Leskovec, Anand Rajaraman, Jeffery David
Ullman, “Mining of Massive Datasets”, 2nd Edition,
Cambridge University Press, 2014.
3. Ian H.Witten, Eibe Frank, Mark A.Hall, “Data Mining:
Practical Machine Learning Tools and Techniques”, 3rd
Edition, Elsevier, 2011.
Data Mining 11
Thank you
Data Mining 12