A3 DWDM
A3 DWDM
Syllabus
UNIT 1
Data Mining – Definition, Concept, Knowledge Discovery from data process, Kinds of
data can be mined: Database Data, Data Warehouse, Transactional Data, Patterns that can be
mined; Technologies Used, Applications of Data Mining; Major issues in Data Mining; Syntax
of Data mining query language; Architecture of Data Mining Systems -Coupling
UNIT 2
DATA PREPROCESSING
Attribute Types, Basic Statistical Description of Data, Measuring the Central Tendency:
Measuring the Dispersion of Data, Graphic Display of Basic Statistical Description of Data; Data
Matrix and Dissimilarity Matrix, Proximity Measure; Minkowski Distance
Data Preprocessing – Why Preprocessing, Data Cleaning: Handling missing values, Data
smoothing techniques, Approaches to Data Cleaning; Data Integration: Redundancy and
Correlation Analysis; Data Reduction, Data Transformation and Data Discretization
UNIT 3
DATA WAREHOUSING AND ONLINE ANALYTICAL PROCESSING
Data Warehouse – Definition, key features, Architecture and models, Differences between
OLAP and OLTP; Schemas for Multidimensional Data Models, OLAP Operations; Data
Warehouse Design, Data Warehouse Implementation, Cube Materialization-Computation
Methods; Data Analysis in Cube Space – Exception-Based, Discovery-Driven Cube Space
Exploration
UNIT 4:
CLASSIFICATION
Classification, Decision Tree, Classification by decision tree Induction, Bayesian Classification,
Rule Based Classification, Classification by Back Propagation; Support Vector Machines, Lazy
Learners, Introduction to other classification methods- Genetic, Rough Set, Fuzzy set approach;
Model Evaluations, Model Selection.
1
UNIT 5:
CLUSTER ANALYSIS
Cluster Analysis & its requirements, Categorization of clustering methods, Partitioning Methods;
Hierarchical Methods, Density based Methods, Grid Based Methods, Evaluation of Clustering.
Text Books:
1. Data Mining, Concepts and Techniques, Third Edition, Jiawei Han, Micheline Kamber,
Jian Pei.
Reference:
Data Mining Concepts and Techniques, Han, Morgan Kaufmann Publishers.
Data Mining: Practical Machine Learning Tools and Techniques2010, by Ian H. Witten
and Eibe Frank, Morgan Kaufmann Publishers.
Course Outcomes
The student shall:
1. Get ability to define the basic terms of Data Mining and Data Warehousing along with
types of data processing techniques
2. Be able to gain knowledge about various warehousing tools and mining patterns
3. Be able to gain in depth knowledge about Clustering and Classification methods.
4. Understand issues related to data and preprocessing techniques.
5. Understand the concepts of analytical processing and various patterns extraction
techniques.
6. Understand various clustering and classification techniques.
7. Apply the concepts of data mining models and analyze the application of each
methodology.
Knowledge Concepts:
UNIT – I:
Cluster – I:
UNIT – II:
Cluster – II:
DATA PREPROCESSING
CG1:
2
KC1: Attribute Types
KC2: Basic Statistical Description of Data
Measuring the Central Tendency
Measuring the Dispersion of Data
Graphic Display of Basic Statistical Description of Data
KC3: Data Matrix and Dissimilarity Matrix, Proximity Measure
KC4: Minkowski Distance
CG2:
KC1: Data Preprocessing – Why Preprocessing
KC2: Data Cleaning
Handling missing values
Data smoothing techniques
Approaches to Data Cleaning
KC3: Data Integration
Redundancy and Correlation Analysis
KC4: Data Reduction, Data Transformation and Data Discretization
UNIT – III:
Cluster – III:
DATA WAREHOUSING AND ONLINE ANALYTICAL PROCESSING
CG-1:
KC-1: Data Warehouse – Definition, key features, Architecture and models
KC-2: Differences between OLAP and OLTP
KC-3: Schemas for Multidimensional Data Models
KC-4: OLAP Operations
CG-2:
KC-1: Data Warehouse Design, Data Warehouse Implementation
KC-2: Cube Materialization-Computation Methods
KC-3: Data Analysis in Cube Space – Exception-Based, Discovery-Driven Cube Space
Exploration
KC-4: Data Analysis in Cube Space – Exception-Based, Discovery-Driven Cube Space
Exploration
Cluster – IV:
MINING FREQUENT PATTERNS, ASSOCIATIONS, AND CORRELATIONS
CG-1:
KC-1: Market Basket Analysis
KC-2: Frequent Item set Mining
Apriori Algorithm
FP growth Algorithm
KC-3: Pattern Evaluation Methods
KC-4: Pattern Mining in Multilevel, Multidimensional Space
CG-2:
KC-1: Constraint-Based Frequent Pattern Mining-Constraints
KC-2: Colossal pattern
KC-3: Semantic Annotation of Frequent Patterns
KC-4: Applications of Pattern mining
3
UNIT – IV:
Cluster – V:
CLASSIFICATION
CG-1:
KC-1: Classification, Decision Tree
KC-2: Classification by decision tree Induction, Bayesian Classification
KC-3: Rule Based Classification
KC-4: Classification by Back Propagation
CG-2:
KC-1: Support Vector Machines
KC-2: Lazy Learners
KC-3: Introduction to other classification methods- Genetic, Rough Set, Fuzzy set
approach
KC-4: Model Evaluations, Model Selection
UNIT – V:
Cluster – VI:
CLUSTER ANALYSIS
CG-1:
KC-1: Cluster Analysis & its requirements
KC-2: Categorization of clustering methods
KC-3: Partitioning Methods
KC-4: Partitioning Methods
CG-1:
KC-1: Hierarchical Methods
KC-2: Density based Methods
KC-3: Grid Based Methods
KC-4: Evaluation of Clustering
Text Books:
1. Data Mining, Concepts and Techniques, Third Edition, Jiawei Han, Micheline Kamber,
Jian Pei.