0% found this document useful (0 votes)
19 views4 pages

A3 DWDM

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
19 views4 pages

A3 DWDM

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 4

SEMESTER – VII L T P C

Data Mining & Data Warehousing 3 0 0 3


A3CIT201
Total Contact Hours: 45
Prerequisites: DWDM

Syllabus

UNIT 1
Data Mining – Definition, Concept, Knowledge Discovery from data process, Kinds of
data can be mined: Database Data, Data Warehouse, Transactional Data, Patterns that can be
mined; Technologies Used, Applications of Data Mining; Major issues in Data Mining; Syntax
of Data mining query language; Architecture of Data Mining Systems -Coupling

UNIT 2
DATA PREPROCESSING
Attribute Types, Basic Statistical Description of Data, Measuring the Central Tendency:
Measuring the Dispersion of Data, Graphic Display of Basic Statistical Description of Data; Data
Matrix and Dissimilarity Matrix, Proximity Measure; Minkowski Distance
Data Preprocessing – Why Preprocessing, Data Cleaning: Handling missing values, Data
smoothing techniques, Approaches to Data Cleaning; Data Integration: Redundancy and
Correlation Analysis; Data Reduction, Data Transformation and Data Discretization

UNIT 3
DATA WAREHOUSING AND ONLINE ANALYTICAL PROCESSING
Data Warehouse – Definition, key features, Architecture and models, Differences between
OLAP and OLTP; Schemas for Multidimensional Data Models, OLAP Operations; Data
Warehouse Design, Data Warehouse Implementation, Cube Materialization-Computation
Methods; Data Analysis in Cube Space – Exception-Based, Discovery-Driven Cube Space
Exploration

MINING FREQUENT PATTERNS, ASSOCIATIONS, AND CORRELATIONS


Market Basket Analysis, Frequent Item set Mining: Apriori Algorithm, FP growth Algorithm,; :
Pattern Evaluation Methods, Pattern Mining in Multilevel, Multidimensional Space, Constraint-
Based Frequent Pattern Mining-Constraints, Colossal pattern, Semantic Annotation of Frequent
Patterns, Applications of Pattern mining.

UNIT 4:
CLASSIFICATION
Classification, Decision Tree, Classification by decision tree Induction, Bayesian Classification,
Rule Based Classification, Classification by Back Propagation; Support Vector Machines, Lazy
Learners, Introduction to other classification methods- Genetic, Rough Set, Fuzzy set approach;
Model Evaluations, Model Selection.

1
UNIT 5:
CLUSTER ANALYSIS
Cluster Analysis & its requirements, Categorization of clustering methods, Partitioning Methods;
Hierarchical Methods, Density based Methods, Grid Based Methods, Evaluation of Clustering.

Text Books:
1. Data Mining, Concepts and Techniques, Third Edition, Jiawei Han, Micheline Kamber,
Jian Pei.
Reference:
 Data Mining Concepts and Techniques, Han, Morgan Kaufmann Publishers.
 Data Mining: Practical Machine Learning Tools and Techniques2010, by Ian H. Witten
and Eibe Frank, Morgan Kaufmann Publishers.
Course Outcomes
The student shall:
1. Get ability to define the basic terms of Data Mining and Data Warehousing along with
types of data processing techniques
2. Be able to gain knowledge about various warehousing tools and mining patterns
3. Be able to gain in depth knowledge about Clustering and Classification methods.
4. Understand issues related to data and preprocessing techniques.
5. Understand the concepts of analytical processing and various patterns extraction
techniques.
6. Understand various clustering and classification techniques.
7. Apply the concepts of data mining models and analyze the application of each
methodology.

Knowledge Concepts:
UNIT – I:
Cluster – I:

INTRODUCTION TO DATA MINING


CG1:
KC1: Data Mining – Definition, Concept
KC2: Knowledge Discovery from data process
KC3: Kinds of data can be mined
 Database Data
 Data Warehouse
 Transactional Data
KC4: Patterns that can be mined
CG2:
KC5: Technologies Used, Applications of Data Mining
KC6: Major issues in Data Mining
KC7: Syntax of Data mining query language
KC8: Architecture of Data Mining Systems -Coupling

UNIT – II:
Cluster – II:
DATA PREPROCESSING
CG1:
2
KC1: Attribute Types
KC2: Basic Statistical Description of Data
 Measuring the Central Tendency
 Measuring the Dispersion of Data
 Graphic Display of Basic Statistical Description of Data
KC3: Data Matrix and Dissimilarity Matrix, Proximity Measure
KC4: Minkowski Distance
CG2:
KC1: Data Preprocessing – Why Preprocessing
KC2: Data Cleaning
 Handling missing values
 Data smoothing techniques
 Approaches to Data Cleaning
KC3: Data Integration
 Redundancy and Correlation Analysis
KC4: Data Reduction, Data Transformation and Data Discretization

UNIT – III:
Cluster – III:
DATA WAREHOUSING AND ONLINE ANALYTICAL PROCESSING
CG-1:
KC-1: Data Warehouse – Definition, key features, Architecture and models
KC-2: Differences between OLAP and OLTP
KC-3: Schemas for Multidimensional Data Models
KC-4: OLAP Operations
CG-2:
KC-1: Data Warehouse Design, Data Warehouse Implementation
KC-2: Cube Materialization-Computation Methods
KC-3: Data Analysis in Cube Space – Exception-Based, Discovery-Driven Cube Space
Exploration
KC-4: Data Analysis in Cube Space – Exception-Based, Discovery-Driven Cube Space
Exploration

Cluster – IV:
MINING FREQUENT PATTERNS, ASSOCIATIONS, AND CORRELATIONS
CG-1:
KC-1: Market Basket Analysis
KC-2: Frequent Item set Mining
 Apriori Algorithm
 FP growth Algorithm
KC-3: Pattern Evaluation Methods
KC-4: Pattern Mining in Multilevel, Multidimensional Space
CG-2:
KC-1: Constraint-Based Frequent Pattern Mining-Constraints
KC-2: Colossal pattern
KC-3: Semantic Annotation of Frequent Patterns
KC-4: Applications of Pattern mining

3
UNIT – IV:
Cluster – V:
CLASSIFICATION
CG-1:
KC-1: Classification, Decision Tree
KC-2: Classification by decision tree Induction, Bayesian Classification
KC-3: Rule Based Classification
KC-4: Classification by Back Propagation
CG-2:
KC-1: Support Vector Machines
KC-2: Lazy Learners
KC-3: Introduction to other classification methods- Genetic, Rough Set, Fuzzy set
approach
KC-4: Model Evaluations, Model Selection

UNIT – V:
Cluster – VI:
CLUSTER ANALYSIS
CG-1:
KC-1: Cluster Analysis & its requirements
KC-2: Categorization of clustering methods
KC-3: Partitioning Methods
KC-4: Partitioning Methods
CG-1:
KC-1: Hierarchical Methods
KC-2: Density based Methods
KC-3: Grid Based Methods
KC-4: Evaluation of Clustering

Text Books:
1. Data Mining, Concepts and Techniques, Third Edition, Jiawei Han, Micheline Kamber,
Jian Pei.

You might also like