0% found this document useful (0 votes)

12 views

Data Mining Outline

This document outlines a course on advanced data mining. The course aims to teach supervised learning techniques primarily for classification using real-life data sets in R. Over 16 weeks, topics will include classification algorithms like decision trees, naive Bayes, and neural networks; clustering; associations; evaluation methods; data preparation; and visualization. Students will complete exams, assignments, quizzes, and a semester project to assess their understanding of interpreting and applying advanced data mining models.

Uploaded by

Noureen Zafar

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

12 views

Data Mining Outline

Uploaded by

Noureen Zafar

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 5

PIR MEHR ALI SHAH ARID AGRICULTURE UNIVERSITY

University Institute of Information Technology

CS-775 Advanced Data Mining

Credit Hours: 3(3-0) Prerequisites: Data Mining
Course Learning Outcomes (CLOs)
At the end of course the students will be able to: Domain BT Level*
1. Explain fundamental database concepts. C 2
2. Design conceptual, logical and physical database C 5
schemas using different data models.
3. Identify functional dependencies and resolve database C 2
anomalies by normalizing database tables.
4. Use Structured Query Language (SQL) for database C 4
definition and manipulation in any DBMS
*BT- Bloom’s Taxonomy, C=Cognitive domain, P=Psychomotor domain, A=Affective domain

Course Contents:
Topic to be covered include:
 Basic statistical ideas - populations, distributions, samples and random samples
 Classification models and methods - including: linear discriminant analysis; trees;
random forests; neural nets; boosting and bagging approaches; support vector machines.
 Linear regression approaches to classification, compared with linear discriminant
analysis,
 The training/test approach to assessing accuracy, and cross-validation.
 Strategies in the (common) situation where source and target population differ,
typically in time but in other respects also.
 Unsupervised models - kmeans, association rules, hierarchical clustering, model based
clusters.
 Low-dimensional views of classification results - distance methods and ordination.
 Strategies for working with large data sets.
 Practical approaches to classification with real life data sets, using different methods to
gain different insights into presentation.
 Privacy and security.
 Use of the R system for handling the calculations.

Course Objective:
The main focus of the course will be supervised learning, primarily for classification. The
emphasis will be on practical applications of the methodologies that are described, with the R
system used for the computations. Attention will be given to
1) Generalizability and predictive accuracy, in the practical contexts in which methods are
applied.

2) Low-dimensional visual representation of results, as an aid to diagnosis and insight.

3) Interpretability of model parameters, including potential for misinterpretation.

Lectures, Written Assignments, Practical labs, Semester Project, Presentations
Courses Assessment:
Exams, Assignments, Quizzes. Course will be assessed using a combination of written
examinations.
Week Contents Theory
1
M1: Introduction: Machine Learning and Data Mining
 Data Flood
 Data Mining Application Examples
 Data Mining and Knowledge Discovery
 Data Mining Tasks

2 M2: Machine Learning and Classification

 Machine Learning and Classification
 Examples
 Learning as Search
 Bias
 Weka

3 M3. Input: Concepts, instances, attributes

 What is a concept?
 What is an example?
 What is an attribute?
 Preparing the data

4 M4. Output: Knowledge Representation

 Decision tables
 Decision trees
 Decision rules
 Rules involving relations
 Instance-based representation

5 M5. Classification - Basic methods

 OneR
 NaiveBayes

6 M6: Classification: Decision Trees

 Top-Down Decision Trees
 Choosing the Splitting Attribute
 Information Gain and Gain ratio

7 M7: Classification: C4.5

 Handling Numeric Attributes
Finding Best Split
 Dealing with Missing Values
 Pruning
Pre-pruning, Post-Pruning, Estimating Error Rates
 From Trees to Rules

8 M8: Classification: CART

 CART Overview and Gymtutor Tutorial Example
 Splitting Criteria
 Handling Missing Values
 Pruning
Finding Optimal Tree

MID TERM
9 M9: Classification: more methods
 Rules
 Regression
 Instance-based (Nearest neighbor)

10 M10: Evaluation and Credibility

 Introduction
 Classification with Train, Test, and Validation sets
Handling Unbalanced Data; Parameter Tuning
 *Predicting Performance
 Evaluation on "small data": Cross-validation
 *Bootstrap
 Comparing Data Mining Schemes
 *Choosing a Loss Function

11 M11: Evaluation - Lift and Costs

 Lift and Gains charts
 *ROC
 Cost-sensitive learning
 Evaluating numeric predictions
 MDL principle and Occam's razor

12 M12: Data Preparation for Knowledge Discovery

 Data understanding
 Data cleaning
 Date transformation
 Discretization
 False "predictors" (information leakers)
 Feature reduction, leaker detection
 Randomization
 Learning with unbalanced data

13 M13: Clustering
 Introduction
 K-means
 Hierarchical

14 M14: Associations
 Transactions
 Frequent itemsets
 Association rules
 Applications

15 M15: Visualization
 Graphical excellence and lie factor
 Representing data in 1,2, and 3-D
 Representing data in 4+ dimensions
o Parallel coordinates
o Scatterplots
o Stick figures

16 M19: Data Mining and Society; Future Directions

 Data Mining and Society: Ethics, Privacy, and Security issues
 Future Directions for Data Mining
web mining, text mining, multi-media data
 Course Summary

Final Exam

Hourglass Workout Program by Luisagiuliet 2
76% (21)
Hourglass Workout Program by Luisagiuliet 2
51 pages
12 Week Program: Summer Body Starts Now
87% (46)
12 Week Program: Summer Body Starts Now
70 pages
Read People Like A Book by Patrick King-Edited
57% (82)
Read People Like A Book by Patrick King-Edited
12 pages
Livingood, Blake - Livingood Daily Your 21-Day Guide To Experience Real Health
77% (13)
Livingood, Blake - Livingood Daily Your 21-Day Guide To Experience Real Health
260 pages
Cheat Code To The Universe
94% (79)
Cheat Code To The Universe
34 pages
Facial Gains Guide (001 081)
91% (45)
Facial Gains Guide (001 081)
81 pages
Curse of Strahd
95% (467)
Curse of Strahd
258 pages
The Psychiatric Interview - Daniel Carlat
91% (34)
The Psychiatric Interview - Daniel Carlat
473 pages
The Borax Conspiracy
91% (57)
The Borax Conspiracy
14 pages
The Secret Language of Attraction
86% (108)
The Secret Language of Attraction
278 pages
How To Develop and Write A Grant Proposal
83% (542)
How To Develop and Write A Grant Proposal
17 pages
Penis Enlargement Secret
60% (124)
Penis Enlargement Secret
12 pages
Workbook For The Body Keeps The Score
89% (53)
Workbook For The Body Keeps The Score
111 pages
Donald Trump & Jeffrey Epstein Rape Lawsuit and Affidavits
83% (1016)
Donald Trump & Jeffrey Epstein Rape Lawsuit and Affidavits
13 pages
KamaSutra Positions
78% (69)
KamaSutra Positions
55 pages
7 Hermetic Principles
93% (30)
7 Hermetic Principles
3 pages
27 Feedback Mechanisms Pogil Key
77% (13)
27 Feedback Mechanisms Pogil Key
6 pages
Frank Hammond - List of Demons
92% (92)
Frank Hammond - List of Demons
3 pages
Phone Codes
79% (28)
Phone Codes
5 pages
36 Questions That Lead To Love
91% (35)
36 Questions That Lead To Love
3 pages
How 2 Setup Trust
97% (307)
How 2 Setup Trust
3 pages
The 36 Questions That Lead To Love - The New York Times
94% (34)
The 36 Questions That Lead To Love - The New York Times
3 pages
100 Questions To Ask Your Partner
78% (36)
100 Questions To Ask Your Partner
2 pages
Satanic Calendar
25% (56)
Satanic Calendar
4 pages
The 36 Questions That Lead To Love - The New York Times
95% (21)
The 36 Questions That Lead To Love - The New York Times
3 pages
Jeffrey Epstein39s Little Black Book Unredacted PDF
75% (12)
Jeffrey Epstein39s Little Black Book Unredacted PDF
95 pages
14 Easiest & Hardest Muscles To Build (Ranked With Solutions)
100% (8)
14 Easiest & Hardest Muscles To Build (Ranked With Solutions)
27 pages
1001 Songs
70% (73)
1001 Songs
1,798 pages
The 4 Hour Workweek, Expanded and Updated by Timothy Ferriss - Excerpt
23% (954)
The 4 Hour Workweek, Expanded and Updated by Timothy Ferriss - Excerpt
38 pages
Zodiac Sign & Their Most Common Addictions
63% (30)
Zodiac Sign & Their Most Common Addictions
9 pages
Filipinisms
80% (5)
Filipinisms
8 pages
Data Mining and Data Warehousing Principles and Practical Techniques 1108727743 9781108727747 Compress
No ratings yet
Data Mining and Data Warehousing Principles and Practical Techniques 1108727743 9781108727747 Compress
513 pages
Curriculum Innovation - Ppt.
100% (3)
Curriculum Innovation - Ppt.
13 pages
Lect 1
No ratings yet
Lect 1
38 pages
Machine
No ratings yet
Machine
61 pages
Course 5: Quantitative Techniques For Decision Making - Ii (Machine Learning Techniques)
No ratings yet
Course 5: Quantitative Techniques For Decision Making - Ii (Machine Learning Techniques)
5 pages
Data Mining Course Overview
No ratings yet
Data Mining Course Overview
38 pages
CSE2021 - MODULE 1ppt
No ratings yet
CSE2021 - MODULE 1ppt
62 pages
Chap1 Intro-2
No ratings yet
Chap1 Intro-2
34 pages
DMlecture1
No ratings yet
DMlecture1
39 pages
BCSE_0553
No ratings yet
BCSE_0553
1 page
lecture1&2-đã chuyển đổi
No ratings yet
lecture1&2-đã chuyển đổi
46 pages
0 KDLVLP Đã G P
No ratings yet
0 KDLVLP Đã G P
523 pages
Bilal Ahmed Shaik Data Mining
No ratings yet
Bilal Ahmed Shaik Data Mining
88 pages
Lecture 3 Data Mining
No ratings yet
Lecture 3 Data Mining
30 pages
1. Introduction to Data Mining & Classification
No ratings yet
1. Introduction to Data Mining & Classification
58 pages
Data Mining Chapter 1 Notes
No ratings yet
Data Mining Chapter 1 Notes
40 pages
Comp 6838
No ratings yet
Comp 6838
41 pages
Classification
No ratings yet
Classification
4 pages
DM Chapter 4
No ratings yet
DM Chapter 4
47 pages
Bia Unit-3 Part-2
No ratings yet
Bia Unit-3 Part-2
43 pages
MLT Syllabus
No ratings yet
MLT Syllabus
3 pages
3 DM Classification
No ratings yet
3 DM Classification
55 pages
CIS527: Data Warehousing, Filtering, and Mining: Fall 2004, CIS, Temple University
No ratings yet
CIS527: Data Warehousing, Filtering, and Mining: Fall 2004, CIS, Temple University
50 pages
DSBA Curriculum Booklet
No ratings yet
DSBA Curriculum Booklet
14 pages
Data Science Intro Mulawarman
No ratings yet
Data Science Intro Mulawarman
89 pages
Data Mining and Business Intelligence
No ratings yet
Data Mining and Business Intelligence
4 pages
Unit 3
No ratings yet
Unit 3
33 pages
Class10-Introduction_to_ML
No ratings yet
Class10-Introduction_to_ML
32 pages
Dr. Gaurav Dixit: Department of Management Studies
No ratings yet
Dr. Gaurav Dixit: Department of Management Studies
26 pages
Data Classification - Algorithms and Applications-Chapman and Hall - CRC (2014) - (Chapman & Hall - CRC Data Mining and Knowledge Discovery Series) Charu C. Aggarwal PDF
100% (1)
Data Classification - Algorithms and Applications-Chapman and Hall - CRC (2014) - (Chapman & Hall - CRC Data Mining and Knowledge Discovery Series) Charu C. Aggarwal PDF
704 pages
DM Guidelines 14jan2022
No ratings yet
DM Guidelines 14jan2022
5 pages
1 Introduction
No ratings yet
1 Introduction
30 pages
Data Mining Introduction
No ratings yet
Data Mining Introduction
35 pages
Data Mining
No ratings yet
Data Mining
33 pages
Is Zc415 (Data Mining BITS-WILP)
No ratings yet
Is Zc415 (Data Mining BITS-WILP)
4 pages
4 - Data Analytics Using DM and ML Algorithms - 1
No ratings yet
4 - Data Analytics Using DM and ML Algorithms - 1
71 pages
PGPAIML Curriculum Overview
No ratings yet
PGPAIML Curriculum Overview
15 pages
Data Mining Slide
No ratings yet
Data Mining Slide
35 pages
Data Mining
No ratings yet
Data Mining
30 pages
Information Technology Fundamentals: CCIT4085
No ratings yet
Information Technology Fundamentals: CCIT4085
43 pages
MLDM Lect1 Introduction
No ratings yet
MLDM Lect1 Introduction
40 pages
Handout
No ratings yet
Handout
4 pages
classification basic concept.data mining
No ratings yet
classification basic concept.data mining
20 pages
DataClassification
No ratings yet
DataClassification
65 pages
DM-Unit-I Introduction To Association-1
No ratings yet
DM-Unit-I Introduction To Association-1
97 pages
CE0716-Data Warehouse and Mining_Compulsory
No ratings yet
CE0716-Data Warehouse and Mining_Compulsory
5 pages
Btech Sem6 Cs1141 Data Mining
No ratings yet
Btech Sem6 Cs1141 Data Mining
5 pages
Data Mining: Ying Liu, Prof., PH.D
No ratings yet
Data Mining: Ying Liu, Prof., PH.D
57 pages
Lecture 2
No ratings yet
Lecture 2
66 pages
INS2061 Introductions
No ratings yet
INS2061 Introductions
75 pages
Data Mining Intro IEP
No ratings yet
Data Mining Intro IEP
47 pages
Data Mining All Summary
No ratings yet
Data Mining All Summary
47 pages
ML Lect1
100% (1)
ML Lect1
51 pages
Data Mining: Concepts and Techniques: - Chapter 6
No ratings yet
Data Mining: Concepts and Techniques: - Chapter 6
115 pages
Intelligent Systems 1
No ratings yet
Intelligent Systems 1
38 pages
Data Mining and BI
No ratings yet
Data Mining and BI
4 pages
Teaching Decision Tree Classification Using Microsoft Excel
No ratings yet
Teaching Decision Tree Classification Using Microsoft Excel
9 pages
Classification and Prediction Lecture-22,23,24,25,26,27, 28: Dr. Sudhir Sharma Manipal University Jaipur
No ratings yet
Classification and Prediction Lecture-22,23,24,25,26,27, 28: Dr. Sudhir Sharma Manipal University Jaipur
43 pages
1 Introduction (1)
No ratings yet
1 Introduction (1)
30 pages
The Secret Of Machine Learning
From Everand
The Secret Of Machine Learning
Mhd Arjunanta
No ratings yet
Mastering Data Science: A Comprehensive Guide to Techniques and Applications
From Everand
Mastering Data Science: A Comprehensive Guide to Techniques and Applications
Adam Jones
No ratings yet
Database Split Schedule 16.2.20
No ratings yet
Database Split Schedule 16.2.20
8 pages
Information Retrieval
No ratings yet
Information Retrieval
5 pages
CS - 687 Parallel and Distributed Computing
100% (2)
CS - 687 Parallel and Distributed Computing
3 pages
CS-601 Database Administration & Management
No ratings yet
CS-601 Database Administration & Management
4 pages
2019 Tapies
No ratings yet
2019 Tapies
16 pages
Plants Lesson Plan Final
No ratings yet
Plants Lesson Plan Final
9 pages
From 2 Score 2
No ratings yet
From 2 Score 2
7 pages
12 Angry Men Reflection Paper
No ratings yet
12 Angry Men Reflection Paper
3 pages
EDIANON 4 As LESSON PLAN Semi Detailed PDF
No ratings yet
EDIANON 4 As LESSON PLAN Semi Detailed PDF
9 pages
Let Us Learn Numeracy Action Plan
No ratings yet
Let Us Learn Numeracy Action Plan
4 pages
Instructional Design Plan
No ratings yet
Instructional Design Plan
10 pages
semi-detailed-LP-Bias-and-Prejudice (1) G9
No ratings yet
semi-detailed-LP-Bias-and-Prejudice (1) G9
16 pages
A Playwriting Technique To Engage On A Shared Reflective Enquiry PDF
No ratings yet
A Playwriting Technique To Engage On A Shared Reflective Enquiry PDF
10 pages
DR D E-Portfolios
No ratings yet
DR D E-Portfolios
8 pages
CUP Whats New Brochure 2024 Digital
No ratings yet
CUP Whats New Brochure 2024 Digital
16 pages
Bca First Sem Communication
No ratings yet
Bca First Sem Communication
42 pages
2B Conceptual Framework & RRL
No ratings yet
2B Conceptual Framework & RRL
43 pages
Inside Listening Unit 1
No ratings yet
Inside Listening Unit 1
7 pages
Understanding The Significance of Financial Accounting Education
No ratings yet
Understanding The Significance of Financial Accounting Education
20 pages
Scheme of Work Public Speaking Class KPM
No ratings yet
Scheme of Work Public Speaking Class KPM
4 pages
Uses and Gratification Theory
100% (1)
Uses and Gratification Theory
19 pages
5943-Article Text-28457-3-10-20230925
No ratings yet
5943-Article Text-28457-3-10-20230925
13 pages
Statistics Final Exam
No ratings yet
Statistics Final Exam
7 pages
Crichton Jet
No ratings yet
Crichton Jet
7 pages
Course Introduction NCM 119 Rle
No ratings yet
Course Introduction NCM 119 Rle
7 pages
MPRE Action Plan Revised
100% (3)
MPRE Action Plan Revised
9 pages
Teaching With Feminist Materialisms
67% (3)
Teaching With Feminist Materialisms
174 pages
Lesson Plan 2 Just Us Women My Dream Trip
No ratings yet
Lesson Plan 2 Just Us Women My Dream Trip
5 pages

Data Mining Outline

Uploaded by

Data Mining Outline

Uploaded by

PIR MEHR ALI SHAH ARID AGRICULTURE UNIVERSITY

University Institute of Information Technology

CS-775 Advanced Data Mining

2) Low-dimensional visual representation of results, as an aid to diagnosis and insight.

3) Interpretability of model parameters, including potential for misinterpretation.

2 M2: Machine Learning and Classification

3 M3. Input: Concepts, instances, attributes

4 M4. Output: Knowledge Representation

5 M5. Classification - Basic methods

6 M6: Classification: Decision Trees

7 M7: Classification: C4.5

8 M8: Classification: CART

10 M10: Evaluation and Credibility

11 M11: Evaluation - Lift and Costs

12 M12: Data Preparation for Knowledge Discovery

16 M19: Data Mining and Society; Future Directions

You might also like