0% found this document useful (0 votes)
42 views

MLDA Syllabus

The document outlines a course on Data Visualization including objectives, outcomes, syllabus and references. The syllabus is divided into 4 units covering introduction to data visualization, visualization structures, evaluation of visualization systems and design of visualization systems using R.

Uploaded by

mesevox663
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
42 views

MLDA Syllabus

The document outlines a course on Data Visualization including objectives, outcomes, syllabus and references. The syllabus is divided into 4 units covering introduction to data visualization, visualization structures, evaluation of visualization systems and design of visualization systems using R.

Uploaded by

mesevox663
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 20

Handbook of B.Tech. Programmes offered by USICT at Affiliated Institutions of the University.

Data Analytics L P C
3 3

Discipline(s) / EAE / OAE Semester Group Sub‐group Paper Code


EAE 6 MLDA‐EAE MLDA‐EAE‐2A DA‐338T

Marking Scheme:
1. Teachers Continuous Evaluation: 25 marks
2. Term end Theory Examinations: 75 marks
Instructions for paper setter:
1. There should be 9 questions in the term end examinations question paper.
2. The first (1st) question should be compulsory and cover the entire syllabus. This question should be
objective, single line answers or short answer type question of total 15 marks.
3. Apart from question 1 which is compulsory, rest of the paper shall consist of 4 units as per the syllabus.
Every unit shall have two questions covering the corresponding unit of the syllabus. However, the student
shall be asked to attempt only one of the two questions in the unit. Individual questions may contain upto
5 sub‐parts / sub‐questions. Each Unit shall have a marks weightage of 15.
4. The questions are to be framed keeping in view the learning outcomes of the course / paper. The standard
/ level of the questions to be asked should be at the level of the prescribed textbook.
5. The requirement of (scientific) calculators / log‐tables / data – tables may be specified if required.
Course Objectives :
1. To develop the fundamental concepts such as data analysis, data pre‐processing
2. To learn about the various data modelling techniques
3. To learn three different mining techniques.
4. Exposure to Data Analytics with R
Course Outcomes (CO)
CO 1 Discuss various concepts of data analytics
CO 2 Apply classification and regression techniques
CO 3 Explain and apply mining techniques on streaming data
CO 4 Describe the concept of R programming and implement analytics on Big data using R.
Course Outcomes (CO) to Programme Outcomes (PO) mapping (scale 1: low, 2: Medium, 3: High)
PO01 PO02 PO03 PO04 PO05 PO06 PO07 PO08 PO09 PO10 PO11 PO12
CO 1 1 ‐ 3 ‐ 2 ‐ ‐ ‐ ‐ ‐ ‐ ‐
CO 2 1 ‐ 3 ‐ ‐ 2 ‐ ‐ 3 1 ‐ ‐
CO 3 1 ‐ 2 ‐ ‐ 2 ‐ ‐ ‐ ‐ ‐ ‐
CO 4 1 ‐ 3 ‐ ‐ 3 1 ‐ 2 ‐ ‐ ‐

UNIT‐I

Introduction to Data Analytics: Sources and nature of data, classification of data (structured, semi‐structured,
unstructured), characteristics of data, introduction to Big Data platform, need of data analytics, evolution of
analytic scalability, analytic process and tools, analysis vs reporting, modern data analytic tools, applications of
data analytics. Data Analytics Lifecycle: Need, key roles for successful analytic projects, various phases of data
analytics lifecycle – discovery, data preparation, model planning, model building, communicating results, and
operationalization.

UNIT‐II

Data Analysis: Regression modeling, multivariate analysis, Bayesian modeling, inference and Bayesian
networks, support vector and kernel methods, analysis of time series: linear systems analysis & nonlinear
dynamics, rule induction, neural networks: learning and generalisation, competitive learning, principal
component analysis and neural networks.

Applicable from Batch Admitted in Academic Session 2021-22 Onwards Page 678
Handbook of B.Tech. Programmes offered by USICT at Affiliated Institutions of the University.

UNIT‐III

Mining Data Streams: Introduction to streams concepts, stream data model and architecture, stream
computing, sampling data in a stream, filtering streams, counting distinct elements in a stream, estimating
moments, counting oneness in a window, decaying window, Real‐time Analytics Platform ( RTAP) applications,
Case studies – real time sentiment analysis, stock market predictions.

UNIT – IV

Frame Works and Visualization: MapReduce, Hadoop, Pig, Hive, HBase, MapR, Sharding, NoSQL Databases, S3,
Hadoop Distributed File Systems, Visualization: visual data analysis techniques, interaction techniques, systems
and applications. Introduction to R ‐ R graphical user interfaces, data import and export, attribute and data
types, descriptive statistics, exploratory data analysis, visualization before analysis, analytics for unstructured
data.

Textbook(s):
1. Jiawei Han, Micheline Kamber “Data Mining Concepts and Techniques”, Second Edition, Elsevier.
2. Michael Berthold, David J. Hand, Intelligent Data Analysis, Springer

References:
1. Michael Minelli, Michelle Chambers, and Ambiga Dhiraj, "Big Data, Big Analytics: Emerging Business
Intelligence and Analytic Trends for Today's Businesses", Wiley.
2. David Dietrich, Barry Heller, Beibei Yang, “Data Science and Big Data Analytics”, EMC Education Series,
John Wiley

Applicable from Batch Admitted in Academic Session 2021-22 Onwards Page 679
Handbook of B.Tech. Programmes offered by USICT at Affiliated Institutions of the University.

Data Analytics Lab L P C


2 1

Discipline(s) / EAE / OAE Semester Group Sub‐group Paper Code


EAE 6 MLDA‐EAE MLDA‐EAE‐2A DA‐338P

Marking Scheme:
1. Teachers Continuous Evaluation: 40 marks
2. Term end Theory Examinations: 60 marks
Instructions:
1. The course objectives and course outcomes are identical to that of (Data Analytics) as this is the practical
component of the corresponding theory paper.
2. The practical list shall be notified by the teacher in the first week of the class commencement under
intimation to the office of the Head of Department / Institution in which the paper is being offered from
the list of practicals below. Atleast 10 experiments must be performed by the students, they may be asked
to do more. Atleast 5 experiments must be from the given list.

1. To get the input from user and perform numerical operations (MAX, MIN, AVG, SUM, SQRT, ROUND) using
in R.
2. To perform data import/export (.CSV, .XLS, .TXT) operations using data frames in R
3. To get the input matrix from user and perform Matrix addition, subtraction, multiplication, inverse
transpose and division operations using vector concept in R
4. To perform statistical operations (Mean, Median, Mode and Standard deviation) using R
5. To perform data pre‐processing operations i) Handling Missing data ii) Min‐Max normalization
6. To perform dimensionality reduction operation using PCA for Houses Data Set.
7. To perform Simple Linear Regression with R..
8. To perform K‐Means clustering operation and visualize for iris data set
9. Write R script to diagnose any disease using KNN classification and plot the results.
10. To perform market basket analysis using Association Rules (Apriori).

Applicable from Batch Admitted in Academic Session 2021-22 Onwards Page 680
Handbook of B.Tech. Programmes offered by USICT at Affiliated Institutions of the University.

Data Visualization L P C
3 3

Discipline(s) / EAE / OAE Semester Group Sub‐group Paper Code


EAE 6 MLDA‐EAE MLDA‐EAE‐2B DS‐340T
CST/ITE 7 PCE PCE‐5 CIE‐423T

Marking Scheme:
1. Teachers Continuous Evaluation: 25 marks
2. Term end Theory Examinations: 75 marks
Instructions for paper setter:
1. There should be 9 questions in the term end examinations question paper.
2. The first (1st) question should be compulsory and cover the entire syllabus. This question should be
objective, single line answers or short answer type question of total 15 marks.
3. Apart from question 1 which is compulsory, rest of the paper shall consist of 4 units as per the syllabus.
Every unit shall have two questions covering the corresponding unit of the syllabus. However, the student
shall be asked to attempt only one of the two questions in the unit. Individual questions may contain upto
5 sub‐parts / sub‐questions. Each Unit shall have a marks weightage of 15.
4. The questions are to be framed keeping in view the learning outcomes of the course / paper. The standard
/ level of the questions to be asked should be at the level of the prescribed textbook.
5. The requirement of (scientific) calculators / log‐tables / data – tables may be specified if required.
Course Objectives :
1. To understand the key techniques and theory behind data visualization
2. To use effectively the various visualization structures (like tables, spatial data, tree and network etc.)
3. To evaluate information visualization systems and other forms of visual presentation for their
effectiveness
4. To design and build data visualization systems with box plots, heat maps etc.
Course Outcomes (CO)
CO 1 Understand the key techniques and theory behind data visualization
CO 2 Use effectively the various visualization structures (like tables, spatial data, tree and network etc.)
CO 3 Evaluate information visualization systems and other forms of visual presentation for their
effectiveness
CO 4 Design and build data visualization systems with box plots, heat maps etc.
Course Outcomes (CO) to Programme Outcomes (PO) mapping (scale 1: low, 2: Medium, 3: High)
PO01 PO02 PO03 PO04 PO05 PO06 PO07 PO08 PO09 PO10 PO11 PO12
CO 1 3 3 3 3 3 ‐ ‐ 1 2 ‐ ‐ 3
CO 2 3 3 3 3 3 ‐ ‐ 1 2 ‐ ‐ 3
CO 3 3 3 3 3 3 ‐ ‐ 1 2 ‐ ‐ 3
CO 4 3 3 3 3 3 ‐ ‐ 1 2 ‐ ‐ 3

UNIT‐I

Value of Visualization – What is Visualization and Why do it: External representation – Interactivity – Difficulty
in Validation. Data Abstraction: Dataset types – Attribute types – Semantics. Task Abstraction – Analyze,
Produce, Search, Query. Four levels of validation – Validation approaches – Validation examples. Marks and
Channels

UNIT‐II

Rules of thumb – Arrange tables: Categorical regions – Spatial axis orientation – Spatial layout density. Arrange
spatial data: Geometry – Scalar fields – Vector fields – Tensor fields. Arrange networks and trees: Connections,
Matrix views – Containment. Map color: Color theory, Color maps and other channels.

Applicable from Batch Admitted in Academic Session 2021-22 Onwards Page 696
Handbook of B.Tech. Programmes offered by USICT at Affiliated Institutions of the University.

UNIT‐III

Manipulate view: Change view over time – Select elements – Changing viewpoint – Reducing attributes. Facet
into multiple views: Juxtapose and Coordinate views – Partition into views – Static and Dynamic layers –
Reduce items and attributes: Filter – Aggregate. Focus and context: Elide – Superimpose – Distort – Case
studies.

UNIT – IV

Applied Visualizations: Box plot ‐ Density Plot ‐ Area Chart ‐ Heat map ‐ Tree map ‐ Graph Networks

Textbook(s):
1. Tamara Munzner, Visualization Analysis and Design, A K Peters Visualization Series, CRC Press, 2014.
2. Scott Murray, Interactive Data Visualization for the Web, O’Reilly, 2013.

References:
1. Alberto Cairo, The Functional Art: An Introduction to Information Graphics and Visualization, New Riders,
2012
2. Nathan Yau, Visualize This: The FlowingData Guide to Design, Visualization and Statistics, John Wiley &
Sons, 2011.

Applicable from Batch Admitted in Academic Session 2021-22 Onwards Page 697
Handbook of B.Tech. Programmes offered by USICT at Affiliated Institutions of the University.

Data Visualization Lab L P C


2 1

Discipline(s) / EAE / OAE Semester Group Sub‐group Paper Code


EAE 6 MLDA‐EAE MLDA‐EAE‐2B DS‐340P
CST/ITE 7 PCE PCE‐5 CIE‐423P

Marking Scheme:
1. Teachers Continuous Evaluation: 40 marks
2. Term end Theory Examinations: 60 marks
Instructions:
1. The course objectives and course outcomes are identical to that of (Data Visualization) as this is the
practical component of the corresponding theory paper.
2. The practical list shall be notified by the teacher in the first week of the class commencement under
intimation to the office of the Head of Department / Institution in which the paper is being offered from
the list of practicals below. Atleast 10 experiments must be performed by the students, they may be asked
to do more. Atleast 5 experiments must be from the given list.

1. Loading and Distinguishing Dependent and Independent parameters


2. Exploring Data Visualization tools
3. Drawing Charts
4. Drawing Graphs
5. Data mapping
6. Creating Scatter Plot maps
7. Using BNF Notations
8. Working with REGEX
9. Visualize Network Data
10. Understanding Data Visualization frameworks

Applicable from Batch Admitted in Academic Session 2021-22 Onwards Page 698
Handbook of B.Tech. Programmes offered by USICT at Affiliated Institutions of the University.

Machine Learning L P C
3 3

Discipline(s) / EAE / OAE Semester Group Sub‐group Paper Code


ECE 6 PCE PCE‐3 ECE‐350T
EAE 6 MLDA‐EAE MLDA‐EAE‐2C ML‐342T
CSE/IT/CST/ITE 7 PCE PCE‐5 CIE‐421T
CSE‐AIML 7 PC PC ML‐407T
EAE 7 AIML‐EAE AIML‐EAE‐3 ML‐407T

Marking Scheme:
1. Teachers Continuous Evaluation: 25 marks
2. Term end Theory Examinations: 75 marks
Instructions for paper setter:
1. There should be 9 questions in the term end examinations question paper.
2. The first (1st) question should be compulsory and cover the entire syllabus. This question should be
objective, single line answers or short answer type question of total 15 marks.
3. Apart from question 1 which is compulsory, rest of the paper shall consist of 4 units as per the syllabus.
Every unit shall have two questions covering the corresponding unit of the syllabus. However, the student
shall be asked to attempt only one of the two questions in the unit. Individual questions may contain upto
5 sub‐parts / sub‐questions. Each Unit shall have a marks weightage of 15.
4. The questions are to be framed keeping in view the learning outcomes of the course / paper. The standard
/ level of the questions to be asked should be at the level of the prescribed textbook.
5. The requirement of (scientific) calculators / log‐tables / data – tables may be specified if required.
Course Objectives :
1. To understand the need of machine learning
2. To learn about regression and feature selection
3. To understand about classification algorithms
4. To learn clustering algorithms
Course Outcomes (CO)
CO 1 To formulate machine learning problems
CO 2 Learn about regression and feature selection techniques
CO 3 Apply machine learning techniques such as classification to practical applications
CO 4 Apply clustering algorithms
Course Outcomes (CO) to Programme Outcomes (PO) mapping (scale 1: low, 2: Medium, 3: High)
PO01 PO02 PO03 PO04 PO05 PO06 PO07 PO08 PO09 PO10 PO11 PO12
CO 1 3 3 3 3 3 2 2 ‐ ‐ ‐ ‐ 2
CO 2 3 3 3 3 3 2 2 ‐ ‐ ‐ ‐ 2
CO 3 3 3 3 3 3 2 2 ‐ ‐ ‐ ‐ 2
CO 4 3 3 3 3 3 2 2 ‐ ‐ ‐ ‐ 2

UNIT‐I

Introduction: Machine learning, terminologies in machine learning, Perspectives and issues in machine
learning, application of Machine learning, Types of machine learning: supervised, unsupervised, semi‐
supervised learning. Review of probability, Basic Linear Algebra in Machine Learning Techniques, Dataset and
its types,Data preprocessing, Bias and Variance in Machine learning , Function approximation, Overfitting

UNIT‐II

Regression Analysis in Machine Learning: Introduction to regression and its terminologies,Types of


regression,Logistic Regression
Simple Linear regression: Introduction to Simple Linear Regression and its assumption, Simple Linear

Applicable from Batch Admitted in Academic Session 2021-22 Onwards Page 1056
Handbook of B.Tech. Programmes offered by USICT at Affiliated Institutions of the University.

Regression Model Building,Ordinary Least square estimation, Properties of the least‐squares estimators and
the fitted regression model, Interval estimation in simple linear regression , Residuals
Multiple Linear Regression:Multiple linear regression model and its assumption, Interpret Multiple Linear
Regression Output(R‐Square, Standard error, F, Significance F, Cofficient P values), Access the fit of multiple
linear regression model (R squared, Standard error)
Feature Selection and Dimensionality Reduction: PCA, LDA, ICA

UNIT‐III

Introduction to Classification and Classification Algorithms: What is Classification? General Approach to


Classification, k‐Nearest Neighbor Algorithm, Random Forests, Fuzzy Set Approaches
Support Vector Machine: Introduction, Types of support vector kernel – (Linear kernel, polynomial kernel, and
Gaussiankernel), Hyperplane – (Decision surface), Properties of SVM, and Issues in SVM.
Decision Trees: Decision tree learning algorithm,ID‐3algorithm, Inductive bias, Entropy and information theory,
Information gain,Issues in Decision tree learning.
Bayesian Learning ‐ Bayes theorem, Concept learning, Bayes Optimal Classifier, Naïve Bayes classifier, Bayesian
belief networks, EM algorithm
Ensemble Methods: Bagging, Boosting and AdaBoost and XBoost,
Classification Model Evaluation and Selection: Sensitivity, Specificity, Positive Predictive Value, Negative
Predictive Value, Lift Curves and Gain Curves, ROC Curves, Misclassification Cost Adjustment to Reflect Real‐
World Concerns, Decision Cost/Benefit Analysis

UNIT – IV

Introduction to Cluster Analysis and Clustering Methods: The Clustering Task and the Requirements for
Cluster Analysis , Overview of Some Basic Clustering Methods:‐k‐Means Clustering, k‐Medoids Clustering,
Density‐Based Clustering: DBSCAN ‐ Density‐Based Clustering Based on Connected Regions with High Density,
Gaussian Mixture Model algorithm , Balance Iterative Reducing and Clustering using Hierarchies (BIRCH) ,
Affinity Propagation clustering algorithm,Mean‐Shift clustering algorithm, ordering Points to Identify the
Clustering Structure (OPTICS) algorithm, Agglomerative Hierarchy clustering algorithm, Divisive Hierarchical ,
Measuring Clustering Goodness

Textbook(s):

1. Tom M. Mitchell, “Machine Learning”, McGraw‐Hill Education (India) Private Limited, 2013.
2. M. Gopal, “Applied Machine Learning”, McGraw Hill Education

References:
1. C. M. BISHOP (2006), “Pattern Recognition and Machine Learning”, Springer‐Verlag New York, 1st Edition
2. R. O. Duda, P. E. Hart, D. G. Stork (2000), Pattern Classification, Wiley‐Blackwell, 2nd Edition

Applicable from Batch Admitted in Academic Session 2021-22 Onwards Page 1057
Handbook of B.Tech. Programmes offered by USICT at Affiliated Institutions of the University.

Machine Learning Lab L P C


2 1

Discipline(s) / EAE / OAE Semester Group Sub‐group Paper Code


ECE 6 PCE PCE‐3 ECE‐350P
EAE 6 MLDA‐EAE MLDA‐EAE‐2C ML‐342P
CSE/IT/CST/ITE 7 PCE PCE‐5 CIE‐421P
CSE‐AIML 7 PC PC ML‐407P
EAE 7 AIML‐EAE AIML‐EAE‐3 ML‐407P

Marking Scheme:
1. Teachers Continuous Evaluation: 40 marks
2. Term end Theory Examinations: 60 marks
Instructions:
1. The course objectives and course outcomes are identical to that of (Machine Learning) as this is the
practical component of the corresponding theory paper.
2. The practical list shall be notified by the teacher in the first week of the class commencement under
intimation to the office of the Head of Department / Institution in which the paper is being offered from
the list of practicals below. Atleast 10 experiments must be performed by the students, they may be asked
to do more. Atleast 5 experiments must be from the given list.

1. Introduction to JUPYTER IDE and its libraries Pandas and NumPy


2. Program to demonstrate Simple Linear Regression
3. Program to demonstrate Logistic Regression
4. Program to demonstrate Decision Tree – ID3 Algorithm
5. Program to demonstrate k‐Nearest Neighbor flowers classification
6. Program to demonstrate Naïve‐ Bayes Classifier
7. Program to demonstrate PCA and LDA on Iris dataset
8. Program to demonstrate DBSCAN clustering algorithm
9. Program to demonstrate K‐Medoid clustering algorithm
10. Program to demonstrate K‐Means Clustering Algorithm on Handwritten Dataset

Applicable from Batch Admitted in Academic Session 2021-22 Onwards Page 1058
Handbook of B.Tech. Programmes offered by USICT at Affiliated Institutions of the University.

Machine Learning and Data Analytics Case Studies L P C


3 3

Discipline(s) / EAE / OAE Semester Group Sub‐group Paper Code


EAE 7 MLDA‐EAE MLDA‐EAE‐5A ML‐467T

Marking Scheme:
1. Teachers Continuous Evaluation: 25 marks
2. Term end Theory Examinations: 75 marks
Instructions for paper setter:
1. There should be 9 questions in the term end examinations question paper.
2. The first (1st) question should be compulsory and cover the entire syllabus. This question should be
objective, single line answers or short answer type question of total 15 marks.
3. Apart from question 1 which is compulsory, rest of the paper shall consist of 4 units as per the syllabus.
Every unit shall have two questions covering the corresponding unit of the syllabus. However, the student
shall be asked to attempt only one of the two questions in the unit. Individual questions may contain upto
5 sub‐parts / sub‐questions. Each Unit shall have a marks weightage of 15.
4. The questions are to be framed keeping in view the learning outcomes of the course / paper. The standard
/ level of the questions to be asked should be at the level of the prescribed textbook.
5. The requirement of (scientific) calculators / log‐tables / data – tables may be specified if required.
Course Objectives :
1. This course provides the fundamental concepts in data science.
2. Learn the Basics of statistical data analysis with examples.
3. Basics of Machine Learning and statistical measures.
4. Compile and visualize data using statistical functions.
Course Outcomes (CO)
CO 1 Impart the knowledge of data classification, process of big data technology, user roles and skills in data
science.
CO 2 Understand how data is analysed and visualized using statistic functions
CO 3 Analyze the methodologies of data science
CO 4 Design the code for the problems related to data science using R
Course Outcomes (CO) to Programme Outcomes (PO) mapping (scale 1: low, 2: Medium, 3: High)
PO01 PO02 PO03 PO04 PO05 PO06 PO07 PO08 PO09 PO10 PO11 PO12
CO 1 3 3 2 3 ‐ 2 ‐ ‐ ‐ ‐ ‐ ‐
CO 2 ‐ 3 ‐ 2 ‐ ‐ ‐ ‐ ‐ ‐ 2 ‐
CO 3 ‐ ‐ ‐ 3 3 3 ‐ ‐ ‐ 2 3
CO 4 ‐ ‐ 3 2 ‐ 3 ‐ ‐ ‐ ‐ 2 2

UNIT‐I

Unsupervised Machine Learning Algorithms: Dimensionality Reduction, Clustering, Supervised Machine


Learning Problems: Regression and classification.
Case Study: Balanced Scorecard Model for Measuring Organizational Performance, Employee Attrition in an
Organization, Market Capitalization Categories, Performance Appraisal in Organizations, Application of
Technology Acceptance Model in Cloud Computing, Prediction of Customer Buying Intention due to Digital
Marketing.

UNIT‐II

Supervised Machine Learning Algorithms: Naïve Bayes Algorithm, k‐Nearest Neighbor’s (KNN) Algorithm,
Support Vector Machines (SVMs), Decision Trees.
Case Study: Measuring Acceptability of a New Product, Case Study: Predicting Phishing Websites, Fraud
Analysis for Credit Card and Mobile Payment Transactions, Artificial Intelligence and Employment.

Applicable from Batch Admitted in Academic Session 2021-22 Onwards Page 1059
Handbook of B.Tech. Programmes offered by USICT at Affiliated Institutions of the University.

Machine Learning and Data Analytics Case Studies Lab L P C


2 1

Discipline(s) / EAE / OAE Semester Group Sub‐group Paper Code


EAE 7 MLDA‐EAE MLDA‐EAE‐5A ML‐467P

Marking Scheme:
1. Teachers Continuous Evaluation: 40 marks
2. Term end Theory Examinations: 60 marks
Instructions:
1. The course objectives and course outcomes are identical to that of (Machine Learning and Data Analytics
Case Studies) as this is the practical component of the corresponding theory paper.
2. The practical list shall be notified by the teacher in the first week of the class commencement under
intimation to the office of the Head of Department / Institution in which the paper is being offered from
the list of practicals below. Atleast 10 experiments must be performed by the students, they may be asked
to do more. Atleast 5 experiments must be from the given list.

Implement a case study mentioned in syllabus by considering following methods.


1. Define the problem statement: Clearly articulate the problem you want to solve or the objective you want
to achieve using machine learning and data analytics.
2. Gather and preprocess data: Collect relevant data for your case study and perform necessary
preprocessing tasks such as data cleaning, handling missing values, and feature engineering.
3. Exploratory data analysis (EDA): Analyze the dataset to gain insights into the data distribution, identify
patterns, and visualize relationships between variables.
4. Split the data and model training: Divide the dataset into training, validation, and testing sets and train
your models on the training set, tune hyperparameters using the validation set, and evaluate their
performance on the testing set.
5. Model evaluation: Train the selected machine learning models using the training set and evaluate their
performance using appropriate metrics such as accuracy, precision, recall, F1‐score, or mean squared error.
6. Deployment: Implement the chosen model in a real‐world scenario, considering factors such as scalability,
performance, and integration with existing systems.
7. Interpretation and visualization: Interpret the results of your models and visualize them in a meaningful
way. This helps in presenting insights to stakeholders and understanding the impact of different variables
on the outcome.
8. Documentation: Document your case study, including the problem statement, data sources, preprocessing
steps, modeling techniques used, results obtained, and any limitations or assumptions made during the
process.
9. Communication and presentation: Prepare a clear and concise presentation of your case study findings,
highlighting the key insights and recommendations derived from your analysis.
10. Ethical considerations: Consider ethical aspects such as data privacy, bias, and fairness throughout the
entire process. Ensure that your models and analyses are fair, transparent, and aligned with legal and
ethical guidelines.

Applicable from Batch Admitted in Academic Session 2021-22 Onwards Page 1061
Handbook of B.Tech. Programmes offered by USICT at Affiliated Institutions of the University.

Statistics, Statistical Modelling & Data Analytics L P C


3 3

Discipline(s) / EAE / OAE Semester Group Sub‐group Paper Code


CSE‐AI/CSE‐AIML/CSE‐DS 6 PC PC DA‐304T
EAE 6 AI‐EAE AI‐EAE‐2 DA‐304T
EAE 6 AIML‐EAE AIML‐EAE‐2 DA‐304T
EAE 6 DS‐EAE DS‐EAE‐1 DA‐304T
EAE 6 SC‐EAE SC‐EAE‐1 DA‐304T
EAE 6 MLDA‐EAE MLDA‐EAE‐1 DA‐304T

Marking Scheme:
1. Teachers Continuous Evaluation: 25 marks
2. Term end Theory Examinations: 75 marks
Instructions for paper setter:
1. There should be 9 questions in the term end examinations question paper.
2. The first (1st) question should be compulsory and cover the entire syllabus. This question should be
objective, single line answers or short answer type question of total 15 marks.
3. Apart from question 1 which is compulsory, rest of the paper shall consist of 4 units as per the syllabus.
Every unit shall have two questions covering the corresponding unit of the syllabus. However, the student
shall be asked to attempt only one of the two questions in the unit. Individual questions may contain upto 5
sub‐parts / sub‐questions. Each Unit shall have a marks weightage of 15.
4. The questions are to be framed keeping in view the learning outcomes of the course / paper. The standard
/ level of the questions to be asked should be at the level of the prescribed textbook.
5. The requirement of (scientific) calculators / log‐tables / data – tables may be specified if required.
Course Objectives :
1. To impart basic knowledge about Statistics, visualisation and probability.
2. To impart basic knowledge about how to implement regression analysis and interpret the results.
3. To impart basic knowledge about how to describe classes of open and closed sets of R, concept of
compactness Describe Metric space ‐ Metric in Rn.
4. To impart basic knowledge about how to apply Eigen values, Eigen vectors.
Course Outcomes (CO)
CO 1 Ability to learn and understand the basic concepts about Statistics, visualisation and probability.
CO 2 Ability to implement regression analysis and interpret the results. Be able to fit a model to data and
comment on the adequacy of the model
CO 3 Ability to describe classes of open and closed sets of R, concept of compactness Describe Metric space
‐ Metric in Rn.
CO 4 Ability to impart basic knowledge about how to apply Eigen values, Eigen vectors.
Course Outcomes (CO) to Programme Outcomes (PO) mapping (scale 1: low, 2: Medium, 3: High)
PO01 PO02 PO03 PO04 PO05 PO06 PO07 PO08 PO09 PO10 PO11 PO12
CO 1 3 3 3 3 3 ‐ ‐ 1 2 ‐ ‐ 3
CO 2 3 3 3 3 3 ‐ ‐ 1 2 ‐ ‐ 3
CO 3 3 3 3 3 3 ‐ ‐ 1 2 ‐ ‐ 3
CO 4 3 3 3 3 3 ‐ ‐ 1 2 ‐ ‐ 3

UNIT‐I

Statistics: Introduction & Descriptive Statistics‐ mean, median, mode, variance, and standard deviation. Data
Visualization, Introduction to Probability Distributions.
Hypothesis testing, Linear Algebra and Population Statistics, Mathematical Methods and Probability Theory,
Sampling Distributions and Statistical Inference, Quantitative analysis.

Applicable from Batch Admitted in Academic Session 2021-22 Onwards Page 1422
Handbook of B.Tech. Programmes offered by USICT at Affiliated Institutions of the University.

UNIT‐II

Statistical Modelling: Linear models, regression analysis, analysis of variance, applications in various fields.
Gauss‐Markov theorem; geometry of least squares, subspace formulation of linear models, orthogonal
projections; regression models, factorial experiments, analysis of covariance and model formulae; regression
diagnostics, residuals, influence diagnostics, transformations, Box‐Cox models, model selection and model
building strategies, logistic regression models; Poisson regression models.

UNIT‐III

Data Analytics: Describe classes of open and closed set. Apply the concept of compactness. Describe Metric
space ‐ Metric in Rn. Use the concept of Cauchy sequence, completeness, compactness and connectedness to
solve the problems.

UNIT – IV

Advanced concepts in Data Analytics: Describe vector space, subspaces, independence of vectors, basis and
dimension. Describe Eigen values, Eigen vectors and related results.

Textbook(s):
1. Apostol T. M. (1974): Mathematical Analysis, Narosa Publishing House, New Delhi.
2. Malik, S.C., Arora, S. (2012): Mathematical Analysis, New Age International, New Delhi

References:
1. Pringle, R.M. and Rayner, A.(1971): Generalized Inverse of Matrices with Application to Statistics, Griffin,
London
2. Peter Bruce, Andrew Bruce (2017), Practical Statistics for Data Scientists Paperback

Applicable from Batch Admitted in Academic Session 2021-22 Onwards Page 1423
Handbook of B.Tech. Programmes offered by USICT at Affiliated Institutions of the University.

Statistics, Statistical Modelling & Data Analytics Lab L P C


2 1

Discipline(s) / EAE / OAE Semester Group Sub‐group Paper Code


CSE‐AI/CSE‐AIML/CSE‐DS 6 PC PC DA‐304P
EAE 6 AI‐EAE AI‐EAE‐2 DA‐304P
EAE 6 AIML‐EAE AIML‐EAE‐2 DA‐304P
EAE 6 DS‐EAE DS‐EAE‐1 DA‐304P
EAE 6 SC‐EAE SC‐EAE‐1 DA‐304P
EAE 6 MLDA‐EAE MLDA‐EAE‐1 DA‐304P

Marking Scheme:
1. Teachers Continuous Evaluation: 40 marks
2. Term end Theory Examinations: 60 marks
Instructions:
1. The course objectives and course outcomes are identical to that of (Statistics, Statistical Modelling & Data
Analytics) as this is the practical component of the corresponding theory paper.
2. The practical list shall be notified by the teacher in the first week of the class commencement under
intimation to the office of the Head of Department / Institution in which the paper is being offered from
the list of practicals below. Atleast 10 experiments must be performed by the students, they may be asked
to do more. Atleast 5 experiments must be from the given list.

1. Exercises to implement the basic matrix operations in Scilab.


2. Exercises to find the Eigenvalues and eigenvectors in Scilab.
3. Exercises to solve equations by Gauss elimination, Gauss Jordan Method and Gauss Siedel in Scilab.
4. Exercises to implement the associative, commutative and distributive property in a matrix in Scilab.
5. Exercises to find the reduced row echelon form of a matrix in Scilab.
6. Exercises to plot the functions and to find its first and second derivatives in Scilab.
7. Exercises to present the data as a frequency table in SPSS.
8. Exercises to find the outliers in a dataset in SPSS.
9. Exercises to find the most risky project out of two mutually exclusive projects in SPSS
10. Exercises to draw a scatter diagram, residual plots, outliers leverage and influential data points in R
11. Exercises to calculate correlation using R
12. Exercises to implement Time series Analysis using R.
13. Exercises to implement linear regression using R.
14. Exercises to implement concepts of probability and distributions in R

Applicable from Batch Admitted in Academic Session 2021-22 Onwards Page 1424
Handbook of B.Tech. Programmes offered by USICT at Affiliated Institutions of the University.

Supervised and Deep Learning L P C


3 3

Discipline(s) / EAE / OAE Semester Group Sub‐group Paper Code


EAE 7 MLDA‐EAE MLDA‐EAE‐3 ML‐463T

Marking Scheme:
1. Teachers Continuous Evaluation: 25 marks
2. Term end Theory Examinations: 75 marks
Instructions for paper setter:
1. There should be 9 questions in the term end examinations question paper.
2. The first (1st) question should be compulsory and cover the entire syllabus. This question should be
objective, single line answers or short answer type question of total 15 marks.
3. Apart from question 1 which is compulsory, rest of the paper shall consist of 4 units as per the syllabus.
Every unit shall have two questions covering the corresponding unit of the syllabus. However, the student
shall be asked to attempt only one of the two questions in the unit. Individual questions may contain upto 5
sub‐parts / sub‐questions. Each Unit shall have a marks weightage of 15.
4. The questions are to be framed keeping in view the learning outcomes of the course / paper. The standard
/ level of the questions to be asked should be at the level of the prescribed textbook.
5. The requirement of (scientific) calculators / log‐tables / data – tables may be specified if required.
Course Objectives :
1. To introduce students to the fundamentals of Supervised Learning and Deep Learning techniques and
algorithms.
2. To enable students to develop skills in implementing supervised and deep learning algorithms using
Python programming language and popular machine learning libraries.
3. To equip students with the ability to evaluate the performance of supervised and deep learning
models and select the appropriate models for specific problems.
4. To provide students with hands‐on experience in working with real‐world supervised and deep
learning projects.
Course Outcomes (CO)
CO 1 Develop a deep understanding of the concepts and applications of Supervised Learning and Deep
Learning techniques and algorithms.
CO 2 Develop proficiency in using Python programming language and popular machine learning libraries to
implement supervised and deep learning models.
CO 3 Demonstrate the ability to evaluate the performance of supervised and deep learning models and
select the appropriate models for specific problems.
CO 4 Gain hands‐on experience in working with real‐world supervised and deep learning projects, including
image recognition, text analysis, and time‐series analysis.
Course Outcomes (CO) to Programme Outcomes (PO) mapping (scale 1: low, 2: Medium, 3: High)
PO01 PO02 PO03 PO04 PO05 PO06 PO07 PO08 PO09 PO10 PO11 PO12
CO 1 3 2 2 2 3 ‐ ‐ ‐ 3 2 2 3
CO 2 3 2 2 2 3 ‐ ‐ ‐ 3 2 2 3
CO 3 3 2 2 2 3 ‐ ‐ ‐ 3 2 2 3
CO 4 3 2 2 2 3 ‐ ‐ ‐ 3 2 2 3

UNIT‐I

Introduction to Machine Learning, Types of Machine Learning, Supervised Learning Basics, Regression and
Classification, Linear Regression, Logistic Regression, Model Evaluation Metrics

UNIT‐II

Introduction to Deep Learning, Artificial Neural Networks, Activation Functions, Loss Functions, Optimization

Applicable from Batch Admitted in Academic Session 2021-22 Onwards Page 1439
Handbook of B.Tech. Programmes offered by USICT at Affiliated Institutions of the University.

Algorithms, Backpropagation Algorithm, Regularization Techniques

UNIT‐III

Introduction to CNNs, CNN Architecture, Convolution and Pooling Layers, Object Detection, Image
Segmentation, Transfer Learning, Introduction to RNNs, RNN Architecture, Long Short‐Term Memory (LSTM),
Gated Recurrent Unit (GRU), Text Generation, Language Translation

UNIT – IV

Generative Adversarial Networks (GANs), Autoencoders, Reinforcement Learning, Natural Language Processing
(NLP), Sentiment Analysis, Time Series Analysis

Textbooks:
1. Aurélien Géron, "Hands‐On Machine Learning with Scikit‐Learn, Keras, and TensorFlow", 2nd Edition,
O'Reilly Media, 2019. ISBN: 978‐1492032649
2. Francois Chollet, "Deep Learning with Python", 1st Edition, Manning Publications, 2017. ISBN: 978‐
1617294433

Reference Books:
1. "Hands‐On Machine Learning with Scikit‐Learn, Keras, and TensorFlow" by Aurélien Géron.
2. "Deep Learning" by Ian Goodfellow, Yoshua Bengio, and Aaron Courville.
3. "Pattern Recognition and Machine Learning" by Christopher M. Bishop.
4. Ian Goodfellow, Yoshua Bengio, and Aaron Courville, "Deep Learning", 1st Edition, MIT Press, 2016. ISBN:
978‐0262035613
5. Andrew Ng, "Machine Learning Yearning", eBook, 2018.
6. Sebastian Raschka and Vahid Mirjalili, "Python Machine Learning", 3rd Edition, Packt Publishing, 2019.
ISBN: 978‐1789955750

Applicable from Batch Admitted in Academic Session 2021-22 Onwards Page 1440
Handbook of B.Tech. Programmes offered by USICT at Affiliated Institutions of the University.

Supervised and Deep Learning Lab L P C


2 1

Discipline(s) / EAE / OAE Semester Group Sub‐group Paper Code


EAE 7 MLDA‐EAE MLDA‐EAE‐3 ML‐463P

Marking Scheme:
1. Teachers Continuous Evaluation: 40 marks
2. Term end Theory Examinations: 60 marks
Instructions:
1. The course objectives and course outcomes are identical to that of (Supervised and Deep Learning) as this
is the practical component of the corresponding theory paper.
2. The practical list shall be notified by the teacher in the first week of the class commencement under
intimation to the office of the Head of Department / Institution in which the paper is being offered from
the list of practicals below. Atleast 10 experiments must be performed by the students, they may be asked
to do more. Atleast 5 experiments must be from the given list.

1. Linear regression: Implement linear regression on a dataset and evaluate the model's performance.
2. Logistic regression: Implement logistic regression on a binary classification dataset and evaluate the
model's performance.
3. k‐Nearest Neighbors (k‐NN): Implement k‐NN algorithm on a dataset and evaluate the model's
performance.
4. Decision Trees: Implement decision trees on a dataset and evaluate the model's performance.
5. Random Forest: Implement random forest algorithm on a dataset and evaluate the model's performance.
6. Support Vector Machines (SVM): Implement SVM on a dataset and evaluate the model's performance.
7. Naive Bayes: Implement Naive Bayes algorithm on a dataset and evaluate the model's performance.
8. Gradient Boosting: Implement gradient boosting algorithm on a dataset and evaluate the model's
performance.
9. Convolutional Neural Networks (CNN): Implement CNN on an image classification dataset and evaluate the
model's performance.
10. Recurrent Neural Networks (RNN): Implement RNN on a text classification dataset and evaluate the
model's performance.
11. Long Short‐Term Memory Networks (LSTM): Implement LSTM on a time‐series dataset and evaluate the
model's performance.
12. Autoencoders: Implement autoencoders on an image dataset and evaluate the model's performance.
13. Generative Adversarial Networks (GANs): Implement GANs on an image dataset and evaluate the model's
performance.
14. Transfer Learning: Implement transfer learning on an image dataset and evaluate the model's
performance.
15. Reinforcement Learning: Implement reinforcement learning on a game environment and evaluate the
model's performance.

Applicable from Batch Admitted in Academic Session 2021-22 Onwards Page 1441
Handbook of B.Tech. Programmes offered by USICT at Affiliated Institutions of the University.

Unsupervised Learning L P C
3 3

Discipline(s) / EAE / OAE Semester Group Sub‐group Paper Code


EAE 7 MLDA‐EAE MLDA‐EAE‐4 ML‐465T

Marking Scheme:
1. Teachers Continuous Evaluation: 25 marks
2. Term end Theory Examinations: 75 marks
Instructions for paper setter:
1. There should be 9 questions in the term end examinations question paper.
2. The first (1st) question should be compulsory and cover the entire syllabus. This question should be
objective, single line answers or short answer type question of total 15 marks.
3. Apart from question 1 which is compulsory, rest of the paper shall consist of 4 units as per the syllabus.
Every unit shall have two questions covering the corresponding unit of the syllabus. However, the student
shall be asked to attempt only one of the two questions in the unit. Individual questions may contain upto 5
sub‐parts / sub‐questions. Each Unit shall have a marks weightage of 15.
4. The questions are to be framed keeping in view the learning outcomes of the course / paper. The standard
/ level of the questions to be asked should be at the level of the prescribed textbook.
5. The requirement of (scientific) calculators / log‐tables / data – tables may be specified if required.
Course Objectives :
1. To learn about unsupervised learning and clustering algorithms
2. To learn about Gaussian mixture models and linear dimensional reduction methods
3. To learn about autoencoders and generative adversarial network
4. To learn about outlier detection,density estimation methods and unsupervised learning networks
Course Outcomes (CO)
CO 1 Applying clustering algorithms for the real world data
CO 2 Applying Dimensional reduction techniques for feature extraction and learn,Gaussian mixture models
CO 3 Learn about Autoencoders and Genearative adversarial network
CO 4 Applying outlier and novelity detection,density estimation methods to real world data and learn
about unsupervised learning networks
Course Outcomes (CO) to Programme Outcomes (PO) mapping (scale 1: low, 2: Medium, 3: High)
PO01 PO02 PO03 PO04 PO05 PO06 PO07 PO08 PO09 PO10 PO11 PO12
CO 1 3 2 3 3 3 2 2 ‐ ‐ ‐ ‐ 2
CO 2 3 2 3 3 3 2 2 ‐ ‐ ‐ ‐ 2
CO 3 3 2 3 3 3 2 2 ‐ ‐ ‐ ‐ 2
CO 4 3 2 3 3 3 2 2 ‐ ‐ ‐ ‐ 2

UNIT‐I

Unsupervised learning ‐ Introduction, Unsupervised vs Supervised Learning, Application of Unsupervised


Learning,
Clustering –Clustering as a Machine Learning task, Different types of clustering techniques, Partitioning
methods, Hierarchical clustering, Density‐based methods: DBSCAN
Biclustering :Spectral co‐clustering,spectral biclustering
Finding Pattern using Association Rule ‐ Definition of common terms, Association rule, Apriori algorithm.

UNIT‐II

Gaussain Mixture Models: Gaussian mixture ,Variational Bayesian Gaussian mixture


Manifold learning:Introduction,Isomap,Locally linear embedding,Modified locally linear embedding,Spectral
embedding,MDS(Multi dimensional scaling, t‐distributed Stochastic Neighbor Embedding (t‐SNE)

Applicable from Batch Admitted in Academic Session 2021-22 Onwards Page 1488
Handbook of B.Tech. Programmes offered by USICT at Affiliated Institutions of the University.

Decomposing signals in components (matrix factorization problems):PCA(Principal component Anlysis),Factor


Analysis, Kernel Principal Component Analysis (kPCA), Truncated singular value decomposition and latent
semantic analysis, Independent component analysis (ICA), Non‐negative matrix factorization (NMF or NNMF),
Latent Dirichlet Allocation (LDA)

UNIT‐III

Autoencoders: Architecture,Layers in autoencoder ,training of autoencoder ,Sparse Coding, Undercomplete


Autoencoders, Regularized Autoencoders, Stochastic Encoders and Decoders, Denoising Autoencoders,
Contractive Autoencoders, Applications of Autoencoders.
Generative Adversarial Networks: Generative Vs Discriminative Modeling, Probabilistic Generative Model,
Generative Adversarial Networks (GAN), GAN challenges: Oscillation Loss, Mode Collapse, Uninformative Loss,
Hyperparameters, Tackling GAN challenges, Wasserstein GAN, Cycle GAN, Neural Style Transfer

UNIT ‐ IV

Novelty and outlier detection:Overview of outlier detection methods,Novelty detection,outlier detection


Density estimation:Histograms and kernel density estimation
Unsupervised Learning Networks: Kohonen Self‐Organizing Feature Maps – architecture, training algorithm,
Kohonen Self‐Organizing Motor Map,Restricted Boltzmann machine(neural network model)

Textbook(s):
1. Tom M. Mitchell, “Machine Learning”, McGraw‐Hill Education (India) Private Limited, 2013.
2. Benyamin Ghojogh, Mark Crowley, Fakhri Karray, , Ali Ghodsi , Elements of Dimensionality Reduction and
Manifold Learning,Springer

References:
1. C. M. BISHOP (2006), “Pattern Recognition and Machine Learning”, Springer‐Verlag New York, 1st Edition
2. Kevin Murphy, Machine learning: a probabilistic perspective.
3. Jennifer Grange ,” Machine Learning for Absolute Beginners: A Simple, Concise & Complete Introduction to
Supervised and Unsupervised Learning Algorithms”,Kindle

Applicable from Batch Admitted in Academic Session 2021-22 Onwards Page 1489
Handbook of B.Tech. Programmes offered by USICT at Affiliated Institutions of the University.

Unsupervised Learning Lab L P C


2 1

Discipline(s) / EAE / OAE Semester Group Sub‐group Paper Code


EAE 7 MLDA‐EAE MLDA‐EAE‐4 ML‐465P

Marking Scheme:
1. Teachers Continuous Evaluation: 40 marks
2. Term end Theory Examinations: 60 marks
Instructions:
1. The course objectives and course outcomes are identical to that of (Unsupervised Learning) as this is the
practical component of the corresponding theory paper.
2. The practical list shall be notified by the teacher in the first week of the class commencement under
intimation to the office of the Head of Department / Institution in which the paper is being offered from
the list of practicals below. Atleast 10 experiments must be performed by the students, they may be asked
to do more. Atleast 5 experiments must be from the given list.

1. Setting up the Jupyter Notebook and Executing a Python Program


2. Installing Keras, Tensorflow and Pytorch,Pandas ,numpy etc libraries and making use of them
3. Apply EM algorithm to cluster a set of data stored in a .CSV file. Use the same data set for clustering using
k‐Means algorithm. Compare the results of these two algorithms and comment on the quality of clustering.
You can add Java/Python ML library classes/API in the program.
4. Program to demonstrate k‐means clustering algorithm
5. Program to demonstrate DBSCAN clustering algorithm
6. Program to demonstrate PCA and LDA on Iris dataset
7. Compare the performance of PCA and Autoencoders on a given dataset
8. Build Generative adversarial model for fake (news/image/audio/video) prediction.
9. Outlier detection in time series dataset using RNN
10. Anomaly detection using Self‐Organizing Network

Applicable from Batch Admitted in Academic Session 2021-22 Onwards Page 1490

You might also like