0% found this document useful (0 votes)
72 views26 pages

r21 III II Syllabus Hits-1

Uploaded by

botgame696969
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
72 views26 pages

r21 III II Syllabus Hits-1

Uploaded by

botgame696969
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 26

B.

Tech - CSE (Data Science) - HITS R21

III B.Tech.- I-Semester


Hours per Scheme of Examination
Course Week Maximum Marks
Course Code Course Title Credits Externa
Area Internal
L T P (CIE)
l Total
(SEE)
Design and Analysis of
A1DS501PC PCC 3 - - 3 30 70 100
Algorithms
A1DS502PC Computer Networks PCC 3 - - 3 30 70 100
A1DS503PC Introduction to Data Science PCC 3 - - 3 30 70 100
A1DS504PC Machine Learning PCC 3 - - 3 30 70 100
Professional Elective-I PEC 3 - - 3 30 70 100
A1DS505PC Data Science Lab PCC - - 3 1.5 30 70 100
A1DS506PC Machine Learning Lab PCC - - 3 1.5 30 70 100
Advanced English
A1EN507HS HSMC - - 2 1 30 70 100
Communication Skills Lab
A1DS508PW Industrial Training/Mini Project PWC - - - 2 - 100 100
MOOC’s (B.Tech Hon’s Degree)
Total 15 - 8 21 240 660 900

Mandatory Course (Non-Credit)

A1DS506MC Constitution of India MC 2 - - - 100 - 100

III B.Tech.- II-Semester


Hours per Scheme of Examination
Course Week Maximum Marks
Course Code Course Title Credits Externa
Area Internal
L T P (CIE)
l Total
(SEE)
Data Warehousing and Data
A1DS601PC PCC 3 - - 3 30 70 100
Mining
Predictive Analytics and
A1DS602PC PCC 3 - - 3 30 70 100
Reinforcement Learning
A1DS603PC Big Data Analytics PCC 3 - - 3 30 70 100
Professional Elective-II PEC 3 - - 3 30 70 100
Professional Elective-III PEC 3 - - 3 30 70 100
Open Elective-I OEC 3 - - 3 30 70 100
A1DS604PC Big Data Analytics Lab PCC - - 3 1.5 30 70 100
Data Warehousing and Data
A1DS605PC PCC - - 3 1.5 30 70 100
Mining Lab
A1DS606PW Comprehensive Viva PWC - - - 1 - 100 100
MOOC’s (B.Tech Hon’s Degree)
Total 18 - 6 22 240 660 900
Mandatory Course (Non-Credit)
Essence of Indian Traditional
A1DS607MC MC 2 - - - 100 - 100
Knowledge

Holy Mary Institute of Technology & Science (UGC-Autonomous) Page 25


B.Tech - CSE (Data Science) - HITS R21
DATA WAREHOUSING AND DATA MINING

III-B.Tech II-Semester L T P C
Course Code: A1DS601PC 3 0 0 3

COURSE OBJECTIVES
1. To understand the principles of data warehousing and Data Mining.
2. To be familiar with the Data warehouse architecture and its Implementation.
3. To know the Architecture of a Data Mining system.
4. To understand the various Data preprocessing Methods.
5. To perform classification and prediction of data.

COURSE OUTCOMES
1. Technical knowhow of the Data Mining principles and techniques for real time applications.

UNIT I
Data Warehousing and Business Analysis: - Data warehousing Components –Building a Data warehouse
–Data Warehouse Architecture – DBMS Schemas for Decision Support – Data Extraction, Cleanup, and Transformation
Tools –Metadata – reporting – Query tools and Applications – Online Analytical Processing (OLAP) – OLAP and
Multidimensional Data Analysis.

UNIT II
Data Mining: - Data Mining Functionalities – Data Preprocessing – Data Cleaning – Data Integration and
Transformation – Data Reduction – Data Discretization and Concept Hierarchy Generation- ArchitectureOf A Typical
Data Mining Systems- Classification Of Data Mining Systems.
Association Rule Mining: - Efficient and Scalable Frequent Item set Mining Methods – Mining Various Kinds of
Association Rules – Association Mining to Correlation Analysis – Constraint-Based Association Mining.

UNIT III
Classification and Prediction: - Issues Regarding Classification and Prediction – Classification by Decision Tree
Introduction – Bayesian Classification – Rule Based Classification – Classification by Back propagation – Support
Vector Machines – Associative Classification – Lazy Learners – Other Classification Methods – Prediction – Accuracy
and Error Measures – Evaluating the Accuracy of a Classifier or Predictor – Ensemble Methods – Model Section.

UNIT IV
Cluster Analysis: - Types of Data in Cluster Analysis – A Categorization of Major Clustering Methods – Partitioning
Methods – Hierarchical methods – Density-Based Methods – Grid-Based Methods – Model- Based Clustering Methods –
Clustering High-Dimensional Data – Constraint-Based Cluster Analysis – Outlier Analysis.

UNIT V
Mining Object, Spatial, Multimedia, Text and Web Data:
Multidimensional Analysis and Descriptive Mining of Complex Data Objects – Spatial Data Mining – Multimedia Data
Mining – Text Mining – Mining the World Wide Web.

TEXT BOOK
1. Jiawei Han, Micheline Kamber and Jian Pei“Data Mining Concepts and Techniques”, Third Edition,Elsevier,
2011.

Holy Mary Institute of Technology & Science (UGC-Autonomous) Page 121


B.Tech - CSE (Data Science) - HITS R21
REFERENCE BOOKS:
1. Alex Berson and Stephen J. Smith “Data Warehousing, Data Mining & OLAP”, Tata McGraw –Hill
Edition, Tenth Reprint 2007.
2. K.P. Soman, Shyam Diwakar and V. Ajay “Insight into Data mining Theory and Practice”, EasterEconomy
Edition, Prentice Hall of India, 2006.
3. G. K. Gupta “Introduction to Data Mining with Case Studies”, Easter Economy Edition, Prentice Hall of
India, 2006.
4. Pang-Ning Tan, Michael Steinbach and Vipin Kumar “Introduction to Data Mining”, Pearson Education, 2007.

Holy Mary Institute of Technology & Science (UGC-Autonomous) Page 122


B.Tech - CSE (Data Science) - HITS R21

PREDICTIVE ANALYTICS AND REINFORCEMENT LEARNING

III-B.Tech II-Semester L T P C
Course Code: A1DS602PC 3 0 0 3

COURSE OBJECTIVESRE
The course should enable the students to learn:
1. course should enable the students to learn:
2. Learn how to define RL tasks and the core principals behind the RL, including policies, valuefunctions,
deriving Bellman equations (as assets by the assignments, an exam and quizzes)
3. Implement in code common algorithms following code standards and libraries used in RL (as assessedby
the assignments and final project)
4. Understand and work with tabular methods to solve classical control problems (as assessed by the
assignments, quizzes and final exam)
5. Understand and work with approximate solutions (deep Q network based algorithms) (as assessed bythe
assignments and final exam)
6. Learn the policy gradient methods from vanilla to more complex cases (as assessed by theassignments,
quizzes and final exam)
7. Explore imitation learning tasks and solutions (as assessed by the quizzes and final exam) • Recognize
current advanced techniques and applications in RL (as assessed by the final project, quizzes and final
exam)

COURSE OUTCOMES:
After completion of the course, students will be able to:
1. Understand the need for machine learning for various problem solving
2. Familiarize the basics of Reinforcement Learning
3. Explain various tabular solution methods
4. Familiarize in approximate solution methods
5. Explain about classic conditioning and explore few applications

UNIT-I
INTRODUCTION TO PREDICTIVE ANALYTICS & LINEAR REGRESSION (NOS 2101): What and
Why Analytics, Introduction to Tools and Environment, Application of Modelling in Business, Databases & Types
of data and variables, Data Modelling Techniques, Missing imputations etc. Need for Business Modelling,
Regression— Concepts, Blue property- assumptions- Least Square Estimation, Variable Rationalization, and Model
Building etc.

UNIT-II
LOGISTIC REGRESSION (NOS 2101): Model Theory, Model fit Statistics, Model Conclusion, Analytics
applications to various Business Domains etc.
Regression Vs Segmentation --- Supervised and Unsupervised Learning, Tree Building ---Regression,
Classification, Over fitting, pruning and complexity, Multiple Decision Trees etc.

UNIT – III
Basic Tabular Solution Methods: Finite Markov Decision Processes- Goals, Rewards, Returns, Episodes- Optimal
policies and optimal valued functions. Dynamic Programming: Policy Evaluation (Prediction) - Policy Improvement
- Policy Iteration - Value Iteration- Asynchronous Dynamic Programming - Generalized Policy Iteration. Monte
Carlo Methods: Monte Carlo Prediction - Monte Carlo Estimation of Action Values - Monte Carlo Control - Monte
Carlo Control without Exploring Starts - Off-policy Prediction via Importance Sampling. Temporal-Difference
Learning: TD Prediction - Advantages of TD - Incremental Implementation -Off-policy Monte Carlo Control.

UNIT – IV
Approximate Solution Methods : On-policy Prediction with Approximation : Value-function Approximation
-The Prediction Objective (VE) - Stochastic-gradient and Semi-gradient Methods - Linear Methods –Feature
Construction for Linear Methods- Nonlinear Function Approximation: Artificial Neural Networks - Least- Squares
TD - Memory-based Function Approximation - Kernel-based Function Approximation.

Holy Mary Institute of Technology & Science (UGC-Autonomous) Page 123


B.Tech - CSE (Data Science) - HITS R21

UNIT – V
Classical Conditioning & Case studies Classical Conditioning : Blocking and Higher-order Conditioning - The
Rescorla -Wagner Model - TD Model -Simulations - Instrumental Conditioning - Delayed Reinforcement- Cognitive
Maps. Case Studies: Samuel's Checkers Player, Optimizing Memory Control, Human-level Video Game Play-
Autonomous UAV Navigation and path planning -Drones for Field Coverage.

TEXT BOOKS:
1. Richard S.Sutton and Andrew G. Barto, ‚ Introduction to Reinforcement Learning‛,2nd Edition, MITPress,
2017.
2. Tom M.Mitchell,―Machine Learning,McGraw-Hill Education (India) Private Limited, 2013
3. Student’s Handbook for Associate Analytics-III.

REFERENCE BOOKS:
1. Sigaud O.&Buffet O. ‚Markov Decision Processes in Artificial Intelligence‛, editors, ISTE Ld., Wileyand Sons
Inc, 2010.
2. Dragun Vrabie,Kyriakos G.Vamvoudakis,Frank L.Lewis.‚Optimal Adaptive Control and DifferentialGames by
Reinforcement learning principles,2012.

Holy Mary Institute of Technology & Science (UGC-Autonomous) Page 124


B.Tech - CSE (Data Science) - HITS R21
BIG DATA ANALYTICS
III-B.Tech II-Semester L T P C
Course Code: A1DS603PC 3 0 0 0

COURSE OBJECTIVES:
The course should enable the students to learn:
1. Outline the importance of Big Data Analytics
2. Apply statistical techniques for Big data Analytics.
3. Analyze problems appropriate to mining data streams.
4. Apply the knowledge of clustering techniques in data mining.
5. Use Graph Analytics for Big Data and provide solutions
6. Apply Hadoop map Reduce programming for handing Big Data

COURSE OUTCOMES:
At the end of the course the students are able to:
1. Identify Big Data and its Business Implications.
2. List the components of Hadoop and Hadoop Eco-System
3. Access and Process Data on Distributed File System
4. Manage Job Execution in Hadoop Environment
5. Develop Big Data Solutions using Hadoop Eco System
6. Analyze Infosphere BigInsights Big Data Recommendations.
7. Apply Machine Learning Techniques using R.

PRE- REQUISITES: Should have knowledge of one Programming Language (Java preferably), Practice of
SQL (queriesand sub queries), exposure to Linux Environment.

UNIT-I INTRODUCTION TO BIG DATA


Evolution of Big data - Best Practices for Big data Analytics - Big data characteristics - Validating - The
Promotion of the Value of Big Data - Big Data Use Cases- Characteristics of Big Data Applications - Perception
and Quantification of Value -Understanding Big Data Storage - Evolution Of Analytic Scalability - Analytic
Processes and Tools - Analysis vs Reporting - Modern Data Analytic Tools - Statistical Concepts: Sampling
Distributions - Re-Sampling - Statistical Inference - Prediction Error.

UNIT-II DATA ANALYSIS, CLUSTERING AND CLASSIFICATION


Regression Modeling - Multivariate Analysis - Bayesian Modeling - Support Vector and Kernel Methods -
Analysis of Time Series: Linear Systems Analysis - Nonlinear Dynamics - Rule Induction. Overview of
Clustering - K-means - Use Cases - Overview of the Method - Determining the Number of Clusters -
Diagnostics - Reasons to Choose and Cautions .- Classification: Decision Trees - Overview of a Decision Tree
- The General Algorithm - Decision Tree Algorithms - Evaluating a Decision Tree - Decision Trees in R - Naïve
Bayes - Bayes‘ Theorem - Naïve Bayes Classifier.

UNIT-III STREAM MEMORY


Introduction To Streams Concepts – Stream Data Model and Architecture - Stream Computing - Sampling Data
in a Stream – Filtering Streams – Counting Distinct Elements in a Stream – Estimating Moments – Counting
Oneness in a Window – Decaying Window - Real time Analytics Platform(RTAP) Applications - Case Studies -
Real Time Sentiment Analysis, Stock Market Predictions

UNIT-IV ASSOCIATION AND GRAPH MEMORY


Advanced Analytical Theory and Methods: Association Rules - Overview - Apriori Algorithm - Evaluation of
Candidate Rules - Applications of Association Rules - Finding Association& finding similarity - Graph
Analytics for Big Data: Graph Analytics - The Graph Model - Representation as Triples - Graphs and Network
Organization - Choosing Graph Analytics - Graph Analytics Use Cases - Graph Analytics Algorithms and
Solution Approaches - Technical Complexity of Analyzing Graphs- Features of a Graph Analytics Platform.

Holy Mary Institute of Technology & Science (UGC-Autonomous) Page 125


B.Tech - CSE (Data Science) - HITS R21
UNIT-V FRAMEWORKS AND VISUALIZATION
MapReduce – Hadoop, Hive, MapR – Sharding – NoSQL Databases - S3 - Hadoop Distributed File Systems –
Visualizations - Visual Data Analysis Techniques - Interaction Techniques; Systems and Analytics Applications -
Analytics using Statistical packages-Approaches to modeling in Analytics – correlation, regression, decision trees,
classification, association-Intelligence from unstructured information-Text analytics- Understanding of emerging
trends and Technologies-Industry challenges and application of Analytics- Analyzing big data with twitter - Big data
for E-Commerce Big data for blogs - Review of Basic Data Analytic Methods using R.

TEXT BOOKS:
1. David Loshin, "Big Data Analytics: From Strategic Planning to Enterprise Integration with Tools,
Techniques, NoSQL, and Graph", 2013.
2. AnandRajaraman and Jeffrey David Ullman, “Mining of Massive Datasets”, Cambridge UniversityPress,
2012
3. Michael Berthold, David J. Hand, “Intelligent Data Analysis”, Springer, 2007

REFERENCE BOOKS:
1. EMC Education Services, "Data Science and Big Data Analytics: Discovering, Analyzing, Visualizingand
Presenting Data", Wiley publishers, 2015.
2. Bart Baesens, "Analytics in a Big Data World: The Essential Guide to Data Science and itsApplications",
Wiley Publishers, 2015.
3. Kim H. Pries and Robert Dunnigan, "Big Data Analytics: A Practical Guide for Managers " CRCPress,
2015
4. Jimmy Lin and Chris Dyer, "Data-Intensive Text Processing with MapReduce", Synthesis Lectures on
Human Language Technologies, Vol. 3, No. 1, Pages 1-177, Morgan Claypool publishers, 2010.
5. Chris Eaton, Dirk DeRoos, Tom Deutsch, George Lapis, Paul Zikopoulos, “Understanding Big Data:
Analytics for Enterprise Class Hadoop and Streaming Data”, McGrawHill Publishing, 2012.

Holy Mary Institute of Technology & Science (UGC-Autonomous) Page 126


B.Tech - CSE (Data Science) - HITS R21
QUANTUM COMPUTING
(PROFESSIONAL ELECTIVE-II)

III-B.Tech II-Semester L T P C
Course Code: A1DS604PE 3 0 0 3

COURSE OBJECTIVES
A basic introduction to quantum mechanics, linear algebra and familiarity with the Dirac notation is provided first to get
one’s quantum moorings right. This is then followed byan introductory treatment of quantum computation and quantum
information covering aspects of quantum entanglement, quantum algorithms, quantum channels. Rudimentary quantum
computing is introduced using the IBM quantum computer and associated simulators

UNIT - I
Introduction: Elementary quantum mechanics:, linear algebra for quantum mechanics, Quantum states in Hilbert space,
The Bloch sphere, Density operators, generalized measurements, no-cloning theorem.

UNIT - II
Quantum correlations: Bell inequalities and entanglement, Schmidt decomposition, superdense coding, teleportation.

UNIT - III
Quantum cryptography: quantum key distribution

UNIT - IV
Quantum gates and algorithms: Universal set of gates, quantum circuits, Solovay-Kitaev theorem, Deutsch-Jozsa
algorithm, factoring

UNIT - V
Programming a quantum computer:The IBMQ, coding a quantum computer using a simulator to carry out basic
quantum measurement and state analysis.

TEXT BOOKS:
1. Phillip Kaye, Raymond Laflamme et. al., An introduction to Quantum Computing, Oxford University press, 2007.
2. Chris Bernhardt, Quantum Computing for Everyone, The MIT Press,Cambridge, 2020 (2)David McMahon-
Quantum Computing Explained-Wiley-Interscience , IEEE Computer Society (2008)

REFERENCE BOOKS:
1. Quantum Computation and Quantum Information, M. A. Nielsen &I.Chuang, Cambridge University Press (2013).
2. Quantum Computing, A Gentle Introduction , Eleanor G. Rieffel and Wolfgang H. Polak

Holy Mary Institute of Technology & Science (UGC-Autonomous) Page 127


B.Tech - CSE (Data Science) - HITS R21
DEEP LEARNING
(PROFESSIONAL ELECTIVE-II)

III-B.Tech II-Semester L T P C
Course Code: A1DS605PE 3 0 0 3

COURSE OBJECTIVES:
The course should enable the students to learn:
This course covers the basics of machine learning, neural networks and deep learning. Model for deep learning
technique and the various optimization and generalization mechanisms are included. Major topics in deep learning
and dimensionality reduction techniques are covered. The objective of this course is:
1. To present the mathematical, statistical and computational challenges of building neural networks
2. To study the concepts of deep learning
3. To introduce dimensionality reduction techniques
4. To enable the students to know deep learning techniques to support real-time applications
5. To examine the case studies of deep learning techniques

COURSE OUTCOMES:
At the end of the course the students are able to:
1. Understand basics of deep learning
2. Implement various deep learning models
3. Realign high dimensional data using reduction techniques
4. Analyze optimization and generalization in deep learning
5. Explore the deep learning applications

UNIT – I INTRODUCTION
Introduction to machine learning- Linear models (SVMs and Perceptrons, logistic regression)- Intro to Neural Nets:
What a shallow network computes- Training a network: loss functions, back propagation and stochastic gradient
descent- Neural networks as universal function approximates.

UNIT – II DEEP NETWORKS


History of Deep Learning- A Probabilistic Theory of Deep Learning- Backpropagation and regularization, batch
normalization- VC Dimension and Neural Nets-Deep Vs Shallow NetworksConvolutional Networks- Generative
Adversarial Networks (GAN), Semi-supervised Learning.

UNIT – III DIMENSIONALITY REDUCTION


Linear (PCA, LDA) and manifolds, metric learning - Auto encoders and dimensionality reduction in networks -
Introduction to Convnet - Architectures – AlexNet, VGG, Inception, ResNet - Training a Convnet: weights
initialization, batch normalization, hyperparameter optimization.

UNIT – IV OPTIMIZATION AND GENERALIZATION


Optimization in deep learning– Non-convex optimization for deep networks- Stochastic Optimization Generalization
in neural networks- Spatial Transformer Networks- Recurrent networks, LSTM - Recurrent Neural Network
Language Models- Word-Level RNNs & Deep Reinforcement Learning - Computational & Artificial Neuroscience.

UNIT – V CASE STUDY AND APPLICATIONS


Imagenet- Detection-Audio WaveNet-Natural Language Processing Word2Vec - Joint DetectionBioInformatics- Face
Recognition- Scene Understanding- Gathering Image Captions.

Holy Mary Institute of Technology & Science (UGC-Autonomous) Page 128


B.Tech - CSE (Data Science) - HITS R21
TEXT BOOKS:
1. Deep Learning (Adaptive Computation and Machine Learning Series) by Ian Goodfellow, Yoshua
Bengio and Aaron Courville, MIT Press, 2016.

REFERENCE BOOKS:
1. Cosma Rohilla Shalizi, Advanced Data Analysis from an Elementary Point of View, 2015.
2. Deng & Yu, Deep Learning: Methods and Applications, Now Publishers, 2013.
3. Ian Goodfellow, Yoshua Bengio, Aaron Courville, Deep Learning, MIT Press, 2016.
4. Michael Nielsen, Neural Networks and Deep Learning, Determination Press, 2015.

Holy Mary Institute of Technology & Science (UGC-Autonomous) Page 129


B.Tech - CSE (Data Science) - HITS R21
CYBER SECURITY
(PROFESSIONAL ELECTIVE-II)

III-B.Tech II-Semester L T P C
Course Code: A1DS606PE 3 0 0 3
COURSE OBJECTIVES:
The course should enable the students to learn:
1. To familiarize various types of cyber-attacks and cyber-crimes
2. To give an overview of the cyber laws
3. To study the defensive techniques against these attacks.

COURSE OUTCOMES:
After completion of the course, students will be able to:
1. The students will be able to understand cyber-attacks, types of cybercrimes, cyber laws and also howto
protect them self and ultimately the entire Internet community from such attacks.

UNIT – I
Introduction to Cyber Security: Basic Cyber Security Concepts, layers of security, Vulnerability, threat, Harmful
acts, Internet Governance – Challenges and Constraints, Computer Criminals, CIA Triad, Assets and Threat, motive
of attackers, active attacks, passive attacks, Software attacks, hardware attacks, Spectrum of attacks, Taxonomy of
various attacks, IP spoofing, Methods of defense, Security Models, risk management, Cyber Threats-Cyber Warfare,
Cyber Crime, Cyber terrorism, Cyber Espionage, etc., Comprehensive Cyber Security Policy.

UNIT – II
Cyberspace and the Law & Cyber Forensics: Introduction, Cyber Security Regulations, Roles of International
Law. The INDIAN Cyberspace, National Cyber Security Policy. Introduction, Historical background of Cyber
forensics, Digital Forensics Science, The Need for Computer Forensics, Cyber Forensics and Digital evidence,
Forensics Analysis of Email, Digital Forensics Lifecycle, Forensics Investigation, Challenges in Computer Forensics,
Special Techniques for Forensics Auditing.

UNIT – III
Cybercrime: Mobile and Wireless Devices: Introduction, Proliferation of Mobile and Wireless Devices, Trends in
Mobility, Credit card Frauds in Mobile and Wireless Computing Era, Security Challenges Posed by Mobile Devices,
Registry Settings for Mobile Devices, Authentication service Security, Attacks on Mobile/Cell Phones, Mobile
Devices: Security Implications for Organizations, Organizational Measures for Handling Mobile, Organizational
Security Policies and Measures in Mobile Computing Era, Laptops.

UNIT – IV
Cyber Security: Organizational Implications: Introduction cost of cybercrimes and IPR issues, web threats for
organizations, security and privacy implications, social media marketing: security risks and perils for organizations,
social computing and the associated challenges for organizations. Cybercrime and Cyber terrorism: Introduction,
intellectual property in the cyberspace, the ethical dimension of cybercrimes the psychology, mindset and skills of
hackers and other cyber criminals.

UNIT – V
Privacy Issues: Basic Data Privacy Concepts: Fundamental Concepts, Data Privacy Attacks, Data linking and
profiling, privacy policies and their specifications, privacy policy languages, privacy in different domains- medical,
financial, etc. Cybercrime: Examples and Mini-Cases Examples: Official Website of Maharashtra Government
Hacked, Indian Banks Lose Millions of Rupees, Parliament Attack, Pune City Police Bust Nigerian Racket, e-mail
spoofing instances. Mini-Cases: The Indian Case of online Gambling, An Indian Case of Intellectual Property Crime,
Financial Frauds in Cyber Domain.

Holy Mary Institute of Technology & Science (UGC-Autonomous) Page 130


B.Tech - CSE (Data Science) - HITS R21
TEXT BOOKS:
1. Nina Godbole and Sunit Belpure, Cyber Security Understanding Cyber Crimes, Computer Forensicsand
Legal Perspectives, Wiley
2. B. B. Gupta, D. P. Agrawal, Haoxiang Wang, Computer and Cyber Security: Principles, Algorithm,
Applications, and Perspectives, CRC Press, ISBN 9780815371335, 2018.

REFERENCE BOOKS:
1. Cyber Security Essentials, James Graham, Richard Howard and Ryan Otson, CRC Press.
2. Introduction to Cyber Security, Chwan-Hwa(john) Wu,J. David Irwin, CRC Press T&F Group

Holy Mary Institute of Technology & Science (UGC-Autonomous) Page 131


B.Tech - CSE (Data Science) - HITS R21
SCRIPTING LANGUAGES
(PROFESSIONAL ELECTIVE-III)
III-B.Tech II-Semester L T P C
Course Code: A1DS607PE 3 0 0 3

PREREQUISITES:
1. A course on “Computer Programming and Data Structures”
2. A course on “Object Oriented Programming Concepts”

COURSE OBJECTIVES:
1. This course introduces the script programming paradigm
2. Introduces scripting languages such as Perl, Ruby and TCL.
3. Learning TCL

COURSE OUTCOMES:
1. Comprehend the differences between typical scripting languages and typical system and application
programming languages.
2. Gain knowledge of the strengths and weakness of Perl, TCL and Ruby; and select an appropriate
language for solving a given problem.
3. Acquire programming skills in scripting language

UNIT - I
Introduction: Ruby, Rails, The structure and Excution of Ruby Programs, Package Management with RUBYGEMS,
Ruby and web: Writing CGI scripts, cookies, Choice of Webservers, SOAP and webservices RubyTk – Simple Tk
Application, widgets, Binding events, Canvas, scrolling

UNIT - II
Extending Ruby: Ruby Objects in C, the Jukebox extension, Memory allocation, Ruby Type System, Embedding
Ruby to Other Languages, Embedding a Ruby Interperter

UNIT - III
Introduction to PERL and Scripting : Scripts and Programs, Origin of Scripting, Scripting Today, Characteristics
of Scripting Languages, Uses for Scripting Languages, Web Scripting, and the universe of Scripting Languages.
PERL- Namesand Values, Variables, Scalar Expressions, Control Structures, arrays, list, hashes, strings, pattern
andregular expressions, subroutines.

UNIT - IV
Advanced perl Finer points of looping, pack and unpack, filesystem, eval, data structures, packages, modules, objects,
interfacing to the operating system, Creating Internet ware applications, Dirty Hands Internet Programming, security
Isses.

UNIT - V
TCL Structure, syntax, Variables and Data in TCL, Control Flow, Data Structures, input/output, procedures, strings,
patterns, files, Advance TCL- eval, source, exec and uplevel commands, Name spaces, trapping errors, event driven
programs, making applications internet aware, Nuts and Bolts Internet Programming, Security Issues, C Interface.
Tk-Visual Tool Kits, Fundamental Concepts of Tk, Tk by example, Events and Binding, Perl-Tk.

TEXT BOOKS:
1. The World of Scripting Languages, David Barron,Wiley Publications.
2. Ruby Progamming language by David Flanagan and Yukihiro Matsumoto O’Reilly
3. “Programming Ruby” The Pramatic Progammers guide by Dabve Thomas Second edition

Holy Mary Institute of Technology & Science (UGC-Autonomous) Page 132


B.Tech - CSE (Data Science) - HITS R21
REFERENCE BOOKS:
1. Open Source Web Development with LAMP using Linux Apache, MySQL, Perl and PHP, J.Leeand
B. Ware (Addison Wesley) Pearson Education.
2. Perl by Example, E. Quigley, Pearson Education.
3. Programming Perl, Larry Wall, T. Christiansen and J. Orwant, O’Reilly, SPD.
4. Tcl and the Tk Tool kit, Ousterhout, Pearson Education.
5. Perl Power, J. P. Flynt, Cengage Learning.

Holy Mary Institute of Technology & Science (UGC-Autonomous) Page 135


B.Tech - CSE (Data Science) - HITS R21
MOBILE APPLICATION DEVELOPMENT
(PROFESSIONAL ELECTIVE-III)

III-B.Tech II-Semester L T P C
Course Code: A1DS608PE 3 0 0 3

PREREQUISITES:
1. Acquaintance with JAVA programming
2. A Course on DBMS

COURSE OBJECTIVES:
1. To demonstrate their understanding of the fundamentals of Android operating systems
2. To improves their skills of using Android software development tools
3. To demonstrate their ability to develop software with reasonable complexity on mobile platform
4. To demonstrate their ability to deploy software to mobile devices
5. To demonstrate their ability to debug programs running on mobile devices

COURSE OUTCOMES:
1. Student understands the working of Android OS Practically.
2. Student will be able to develop Android user interfaces
3. Student will be able to develop, deploy and maintain the Android Applications.

UNIT - I
Introduction to Android Operating System: Android OS design and Features – Android development framework,
SDK features, Installing and running applications on Android Studio, Creating AVDs, Types of Android applications,
Best practices in Android programming, Android tools
Android application components – Android Manifest file, Externalizing resources like values, themes, layouts,
Menus etc, Resources for different devices and languages, Runtime Configuration Changes Android Application
Lifecycle – Activities, Activity lifecycle, activity states, monitoring state changes

UNIT - II
Android User Interface: Measurements – Device and pixel density independent measuring UNIT - s Layouts
– Linear, Relative, Grid and Table Layouts
User Interface (UI) Components – Editable and non-editable TextViews, Buttons, Radio and Toggle Buttons,
Checkboxes, Spinners, Dialog and pickers
Event Handling – Handling clicks or changes of various UI components Fragments – Creating fragments, Lifecycle
of fragments, Fragment states, Adding fragments to Activity, adding, removing and replacing fragments with
fragment transactions, interfacing between fragments and Activities, Multi-screen Activities

UNIT - III
Intents and Broadcasts: Intent – Using intents to launch Activities, Explicitly starting new Activity, Implicit Intents,
Passing data to Intents, Getting results from Activities, Native Actions, using Intent to dial a number or to send SMS
Broadcast Receivers – Using Intent filters to service implicit Intents, Resolving Intent filters, finding andusingIntents
received within an Activity
Notifications – Creating and Displaying notifications, Displaying Toasts

UNIT - IV
Persistent Storage: Files – Using application specific folders and files, creating files, reading data fromfiles, listing
contents of a directory Shared Preferences – Creating shared preferences, saving and retrieving data using Shared
Preference.

Holy Mary Institute of Technology & Science (UGC-Autonomous) Page 136


B.Tech - CSE (Data Science) - HITS R21
UNIT - V
Database – Introduction to SQLite database, creating and opening a database, creating tables, insertingretrieving and
etindelg data, Registering Content Providers, Using content Providers (insert, delete, retrieve and update)

TEXT BOOKS:
1. Professional Android 4 Application Development, Reto Meier, Wiley India, (Wrox), 2012
2. Android Application Development for Java Programmers, James C Sheusi, Cengage Learning,2013

REFERENCE BOOK:
1. Beginning Android 4 Application Development, Wei-Meng Lee, Wiley India (Wrox),2013

Holy Mary Institute of Technology & Science (UGC-Autonomous) Page 135


B.Tech - CSE (Data Science) - HITS R21
DATA HANDLING AND VISUALIZATION
(PROFESSIONAL ELECTIVE-III)

III-B.Tech II-Semester L T P C
Course Code: A1DS609PE 3 0 0 3
OBJECTIVES:
The course should enable the students to learn:

1. To introduce and provide some core and necessary data visualization techniques so that students understand how
to work with large data sets and apply the appropriate data visualization technique to answer business
questions

COURSE OUTCOMES:
After completion of the course, students will be able to:
1. Understand basics of Data Visualization
2. Implement visualization of distributions
3. Write programs on visualization of time series, proportions & associations
4. Apply visualization on Trends and uncertainty
5. Explain principles of proportions

UNIT – I INTRODUCTION TO VISUALIZATION


Visualizing Data-Mapping Data onto Aesthetics, Aesthetics and Types of Data, Scales Map Data Values onto
Aesthetics, Coordinate Systems and Axes- Cartesian Coordinates, Nonlinear Axes, Coordinate Systems with Curved
Axes, Color Scales-Color as a Tool to Distinguish, Color to Represent Data Values ,Color as a Tool to Highlight,
Directory of Visualizations- Amounts, Distributions, Proportions, x–y relationships, Geospatial Data

UNIT – II VISUALIZING DISTRIBUTIONS


Visualizing Amounts-Bar Plots, Grouped and Stacked Bars, Dot Plots and Heatmaps, VisualizingDistributions:
Histograms and Density Plots- Visualizing a Single Distribution, Visualizing Multple Distributions at the Same
Time, Visualizing Distributions: Empirical Cumulative Distribution Functions andQ-Q Plots-Empirical Cumulative
Distribution Functions, Highly Skewed Distributions, QuantileQuantile Plots,Visualizing Many Distributions at
Once-Visualizing Distributions Along the Vertical Axis, VisualizingDistributions Along the Horizontal Axis

UNIT – III VISUALIZING ASSOCIATIONS & TIME SERIES


Visualizing Proportions-A Case for Pie Charts, A Case for Side-by-Side Bars, A Case for Stacked Bars and Stacked
Densities, Visualizing Proportions Separately as Parts of the Total ,Visualizing Nested Proportions- Nested
Proportions Gone Wrong, Mosaic Plots and Treemaps, Nested Pies ,Parallel Sets. Visualizing Associations Among
Two or More Quantitative Variables-Scatterplots, Correlograms, Dimension Reduction, Paired Data. Visualizing
Time Series and Other Functions of an Independent Variable-Individual Time Series , Multiple Time Series and
Dose–Response Curves, Time Series of Two or More Response Variables

UNIT – IV VISUALIZING UNCERTIANITY


Visualizing Trends-Smoothing, Showing Trends with a Defined Functional Form, Detrending and Time-Series
Decomposition, Visualizing Geospatial Data-Projections, Layers, Choropleth Mapping, Cartograms, Visualizing
Uncertainty-Framing Probabilities as Frequencies, Visualizing the Uncertainty of Point Estimates, Visualizing the
Uncertainty of Curve Fits, Hypothetical Outcome Plots

Holy Mary Institute of Technology & Science (UGC-Autonomous) Page 136


B.Tech - CSE (Data Science) - HITS R21
UNIT – V PRINCIPLE OF PROPORTIONAL INK
The Principle of Proportional Ink-Visualizations Along Linear Axes, Visualizations Along Logarithmic Axes,
Direct Area Visualizations, Handling Overlapping Points-Partial Transparency and Jittering, 2D Histograms,
Contour Lines, Common Pitfalls of Color Use-Encoding Too Much or Irrelevant Information ,Using
Nonmonotonic Color Scales to Encode Data Values, Not Designing for Color-Vision Deficiency

TEXT BOOKS
1. Claus Wilke, “Fundamentals of Data Visualization: A Primer on Making Informative and Compelling
Figures”, 1st edition, O’Reilly Media Inc, 2019.

REFERENCE BOOKS:
1. Tony Fischetti, Brett Lantz, R: Data Analysis and Visualization,O’Reilly ,2016
2. Ossama Embarak, Data Analysis and Visualization Using Python: Analyze Data to Create
Visualizations for BI Systems,Apress, 2018

Holy Mary Institute of Technology & Science (UGC-Autonomous) Page 137


B.Tech - CSE (Data Science) - HITS R21
PREDICTIVE ANALYTICS
(OPEN ELECTIVE-I)

III-B.Tech II-Semester L T P C
Course Code: A1DS601OE 3 0 0 3

COURSE OBJECTIVES:
The course should enable the students to learn:
1. To introduce the terminology, technology and its applications
2. To introduce the concept of predictive analysis
3. To introduce linear regression, time series concepts
4. To introduce the tools, technologies, programming languages which are used in day to day analytics cycle.

COURSE OUTCOMES:
At the end of the course, student will be able to:
1. Identify basic terminology of Data Analytics
2. Compare learning with classification
3. Develop the knowledge skill and competences using tools and training
4. Analyze the importance of Analytics in business perspective.

UNIT-I
INTRODUCTION TO PREDICTIVE ANALYTICS & LINEAR REGRESSION (NOS 2101): What and
Why Analytics, Introduction to Tools and Environment, Application of Modelling in Business, Databases & Types
of data and variables, Data Modelling Techniques, Missing imputations etc. Need for Business Modelling,
Regression— Concepts, Blue property- assumptions- Least Square Estimation, Variable Rationalization, and Model
Building etc.

UNIT-II
LOGISTIC REGRESSION (NOS 2101): Model Theory, Model fit Statistics, Model Conclusion, Analytics
applications to various Business Domains etc.
Regression Vs Segmentation --- Supervised and Unsupervised Learning, Tree Building ---Regression,
Classification,Over fitting, pruning and complexity, Multiple Decision Trees etc.

UNIT-III
DEVELOP KNOWLEDGE, SKILL AND COMPETENCES (NOS 9005): Introduction to Knowledge skills &
competences, Training & Development, Learning & Development, Policies and Record keeping, etc.

UNIT-IV
TIME SERIES METHODS / FORECASTING, FEATURE EXTRACTION (NOS 2101): ARIMA, Measures
of Forecast Accuracy, STL approach, Extract features from generated model as Height, Average, Energy etc and
Analyze for prediction.

UNIT-V
WORKING WITH DOCUMENTS (NOS 0703): Standard Operating Procedures for documentation and
Knowledge sharing, Defining purpose and scope documents, Understanding structure of documents – case studies,
articles, white papers, technical reports, minutes of meeting etc., Style and format, Intellectual Property and
Copyright, Document preparation tools – Visio, PowerPoint, Word, Excel etc., Version Control, Accessing and
updating corporate knowledge base, Peer review and feedback.

TEXT BOOKS:
1. Student’s Handbook for Associate Analytics-III.

REFERENCE BOOKS:
1. Gareth James Daniela Witten Trevor Hastie Robert Tibshirani. An Introduction to Statistical Learning with
Applications in R Programming

Holy Mary Institute of Technology & Science (UGC-Autonomous) Page 138


B.Tech - CSE (Data Science) - HITS R21
VIDEO ANALYTICS
(OPEN ELECTIVE-I)

III-B.Tech II-Semester L T P C
Course Code: A1DS602OE 3 0 0 3
COURSE OBJECTIVES:
To acquire the knowledge of extracting information from surveillance videos, understand the models used for
recognition of objects, humans in videos and perform gait analysis.

COURSE OUTCOMES:
At the end of the course, student will be able to:
1. Understand the basics of video- signals and systems.
2. Able to estimate motion in a video.
3. Able to detect the objects and track them.
4. Recognize activity and analyze behaviour.
5. Evaluate face recognition technologies

UNIT-I
INTRODUCTION Multidimensional signals and systems: signals, transforms, systems, sampling theorem. Digital
Images and Video: human visual system and color, digital video, 3D video, digital-video applications, image and video
quality.

UNIT-II
MOTION ESTIMATION Image formation, motion models, 2D apparent motion estimation, differential methods,
matching methods, non-linear optimization methods, transform domain methods, 3D motion and structure estimation.

UNIT-III
VIDEO ANALYTICS Introduction- Video Basics - Fundamentals for Video Surveillance- Scene Artifacts- Object
Detection and Tracking: Adaptive Background Modelling and Subtraction- Pedestrian Detection and Tracking Vehicle
Detection and Tracking- Articulated Human Motion Tracking in LowDimensional Latent Spaces

UNIT-IV
BEHAVIORAL ANALYSIS & ACTIVITY RECOGNITION Event Modelling- Behavioral Analysis- Human Activity
Recognition-Complex Activity Recognition Activity modelling using 3D shape, Video summarization, shape-based
activity models- Suspicious Activity Detection..

UNIT-V
HUMAN FACE RECOGNITION & GAIT ANALYSIS Introduction: Overview of Recognition algorithms – Human
Recognition using Face: Face Recognition from still images, Face Recognition from video, Evaluation of Face
Recognition Technologies- Human Recognition using gait: HMM Framework for Gait Recognition, View Invariant
Gait Recognition, Role of Shape and Dynamics in Gait Recognition.

TEXT BOOKS:
1. Murat Tekalp, “Digital Video Processing”, second edition, Pearson, 2015
2. Rama Chellappa, Amit K. Roy-Chowdhury, Kevin Zhou. S, “Recognition of Humans and their Activities using
Video”, Morgan & Claypool Publishers, 2005.
3. Yunqian Ma, Gang Qian, “Intelligent Video Surveillance: Systems and Technology”, CRC Press (Taylor and
Francis Group), 2009.

REFERENCE BOOKS:
1. Richard Szeliski, “Computer Vision: Algorithms and Applications”, Springer, 2011.
2. Yao Wang, JornOstermann and Ya-Qin Zhang, “Video Processing and Communications”, Prentice Hall, 2001.
3. Thierry Bouwmans, FatihPorikli, Benjamin Höferlin and Antoine Vacavant, “Background Modeling and
Foreground Detection for Video Surveillance: Traditional and Recent Approaches, Implementations,
Benchmarking and Evaluation”, CRC Press, Taylor and Francis Group, 2014.
4. Md. Atiqur Rahman Ahad, “Computer Vision and Action Recognition-A Guide for Image Processing and
Computer Vision Community for Action Understanding”, Atlantis Press, 2011

Holy Mary Institute of Technology & Science (UGC-Autonomous) Page 139


B.Tech - CSE (Data Science) - HITS R21
BIG DATA ANALYTICS LAB

III-B.Tech II-Semester L T P C
Course Code: A1DS604PC 0 0 3 1.5
COURSE OBJECTIVES:
The course should enable the students to learn:
1. Outline the importance of Big Data Analytics
2. Apply statistical techniques for Big data Analytics.
3. Analyze problems appropriate to mining data streams.
4. Apply the knowledge of clustering techniques in data mining.

5. Use Graph Analytics for Big Data and provide solutions


6. Apply Hadoop map Reduce programming for handing Big Data

COURSE OUTCOMES:
At the end of the course the students are able to:
1. Identify Big Data and its Business Implications.
2. List the components of Hadoop and Hadoop Eco-System
3. Access and Process Data on Distributed File System
4. Manage Job Execution in Hadoop Environment
5. Develop Big Data Solutions using Hadoop Eco System
6. Analyze Infosphere BigInsights Big Data Recommendations.
7. Apply Machine Learning Techniques using R.

LIST OF EXPERIMENTS:
1. Study of R Programming.
2. Hypothesis Test using R
3. K-means Clustering using R
4. Naïve Bayesian Classifier
5. Implementation of Linear Regression
7. Implement Logistic Regression
8. Time-series Analysis
9. Association Rules using R.
10. Data Analysis-Visualization using R.
11. Map Reduce using Hadoop
12. In-database Analytics
13. Implementation of Queries using Mongo DB

Holy Mary Institute of Technology & Science (UGC-Autonomous) Page 140


B.Tech - CSE (Data Science) - HITS R21

DATA WAREHOUSING AND DATA MINING LAB

III-B.Tech II-Semester L T P C
Course Code: A1DS605PC 0 0 3 1.5

COURSE OBJECTIVES:
1. To obtain practical experience using data mining techniques on real world data sets
2. Emphasize hands-on experience working with all real data sets.

COURSE OUTCOMES:
1. Ability to add mining algorithms as a component to the exiting tools
2. Ability to apply mining techniques for realistic data.

LIST OF SAMPLE PROBLEMS:


Task 1: Credit Risk AssessmentDescription:
The business of banks is making loans. Assessing the credit worthiness of an applicant is of crucial importance. You
have to develop a system to help a loan officer decide whether the credit of a customer is good, or bad. A bank's
business rules regarding loans must consider two opposing factors. On the one hand, a bank wants to make as many
loans as possible. Interest on these loans is the banks profit source. On the other hand, a bank cannot afford to make
too many bad loans. Too many bad loans could lead to the collapse of the bank. The bank's loan policy must involve
a compromise: not too strict, and not too lenient. To do the assignment, you first and foremost need some knowledge
about the world of credit. You can acquire such knowledge in a number of ways.
1. Knowledge Engineering. Find a loan officer who is willing to talk. Interview her and try to represent her
knowledge in the form of production rules.
2. Books. Find some training manuals for loan officers or perhaps a suitable textbook on finance.
Translate this knowledge from text form to production rule form.
3. Common sense. Imagine yourself as a loan officer and make up reasonable rules which can be used tojudge
the credit worthiness of a loan applicant.
4. Case histories. Find records of actual cases where competent loan officers correctly judged when, andwhen
not to, approve a loan application.

The German Credit Data:


Actual historical credit data is not always easy to come by because of confidentiality rules.
Here is one such dataset, consisting of 1000 actual cases collected in Germany. Credit dataset (original) Excel
spreadsheet version of the German credit data.
In spite of the fact that the data is German, you should probably make use of it for this assignment. (Unless you
really can consult a real loan officer!)

A few notes on the German dataset:


1. DM stands for Deutsche Mark, the unit of currency, worth about 90 cents Canadian (but looks andacts
like a quarter).
2. Owns telephone. German phone rates are much higher than in Canada so fewer people owntelephones.
3. Foreign worker. There are millions of these in Germany (many from Turkey). It is very hard to get
German citizenship if you were not born of German parents.
4. There are 20 attributes used in judging a loan applicant. The goal is to classify the applicant into oneof two
categories, good or bad.

Holy Mary Institute of Technology & Science (UGC-Autonomous) Page 141


B.Tech - CSE (Data Science) - HITS R21
Subtasks: (Turn in your answers to the following tasks)
1. List all the categorical (or nominal) attributes and the real-valued attributes seperately. (5 marks)
2. What attributes do you think might be crucial in making the credit assessment? Come up with some
simple rules in plain English using your selected attributes. (5 marks)
3. One type of model that you can create is a Decision Tree - train a Decision Tree using the complete
dataset as the training data. Report the model obtained after training. (10 marks)
4. Suppose you use your above model trained on the complete dataset, and classify credit good/bad for
each of the examples in the dataset. What % of examples can you classify correctly? (This is also called
testing on the training set) Why do you think you cannot get 100 % training accuracy? (10 marks)
5. Is testing on the training set as you did above a good idea? Why or Why not ? (10 marks)
6. One approach for solving the problem encountered in the previous question is using cross-validation?
Describe what is cross-validation briefly. Train a Decision Tree again using cross-validation and report
your results. Does your accuracy increase/decrease? Why? (10 marks)
7. Check to see if the data shows a bias against "foreign workers" (attribute 20), or "personal-status"
(attribute 9). One way to do this (perhaps rather simple minded) is to remove these attributes from the
dataset and see if the decision tree created in those cases is significantly different from the full dataset
case which you have already done.

To remove an attribute, you can use the preprocess tab in Weka's GUI Explorer. Did removing these attributes
have any significant effect? Discuss. (10 marks)

8. Another question might be, do you really need to input so many attributes to get good results? Maybe
only a few would do. For example, you could try just having attributes 2, 3, 5, 7, 10, 17 (and 21, the
class attribute (naturally)). Try out some combinations. (You had removed two attributes in problem
7. Remember to reload the arff data file to get all the attributes initially before you start selecting the
ones you want.) (10 marks)
9. Sometimes, the cost of rejecting an applicant who actually has a good credit (case 1) might be higher
than accepting an applicant who has bad credit (case 2). Instead of counting the misclassifcations
equally in both cases, give a higher cost to the first case (say cost 5) and lower cost to the second case.
You can do this by using a cost matrix in Weka. Train your Decision Tree again and report the
Decision Tree and cross validation results. Are they significantly different from results obtained in
problem 6 (using equal cost)? (10 marks)
10. Do you think it is a good idea to prefer simple decision trees instead of having long complex decision
trees? How does the complexity of a Decision Tree relate to the bias of the model? (10 marks)
11. You can make your Decision Trees simpler by pruning the nodes. One approach is to use Reduced
Error Pruning - Explain this idea briefly. Try reduced error pruning for training your Decision Trees
using cross-validation (you can do this in Weka) and report the Decision Tree you obtain? Also, report
your accuracy using the pruned model. Does your accuracy increase? (10 marks)
12. (Extra Credit): How can you convert a Decision Trees into "if-then-else rules". Make up your own
small Decision Tree consisting of 2-3 levels and convert it into a set of rules. There also exist different
classifiers that output the model in the form of rules - one such classifier in Weka is rules. PART, train
this model and report the set of rules obtained. Sometimes just one attribute can be good enough in
making the decision, yes, just one ! Can you predict what attribute that might be in this dataset ? OneR
classifier uses a single attribute to make decisions (it chooses the attribute based on minimum error).
Report the rule obtained by training a one R classifier. Rank the performance of j48, PART and oneR.
(10 marks)

Task Resources:
2. Mentor lecture on Decision Trees
3. Andrew Moore's Data Mining Tutorials (See tutorials on Decision Trees and Cross Validation)
4. Decision Trees (Source: Tan, MSU)

Holy Mary Institute of Technology & Science (UGC-Autonomous) Page 142


B.Tech - CSE (Data Science) - HITS R21
5. Tom Mitchell's book slides (See slides on Concept Learning and Decision Trees)
6. Weka resources:
- Introduction to Weka (html version) (download ppt version)
- Download Weka
- Weka Tutorial
- ARFF format
- Using Weka from command line

Task 2: Hospital Management System


Data Warehouse consists Dimension Table and Fact Table.REMEMBER The following
Dimension
The dimension objects (Dimension):
- Name
- Attributes (Levels) , with one primary key
- Hierarchies
One time dimension is must. About Levels and Hierarchies
Dimension objects (dimension) consist of a set of levels and a set of hierarchies defined over those levels. The levels
represent levels of aggregation. Hierarchies describe parent-child relationships among a set of levels.
For example, a typical calendar dimension could contain five levels. Two hierarchies can bedefined on these levels:
H1: YearL > QuarterL > MonthL > WeekL > DayLH2: YearL > WeekL > DayL
The hierarchies are described from parent to child, so that Year is the parent of Quarter,Quarter the parent of Month,
and so forth.
About Unique Key Constraints
When you create a definition for a hierarchy, Warehouse Builder creates an identifier key for each level of the
hierarchy and a unique key constraint on the lowest level (Base Level)
Design a Hospital Management system data warehouse (TARGET) consists of DimensionsPatient, Medicine,
Supplier, Time. Where measures are ‘NO UNITS’, UNIT PRICE. Assume the Relational database (SOURCE) table
schemas as follows
TIME (day, month, year),
PATIENT (patient_name, Age, Address, etc.,)
MEDICINE ( Medicine_Brand_name, Drug_name, Supplier, no_units, Uinit_Price, etc.,)SUPPLIER :(
Supplier_name, Medicine_Brand_name, Address, etc., )
If each Dimension has 6 levels, decide the levels and hierarchies, Assume the level namessuitably.
Design the Hospital Management system data warehouse using all schemas. Give theexample 4-D cube with
assumption names.

Holy Mary Institute of Technology & Science (UGC-Autonomous) Page 143


B.Tech - CSE (Data Science) - HITS R21
ESSENCE OF INDIAN TRADITIONAL KNOWLEDGE

III-B.Tech II-Semester L T P C
Course Code: A1DS607MC 2 0 0 0

COURSE OBJECTIVES:
The course should enable the students to learn:
To facilitate the students with the concepts of Indian traditional knowledge and to make them understand the
Importance of roots of knowledge system.

COURSE OUTCOMES:
At the end of the course, the student will be able to:
1. Upon completion of the course, the students are expected to:
2. Understand the concept of Traditional knowledge and its importance
3. Know the need and importance of protecting traditional knowledge.
4. Know the various enactments related to the protection of traditional knowledge.
5. Understand the concepts of Intellectual property to protect the traditional knowledge.

UNIT-I
Introduction to traditional knowledge: Define traditional knowledge, nature and characteristics, scope and
importance, kinds of traditional knowledge, the physical and social contexts in which traditional knowledge
develop, the historical impact of social change on traditional knowledge systems. Indigenous Knowledge (IK),
characteristics, traditional knowledge vis-à-vis indigenous knowledge, traditional knowledge Vs western
knowledge traditional knowledge vis-à-vis formal knowledge

UNIT-II
Protection of traditional knowledge: the need for protecting traditional knowledge Significance of TK
Protection, value of TK in global economy, Role of Government to harness TK.

UNIT-III
Legal frame work and TK: A: The Scheduled Tribes and Other Traditional Forest Dwellers (Recognition of
Forest Rights) Act, 2006, Plant Varieties Protection and Farmer's Rights Act, 2001 (PPVFR Act); B: The
Biological Diversity Act 2002 and Rules 2004, the protection of traditional knowledge bill, 2016. Geographical
indicators act 2003.

UNIT-IV
Traditional knowledge and intellectual property: Systems of traditional knowledge protection, Legal concepts
for the protection of traditional knowledge, Certain non IPR mechanisms of traditional knowledge protection,
Patents and traditional knowledge, Strategies to increase protection of traditional knowledge, global legal FORA
for increasing protection of Indian Traditional Knowledge.

UNIT-V
Traditional knowledge in different sectors: Traditional knowledge and engineering, Traditional medicine system,
TK and biotechnology, TK in agriculture, Traditional societies depend on it for their food and healthcare needs,
Importance of conservation and sustainable development of environment, Management of biodiversity, Food
security of the country and protection of TK. 139

TEXT BOOKS:
1. Traditional Knowledge System in India, by Amit Jha, 2009.
2. Traditional Knowledge System and Technology in India by Basanta Kumar Mohanta and Vipin
Kumar Singh, Pratibha Prakashan 2012.

Holy Mary Institute of Technology & Science (UGC-Autonomous) Page 144


B.Tech - CSE (Data Science) - HITS R21
REFERENCE BOOKS:
1. Traditional Knowledge System in India by Amit Jha Atlantic publishers, 2002
2. "Knowledge Traditions and Practices of India" Kapil Kapoor1, Michel Danino2

E-RESOURCES:
1. https://www.youtube.com/watch?v=LZP1StpYEPM
2. http://nptel.ac.in/courses/121106003

Holy Mary Institute of Technology & Science (UGC-Autonomous) Page 145

You might also like