0% found this document useful (0 votes)
9 views

Data Science Curriculum

Uploaded by

meetp.keystone
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views

Data Science Curriculum

Uploaded by

meetp.keystone
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 20

®

WE DON'T JUST TRAIN WE TRANSFORM CAREERS

DATA SCIENCE
CURRICULUM

4 5
SQL + Machine
Power BI Learning

3 6
Python Deep
Data Analysis Learning

2 7
Computer
Statistics Vision

1 8
Python
Programming NLP

Kukatpally Gachibowli
#205, 2nd Floor, Fortune Signature, 2nd Floor, Leeway, BP Raju Marg,
Near JNTU Metro Station, Kukatpally, Opp. Sarath City Capital Mall,
Hyderabad, Telangana 500085. Laxmi Cyber City, Whitefields, Kondapur,
Telangana 500081.
®

AUTHORIZED IN PARTNERSHIP
IBM PARTNER BEST DATA LEADING BEST WITH
MOST TRUSTED SCIENCE EDTECH EDTECH DATA SCIENCE
INDIAN COMPANY IN INDIA COMPANY IN 2023 INSTITUTION >>
AWARDED BY >> ®
AWARDED BY AWARDED BY AWARDED BY
A MeitY - NASSCOM Digital Skilling Initiative

350+ 10,000+ LEARNING MODE 500+


Batches Career Transformations Online & Offline Hiring Partners

Table of
Contents
01 Course Objective, Key Features In The Training Page no 3

02 Introduction & Walk Through The Course Page no 4

03 Module 1 - Python Core & Advanced Page no 4 - 5

04 Module 2 - Data Analysis In Python Page no 6 - 7

05 Module 3 - Advanced Statistics Page no 7 - 8

06 Module 4 - Data Base (SQL) + Reporting Tool (Power BI) Page no 8 - 11

07 Module 5 - Machine Learning - Supervised & Page no 11 - 16


Un-Supervised Learning

08 Module 6 - Deep Learning Page no 17

09 Module 7 - CNN & Computer Vision Page no 18 - 19

10 Module 8 - Natural Language Processing Page no 19 - 20


FOLLOW US ON

Instagram
www.innomatics.in Facebook
02 Linkedin
+91 9951666670 Youtube
Website
®

COURSE OBJECTIVE

► To understand the vital nature of data for organizations.


► To learn the conceptual framework of machine learning.
► To explore and analyze data using supervised and un-supervised learning techniques.
► To develop and deploy knowledge learning models using python.
► To work on Unstructured data like text processing them using Nltk and building
modules.
► Understanding Neural Networks and building deep networks using Tensoflow, Keras
and working with image processing computer vision.

KEY FEATURES IN THE TRAINING

► Duration: 6 Months
► Class Duration: 2 hrs (Monday to Friday)
► Online help on Doubt Clearance, Monitoring Session, Career Guidance, Interview
preparation & Mock interviews.
► Use cases covered: Python and statistics: 4, Machine Learning - 10, NLP - 2, DL - 3.
► One Big Hackathon Challenge on Machine Learning
► Projects:
► Python: Data analysis project, Machine Learning: Regression,
Classification
► NLP: Sentiment Analysis / Chatbot, Deep Learning: Face Recognition.
► Addition: Assignments, Quizzes for each module from python, Statistics, Machine
Learning, NLP and Deep Learning + Computer vision topic wise assignments and quiz.
► IBM Credentials and Certification after completion of the course.
► Guaranteed In-house Internship
► Nearly working on 20 use cases during your course.
► Training materials are provided with Lab Exercises, Data sets, Codes Quizzes, Case
studies on real data.
► For every online session recording video & live running notes will provide.
► Real time Training with live Scenarios and Applications.
► Job Assistance after completion of the course.
FOLLOW US ON

Instagram
www.innomatics.in
Facebook
Linkedin 03
Youtube +91 9951666670
Website
®

INTRODUCTION & WALK THROUGH THE COURSE

Introduction To Data Science


Data Science is a multi disciplinary field that combines techniques from statistics,
mathematics, computer science, and domain-specific knowledge to extract valuable insights
and knowledge from data. It involves the use of various methods, algorithms, and systems to
analyze and interpret complex data sets.

In this introductory section, we'll explore the fundamental concepts of data science, including
its origins, key principles, and the role it plays in solving real-world problems. We'll deliver into
the importance of data-driven decision-making and how data science contributes to
innovation across various domains.

Life Cycle of Data Science


The data science life cycle involves stages from data collection to model deployment and
ongoing monitoring, ensuring effective project management.

Skills Required for Data Science


Essential skills include statistical analysis, programming proficiency (Python, R), machine
learning, data wrangling, data visualization, domain knowledge, and strong communication
skills.

Applications of Data Science


The data science life cycle involves stages from data collection to model deployment and
ongoing monitoring, ensuring effective project management.

MODULE 1
PYTHON PROGRAMMING AND FLASKFRAMEWORK

Introduction
► What is Python?
► Why does Data Science require Python?
► Installation of Anaconda
► Understanding Jupyter Notebook (IDE)
► Basic commands in Jupyter Notebook
► Understanding Python Syntax
► Identifiers and Operators
FOLLOW US ON

Instagram
www.innomatics.in Facebook
04 Linkedin
+91 9951666670 Youtube
Website
®

Data Types & Data Structures


► Variables, Data Types, and Strings
► Lists, Sets, Tuples and Dictionaries

Control Flow & Conditional Statements


► Conditional Operators, Arithmetic Operators and Logical Operators
► if, elif and else Statements
► While Loops and control flow
► For Loops and nested loops
► pass, break and continue
► Nested Loops and List and Dictionary Comprehensions

Functions and Modules


► What is function and types of functions
► Code optimization and argument functions
► Scope
► Lambda Functions
► Map, Filter and Reduce
► Importing a Module
Using help() and dir() Aliasing or Renaming
► Some Important Modules in Python:
math module, random module, datetime and os module

File Handling
► Create, Read, Write files and Operations in File Handling
► Errors and Exception Handling

Class and Objects


► Create a class
► Create a object
► The __init__()
► Modifying Objects
► Object Methods
► Self
► Modify the Object Properties
► Delete Object
► Pass Statements
FOLLOW US ON

Instagram
www.innomatics.in
Facebook
Linkedin 05
Youtube +91 9951666670
Website
®

MODULE 2
DATA ANALYSIS IN PYTHON

Numpy - Numerical Python


► Introduction to Array
► Creation and Printing of array
► Basic Operations in Numpy
► Indexing
► Mathematical Functions of Numpy

Data Manipulation with Pandas


► Series and DataFrames
► Data Importing and Exporting through Excel, CSV Files
► Data Understanding Operations
► Indexing and slicing and More filtering with Conditional Slicing
► Groupby, Pivot table and Cross Tab
► Concatenating and Merging Joining
► Descriptive Statistics
► Removing Duplicates
► String Manipulation
► Missing Data Handling

DATA VISUALIZATION

Data Visualization Using Matplotlib And Seaborn


► Introduction to Matplotlib
► Basic Plotting
► Properties of plotting
► About Subplots
► Line plots
► pie chart and Bar Graph
► Histograms
► Box and Violin Plots
► Scatterplot

Case Study On Exploratory Data Analysis (eda) & Visualizations


► What is EDA?
► Uni - Variate Analysis
► Bi - Variate Analysis
► More on Seaborn Based Plotting Including Pair Plots, Catplot, Heat Maps, Count plot
along with matplotlib plots.
FOLLOW US ON

Instagram
www.innomatics.in Facebook
06 Linkedin
+91 9951666670 Youtube
Website
®

UNSTRUCTURED DATA PROCESSING

Regular Expressions
► Structured Data and Unstructured Data
► Literals and Meta Characters
► How to Regular Expressions using Pandas?
► Inbuilt Methods
► Pattern Matching

Project On Web Scraping : Data Mining


And Exploratory Data Analysis
► Data Mining (WEB - SCRAPING)
This project starts completely from scratch which involves collection of Raw Data from
different sources and converting the unstructured data to a structured format to apply
Machine Learning and NLP models.

► This project covers the main four steps of Data Science Life Cycle which involves
► Data Collection
► Data Mining
► Data Preprocessing
► Data Visualization
Ex: Text, CSV, TSV, Excel Files, Matrices, Images

MODULE 3
ADVANCED STATISTICS

Data Types in Statistics


► Statistics in Data science:
► What is Statistics?
► How is Statistics used in Data Science?
► Population and Sample
► Parameter and Statistic
► Variable and its types

Data Gathering Techniques


► Data types
► Data Collection Techniques
► Sampling Techniques:
► Convenience Sampling, Simple Random Sampling
► Stratified Sampling ,Systematic Sampling and Cluster Sampling
FOLLOW US ON

Instagram
www.innomatics.in
Facebook
Linkedin 07
Youtube +91 9951666670
Website
®

Descriptive Statistics
► Data types
► Data Collection Techniques
► Sampling Techniques:
► Convenience Sampling, Simple Random Sampling
► Stratified Sampling ,Systematic Sampling and Cluster Sampling

Descriptive Statistics
► What is Univariate and Bi Variate Analysis?
► Measures of Central Tendencies
► Measures of Dispersion
► Skewness and Kurtosis
► Box Plots and Outliers detection
► Covariance and Correlation

Probability Distribution
► Probability and Limitations
► Discrete Probability Distributions
► Bernoulli, Binomial Distribution, Poisson Distribution
► Continuous Probability Distributions
► Normal Distribution, Standard Normal Distribution

Inferential Statistics
► Sampling variability and Central Limit Theorem
► Confidence Intervals
► Hypothesis Testing
► Z -test, t-test
► Chi – Square Test
► F -Test and ANOVA

MODULE 4
Data Base (SQL) + Reporting Tool (Power BI)

SQL for Data Science


► Introduction to Databases
► Basics of SQL
► DML, DDL, DCL and Data Types
► Common SQL commands using SELECT, FROM and WHERE
► Logical Operators in SQL
FOLLOW US ON

Instagram
www.innomatics.in Facebook
08 Linkedin
+91 9951666670 Youtube
Website
®

► Filtering and Sorting


► Advanced filtering using IN, OR and NOT
► Sorting with GROUPBY and ORDER BY

► SQL Joins
► INNER and OUTER joins to combine data from multiple tables
► RIGHT, LEFT joins to combine data from multiple tables

► SQL Aggregations
► Common Aggregations including COUNT, SUM, MIN and MAX
► CASE and DATE functions as well as work with NULL values

► Subqueries and Temp Tables
► Subqueries to run multiple queries together
► Temp tables to access a table with more than one query

► Window Functions
► ROW_NUMBERS(), RANK(), DENSE_RANK(), LAG, LEAD, SUM, COUNT, AVG

Introduction To Power Bi
► What is Business Intelligence?
► Power BI Introduction
► Quadrant report
► Comparison with other BI tools
► Power BI Desktop overview
► Power BI workflow
► Installation query addressal

Data Import And Visualizations


► Data import options in Power BI
► Import from Web (hands on)
► Why Visualization?
► Visualization types

Data Visualization (Contd.)


► Categorical data visualization
► Trend Data viz
► Visuals for Filtering
► Slicer details and use
► Formatting visuals
► KPI visuals
► Tables and Matix
FOLLOW US ON

Instagram
www.innomatics.in
Facebook
Linkedin 09
Youtube +91 9951666670
Website
®

Power Queries
► Power Query Introduction
► Data Transformation - its benefits
► Introducing ribbons
► Queries panel
► M Language briefing
► Power BI Datatypes
► Changing Datatypes of columns

Power Queries (Cond.)


► Filtering
► Inbuilt column Transformations
► Inbuilt row Transformations
► Combine Queries
► Merge Queries

Power Pivot And Introduction To Dax


► Power Pivot
► Intro to Data Modelling
► Relationship and Cardinality
► Relationship view
► Calculated Columns vs Measures
► DAX Introduction and Syntax

Data Analysis Expressions


► DAX recap
► DAX logical functions
► DAX text functions
► DAX math and statistical Functions

Data Analysis Expressions (Contd.)


► DAX aggregation function
► DAX filter function
► DAX time intelligent function
► Creating a Date Dimension table
► Related aspects with tables
FOLLOW US ON

Instagram
www.innomatics.in Facebook
10 Linkedin
+91 9951666670 Youtube
Website
®

Login, Publish To Web And Rls


► Power BI services
► Dashboard creation
► Web Content, Image, Text Box, Video
► Dashboard formatting
► Sharing your dashboard
► RLS introduction

Miscellaneous Topics
► Visual Interactions
► Drill Through
► Drilldown
► Conditional Formatting
► Creating buttons in Power BI reports
► Creating Python Script Visuals

MODULE 5
MACHINE LEARNING - SUPERVISED LEARNING

Introduction
► What Is Machine Learning?
► Supervised Versus Unsupervised Learning
► Regression Versus Classification Problems Assessing Model Accuracy

Introduction And Linear Algebra


► Supervised Versus Unsupervised Learning
► Introduction to Matrices
► Vector spaces, including dimensions, Euclidean spaces, closure properties and axioms
► Eigenvalues and Eigenvectors, including how to find Eigenvalues and the corresponding
Eigenvectors
FOLLOW US ON

Instagram
www.innomatics.in
Facebook
Linkedin 11
Youtube +91 9951666670
Website
®

REGRESSION TECHNIQUES

Linear Regression
► Simple Linear Regression:
► Estimating the Coefficients
► Assessing the Coefficient Estimates
► R Squared and Adjusted R Squared
► M SE and RMSE

Multiple Linear Regression


► Estimating the Regression Coefficients
► OLS Assumptions
► Multicollinearity
► Feature Selection
► Gradient Discent

Evaluating the Metrics of Regression Techniques


► Homoscedasticity and Heteroscedasticity of error terms
► Residual Analysis
► Q-Q Plot
► Cook's distance and Shapiro-Wilk Test
► Identifying the line of best fit
► Other Considerations in the Regression Model
► Qualitative Predictors
► Interaction Terms
► Non-linear Transformations of the Predictors

Polynomial Regression
► Why Polynomial Regression
► Creating polynomial linear regression
► evaluating the metrics

Regularization Techniques
► Lasso Regularization
► Ridge Regularization
► ElasticNet Regularization

Case Study on Linear, Multiple Linear Regression, Polynomial, Regression using Python.
CAPSTONE PROJECT:
A project on a use case will challenge the Data Understanding, EDA, Data Processing
and above Regression Techniques.
FOLLOW US ON

Instagram
www.innomatics.in Facebook
12 Linkedin
+91 9951666670 Youtube
Website
®

CLASSIFICATION TECHNIQUES

Logistic regression
► An Overview of Classification
► Difference Between Regression and classification Models.
► Why Not Linear Regression?
► Logistic Regression:
► The Logistic Model
► Estimating the Regression Coefficients and Making Pr edictions
► Logit and Sigmoid functions
► Setting the threshold and understanding decision boundary
► Logistic Regression for >2 Response Classes
► Evaluation Metrics for Classification Models:
► Confusion Matrix
► Accuracy and Error rate
► TPR and FPR
► Precision and Recall, F1 Score
► AUC – ROC
► Kappa Score

Naive Bayes
► Principle of Naive Bayes Classifier
► Bayes Theorem
► Terminology in Naive Bayes
► Posterior probability
► Prior probability of class
► Likelihood
► Types of Naive Bayes Classifier
► Multinomial Naive Bayes
► Bernoulli Naive Bayes and Gaussian Naive Bayes
FOLLOW US ON

Instagram
www.innomatics.in
Facebook
Linkedin 13
Youtube +91 9951666670
Website
®

TREE BASED MODULES

Decision Trees
► Decision Trees (Rule Based Learning):
► Basic Terminology in Decision Tree
► Root Node and Terminal Node
► Regression Trees and Classification Trees
► Trees Versus Linear Models
► Advantages and Disadvantages of Trees
► Gini Index
► Overfitting and Pruning
► Stopping Criteria
► Accuracy Estimation using Decision Trees

Case Study: A Case Study on Decision Tree using Python


► Resampling Methods:
► Cross-Validation
► The Validation Set Approach Leave-One-Out Cross-Validation
► k -Fold Cross-Validation
► Bias-Variance Trade-Offfor k-Fold Cross-Validation

Ensemble Methods in Tree Based Models


► What is Ensemble Learning?
► What is Bootstrap Aggregation Classifiers and how does it work?

Random Forest
► What is it and how does it work?
► Variable selection using Random Forest

Boosting: AdaBoost, Gradient Boosting:


► What is it and how does it work?
► Hyper parameter and Pro's and Con's

Case Study: Ensemble Methods - Random Forest Techniques using Python.


FOLLOW US ON

Instagram
www.innomatics.in Facebook
14 Linkedin
+91 9951666670 Youtube
Website
®

DISTANCE BASED MODULES

K Nearest Neighbors
► K-Nearest Neighbor Algorithm
► Eager Vs Lazy learners
► How does the KNN algorithm work?
► How do you decide the number of neighbors in KNN?
► Curse of Dimensionality
► Pros and Cons of KNN
► How to improve KNN performance

Case Study: A Case Study on k-NN using Python.

Support Vector Machines


► The Maximal Margin Classifier
► HyperPlane
► Support Vector Classifiers and Support Vector Machines
► Hard and Soft Margin Classification
► Classification with Non-linear Decision Boundaries
► Kernel Trick
► Polynomial and Radial
► Tuning Hyper parameters for SVM
► Gamma, Cost and Epsilon
► SVMs with More than Two Classes
Case Study: A Case Study on SVM using Python.

CAPSTONE PROJECT: A project on a use case will challenge the Data


Understanding, EDA, Data Processing and above Classification Techniques.

UN-SUPERVISED LEARNING

► Why Unsupervised Learning


► How it Different from Supervised Learning
► The Challenges of Unsupervised Learning
FOLLOW US ON

Instagram
www.innomatics.in
Facebook
Linkedin 15
Youtube +91 9951666670
Website
®

Principal Components Analysis


► Introduction to Dimensionality Reduction and it's necessity
► What Are Principal Components?
► Demonstration of 2D PCA and 3D PCA
► EigenValues, EigenVectors and Orthogonality
► Transforming Eigen values into a new data set
► Proportion of variance explained in PCA

Case Study: A Case Study on PCA using Python.

K-Means Clustering
► Centroids and Medoids
► Deciding optimal value of 'k' using Elbow Method
► Linkage Methods

Hierarchical Clustering
► Divisive and Agglomerative Clustering
► Dendrograms and their interpretation
► Applications of Clustering
► Practical Issues in Clustering

Case Study: A Case Study on clusterings using Python.

Recommendation Systems
► What are recommendation engines?
► How does a recommendation engine work?
► Data collection
► Data storage
► Filtering the data
► Content based filtering
► Collaborative filtering
► Cold start problem
► Matrix factorization
► Building a recommendation engine using matrix factorization
► Case Study
FOLLOW US ON

Instagram
www.innomatics.in Facebook
16 Linkedin
+91 9951666670 Youtube
Website
®

MODULE 6
DEEP LEARNING

Introduction to Neural Networks


► Introduction to Perceptron & History of Neural networks
► Activation functions
a)Sigmoid b) Relu c)Softmax d)Leaky Relu e)Tanh
► Gradient Descent
► Learning Rate and tuning
► Optimization functions
► Introduction to Tensorflow
► Introduction to keras
► Back propagation and chain rule
► Fully connected layer
► Cross entropy
► Weight Initialization
► Regularization

TensorFlow 2.0
► Introducing Google Colab
► Tensorflow basic syntax
► Tensorflow Graphs
► Tensorboard

Artificial Neural Network with Tensorflow


► Neural Network for Regression
► Neural Network for Classification
► Evaluating the ANN
► Improving and tuning the ANN
► Saving and Restoring Graphs
FOLLOW US ON

Instagram
www.innomatics.in
Facebook
Linkedin 17
Youtube +91 9951666670
Website
®

MODULE 7
CNN & COMPUTER VISION

UNIT 1: Working with images & CNN Building Blocks


► Working with Images_Introduction
► Working with Images - Reshaping understanding, size of image understanding pixels
Digitization, Sampling, and Quantization
► Working with images - Filtering
► Hands-on Python Demo: Working with images
► Introduction to Convolutions
► 2D convolutions for Images
► Convolution - Backward
► Transposed Convolution and Fully Connected Layer as a Convolution
► Pooling: Max Pooling and Other pooling options

UNIT 2: CNN Architectures and Transfer Learning


► CNN Architectures and LeNet Case Study
► Case Study: AlexNet
► Case Study: ZFNet and VGGNet
► Case Study: GoogleNet
► Case Study: ResNet
► GPU vs CPU
► Transfer Learning Principles and Practice
► Hands-on Keras Demo: SVHN Transfer learning from MNIST dataset
► Transfer learning Visualization (run package, occlusion experiment)
► Hands-on demo -T-SNE

UNIT 3: Object Detection


► CNN's at Work - Object Detection with region proposals
► CNN's at Work - Object Detection with Yolo and SSD
► Hands-on demo- Bounding box regressor
► Need to do a semantic segmentation project

UNIT 4: CNN's at Work - Semantic Segmentation


► CNNs at Work - Semantic Segmentation
► Semantic Segmentation process
► U-Net Architecture for Semantic Segmentation
► Hands-on demo - Semantic Segmentation using U-Net
FOLLOW US ON

Instagram
www.innomatics.in Facebook
18 Linkedin
+91 9951666670 Youtube
Website
®

► Other variants of Convolutions


► Inception and Mobile Net models

UNIT 5: CNN's at work- Siamese Network for Metric Learning


► Metric Learning
► Siamese Network as metric learning
► How to train a Neural Network in Siamese way
► Hands-on demo - Siamese Network

MODULE 8
NATURAL LANGUAGE PROCESSING

Unit 1: Introduction to Statistical NLP Techniques


► Introduction to NLP
► Preprocessing, NLP Tokenization, stop words, normalization, Stemming and
lemmatization
► Preprocessing in NLP Bag of words, TF-IDF as features
► Language model probabilistic models, n-gram model and channel model
► Hands on NLTK

Unit 2 : Word embedding


► Word2vec
► Golve
► POS Tagger
► Named Entity Recognition(NER)
► POS with NLTK
► TF-IDF with NLTK

Unit 3: Sequential Models


► Introdcution to sequential models
► Introduction to RNN
► Intro to LSTM
► LSTM forward pass
► LSTM backprop through time
► Hands on keras LSTM
FOLLOW US ON

Instagram
www.innomatics.in
Facebook
Linkedin 19
Youtube +91 9951666670
Website
®

Unit 4 : Applications
► Sentiment Analysis
► Sentence generation
► Machine translation
► Advanced LSTM structures
► Keras- machine translation
► ChatBot

Tools & Technologies

Python +

Statistics

WE DON'T JUST TRAIN


WE TRANSFORM CAREERS
FOLLOW US ON

LINKEDIN
INSTAGRAM
WEBSITE
FACEBOOK
YOUTUBE
FOLLOW US ON

Instagram
www.innomatics.in Facebook
20 Linkedin
+91 9951666670 Youtube
Website

You might also like