GCD Detailed Syllabus
GCD Detailed Syllabus
PROGRAM SNAPSHOT
TERM 05 [ ELECTIVE ]
TERM 01
Machine Learning Advanced - [E-1]
Data Analysis basics with Python
Tensorflow for Deep Learning -[E-2]
TERM 02
TERM 06
Data Visualization & EDA
Capstone Project - II &
Industry Immersion
TERM 03
Machine Learning Foundation Optional Terms
EDA with R
Data visualization with Tableau
PROJECT
Capstone Project - I
Term 1:
DATA ANALYSIS BASICS WITH
PYTHON
Term Projects
Analyse key success factors for top cricket Analyse the car sale data in Ukraine.
team at IPL
The matches dataset contains 18 variables The dataset contains 10 variables and 9k+
and 600+ observations . The deliveries observations of the car sales data in Ukraine.
dataset contains 21 variables and 160k+
observations of the IPL 2018 season.
Analyse which country have won the most Analyse the IMDB 1000 most popular movies
medals at Olympic games. and come up with interesting insights.
The dataset contains 9 variables and 31.2k The dataset contains 12 variables and 1000
observations of the summer olympic games observations of the top 1000 popular
(1896 - 2014) movies for past 10 years
Module 5 :
Exploratory Case study: Analyse mental health of IT professionals
Data Analysis - 2
Module 6:
Top 3 teams present their projects details to the entire cohort
Project presentation
Term Projects
Analyse key success factors for top cricket Analyse the car sale data in Ukraine.
team at IPL
The matches dataset contains 18 variables The dataset contains 10 variables and 9k+
and 600+ observations . The deliveries observations of the car sales data in Ukraine.
dataset contains 21 variables and 160k+
observations of the IPL 2018 season.
Analyse which country have won the most Analyse the IMDB 1000 most popular movies
medals at Olympic games. and come up with interesting insights.
The dataset contains 9 variables and 31.2k The dataset contains 12 variables and 1000
observations of the summer olympic games observations of the top 1000 popular
(1896 - 2014) movies for past 10 years
Introduction to KNN
Module 2 : Calculate neighbours using distance measures
KNN (K- Nearest Find optimal value of K in KNN method
neighbours) Advantage & disadvantages of KNN
Case Study:Classify malicious websites using close neighbour technique
Introduction to SVM
Module 5 : Figure decision boundaries using support vectors
Support vector Identify hyperplane in SVM
machines (SVM) Applications of SVM in Machine Learning
Case Study : Predicting wine quality without tasting the wine
Introduction to Time Series analysis
Stationary vs non stationary data
Module 6 : Components of time series data
Time series
Interpreting autocorrelation & partial autocorrelation functions
forecasting
Stationarize data and implement ARIMA model
Case Study: Forecast demand for Air travel
Term Projects
The dataset contains 21 variables and 3k+ The test dataset contains 4 variables and
observations to identify a voice as male or 45000 observations. the train dataset
female using acoustic properties of voice and contains 4 variables and 90k observations of
speech. a 5 year of store-item-sales data.
Introduction to optimization in ML
Applications of optimization methods
Module 7 :
Optimization Optimization techniques: Linear Programming using Excel solver
How Stochastic Gradient Descent(SGD) Works?
Case study: Apply SGD on Regression data (sklearn dataset)
Term Projects
Letter recognition
Module 10 :
Project: Bulid a Chatbot using Slack Class
Revision of concepts
Integrate Chatbot with Bot server
and Term Project
Term Projects
In this capstone project, students will be provided with data collected by a major
Telecom operator on the demographic behaviour of users using different
handsets.
Students are required to do the initial bit of data cleansing, pre-processing and
then upload this data to SQL server via a web hosting platform that will be
provided to them.
This data from SQL server will be used to create a dashboard for the
company using D3.js scripts. D3,js scripts will be provided to students upfront.
These dashboards are reflective of how interactive visualizations can help
companies make strategies such as what demographies to cater to, how men and
women customers behave differently, which geographies are popular and ones
that need more investment from the company in terms of finance and marketing?
Term 6:
CAPSTONE PROJECT - II
Demand Planners
This capstone project will focus more on applying machine learning concepts
rather than data gathering and storing aspects. Students will be provided with
data collected by a major Taxi Aggregator of taxi bookings done in a leading city.
As budding data science consultants, students are required to do exploratory data
analysis & present an initial report.
After that the students are required to create an UI that displays the
observations regarding taxi usage across the city from the analysis and the
website should also have a provision for the company to forecast demand for
taxis at a specific time in the day.
The taxi bookings data provided will be in csv format and dashboards for the
company need to be created using D3.js scripts. The D3.js scripts will be provided
to the students beforehand.
OPTIONAL TERMS
EDA with R
Data structures
Module 3 : Basic Data management
Playing around with Loops and Functions
Data objects in R Saving output
Exercises: Loops and functions in R
Introduction to Statistics
Module 4 : Descriptive Statistics
Descriptive Measures of central tendency
statistics - 1 Measures of Dispersion and shape
Case Study: Investigation of Crime statistics in Beaufort
Introduction to Probability
Module 5 : Probability Distributions used in Data Science
Descriptive
Quantiles, percentiles, and standard score
statistics - 2
Case Study : Analyse student's performance at school
Introduction to Inferential Statistics
Module 6 : Population and Samples
Inferential
Central Limit Theorem
Statistics - 1
Case Study: Sampling data for Business analysis
Term Projects
Term Projects