0% found this document useful (0 votes)
29 views26 pages

Syllabus Online Learning (DV+ML) Compress

Uploaded by

dicha gilang
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
29 views26 pages

Syllabus Online Learning (DV+ML) Compress

Uploaded by

dicha gilang
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 26

ACADEMY SYLLABUS

Online-Interactive Learning
learn data science by building

DATA VISUALIZATION SPECIALIZATION

A fun, hands-on, and project-based specialization


that helps student gain full proficiency in data
visualization systems and tools. Create compelling
narratives by combining charting elements with
custom aesthetics under the guidance of our
instructors.

The learn-by-building module in all the workshops


follows our project-based learning philosophy to
this specialization. The course capstone requires
that the student build a real-world application under
stringent criteria modeled after real business
scenarios.
Programming for
Data Science
3-Days Workshop

P4DS
Module 1: Data Science in R
Data Science in R Working with Data
R Programming Basics Reading & Extracting Data
Why Learn R? Understanding Statistics
R Studio Interface Exploratory Data Analysis
Programming for Data Science is a course that Data Structures in R
covers the important programming paradigms
and tools used by data analysts and data Data Manipulation
scientists today. You will be guided through a Working with your Global Environment
Getting familiar with your Workspace
series of coding exercises designed to Continuous and Categorical Data
maximize your familiarity with data science
programming in RStudio, an integrated
Module 2: Data Manipulation
development environment for the statistical
computing language R. Data Manipulation II Practical Data Cleansing
Vector Types and Classes The Data Transformation Process
Upon completion of this workshop, you will be List and Objects Reproducible Data Science Projects
Matrix and Data Frames Reading and Writing from your IDE
familiar with the programming language,
popular tools, libraries (data science packages)
R in Practice
and toolkits required to excel in Programming Exercise: e-Commerce Retail Datasets
your data analysis and statistical computing In-depth review of Data Frame subsetting
projects. Sampling and Randomization
Cross-Tabulations
Aggregations

5 6
Academy Modules
Working with R
R Scripts and Functions
R Markdown
Why Care about Reproducibility
Graded Quiz

Learn by Building Module


Writing your code as R scripts make up for automation and
integration with other tools and services, while writing a
R Markdown presents your findings and recommendations in a
way that is friendly to non-technical / managerial team members.

R Script to clean & transform the data


Write a R script containing a function (name the function
however way you want) that reads a dataset as input, perform
the necessary transformation and export a cross-tabulation
numeric result or plot as output.
Reproducible Data Science
Create an R Markdown file that combines your step-by-step
data transformation code with some explanatory text. Add
formatting styles and hierarchical structure using Markdown.

7 8
Practical
Statistics
2-Days Workshop

PS
Module 1: Descriptive Statistics
5-Number Summary Central Tendency & Variability
Mean, Median and Mode Visualizing Central Tendency
Measures of Central Tendency Variance and Covariance
Have the statistical foundation for more Quantiles in R
advanced machine learning theories later on in
the specialization by picking up the key ideas in Standard Score and z-Score
Standard Normal Curve
statistical thinking. Learn to interpret correlations, Central Limit Theorem
construct confidence intervals and other z-Score Calculation & Student's T-test
statistical principles that form the basis of many
common machine learning models. Module 2: Inferential Statistics
The 2-days course is optional for participation of Probabilities Intervals
the Data Visualization and Machine Learning Probability Mass Function Confidence Intervals
Probability Density Function Prediction Intervals
Specialization and intended for learners without Expected Values
prior experience in statistics. p-Values

Inferential Statistics in Practice


Hypothesis Testing
Deriving Scientific Truths from Data
Case Study

9 10
Academy Modules
Tips and Techniques: R for Statisticians
Density Plots
Interpreting Box Plots (Box and Whisker)
Better summary statistics with `skimr()`
Pais Matrix
Graded Quiz

Learn by Building Module


Statistical Treatment of Retail Dataset
Using what you’ve learned formulate a question and derive
statistical hypothesis test to answer the questions. You’he to
demonstrate that you’re able to make decision using data in
a scientific manner.
Examples of questions can be:
-Is there a different in profitability between standard
shipment and same-day shipment?
-Supposed there is no difference in profitability between the
different product segment, what is the probability that we
obtain the current observation due to pure chance alone?

11 12
Data Visualization
in R
4-Days Workshop

DVinR
Module 1: Plotting Essentials
Base Plotting I Base Plotting II
Plots and Lines Histograms and Curves
Built-in Plot Types Cleveland's Dot Plot
A fun, hands-on, and project-based workshop Legends and Annotations Axis, Titles, Subtitles and Panel Styles
Other built-in Plotting Functionalities The Notorious Pie Chart
that help students gain full proficiency in data
visualization systems and tools. Create Working with ggplot2 Enhancing with ggplot2
Grammar of Graphics System Axis, titles and scales
compelling narratives by combining charting Mapping aesthetics Adding themes to your plots
elements with custom aesthetics under the Working with Geometries Custom aesthetics and styles
guidance of our instructors. Background image Working with Legends

The 4-days course follows our learn-by-building Module 2: Richer Visualization Techniques
approach, in that students are tasked to
Enhancing ggplot2 II Enhancing ggplot2 III
reproduce a series of plots applying what Flipping coordinates and Axis Rotation Enriching: Scatterplots and bubble plots
they’ve learned. While it covers the three main Multi-dimensional Faceting Enriching: Jitterplots
plotting systems in R, its particular focus is on Text Layers and Label Layers Enriching: Boxplots and violin plots
Expected Values Layer transparency
ggplot2 and the additional libraries centered
around it that brings interactivity and enhanced Enhancing ggplot2 IV Other Visualization Toolset
aesthetic options to the art of creating rich, Enriching: Column Plots Discrete, Continuous, and Gradient colors
Enriching: Texts and Labels Facet with wraps and grids
powerful visualizations. Enriching: Horizontal and Vertical Lines Visualizing Spatial Data
Fills and Colors Working with Leaflet and Maps

13 14
Academy Modules
Project: Mining Trending Videos on YouTube
Hands-on data visualization
Identifying temporal patterns in trending videos
Combining aesthetics and geometries
Graded Quiz

Learn by Building Module


Creating a Publication-Grade Plot
Applying what you’ve learned, create an economics- or
social-related plot that is polished with the appropriate
annotations, aesthetics and some simple commentary. You
may use the same "YouTube Trending Videos" dataset or any
other dataset for this practice.
Creating an Interactive Map
Applying what you’ve learned, create a web page with an
interactive map embedded on it. Use a custom icon for the
map markers to represent business locations, and show
details about each location pin (“markers”) upon user’s
interaction with it.

15 16
Interactive Plotting &
Web Dashboard
4-Days Workshop

IP&WD
Module 1: Interactive Visualization
Working with Plotly Publication & Layout Options
Refresher on dplyr Multiple Plots Arrangement
ggplotly function More export functions
Building on the foundation from previous Visualization as a HTML widget Subplots
Range Slider and other interactivity Tips and Techniques for Layouts
classes, we will create a series of interactive plots
and gadgets that renders multiple visualization
elements based on the user’s input. This is the Module 2: Web Dashboard Development
final workshop leading up to the data Flex Dashboard Interactive Document
visualization capstone project. Creating Flex Dashboard from RStudio Inputs and Outputs
Layouts The renderPlot() function
Hands-on Practice: Text, Plots, Tables Embedded Application
The 4-days course follows our learn-by-building Demonstration and Practical Advice Demonstration and Practical Advice
approach, in that students are tasked to
reproduce a series of plots applying what Shiny Web App
Shiny Dashboard
they’ve learned. It covers an exhaustive list of Tabs and Pagination
techniques that add interactivity to an R UI, Server and Shiny Functions
document and set the stage for the data science Custom Styles, Structure

capstone project.

17 18
Academy Modules
Tips on Web Dashboard Deployment
Working with live data
App deployment solutions
Tips for live dashboard performance

Learn by Building Module


Building an Interactive Dashboard
Applying what you’ve learned, create a paginated web
dashboard with a rich set of UI elements coupled with the
appropriate server logic. The web dashboard can be of any
theme, using any dataset, but must feature an input panel that
accepts end user inputs and render the output accordingly.

19 10
Data Visualization
Capstone Project
VISUALIZE
YOUR
After having learned and explored appropriate techniques on
visualizing data, students are required to deploy an interactive
dashboard web application using a shiny server which contains
any plotting objects such as ggplot and/or leaflet that display
useful insights. In addition, students are given the freedom to
use their own dataset or past datasets from previous classes.

Marks of the project is out of 30 points, the rubrics for

SUCCESS
assessment and grading will be discuss in the class.

THEN TAKE ACTION

21 22
learn data science by building

MACHINE LEARNING SPECIALIZATION

An intensive specialization that strives for a fine


balance between practical applications and
mathematical rigor in teaching essential machine
learning concepts. By taking a learn-by-building
approach, you will learn to develop regression and
classification algorithms and incorporate them into
real-life solutions or data products / business
applications.

The modules in all the workshops follow our


project-based learning philosophy to this
specialization. The course capstone requires that
the student build a real-world application under
stringent criteria modeled after real business
scenarios.
Regression
Models
4-Days Workshop

RM
Module 1: Regression Models I
OLS Regression Linear Models in R
Understanding Least Squares Understanding Coefficients
This course strives for a fine balance between Outliers: Leverage and Influence Plotting Regression
Simple Linear Regression Model Construction
business applications and mathematical rigor in
its treatment to regression models, one of the Interpreting Linear Models
most essential statistical techniques in the field Residuals Manually
of machine learning. Its aim is to equip you with Coefficients Manually
R-Squared Manually
the knowledge to investigate relationships
between variables of a data effectively and
rigorously. Module 2: Regression Models II
Interpreting Linear Models Multiple Regression
We strongly recommend that you complete Estimates and Standard Errors Model Assumptions
Practical Statistics prior to taking this course. t-value and P-value Bias-Variance Trade-off
Upon completion of this workshop, you will Adjusted R-Squared Outliers: Leverage and Influence
Confidence Interval Model Limitation and Evaluation
acquire a rigorous statistical understanding of
machine learning models, allowing you to Dive Deeper: Regression Models
extrapolate the same ideas into other, more Model Selection and Specification
Step-wise Regression
advanced machine learning models. All-possible Regressions
Residual Plots
Model Diagnostics
Limitations of Regression Models

25 26
Academy Modules
Graded Quiz
Learn by Building Module
Recommendation on Lowering Crime Rates
Write a regression analysis report applying what you’ve
learned in the workshop. Using the dataset provided by you,
write your findings on the different socioeconomic variables
most highly correlated to crime rates.
Explain your recommendations where appropriate.

27 28
Classification in
Machine Learning I
4-Days Workshop

CIML1
Module 1: Logistic Regression
Relating Probabilities to Odds Logistic Regression from First
Understanding Odds Principles
Understanding Log of Odds Sigmoidal Logistic Function
Learn to solve binary and multi-class classification Plotting Odds and Log of Odds Key Assumptions of Sigmoid
models using machine learning algorithms that Function Extra Proof: Intuition behind the
Sigmoid Function
are easily understood and readily interpretable.
You will learn to write a classification algorithm Logistic Regression in Action Practical Tips and Case Study
from scratch, and appreciate the mathematical Binary Logistic Regression Flight Delay Prediction Examples
Interpreting Coefficients Customer Churn and Attrition Examples
foundations underpinning logistic regressions Interpretation Against Continuous Risk Modeling on Loans from Quarter 4, 2017
and nearest neighbors algorithms. & Discrete Variables

We strongly recommend that you complete the Performance Evaluation and


Regression Models workshop prior to taking this Model Selection
AIC (Akaike Information Criteria)
course. Upon completion of this workshop, you Null Deviance and Residual Deviance
will acquire the depth to develop, apply, and Hauck Donner Effect
evaluate two highly versatile algorithms widely
used today. Module 2: Nearest Neighbours Algorithm
Closer Look at Classification k-NN in Action
Probabilties vs Class Responses Characteristics of k-NN
Cross Validation and Out-of Sample Error Positives and Negatives
Bias-variance trade off Diagnosing Breast Cancer with k-NN
Confusion matrix (accuracy, sensitivity,
specificity, & precision)

Bulding Blocks of k-NN k-NN from First Principles


Distance Function (Euclidean, Classifying Customer Segments with k-NN
Minkowsky) Writing Your Own k-NN Classifier
The k Parameter Predicting Using Your Own k-NN Classifier
Standardization vs Min-Max
Normalization

29 30
Academy Modules
Graded Quiz
Learn by Building Module
Logistic Regression on Credit Risk
Applying what you’ve learned, present a simple R Markdown
document in which you demonstrate the use of logistic
regression on the lbb_loans.csv dataset. Explain your findings
wherever necessary and show the necessary data preparation
steps. To help you through the exercise, consider the following
questions throughout the document:
How do we correctly interpret the negative coefficients
obtained from your logistic regression?
How do we know which of the variables are more
statistically significant as predictors?
What are some strategies to improve your model?

Customer Segment Prediction


Applying what you’ve learned, present a simple R Markdown
document in which you demonstrate the use of k-NN on the
wholesale.csv dataset. Compare the k-NN to the logistic
regression model and answer the following questions
throughout the document:
What is your accuracy? Was the logistic regression better
than k-NN in terms of accuracy? (recall the lesson on
obtaining an unbiased estimate of the model’s accuracy)
Was the logistic regression better than our kNN model at
explaining which of the variables are good predictors of a
customer’s industry?
List down 1 disadvantage and 1 strength of each of the
approach (k-NN and logistic regression)

31 32
Classification in
Machine Learning II
4-Days Workshop

CIML2
Module 1: Naive Bayes
Law of Probability Naive Bayes Classifier
Dependent and Independent Events Characteristics of a Naive Bayes Classifier
Bayes Theorem The "naive" assumptions
Learn to apply the law of probabilities, boosting, Formula for Posterior Probability Customer Churn example
bootstrap aggregation, k-fold cross-validation,
ensembling methods, and a variety of other Practical and Performance Naive Bayes in Action
Considerations Spam Classification
techniques as we build some of the most widely
The Case for Smoothing Predicting on Text (Corpus)
used machine learning algorithms today. Learn Laplace (Add-One) Predicting Political Party Affiliation
to add performance to your models using Thinking about Training vs
mathematically sound principles you’ll learn in Prediction Speed
this course.
Module 2: Tree-Based Methods and Ensembles
We strongly recommend that you complete the
Classification in Machine Learning 1 workshop Decision Trees Decision Tress in Action
Advantages and Model Characteristics Predicting Diabetes from Diagnostics
prior to taking this course. Some concepts Information Gain and Splitting Measurement
presented throughout the lecture may be Criterion Pruning and Tree Size AUC Curve
less-than-ideal for practitioners who have not Key Considerations and Practical Advice
completed the pre-requisite courses. Random Forest Machine Learning Theories
Ensemble-based Methods Logistic Regression, Naive Bayes and
Case Example: Predicting the Quality Decision Trees have more in common
of Exercise than you think
Industrial Applications
Thinking about Decision Boundaries

High-Performance Machine Learning


Bias-Variance Tradeoff revisited k-Fold
Cross Validation
Predicting Exercise Form with Fitness
Tracker Data

33 34
Academy Modules
Graded Quiz
Learn by Building Module
Identifying Risky Bank Loans
Use any of the 3 classification algorithms you’ve learned in
this lesson to predict the risk status of a bank loan. The
variable default in the dataset indicates whether the applicant
did default on the loan issued by the bank.
Use an R Markdown document to lay out your process, and
explain the methodology in 1 or 2 brief paragraph. The
student should be awarded the full (3) points when:
The preprocessing steps are done, and the student show
an understanding of holding out a test/cross-validation set
for an estimate of the model’s performance on unseen data.
The model’s performance is sufficiently explained
(accuracy may not be the most helpful metric here! Recall
about what you’ve learned regarding specificity and
sensitivity).
The student demonstrated extra effort in evaluating his/her
model and proposes ways to improve the accuracy
obtained from the initial model.

35 36
Unsupervised
Machine Learning
4-Days Workshop

UML
Module 1: Dimensionality Reduction
Background Principal Component Analysis
Understanding Unsupervised Learning Rethinking about Covariances
The "dimensionality" problem The Case for PCA
Learn PCA (Principal Component Analysis), Industrial Use of PCA Eigenvalues and Eigenvectors
Clustering, and other algorithms to work with
unsupervised machine learning tasks where the PCA from First Principles PCA in Action
Just enough Matrix Algebra Dubious Property Sales in NYC
target variable is not known or defined. Applying Mathematical Proof PCA on US Arrests data
what you’ll learn from this workshop, you will be Visualization and Visual Proof Biplot and the variables factor map
tasked to develop an anomaly detection or an
e-commerce product recommendation model PCA in Action II
Eigenfaces
that can be related to real-life business scenarios. PCA on Credit Loan Data
Deconstruction and Reconstructing
We strongly recommend that you complete the Faces with PCA
Principal Components by Hand
pre-requisite courses prior to taking this course.
Some concepts presented throughout the
lecture may be less-than-ideal for practitioners Module 2: k-Means Clustering
who are new to the field of machine learning. Understanding Clustering k-Means Clustering in Action
Centroid-based Clustering Algorithms Cluster-based Product Recommendation
The k-Means Procedure Scaling and Implementation Details
Mathematical Details Visualizing Clusters

Evaluating k-Means
Between sum-of-squares
Within sum-of-squares
Combining k-Means with PCA

37 38
Academy Modules
Graded Quiz
Learn by Building Module
Diving into Wholesale Transactions
Using any of the two unsupervised learning algorithms you’ve
learned, produce a simple R markdown document where you
demonstrate an exercise of either clustering or dimensionality
reduction on the wholesale.csv data provided to you.

Digging Deep into NYC Property Sales


Using any of the two unsupervised learning algorithms you’ve
learned, produce a simple R markdown document where you
demonstrate an exercise of either clustering or dimensionality
reduction on the nyc data provided to you.

Explain your choice of parameters (how you choose k for


k-means clustering, or how you choose to retain n number of
dimensions for PCA) from the original data. What are some
business utilities for the unsupervised model you’ve developed?
The R Markdown document should be no longer than 4
paragraph and contain one or two visualizations.

39 40
Time Series
& Forecasting
4-Days Workshop

TS&F
Module 1: Time Series I
Working with Time Series Time Series in Action
Application of Time Series Indonesia's gas emissions, 1970-2012
Definition of a ts object Frequency, Start and End
Decomposition of time series allows us to learn Functions to work with timeseries Time Series Plots
about the underlying seasonality, trend and Classical Decomposition Classical Decomposition in Action
random fluctuations in a systematic fashion. In Trend, Seasonality and Residuals Monthly Airline Passenger, 1949-1960
this workshop, we learn the methods to account Understanding Lags The decompose function
Additive vs Multiplicative Understading Smoothing
for seasonality and trend, work with autocorrelation
models and create industry-scale forecasts using Techniques to work with Time Series
modern tools and frameworks. Adjusting for Seasonality
Detrending
Decomposing Non-Seasonal Time Series
We strongly recommend that you complete the
pre-requisite workshops prior to taking this
course. Some concepts presented throughout Module 2: Forecasting
the lecture may be less-than-ideal for practitioners
Forecasting I Forecasting II
who have not completed the pre-requisite courses. Simple Moving Average Forecasting using One-sided SMA
Simple Moving Average from First Forecasting using Exponential Smoothing
Principles Holt's Exponential Smoothing
Log-transformation

Forecasting III Advanced Time Series


The alpha, beta, and gamma coefficients ACF and PACF
Mathematical Details ARMA and ARIMA Models
Holt-Winters Exponential Smoothing Stationarity and Differencing

Advanced Time Series II


Augmented Dickey-Fuller (ADF) test
Seasonal ARIMA
Tips to work with xts
Facebook's Prophet
Quantmod for quantitative traders

41 42
Academy Modules
Graded Quiz
Learn by Building Module
Forecasting the Crime rate in Chicago
Download the dataset from Chicago Crime Portal, and use a
sample of these data to build a forecasting project where you
inspect the seasonality and trend of crime in Chicago. Submit
your project in the form of an RMD format, and address the
following questions:
Is crime generally rising in Chicago in the past decade
(last 10 years)?
Is there a seasonal component to the crime rate?
Which time series method seems to capture the variation in
your time series better? Explain your choice of algorithm
and its key assumptions

43 44
Neural Network
& Deep Learning
4-Days Workshop

NN&DL
Module 1: Neural Network
Artificial Neural Networks Neural Network Architecture
The biological brain inspiration Layers, Nodes and Signals
Cost function Network Topology
Develop artificial neural networks that can The building blocks of neural networks Feed-forward vs Recurrent Signal
recognize a face, handwriting patterns and are at
Neural Network Architecture II Multi-Layer Perceptrons (MLP)
the core of some of the most cutting-edge Hidden Layers Backpropagation of error
cognitive models in the AI landscape. We will Computing with Neural Network Feed-forward vs Recurrent
learn to create a backpropagation neural network Mathematical Details Mathematical Details

from scratch, and use our neural network for


classification tasks. This class is the final course
Module 2: Deep Learning
in the Machine Learning Specialization.
Neural Networks from First Neural Networks from Scratch
We strongly recommend that you complete the Principles Gradient Descent by hand
Sum of Squared Error Neural Network by hand
pre-requisite workshops prior to taking this course. Cross-Entropy Error Learning Rate and Implementation Details
Some concepts presented throughout the lecture The Gradient Descent Algorithm
may be less-than-ideal for practitioners who have
not completed the pre-requisite courses. Neural Networks in Action Deep Learning in Action
Putting it all together Theorizing with Effect of Depth
Parameterization and Practical Advice Activation Functions
Deep Learning for Classification and Visualizing Logarithmic Loss
Regression

Deep Learning in Action II Keras in Action


Predicting Bank Telemarketing MNIST handwritten digit recognition
Campaign Defining the Model
Visualizing tricks for Deep Neural Training and Evaluation
Networks
Parameterization and Practical
Advice

45 46
Academy Modules
Graded Quiz
Learn by Building Module
Image Classification Using Neural Network
Build a neural network capable of classifying images into one
of many classes and explain the choice of your architecture.
Test your neural network using unseen images – can your
algorithm correctly classify 80% of the images?

47 48
Machine Learning
Capstone Project LEARN
MORE
After having learned various machine learning methods and its
application, students are required to choose one project that
challenge them to construct an optimal model from the dataset
given. The selection of methods include Forecasting, Regression
and Classification.

Marks of the project is out of 36 points, the rubrics

and
for assessment and grading will be discuss in the class.

DIVE
DEEPER
49 50
ENROLL NOW TO OUR ACADEMY!
ENROLL NOW TO OUR ACADEMY
bit.ly/algo_academy
bit.ly/algo_academy

“Learning
“Learninghowhowto todo
do
data science
data scienceisislike
like
learning
learningtotoski.
ski.
How to Apply:
How to Apply: You
Youhave
havetotodo
doit.”
it.”
1 Go to bit.ly/algo_academy
Go to bit.ly/algo_academy
1 (or scan the QR CODE)
(or scan the QR CODE) ~ Claudia Perlich,
2 Click ENROLL NOW! and fill in the form.
Chief Scientist, Dstillery.
~ Claudia Perlich,
2 Click ENROLL NOW! DQGȴOOLQWKHIRUP
Chief Scientist, Dstillery.
3 One of our Education Consultants will
reach
One ofout
ourtoEducation
you in 1x24 hours (working
Consultants will days).
4 3 Arrange
reach outpayment
to you in 1x24 hours (working days).

5 4 Congratulations!
Arrange payment You're on your way to start
your Data Science journey!
5 Congratulations! You're on your way to start
your Data Science journey!

You might also like