0% found this document useful (0 votes)
30 views5 pages

CBDS - 5 Days

This document outlines a 5-day course to teach beginners the basics of data science. The course covers topics like data gathering, cleaning, analysis, visualization, machine learning, databases, and programming in Python and R. Students will learn essential concepts and tools to start a career in data science.

Uploaded by

junitasari
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
30 views5 pages

CBDS - 5 Days

This document outlines a 5-day course to teach beginners the basics of data science. The course covers topics like data gathering, cleaning, analysis, visualization, machine learning, databases, and programming in Python and R. Students will learn essential concepts and tools to start a career in data science.

Uploaded by

junitasari
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

Level 8, Vertical Corporate Tower B, Avenue 10

No 8, Jalan Kerinchi, Bangsar South City 59200


Federal Territory Kuala Lumpur, Malaysia
T: (+60) 03 2242 0231 | E: ​[email protected]

CERTIFIED BEGINNER DATA SCIENTIST ​


5 Days (Instructor-Led Hands-on)

Introduction
Our lives are flooded by large amount of information, but not all of them is useful data.
Therefore it is essential for us to learn how to applying data science to every aspect of
our daily life from personal finances, reading, lifestyle to making business decisions.
Leveraging on this data to make our life easier, or unlock new economic value for a
business, is what you are going to learn in this course.

This course is a hands-on guided course for you to learn the concepts, tools, and
techniques that you need to begin learning data science. We will cover the key topics
from data science to big data, and the processes of gathering, cleaning and handling
data. This course is well balanced between theory and practical, and key concepts are
taught using case studies references. Upon completion, participants will be able to
perform the basic data handling tasks, collect and analyze data, and present them
using industry standard tools.

Target audience
This workshop is intended for individuals who are interested in learning data science,
or who want to begin their career as a data scientist.

Prerequisites
All participants should have basic understanding of data, relations, and basic
knowledge of mathematics.

Objectives
Upon completion of this course, you will be able to:
● Identify appropriate model for different data types.
● Create your own data process and analysis workflow.
● Define and explain the key concepts and models relevant to data science.
● Differentiate key data ETL process, from cleaning, processing to visualization.
● Implement algorithms to extract information from dataset.
● Apply best practices in data science, and familiar with standard tools.

Course Outline:

Day 1

Introduction to Data Science


● What is Data?

©​ Abundent Sdn. Bhd., All Rights Reserved


Level 8, Vertical Corporate Tower B, Avenue 10
No 8, Jalan Kerinchi, Bangsar South City 59200
Federal Territory Kuala Lumpur, Malaysia
T: (+60) 03 2242 0231 | E: ​[email protected]

● Types of Data
● What is Data Science?
● Knowledge Check
● Lab Activity

Data Science Workflow


● Data Gathering
● Data Preparation & Cleansing
● Data Analysis - Descriptive, Predictive, and Prescriptive
● Data Visualization and Model Deployment
● Knowledge Check

Life of a data scientist


● What is a Data Scientist?
● Data Scientist Roles
● What does a Data Scientist Look Like?
● T-Shaped Skillset
● Data Scientist Roadmap
● Data Scientist Education Framework
● Thinking like a Data Scientist
● Knowns and Unknowns
● Demand and Opportunity
● Labor Market
● Applications of Data Science
● Data Science Principles
● Data-Driven Organization
● Developing Data Products
● Knowledge Check

Data Gathering
● Obtain data from online repositories
● Import data from local file formats (json, xml)
● Import data using Web API
● Scrape website for data
● Knowledge check

Day 2

Data Science Prerequisites


● Probability and Statistics
● Linear Algebra
● Calculus

©​ Abundent Sdn. Bhd., All Rights Reserved


Level 8, Vertical Corporate Tower B, Avenue 10
No 8, Jalan Kerinchi, Bangsar South City 59200
Federal Territory Kuala Lumpur, Malaysia
T: (+60) 03 2242 0231 | E: ​[email protected]

● Combinatorics
● Programming

Beginning Databases
● Types of Databases
● Relational Databases
● NoSQL
● Hybrid database
● Knowledge check
Lab activity

Structured Query Language (SQL)


● Performing CRUD (Create, Retrieve, Update, Delete)
● Designing a Real world database
● Normalizing a table
● Knowledge check
Lab Activity

Introduction to Python
● Basics of Python language
● Functions and packages
● Python lists
● Functional programming in Python
● Numpy and Scipy
● iPython
● Knowledge check
● Lab Activity
Lab: Exploring data using Python

Day 3

Data Preparation and Cleansing


● Extract, Transform and Load (ETL) - Pentaho, Talend, etc
● Data Cleansing with OpenRefine
● Aggregation, Filtering, Sorting, Joining
● Knowledge Check
Lab Activity

Data Quality
● Raw vs Tidy Data
● Key Features of Data Quality
● Maintenance of Data Quality
● Data Profiling
● Data Completeness and Consistency

©​ Abundent Sdn. Bhd., All Rights Reserved


Level 8, Vertical Corporate Tower B, Avenue 10
No 8, Jalan Kerinchi, Bangsar South City 59200
Federal Territory Kuala Lumpur, Malaysia
T: (+60) 03 2242 0231 | E: ​[email protected]

Exploratory Data Analysis (Descriptive)


● What is EDA?
● Goals of EDA
● The role of graphics
● Handling outliers
● Dimension reduction

Introduction to R
● Packages for data import, wrangling, and visualization
● Conditionals and Control Flow
● Loops and Functions
● Knowledge check
● Lab activity
Lab: Exploring data using R
Day 4

Machine Learning (Predictive)


● Bayes Theorem
● Information Theory
● NLP
● Statistical Algorithms
● Stochastic Algorithms

Introduction to Text Mining


● What is Text Mining?
● Natural Language Processing
● Pre-processing text data
● Extracting features from documents
● Using BeautifulSoup
● Measuring document similarity
● Knowledge check
Lab activity

Suprvised, Unsupervised, and Semi-supervised Learning


● What is prediction?
● Sampling, training set, testing set.
● Constructing a decision tree.
● Knowledge check
Lab Activity

©​ Abundent Sdn. Bhd., All Rights Reserved


Level 8, Vertical Corporate Tower B, Avenue 10
No 8, Jalan Kerinchi, Bangsar South City 59200
Federal Territory Kuala Lumpur, Malaysia
T: (+60) 03 2242 0231 | E: ​[email protected]

Day 5

Data Visualization
● Choosing the right visualization
● Plotting data using Python libraries
● Plotting data using R
● Using Jupyter Notebook to validate scripts
● Knowledge check
● Lab activity

Data Analysis Presentation


● Using Markdown language
● Convert your data into slides
● Data presentation techniques
● The pitfall of data analysis
● Knowledge check
● Lab activity
● Group presentation
Lab: Mini Project

Big Data Landscape


● What is small data?
● What is big data?
● Big data analytics vs Data Science
● Key elements in Big Data (3Vs)
● Extracting values from big data
● Challenges in Big data

Big data Tools and Applications


● Introducing Hadoop Ecosystem
● Cloudera vs Hortonworks
● Real world big data applications
● Knowledge check
● Group discussion

What’s next ?
● Preview of Data Science Specialist
● Showing advanced data analysis techniques
● Demo: Interactive visualizations

©​ Abundent Sdn. Bhd., All Rights Reserved

You might also like