0% found this document useful (0 votes)
21 views12 pages

AFRICDSA Certified Data Scientist Syllabus - V1.2

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
21 views12 pages

AFRICDSA Certified Data Scientist Syllabus - V1.2

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 12

CERTIFIED DATA SCIENTIST

PROGRAM SYLLABUS

Accredited by

CONTENTS
COURSE 1: DATA SCIENCE FOUNDATION … 2
COURSE 2: PYTHON ESSENTIALS FOR DATA SCIENCE ... 2
COURSE 4: STATISTICS FOR DATA SCIENCE ... 4
COURSE 5: DATA PREPARATION WITH NUMPY & PANDAS ... 5
COURSE 6: VISUALIZATION WITH PYTHON ... 5
COURSE 7: MACHINE LEARNING ASSOCIATE ... 6
COURSE 8: ADVANCED MACHINE LEARNING ... 8
COURSE 9: SQL FOR DATA SCIENCE ... 10
COURSE 10: DEEP LEARNING – CNN BASICS ... 10
COURSE 11: TABLEAU & POWER BI ... 11
COURSE 12: ML MODEL DEPLOY- FLASK API ... 11
COURSE 13: BIG DATA ESSENTIALS ... 11
COURSE 14: DATA SCIENCE PROJECT EXECUTION ... 11
CONTACTS … 12

©2023AFRICDSA. All content are in this document is copyrighted,


Reproducing any part of the content requires written permission from AFRICDSA®, IABAC®
COURSE 1 DATA SCIENCE FOUNDATION
WHAT IS DATA SCIENCE?
INTRODUCTION TO DATA
MODULE 1 EVOLUTION OF DATA SCIENCE
SCIENCE
DATA SCIENCE TERMINOLOGIES
DATA SCIENCE VS BUSINESS COMPARING VARIOUS RELATED DOMAINS WITH
MODULE 2
ANALYTICS VS BIG DATA DATA SCIENCE

DESCRIPTIVE ANALYTICS
2
CLASSIFICATION OF
MODULE 3 PREDICTIVE ANALYTICS
BUSINESS ANALYTICS
DISCOVERY ANALYTICS AND PRESCRIPTIVE ANALYTICS

DATA SCIENCE PROJECT CRIPS – DM FRAMEWORK


MODULE 4
WORKFLOW DATA SCIENCE PROJECT WORKFLOW
MODULE 5 ROLES IN DATA SCIENCE INDUSTRY ROLES AND RESPONSIBILITIES
INDUSTRY ADOPTION: HEALTH CARE, FINANCE &
APPLICATION OF DATA BANKING, MANUFACTURING, RETAIL, LOGISTICS,
HUMAN RESOURCE
MODULE 6 SCIENCE IN VARIOUS
INDUSTRIES KEY USE CASES
TRENDS ON DATA SCIENCE ADOPTION

COURSE 2 PYTHON ESSENTIALS FOR DATA SCIENCE


INSTALLING ANACONDA ON LOCAL MACHINE
PYTHON INSTALLATION AND ADDITIONAL PACKAGE INSTALLATION
MODULE 1
SETUP INTRODUCTION TO JUPYTER NOTEBOOK
JUPYTER NOTEBOOK KEYBOARD SHORTCUTS
INTRODUCTION TO GOOGLE COLAB
SELECTION RUNTINE ENVIRONMENT GPU/TPU
MODULE 2 GOOGLE COLAB UPLOADING DATA/FILES IN COLAB
LOADING GOOGLE DRIVE AS FOLDER
SHARING COLAB NOTEBOOK
NATIVE DATA TYPES
KEY PYTHON FUNCTIONS
MODULE 3 PYTHON INTRODUCTION SLICING OPERATIONS
IMPORTING PACKAGES - DATETIME
PACKAGE, SUB-PACKAGE, METHODS & ATTRIBUTES
DATA STRUCTURES INTRODUCTION
LISTS, LIST OPERATIONS
TUPLE
MODULE 4 PYTHON DATA STRUCTURES
SETS
DICTIONARIES
LOOPING THROUGH ITERABLE DATA SET
IF - CONDITIONAL STATEMENT
PYTHON CONTROL
MODULE 5 FOR – LOOP
STATEMENTS
USER DEFINED FUNCTIONS

Certified Data Scientist | Syllabus | ©2023 AFRICDSA®


3

COURSE 3 STATISTICS FOR DATA SCIENCE


DESCRIPTIVE AND INFERENTIAL STATISTICS.
INTRODUCTION TO DEFINITIONS
MODULE 1 TERMS
STATISTICS
TYPES OF DATA
TYPES OF SAMPLING DATA. SIMPLE RANDOM
MODULE 2 HARNESSING DATA SAMPLING
STRATIFIED 3
MEAN
MEDIAN AND MODE
DATA VARIABILITY
MODULE 3 EXPLORATORY ANALYSIS
STANDARD DEVIATION
Z-SCORE
OUTLIERS
NORMAL DISTRIBUTION
CENTRAL LIMIT THEOREM
HISTOGRAM
MODULE 4 DISTRIBUTIONS NORMALIZATION
NORMALITY TESTS
SKEWNESS
KURTOSIS.
UNDERSTANDING HYPOTHESIS TESTING
HYPOTHESIS TESTING NULL AND ALTERNATE HYPOTHESES
MAKING A DECISION
HYPOTHESIS TESTING - CRITICAL VALUE METHOD
CRITICAL VALUE METHOD CRITICAL VALUE METHOD – EXAMPLES
P-VALUE METHOD
HYPOTHESIS TESTING – P-
P-VALUE METHOD – EXAMPLES
VALUE METHOD
TYPES OF ERRORS
T DISTRIBUTION
MODULE 5 ONE SAMPLE T-TEST
T-TESTS
INDEPENDENT AND RELATIONAL TWO-SAMPLE TEST
T-TEST HYPOTHESIS TESTING IN PYTHON.
ANALYSIS OF VARIANCE (ANOVA) THEORY
HYPOTHESIS TESTING WITH MORE THAN TWO
ONE WAY ANOVA TEST / F- VARIABLES WITH ANOVA
TEST
INDUSTRY EXAMPLE
F-TEST HYPOTHESIS TESTING IN PYTHON.
NON-PARAMETRIC CHI-SQUARE TEST THEORY
HYPOTHESIS TESTING APPLICATION OF CHI-SQUARE IN PYTHON
DIRECT AND INDIRECT CORRELATION
CORRELATION WITH STRONG AND WEAK
COLLERATION
MODULE 6 CORRELATION & REGRESSION CALCULATING CORRELATION WITH PYTHON
REGRESSION THEORY
SIMPLE LINEAR REGRESSION WITH PYTHON

Certified Data Scientist | Syllabus | ©2023 AFRICDSA®


COURSE 4 DATA PREPARATION WITH NUMPY & PANDAS
INTRODUCTION
NUMPY BASICS
CREATING NUMPY ARRAYS
STRUCTURE AND CONTENT OF ARRAYS
NUMPY
MODULE 1 NUMERICAL PYTHON
PACKAGE
SUBSET
SLICE 4
INDEX AND ITERATE THROUGH ARRAYS
MULTIDIMENSIONAL ARRAYS
PYTHON LISTS VS NUMPY ARRAYS
BASIC OPERATIONS
OPERATIONS ON NUMPY
MODULE 2 OPERATIONS ON ARRAYS
ARRAYS
BASIC LINEAR ALGEBRA OPERATIONS
PANDAS BASICS
INDEXING AND SELECTING DATA
PANDAS MERGE AND APPEND
MODULE 3
PANEL DATA PACKAGE
GROUPING AND SUMMARIZING DATAFRAME
LAMBDA FUNCTION & PIVOT TABLES
PANDAS BASICS
INDEXING AND SELECTING DATA
DATA CLEANING
MODULE 4 DATA MUNGING WITH MERGE AND APPEND
PANDAS GROUPING AND SUMMARIZING DATAFRAME
LAMBDA FUNCTION & PIVOT TABLES

COURSE 5 VISUALIZATION WITH PYTHON


COMPONENTS OF A PLOT
DATA VISUALIZATION TOOLKIT
MODULE 1 BASICS OF VISUALIZATION
FUNCTIONALITIES OF PLOTS
SUB-PLOTS
INTRODUCTION
PLOTTING AGGREGATE VALUES ACROSS CATEGORIES
PLOTTING DISTRIBUTIONS ACROSS CATEGORIES
PLOTTING CATEGORICAL AND
MODULE 2 BIVARIATE DISTRIBUTIONS - PLOTTING PAIRWISE
TIME SERIES DATA
RELATIONSHIPS
VECTOR SPACES
VECTORS: THE BASICS
INTRODUCTION
PLOTTING DATA
MODULE 3 UNIVARIATE DISTRIBUTIONS
DISTRIBUTIONS
UNIVARIATE DISTRIBUTIONS - RUG PLOTS

Certified Data Scientist | Syllabus | ©2023 AFRICDSA®


COURSE 6 MACHINE LEARNING ASSOCIATE
WHAT IS ML? ML VS AI

MACHINE LEARNING ML WORKFLOW


MODULE 1
INTRODUCTION STATISTICAL MODELING OF ML
APPLICATION OF ML 5
POPULAR ML ALGORITHMS
CLUSTERING
MACHINE LEARNING
MODULE 2 CLASSIFICATION AND REGRESSION
ALGORITHMS
SUPERVISED VS UNSUPERVISED
CHOICE OF ML ALGORITHMS
REGRESSION LINE
SIMPLE LINEAR REGRESSION
BEST FIT LINE
ASSUMPTIONS OF SIMPLE LINEAR REGRESSION
READING AND UNDERSTANDING THE DATA
HYPOTHESIS TESTING IN LINEAR REGRESSION
LINEAR REGRESSION IN PYTHON
BUILDING A LINEAR MODEL
RESIDUAL ANALYSIS AND PREDICTIONS
MODULE 3
LINEAR REGRESSION USING SKLEARN
SIMPLE LINEAR REG VS MULTIPLE LINEAR REG
MULTICOLLINEARITY

MULTIPLE LINEAR REGRESSION DEALING WITH CATEGORICAL VARIABLES

MODEL ASSESSMENT AND COMPARISON

FEATURE SELECTION

INTRODUCTION: UNIVARIATE LOGISTIC


REGRESSION
LOGISTIC REGRESSION BINARY CLASSIFICATION
BINARY CLASSIFIER
SIGMOID CURVE
FINDING THE BEST FIT SIGMOID CURVE SUMMARY

MODULE 4 MULTIVARIATE LOGISTIC REGRESSION


DATA CLEANING AND PREPARATION

LOGISTIC REGRESSION BUILDING YOUR FIRST MODEL


MODEL BUILDING FEATURE ELIMINATION USING RFE
CONFUSION MATRIX AND ACCURACY
MANUAL FEATURE ELIMINATION

Certified Data Scientist | Syllabus | ©2023 AFRICDSA®


COURSE 7 MACHINE LEARNING ASSOCIATE
METRICS BEYOND ACCURACY: SENSITIVITY &
SPECIFICITY
LOGISTIC REGRESSION
MODULE 4 MODEL EVALUATION FINDING THE OPTIMAL THRESHOLD USING ROC
CURVE

METRICS BEYOND ACCURACY: PRECISION & RECALL


INTRODUCTION TO KNN
6
HOW IT WORKS: THEORY
SUPERVISED LEARNING:
K NEAREST NEIGHBOR PROS AND CONS OF KNN
MODULE 5 KNN CLASSIFIER APPLICATIONS OF KNN
MODEL BUILDING KNN IN PYTHON SKLEARN

EVALUATION: KNN MODEL.


INTRODUCTION

UNSUPERVISED LEARNING: UNDERSTANDING CLUSTERING


CLUSTERING PRACTICAL EXAMPLE OF CLUSTERING - CUSTOMER
SEGMENTATION

INTRODUCTION
STEPS OF THE ALGORITHM
K MEANS ALGORITHM
K MEANS AS COORDINATE DESCENT
MODULE 6 K MEANS CLUSTERING
VISUALISING THE K MEANS ALGORITHM

PRACTICAL CONSIDERATION IN K MEANS


ALGORITHM

CLUSTER TENDENCY
INTRODUCTION
K MEANS IN PYTHON
IRIS DATA PREPARATION
CASE: IRIS DATASET
CLUSTERING MAKING THE CLUSTERS

INTRODUCTION
MODULE 7
HIERARCHICAL CLUSTERING HIERARCHICAL CLUSTERING ALGORITHM
INTERPRETING THE DENDROGRAM
THE WHY'S AND WHAT'S OF PCA
BUILDING BLOCKS OF PCA
UNSUPERVISED LEARNING:
PRINCIPLE COMPONENT ILLUSTRATION - FINDING PRINCIPAL COMPONENTS
MODULE 8
ANALYSIS (PCA) COMPREHENSION - CALCULATING THE PRINCIPAL
COMPONENTS

SINGULAR VALUE DECOMPOSITION (SVD)

Certified Data Scientist | Syllabus | ©2023 AFRICDSA®


COURSE 8 ADVANCED MACHINE LEARNING
INTRODUCTION TO DECISION TREES
INTERPRETING A DECISION TREE
CLASSIFICATION AND
Module 1 REGRESSION TREE (CART): COMPREHENSION - DECISION TREE CLASSIFICATION IN
DECISION TREE PYTHON

REGRESSION WITH DECISION TREES


INTRODUCTION
7
CONCEPT OF HOMOGENEITY
GINI INDEX
Module 2 THEORY OF DECISION TREE
ENTROPY AND INFORMATION GAIN
COMPREHENSION - INFORMATION GAIN
SPLITTING BY R-SQUARED
BUILDING DECISION TREES IN PYTHON
CHOOSING TREE HYPERPARAMETERS IN PYTHON

DECISION TREE HYPER- COMPREHENSION - HYPERPARAMETERS


Module 3
PARAMETER TUNING TREE TRUNCATION

ADVANTAGES AND DISADVANTAGES TREE


TRUNCATION

INTRODUCTION
ENSEMBLES
RANDOM FOREST COMPREHENSION - ENSEMBLES
Module 4 ENSEMBLE BAGGING
CREATING A RANDOM FOREST
TECHNIQUE
COMPREHENSION - OOB (OUT-OF-BAG) ERROR
RANDOM FORESTS LAB
INTRODUCTION: NAIVE BAYES
CONDITIONAL PROBABILITY AND ITS INTUITION
NAÏVE BAYES: BAYES BAYES' THEOREM
THEOREM AND ALGORITHM
NAIVE BAYES WITH ONE FEATURE
BUILDING BLOCKS
CONDITIONAL INDEPENDENCE IN NAIVE BAYES
DECIPHERING NAIVE BAYES

INTRODUCTION NAIVE BAYES FOR TEXT


Module 5 CLASSIFICATION

DOCUMENT CLASSIFIER PRE-PROCESSING STEPS


NAÏVE BAYES: TEXT
CLASSIFICATION DOCUMENT CLASSIFIER WORKED OUT EXAMPLE
HAM VS SPAM CASESTUDY LAPLACE SMOOTHING
BUILDING SPAM HAM CLASSIFIER
COMPREHENSION NAIVE BAYES FOR TEXT
CLASSIFICATION

Certified Data Scientist | Syllabus | ©2023 AFRICDSA®


COURSE 9 ADVANCED MACHINE LEARNING
INTRODUCTION TO BOOSTING
WEAK LEARNERS
ADABOOST ALGORITHM
ADABOOST DISTRIBUTION AND PARAMETER

Module 6
BOOSTING: INTRODUCTION,
ADABOOST, GRADIENT
CALCULATION
ADABOOST LAB
8
BOOSTING, XGBOOST UNDERSTANDING GRADIENT BOOSTING
GRADIENT IN GRADIENT BOOSTING
GRADIENT BOOSTING ALGORITHM
XGBOOST
KAGGLE PRACTICE EXERCISE
INTRODUCTION TO SVM
CONCEPT OF A HYPERPLANE IN 2D
CONCEPT OF A HYPERPLANE IN 3D

SUPPORT VECTOR MACHINE: MAXIMAL MARGIN CLASSIFIER


THEORY THE SOFT MARGIN CLASSIFIER
THE SLACK VARIABLE

Module 7 NOTION OF SLACK VARIABLES


COST OF MISCLASSIFICATION
MAPPING NONLINEAR DATA TO LINEAR DATA
FEATURE TRANSFORMATION
SVM : IMPLEMENTING SVM IN
THE KERNEL TRICK
SKLEARN, CASESTUDY
MODELING SVM PYTHON SKLEARN
MODEL EVALUATION
INTRODUCTION TO ANN
SIMPLE ANN NETWORK

ARTIFICIAL NEURAL NETWORK HOW IT WORKS: BACKPROP ALGORITHM


Module 8
(ANN) IMPLEMENTING ANN WITH PYTHON SKLEARN
ANN MODELING AND EVALUATION
COMPREHENSION
ADV EVALUATION METRICS: ROC_AUC, R2 THEORY,
PRECISION, RECALL, F1 SCORE, RMSE
K-FOLD CROSSVALIDATION
GRID AND RANDOMIZED SEARCH CV IN SKLEARN
Module 9 ADVANCED ML CONCEPTS
IMBALANCED DATA SET : SMOTE TECHNIQUE
FEATURE SELECTION TECHNIQUES
CHOOSING RIGHT ALGORITHMS

Certified Data Scientist | Syllabus | ©2023 AFRICDSA®


COURSE 10 SQL FOR DATA SCIENCE
INSTALL SQL PACKAGES AND SQLALCHEMY
MODULE 1
CONNECTING TO DB PYMYSQL

BASICS OF SQL DB
RDBMS (RELATIONAL
MODULE 2 DATABASE MANAGEMENT)
BASICS
PRIMARY KEY

FOREIGN KEY
9
SELECT SQL COMMAND, RETRIEVING DATA WITH SELECT SQL COMMAND
MODULE 3
WHERE CONDITION WHERE CONDITION TO PANDAS DATA FRAME.
ORDER BY CLAUSE
AGGREGATE FUNCTIONS
GROUP BY CLAUSE
MODULE 4 ADVANCED SQL
HAVING CLAUSE
NESTED QUERIES
INNER JOIN, OUTER JOINS, MULTI JOIN

COURSE 11 DEEP LEARNING – CNN FOUNDATION


WHAT IS DEEP LEARNING?
INTRODUCTION TO DEEP
MODULE 1 VARIOUS DEEP LEARNING MODELS IN PRACTICE
LEARNING
AND APPLICATIONS
IMAGE RESOLUTION
INTRODUCTION TO IMAGE
MODULE 2 PIXELS
BASICS
IMAGE MANIPULATIONS WITH FILTERS
CNN ESSENTIALS
CONVOLUTIONAL NEURAL CNN ARCHITECTURE
MODULE 3
NETWORK CNN INTRO WORK FLOW OF IMAGE CLASSIFICATION WITH
CNN
CASE STUDY: KERAS–
CNN HANDS ON APPLICATION FOR
MODULE 4 TENSORFLOW IMAGE
CLASSIFICATION IMAGES OF CATS AND DOGS
CLASSIFICATION

Certified Data Scientist | Syllabus | ©2023 AFRICDSA®


COURSE 12 DATA VISUALIZATION WITH POWER BI & TABLEAU
Loading & linking datasets in power BI
Cleaning data & creating calculated columns &
measures using DAX
Reports,Data ,and Relationship Views

MODULE 1
DATA IN POWER BI

Numeric Visulas-cards & tables


10
VISUALS IN POWER BI Graphic visuals-line chart,bar chart,pie chart,column
MODULE 2
chart and tree chart.
Using slicers & custom visuals
Planning,Designing,and Prototyping
Working with various charts
DASHBOARDS
MODULE 3 Working with filters
Telling stories with visuals

VISUAL STORYTELLING When to use which visuals


MODULE 4 Presentations best-practice

TABLEAU
TABLEAU INTERFACE
MODULE 1 TABLEAU INTRODUCTION DIMENSIONS AND MEASURES
FILTER SHELF
CONNECTING TO SOURCES
CONNECTING TO DATA EXCEL
MODULE 2
SOURCE DATABASE
PDF
MODULE 3 VISUAL ANALYTICS CHARTS AND PLOTS WITH SUPERSTORE DATA
FORECASTING TIME SERIES DATA
MODULE 4 FORECASTING
FORECASTING SALES IN TIME PERIODS

Certified Data Scientist | Syllabus | ©2023 AFRICDSA®


COURSE 13 MACHINE LEARNING MODEL DEPLOYMENT - API
BASICS OF APPLICATION API BASICS
MODULE 1 PROGRAM INTERFACE
(API) LOOSELY COUPLED ARCHITECTURE

INSTALLATION AND CONFIGURING FLASK

INSTALLING FLASK AND CROSS DOMAIN AUTHENTICATION WITH


MODULE 2
FLASK CORS FLASK_CORS 11
EXAMPLE TO USE FLASK AS API SERVER

COMPLETE PROJECT FLOW WITH API


END OF END ML PROJECT
MODULE 3 DEPLOYMENT AND ASSESSING THROUGH
WITH API DEPLOYMENT WEBSITE

COURSE 14 BIG DATA ESSENTIALS


WHAT IS BIG DATA?
MODULE 1 BIG DATA INTRODUCTION
VARIOUS BIG DATA FRAMEWORKS

HADOOP INTRODUCTION

SPARK BIG DATA FOR MACHINE LEARNING


MODULE 2 HADOOP AND SPARK
MANAGING BIG DATA IN DATA SCIENCE
PROJECTS

COURSE 15 DATA SCIENCE PROJECT EXECUTION

CRISP DM FRAMEWORK
DATA SCIENCE: PROJECT
MODULE 1
STRUCTURE 6-PHASE PROJECT EXECUTION

ML USE CASE DEVELOPMENT

MODULE 2 BUSINESS ASPECTS PROJECT MANAGEMENT METHODOLOGY

CHALLENGES AND PITFALLS

Certified Data Scientist | Syllabus | ©2023 AFRICDSA®


PROGRAM DETAILS
COURSE NAME : CERTIFIED DATA SCIENTIST
DURATION : 5 MONTHS FULL TIME;9 MONTHS PART-TIME
LEARNING MODE : LIVE ONLINE OR PHYSICAL TRAINING 12
[email protected]
Phone :+254111866292

DATA SCIENCE IS RATED AS THE TOP CAREER


HIGHEST PAID – RECESSION PROOF – MILLIONS OF JOBS

AFRICDSA provides the most comprehensive and industry aligned Data Science Program

1,000+ LEARNERS
20+ ELITE TRAINERS

ENQUIRE NOW

TAKE FIRST STEP TOWARDS DATA SCIENCE CAREER

©2023 AFRICDSA. All content are in this document is copyrighted,


Reproducing any part of the content requires written permission from AFRICDSA®, IABAC®

You might also like