IC Outlines for Data Science Machine Learning
IC Outlines for Data Science Machine Learning
Course Instructors
Araf Mustavi
Analyst, IDT Operations
BAT Bangladesh
Ex - Machine Learning Engineer,
ACI Limited
Tools & Technology
***
How to Effectively Apply for a Data Scientist//ML Engineer Position?
Interview Skill Development
Freelancing Guidance
Python Fundamentals
Week 01
● Class 1: Basics of Python Programming Language
❖ Introduction to Python Programming Language.
❖ Introduction to Google Colab & Jupyter Notebook for Writing and Executing
Python Codes.
❖ Getting used to the User Interface (UI) of Jupyter Notebook/Google Colab.
❖ Python Data Types: Numeric, Strings, Boolean, List, Dictionary, Tuple, Set,
None.
❖ Iterables & Mutability; List vs. Tuples.
*** By the end of the Python Fundamentals module, students will have built a
solid foundation in core Python programming concepts and practices. This is
critical for anyone pursuing a career in data science or as a machine learning (ML)
engineer, as Python is the primary language used in these fields.
Outcomes:
Week 02
● Class 4: Fundamentals of DBMS-I
❖ What is a Relational Database Management System?
❖ ACID Property.
❖ How is data stored in a relational database?
❖ Concept of Normalization.
❖ OLTP vs. OLAP.
❖ Database vs. Data Warehouse vs. Data Lake/Lakehouse.
❖ What is a NoSQL Database and BASE Property?
❖ Difference between Relational and NoSQL Database.
❖ Which one should you choose in which case?
❖ For BI Solution, what should you choose to store your data?
❖ What is Snowflake and what advantage does it provide? (Micro-Partitioning)
Week 03
● Class 7: Exploratory Data Analysis using SQL
❖ Basic Queries: SELECT, FROM, WHERE, LIKE, ILIKE, IN, DISTINCT,
BETWEEN, GROUP BY, ORDER BY, LIMIT, OFFSET, ALIAS.
❖ Aggregate Functions: COUNT, SUM, AVG, MIN, MAX.
❖ Difference between WHERE and HAVING Clause.
❖ Some built-in Functions: EXTRACT, DATE_PART, TO_DATE, TO_CHAR.
❖ CASTING, SUBSTRING, POSITION, COALESCE, NULLIF.
❖ Removing Duplicates.
Week 04
● Class 10: Window Functions
❖ Window Functions, the most widely used SQL Commands used by Data Analysts
& Data Scientists.
❖ Window Functions: RANK, DENSE_RANK, ROW_NUMBER, LEAD, LAG,
FIRST VALUE, AGGREGATE WINDOW FUNCTION, FRAME
SPECIFICATION, WINDOW CHAINING.
*** By the end of this module, Students would be able to confidently perform any
sort of Data Analysis and Reporting for different Departments in any Organisations
using SQL. They would be well-equipped to understand any complex queries;
validating data and help any Business by generating important KPIs.
They would also be able to help Businesses by Performing Complex Analyses like
Cohort and RFM Segmentation. They would understand the importance of
VIEWS, Stored Procedures, Triggers which would help them when they would be
working on Python/Power BI/Tableau, etc. with Database/Data Warehouse as Data
Sources.
Week 04
● Class 12: Descriptive Statistics
❖ Types of Variables.
❖ Level of Measurement.
❖ Frequency Distribution.
❖ Getting Familiar to Different Statistical Charts.
❖ Which Charts/Tables/Frequency Distribution to choose for which Type of
Variables?
Week 05
● Class 13: Measures of Central Tendency & Dispersion
❖ Measures of Central Tendency [Mean, Median, Mode], and Location [Quartile,
Decile, Percentile].
❖ Which Measure is the Best one in which situation?
❖ Measures of Dispersion [Range, Variance, Standard Deviation, Coefficient of
Variation].
Week 06
● Class 16: Parametric Hypothesis Test
❖ Assumptions for Choosing a Parametric Hypothesis Test.
❖ T-tests, F-test, Chi-squared Test.
❖ Executive the Hypothesis Testing in Python and Interpreting the Results.
❖ When to choose which one?
*** By completing the Statistics with Python for Data Science module, students
will gain a robust understanding of essential statistical concepts and how to apply
them to real-world data science problems using Python. This module aims to
bridge statistical theory and practical implementation, which is critical for data
analysis and predictive modelling.
Overall Achievement:
By the end of this module, students will have the ability to:
Week 06
Class 18: Data Analysis using NumPy
❖ A brief introduction
❖ Jupyter notebook Installation & exploring.
❖ Google Colab
❖ NumPy arrays, arrange, linspace
❖ Array methods and attributes.
❖ Indexing, slicing
❖ Broadcasting
❖ Boolean masking
❖ Arithmetic Operations
❖ Universal Functions
Week 07
Class 19: Data Analysis using Pandas -1
❖ Pandas Introduction.
❖ Pandas Data Structures - Series
❖ Pandas Data Structures – DataFrame
❖ Creating DataFrame
❖ Grab data (column wise)
❖ Grab data (raw wise)
❖ Grabbing an element or a sub-set of the dataframe
❖ Adding new column
❖ Deleting the column
❖ Boolean mask
❖ reset_index(), set_index(), head(), tail(), info(), describe()
Week 08
Class 22: Project on Data Analysis using Pandas
*** NumPy and Pandas are two fundamental Python libraries that form the
cornerstone of data analysis and manipulation. Learning these libraries equips you
with the tools to efficiently work with large datasets, extract meaningful insights,
and make data-driven decisions.
Week 09
Communicating Insights:
Machine Learning
Data Processing, Transforming and Feature
Engineering
Week 10
Class 28: Introduction to Machine Learning
❖ Introduction to ML - What, Why
❖ Machine Learning Applications
❖ Supervised Learning
❖ Unsupervised Learning
❖ What is a Machine Learning Model?
❖ Training data and Test data
❖ Splitting Data, Train set & Test set
❖ Underfitting and Overfitting
❖ KFold Cross Validation
Class 29: Data processing, Transforming, Extractions
❖ Feature Scaling Theory
❖ Feature Scaling - Hands-on
❖ Feature Extractions
❖ Image to Pixel using CV2
❖ Resize, bitwise not and Pixel to Photo
❖ How to prepare a dataset using photos
❖ Linear Discriminant Analysis (LDA)
❖ Bag of words, vectorization
❖ Bag of N-grams
Class 30: Feature Selection, Outlier Detection and Removal
❖ Feature Importance, Feature Selection
❖ Label Encoding
❖ Ordinal Encoding
❖ One Hot Encoding
❖ Outlier Detection and Removal Using IQR
❖ Outlier Detection and Removal using Standard Deviation
❖ Outlier Detection and Removal using Z-Score
***By the end of this module, students will have the ability to
Week 13
Class 37: Decision Tree
❖ Decision Tree Theory
❖ Decision Tree Algorithm
❖ Decision Tree Pen & Paper Exercise
❖ Hands-on: Model development using Decision Tree
Week 14
Class 40: Project on Supervised Learning (Classification)
Unsupervised Learning (Clustering)
Week 15
Class 43: Project on Unsupervised Learning (Clustering)
Ensemble Learning
Class-44 Ensemble Learning Models - Random Forest
❖ Ensemble Learning
❖ What is Bagging
❖ What is Bagged Tree
❖ Random Forest
❖ Random Forest Theory
❖ Random Forest Algorithm
❖ Hands-on: Random Forest
Week 16
● Class 46: Introduction to the Basics of Time Series Analysis
❖ Introduction to Time Series Analysis
❖ Moving Averages
❖ Exponential Smoothing
❖ Decomposition
Week 18
Class 52: Model Evaluation and Validation Techniques
❖ Evaluation metrics and validation techniques for image classification and OCR
models.
❖ Practical: Evaluating the performance of the OCR pipeline and CNN model.
Week 19
Class 55: Real-World Applications
❖ Advanced image processing for region detection.
MLOps (4 Classes)
Class 57: Introduction to MLOps Concepts
❖ Overview of MLOps and the ML model lifecycle.
❖ Importance of MLOps in production environments and introduction to CI/CD in
ML.
Week 20
Class 58: Building and Managing ML Pipelines
Job Preparation
Extra Week taken by the Support Instructor:
❖ Class 1:
* CV Making & How to Write a Cover Letter?
* Portfolio Building.
❖ Class 2:
* Guidance on Freelancing Career
* Interview Skill Development & How to Effectively Apply for a Position.
* Roadmap for Future Ahead.