0% found this document useful (0 votes)
37 views

Data Science Machine Learning 17054

The document outlines an 8-month roadmap for becoming an expert in data science, machine learning, and full-stack development. It includes sections on Python programming, data structures and algorithms, pandas, numpy, matplotlib, statistics, machine learning, and various data science and machine learning tools and techniques. The roadmap is divided into monthly sections that cover specific topics, with a focus on building practical skills through hands-on projects over the 8 months.

Uploaded by

itsmeshubham08
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
37 views

Data Science Machine Learning 17054

The document outlines an 8-month roadmap for becoming an expert in data science, machine learning, and full-stack development. It includes sections on Python programming, data structures and algorithms, pandas, numpy, matplotlib, statistics, machine learning, and various data science and machine learning tools and techniques. The roadmap is divided into monthly sections that cover specific topics, with a focus on building practical skills through hands-on projects over the 8 months.

Uploaded by

itsmeshubham08
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 27

Divine-Level

Data Science
Machine Learning
Full-Stack Roadmap

Invest 8 Months and build proof of work, skills, knowledge, projects, and portfolio
and be Industry ready

Ankit Kumar Singh

https://www.linkedin.com/in/ankit-kumar-singh-983b0820b/
The Roadmap is divided into 16 Sections

Duration: 256 Hours of Learning (8 Months) and many more hours for practice
and project building.

Month 1 — May

1. Python Programming and Logic Building


2. Data Structure & Algorithms

Month 2 — June

3. Pandas Numpy Matplotlib


4. Statistics

Month 3 — July

5. Machine Learning
6. ML Operations

Month 4 — August

7. Natural Language Processing


8. Computer Vision
Month 5 — September

9. Data Visualization with Tableau


10. Structured Query Language( SQL)

Month 6 — October

11. Data Engineering


12. Data System Design

Month 7 — November

13. Five Major Capstone Projects


14. Interview Preparations

Month 8 — December

15. Git & GitHub


16. Personal Branding and Portfolio
Technology Stack

● Python
● Data Structures
● NumPy
● Pandas
● Matplotlib
● Seaborn
● Scikit-Learn
● Statsmodels
● Natural Language Toolkit (NLTK)
● PyTorch
● OpenCV
● Tableau
● Structure Query Language (SQL)
● PySpark
● Azure Fundamentals
● Azure Data Factory
● Databricks
● 5 Major Projects
● Git and GitHub
● AWS
● GCP
● Azure
1 | Python Programming and Logic Building
I will prefer Python Programming Language. Python is the best for starting your
programming journey. Here is the roadmap of Python for logic building.

1 | Introduction and Basics

● Installation
● Python Org, Python 3
● Variables
● Print function
● Input from user
● Data Types
● Type Conversion
● First Program

2 | Operators

● Arithmetic Operators
● Relational Operators
● Bitwise Operators
● Logical Operators
● Assignment Operators
● Compound Operators
● Membership Operators
● Identity Operators

3 | Conditional Statements

● If Else
● If
● Else
● El If (else if)
● If Else Ternary Expression
4 | While Loop

● While loop logic building


● Series based Questions
● Break
● Continue
● Nested While Loops
● Pattern-Based Questions
● pass
● Loop else

5 | Lists

● List Basics
● List Operations
● List Comprehensions / Slicing
● List Methods

6 | Strings

● String Basics
● String Literals
● String Operations
● String Comprehensions / Slicing
● String Methods

7 | For Loops

● Range function
● For loop
● Nested For Loops
● Pattern-Based Questions
● Break
● Continue
● Pass
● Loop else

8 | Functions

● Definition
● Call
● Function Arguments
● Default Arguments
● Docstrings
● Scope
● Special functions Lambda, Map, and Filter
● Recursion
● Functional Programming and Reference Functions

9 | Dictionary

● Dictionaries Basics
● Operations
● Comprehensions
● Dictionaries Methods

10 | Tuple

● Tuples Basics
● Tuples Comprehensions / Slicing
● Tuple Functions
● Tuple Methods

11 | Set

● Sets Basics
● Sets Operations
● Union
● Intersection
● Difference and Symmetric Difference

12 | Object-Oriented Programming

● Classes
● Objects
● Method Calls
● Inheritance and Its Types
● Overloading
● Overriding
● Data Hiding
● Operator Overloading

13 | File Handling

● File Basics
● Opening Files
● Reading Files
● Writing Files
● Editing Files
● Working with different extensions of file
14 | Exception Handling

● Common Exceptions
● Exception Handling
● Try
● Except
● Try except else
● Finally
● Raising exceptions
● Assertion

15 Regular Expression

● Basic RE functions
● Patterns
● Meta Characters
● Character Classes

16 | Modules & Packages

● Different types of modules


● Inbuilt modules
● OS
● Sys
● Statistics
● Math
● String
● Random
● Create your own module
● Building Packages
● Build your own python module and deploy it on pip

17 | Data Structures

● Stack
● Queue
● Linked Lists
● Sorting
● Searching
● Linear Search
● Binary Search

18 | Higher-Order Functions

● Function as a parameter
● Function as a return value
● Closures
● Decorators
● Map, Filter, Reduce Functions

19 | Python Web Scrapping

● Understanding BeautifulSoup
● Extracting Data from websites
● Extracting Tables
● Data in JSON format
20 | Virtual Environment

● Virtual Environment Setup

21 | Web Application Project

● Flask
● Project Structure
● Routes
● Templates
● Navigations

22 | Git and GitHub

● Git - Version Control System


● GitHub Profile building
● Manage your work on GitHub

23 | Deployment

● Heroku Deployment
● Flask Integration

24 | Python Package Manager

● What is PIP?
● Installation
● PIP Freeze
● Creating Your Own Package
● Upload it on PIP
25 | Python with MongoDB Database

● SQL and NoSQL


● Connecting to MongoDB URI
● Flask application and MongoDB integration
● CRUD Operations
● Find
● Delete
● Drop

26 | Building API

● API (Application Programming Interface)


● Building API
● Structure of an API
● PUT
● POST
● DELETE
● Using Postman

27 Statistics with NumPy

● Statistics
● NumPy basics
● Working with Matrix
● Linear Algebra operations
● Descriptive Statistics
28 | Data Analysis with Pandas

● Data Analysis basics


● Dataframe operations
● Working with 2-dimensional data
● Data Cleaning
● Data Grouping

29 | Data Visualization with Matplotlib

● Matplotlib Basics
● Working with plots
● Plot
● Pie Chart
● Histogram

30 | What to do Now?

● Discussions on how to process further with this knowledge.


2 | Data Structure & Algorithms
Data Structure is the most important thing to learn not only for data scientists but
for all the people working in computer science. With data structure, you get an
internal understanding of the working of everything in software.

0 | Data Structures & Algorithms Starting Point

● Getting Started
● Variables
● Data Types
● Data Structures
● Algorithms
● Analysis of Algorithm
● Time Complexity
● Space Complexity
● Types of Analysis
● Worst
● Best
● Average
● Asymptotic Notations
● Big-O
● Omega
● Theta
Data Structures - Phase 1

1 | Stack

2 | Queue

3 | Linked List

4 | Tree

5 | Graph

Algorithms - Phase 2

6 | List and Array

7 | Swapping and Sorting

8 | Searching 9 | Recursion

10 | Hashing

11 | Strings

12 | Dynamic Programming

Interviews Questions & Solutions


3 | Pandas Numpy Matplotlib
Python supports n-dimensional arrays with NumPy. For data in 2 dimensions,
Pandas is the best library for analysis. You can use other tools but tools have drag
and drop features and limitations. Pandas can be customized as per the need as
we can code depending upon the real-life problem.

Numpy

● Vectors, Matrix
● Operations on Matrix
● Mean, Variance, and Standard Deviation
● Reshaping Arrays
● Transpose and Determinant of Matrix
● Diagonal Operations, Trace
● Add, Subtract, Multiply, Dot, and Cross Product.

Pandas

● Series and DataFrames


● Slicing, Rows, and Columns
● Operations on DataFrame
● Different ways to create DataFrame
● Read, Write Operations with CSV files
● Handling Missing values, replacing values, and Regular Expression
● GroupBy and Concatenation

Matplotlib

● Graph Basics
● Format Strings in Plots
● Label Parameters, Legend
● Bar Chart, Pie Chart, Histogram, Scatter Plot
4 | Statistics
Descriptive Statistics

● Measure of Frequency and Central Tendency


● Measure of Dispersion
● Probability Distribution
● Gaussian Normal Distribution
● Skewness and Kurtosis
● Regression Analysis
● Continuous and Discrete Functions
● Goodness of Fit
● Normality Test
● ANOVA
● Homoscedasticity
● Linear and Non-Linear Relationship with Regression

Inferential Statistics

● t-Test
● z-Test
● Hypothesis Testing
● Type I and Type II errors
● t-Test and its types
● One way ANOVA
● Two way ANOVA
● Chi-Square Test
● Implementation of continuous and categorical data
5 | Machine Learning
The best way to master machine learning algorithms is to work with the Scikit-
Learn framework. Scikit-Learn contains predefined algorithms and you can work
with them just by generating the object of the class. These are the algorithm you
must know including the types of Supervised and Unsupervised Machine
Learning:

● Linear Regression
● Logistic Regression
● Decision Tree
● Gradient Descent
● Random Forest
● Ridge and Lasso Regression
● Naive Bayes
● Support Vector Machine
● KMeans Clustering

Other Concepts and Topics for ML

● Measuring Accuracy
● Bias-Variance Trade-off
● Applying Regularization
● Elastic Net Regression
● Predictive Analytics
● Exploratory Data Analysis
6 |MLOps
You can master any one of the cloud services providers from AWS, GCP, and
Azure. You can switch easily once you understand one of them.

We will focus on AWS — Amazon Web Services first

● Deploy ML models using Flask

● Amazon Lex — Natural Language Understanding

● AWS Polly — Voice Analysis

● Amazon Transcribe — Speech to Text

● Amazon Textract — Extract Text

● Amazon Rekognition — Image Applications

● Amazon SageMaker — Building and deploying models

● Working with Deep Learning on AWS


7| Natural Language Processing
If you are interested in working with Text, you should do some of the work an NLP
Engineer do and understand the working of Language models.

● Sentiment analysis
● POS Tagging, Parsing,
● Text preprocessing
● Stemming and Lemmatization
● Sentiment classification using Naive Bayes
● TF-IDF, N-gram,
● Machine Translation, BLEU Score
● Text Generation, Summarization, ROUGE Score
● Language Modeling, Perplexity
● Building a text classifier
● Identifying the gender

8 | Computer Vision
To work on image and video analytics we can master computer vision. To work on
computer vision we have to understand images.

● PyTorch Tensors
● Understanding Pretrained models like AlexNet, ImageNet, and ResNet.
● Neural Networks
● Building a perceptron
● Building a single-layer neural network
● Building a deep neural network
● Recurrent neural network for sequential data analysis
Convolutional Neural Networks

● Understanding the ConvNet topology


● Convolution layers
● Pooling layers
● Image Content Analysis
● Operating on images using OpenCV-Python
● Detecting edges
● Histogram equalization
● Detecting corners
● Detecting SIFT feature points

9 | Data Visualization with Tableau


How to use it Visual Perception

● What is it, How it works, Why Tableau


● Connecting to Data
● Building charts
● Calculations
● Dashboards
● Sharing our work
● Advanced Charts, Calculated Fields, Calculated Aggregations
● Conditional Calculation, Parameterized Calculation
10 | Structured Query Language (SQL)
● Fundamental to SQL syntax and Installation
● Creating Tables, Modifiers
● Inserting and Retrieving Data, SELECT INSERT UPDATE DELETE
● Aggregating Data using Functions, Filtering, and RegEX
● Subqueries, retrieve data based on conditions, grouping of Data.
● Practice Questions
● JOINs
● Advanced SQL concepts such as transactions, views, stored procedures, and
functions.
● Database Design principles, normalization, and ER diagrams.
● Practice, Practice, Practice: Practice writing SQL queries on real-world
datasets, and work on projects to apply your knowledge.
11 | Data Engineering
BigData

● What is BigData?
● How is BigData applied within Business?

PySpark

● Resilient Distributed Datasets


● Schema
● Lambda Expressions
● Transformations
● Actions

Data Modeling

● Duplicate Data
● Descriptive Analysis of Data
● Visualizations
● ML lib
● ML Packages
● Pipelines

Streaming

● Packaging Spark Applications


12 | Data System Design
What is system design?

● IP and OSI Model


● Domain Name System (DNS)
● Load Balancing
● Clustering
● Caching
● Availability, Scalability, Storage

Databases and DBMS

● SQL databases
● NoSQL databases
● SQL vs NoSQL databases
● Database Replication
● Indexes
● Normalization and Denormalization
● CAP theorem

System Design Interview

● URL Shortener
● Whatsapp, Twitter, Netflix, Uber
13 | Five Major Projects and Git
We follow project-based learning and we will work on all the projects in parallel.

14 | Interview Preparation

15 | Git & GitHub


Git & GitHub Course

● Understanding Git
● Commands and How to commit your first code?
● How to use GitHub?
● How to make your first open-source contribution?
● How to work with a team? — Part 1
● How to create your stunning GitHub profile?
● How to build your own viral repository?
● Building a personal landing page for your Portfolio for FREE
● How to grow followers on GitHub?
● How to work with a team? Part 2 — issues, Milestones, and projects
SAVE FOR LATER

You might also like