100 Days of Data Engineering
100 Days of Data Engineering
100 days is just a little over 3 months and I don't believe 3 months is truly sufficent to "become a data engineer
is no need to rush. The real purpose of this 100 days is to get you into the habit of practicing. If aftwards you w
let this 100 days limit you.
Day Task
Day 1 For day one, what I reccomend is taking the time to answering some questions and write out your plan to comm
Day 2 1. Downloading SQL Server And Creating A Tables2. Joins3. Case Statements
Day 3 1. SQL Interview Tips2.Solving More Problems With SQL
Day 4 1. Partition By2. CTE (Common Table Expression)3. Stored Procedures
Day 5 1. Loops Strings And Tuples2. Functions3. Mutabiltiy 4. Error Handling
Day 6 Basic Linux Commands 1/3
Day 7 Basic Linux Commands 2/3
Day 8 Basic Linux Commands 3/3
Day 9 1.Data Modeling Basics2.Normalization Vs Denormalization
Day 10 Read Chapters 1,2,3 in Kimballs Data Warehousing Toolkit
Day 11 What is Data Pipeline | How to design Data Pipeline ? - ETL vs Data pipeline (2023)
Day 12 SQL Project Example
Day 13 1.
4. 262. Trips and
Do certain Users2. Popularity
programming langaugesofget
Hack3.
moreAverage
answersSalaries4.
on average626. Exchange Seats
than
Day 14 others?
Continue from yesterday with new questions. Come up with some of your
Day 15 own?
Day 16 AWS Certificate Prep
Day 17 AWS Certificate Prep
Day 18 AWS Certificate Prep
Day 19 AWS Certificate Prep
Day 20 AWS Certificate Prep
Day 21 1. GCP Intro2. GCP and VPC3. GCP IAM
Day 22 1. GCP Bigquery2. GCP Cloud Composer
Day 23 1. Azure Vocab2. Azure Opex Vs Capex3. Azure Geographics And Regions4. Azure Basic Compute Services
Day 24 1. Azure Private Networks And VPCs2. Azure Storage3. Azure Big Data Services4. Azure Serverless Computin
Day 25 1. Data Structures And Algorithms Review Chapters 1-52. Introduction to Linked Lists (Data Structures & Algor
Day 26 1.Data Structures And Algorithms Review Chapters 8-112 Big O Notation
Day 27 1. WEB SCRAPING2. Reading CSVs, JSON And APIs
Day 28 Keeping time,
hitting and API.scheduling, tasks
But take your and
time launching
and programs
enjoy some free time just trying
Day 29 things out for yourself
Day 30 1. Learn Database Normalization - 1NF, 2NF, 3NF, 4NF, 5NF2. Logical Data Model
Day 31 1. Database Denormalization2. Article TBD(I'll be writing one shortly)
Day 32 Read Chapters 4,5,6 in Kimballs Data Warehousing Toolkit
Day 33 Agile Data Warehouse Chapters 1,2(and if time 3)
Day 34 1. What Is A Data Pipeline2. ETLs, Data Pipelines, Etc
Day 35 Basic Data Pipeline Project
Day 36 Live QA And Pipeline Sign Up
Day 37 At this point you may need some time to catch up. If that's the case, then the next three days can be used for th
Day 38 1. Airflow Is Not An ETL Tool2. Databricks Vs Snowflake3. Data Engineering Vocab
Day 39 1. Why Is Data Engineering Important2. MongoDb Is Not For Analytics
At the end of day 40, you should take a moment and review what you
Day 40 have learned overall(otherwise you'll forget all of your hard work)
Day 41 Read How To Start Your Next Data Engineering Project
Day 42 1.
4. Pick
Writeaup
data source,
your (also
current you can
progress andfind some
note more
down here
which andorhere)2.
code SQL isWrite out 10-15 questions you'd like to an
Day 43 actually going to be used
3. Create a layer that can be used for the analytics(aggregate tables,
Day 44 views, etc)
Day 45 Continue with any uncompleted tasks from the past few days
Day 46 Run some basic data quality checks to ensure your data is accurate
Day 47 Start to create your dashboard and populate it
Day 48 Finish Dashboard
Run some final QA and decide how you'd like to display this project(also
Day 49 general catchup)
Day 50 Write a blog, post or create a github repo to share your project
Day 51 Video To Be Filmed By Seattle Data Guy
Day 52 1. What Is Apache Spark2. Downloading And Working With Spark3. Quickstart Spark
Day 53 1. RDD Programming2. Pyspark Tutorial
Day 54 Long Pyspark Tutorial
Day 55 1. Docker Intro And Setting Up Airflow2. Docker In An Hour
Day 56 1.
2. Airflow Intro2.DAG
Set-up basic Airflow
thatTutorial 2 hour
pulls data fromwalk through
a one of these data
Day 57 sources(TODO)
Day 58 1.
2. Challenges
Find a friendYou
whoWill
youFace
can With
teachAirflow2.
some of Common Mistakes
the concepts you'veYou'll Make Setting Up Airflow
Day 59 learned(teaching is a great way to learn)
Day 60 Same as the prior day
Day 61 1. Intro To Databricks2. Setting Up Databricks3. Load Data Into Databricks
Day 62 1. Databricks Delta Table2. Databricks Delta Table Video
Day 63 1.What Is Trino2. Setting Up Trino
Day 64 Continue setting up trino and working with it
Day 65 Data Governance Book
Day 66 Data Governance Live - Sign Up
Day 67 1.Creating A Data Governance Framework2.Data Governance for Modern Organizations, Part 1
Day 68 1. What Is A Data Catalog2. Data Catalog Case Study3. Datahub Purpose And Architecture
Day 69 1. 6 Pillars Of Data Quality2. How And Why We Need To Implement Data Quality Now!3. Data Quality And Exa
Day 70 1. Data Quality
application, etc)Examples With SQL2. Data Quality With DBT
Day 71 4. Decide on some tools you'd like to use
Day 72 up your infrastructure, load your data, analyze it, figure out what you'd like
Day 73 to display, etc
Day 74
Day 75
Day 76
Day 77
Day 78
Day 79
Day 80
Day 81
Day 82
Day 83
Day 84
Day 85
Day 86
Run the project as you've planned out
Day 87
Day 88
Day 89
Day 90
Day 91
Day 92
Day 93
Day 94
Day 95
Day 96
Day 97
Day 98
Day 99
Day 100 Write a blog, post or create a github repo to share your project
0 Days Note
sufficent to "become a data engineer" or at the very least it feels a little fast. There
habit of practicing. If aftwards you want to dig into specific subjects, do that! Don't
Notes Category
taking a cert once you're done with this set SQL Deeper Dive
of videos and some of the projects Cloud
Cloud
Cloud
Cloud
Cloud
Cloud
Cloud
Cloud
Cloud
Programming
and then I'll attach a link the the live in the Data pipelines
future Progress Review And QA
Catch Up
Catch Up
Catch Up
Write A Review
Mini Project
Mini Project
Mini Project
Mini Project
Mini Project
Mini Project
Mini Project
Mini Project
Mini Project
Mini Project
Tool Intro
Spark
Spark
Spark
Docker
Airflow
Airflow
Airflow
Catch Up and Review
Catch Up and Review
Databricks
Databricks
Trino/Presto
Data Governance
Data Governance
Data Governance
Data Catalogs And Lineage
Data Quality
Data Quality
Project Planning