Modern Data Stack
Modern Data Stack
Stack
Presented by Supervised By
Zine El Abidine ACHAGHOUR Pr. Lotfi Najdi
Hamid Elaaly
TABLE OF
CONTENTS
Powered by cloud technology advancements, key players include BigQuery, Redshift, and
Snowflake, alongside BI tools such as Looker and Tableau. Data ingestion tools like Stitch and
Fivetran ensure seamless integration, while MongoDB, Cassandra, and Elasticsearch manage big
data effectively.
HISTORY OF MDS
Cloud computing and data warehousing have driven the emergence of the modern data stack, shifting
from ETL to ELT workflows for greater connectivity and flexibility. This transition, rooted in the early
2010s, addresses the need for agile analytics and vendor-agnostic solutions.
Powered by cloud technology advancements, key players include BigQuery, Redshift, and Snowflake,
alongside BI tools such as Looker and Tableau. Data ingestion tools like Stitch and Fivetran ensure
seamless integration, while MongoDB, Cassandra, and Elasticsearch manage big data effectively.
THE SHIFT FROM ETL TO ELT
The shift from ETL (Extract, Transform, Load) to ELT (Extract, Load, Transform) represents a
fundamental change in the workflow of data processing.
ETL (Extract, Transform, Load): In the traditional ETL approach, data is first extracted from various
sources, then transformed to fit the target data model or schema, and finally loaded into the data
warehouse or destination system. Transformation typically involves cleansing, aggregating, and
formatting the data to make it suitable for analysis.
THE SHIFT FROM ETL TO ELT
The ELT (Extract, Load, Transform) : approach has gained prominence with the rise of cloud
computing. In ELT, data is extracted and loaded into the target system without significant
transformation upfront. Transformation occurs within the data warehouse or data lake, leveraging the
scalability and processing power of modern cloud platforms for faster loading times and efficient
resource use.
THE SHIFT FROM ETL TO ELT
The shift to ELT offers several advantages:
A modern data stack relies on cloud computing, whereas a legacy data stack stores data on
servers instead of in the cloud. Modern data stacks provide access for more data professionals
than a legacy data stack.
A legacy data stack usually refers to the traditional relational database management system
(RDBMS), which uses a structured query language (SQL) to store and process data.
While an RDBMS can still be used in a modern data stack, it is not as common because it is not as
well-suited for managing big data. SQL, however, remains a popular query language for both
legacy and modern data stacks.
THE BENEFITS OF MDS
Higher Volumes + Lower Costs
Compute & store higher volumes of data at a
significantly faster rate & reduced cost.
Data Accessibility
Self-service analytics programs increase your
organization's data literacy.
Built to Scale
Pricing & products supports scale for data volumes, #
for users and use cases.
Best-in-Breed Technologies
Specialization drives innovation and modularity gives
your team flexibility.
MDS VISUALIZED
SRC: FIVETRAN
MDS - DATA SOURCES
Databases
Files
Applications ( Categorized by use case )
Events Collectors
MDS - INGESTION
Batch processing:
Streaming Data:
Reverse ETL: While ETL and ELT transfer data from third-party
sources, reverse ETL does the opposite. It transfers data from a data
warehouse to the third-party system and makes sure that it meets
the formatting requirements of that platform.
Unlike traditional ETL processes that move data from source systems
to a centralized repository for analysis, reverse ETL flips this flow,
enabling organizations to leverage insights gained from centralized
data analysis to drive actions or updates in operational systems.
Open Source Strategy: Modern data companies start with open source, and then move
to the cloud for a hybrid approach, crucial for user engagement. Venture capital
investors increasingly prefer startups with open-source strategies.
More SQL in Data Engineering: SQL is crucial in data management—it's simple, widely
understood, and based on common standards. It's the backbone of the data stack and
will likely support predictive analytics in the future.
CONCLUSION
The modern data stack is powerful tools that can help companies make better data-driven
decisions. If you’re not already using one, now is the time to start putting together a modern
data stack that works for you
If you’re still using a legacy data stack, consider adopting a modern data stack. It is not
merely a rising trend – there are multiple benefits to using it! .
In the future, we can expect to see even more innovation in the modern data stack. This will
help companies to better scale, manage, and analyze their data.
SOME RESOURCES
https://www.moderndatastack.xyz
https://www.thoughtspot.com/data-trends/best-practices/modern-
data-stack
THANK YOU
FOR WATCHING