0% found this document useful (0 votes)
9 views

Azure-Data-Engineer-Learning-Pathway

The Azure Data Engineer Learning Pathway provides a comprehensive guide for data professionals to implement and manage data engineering workloads on Microsoft Azure. It covers key topics such as data storage, processing, security, and monitoring using various Azure services, along with role-based certification preparation for the DP-203 exam. The pathway is designed for those new to Azure or data solutions, offering both foundational and advanced learning resources.

Uploaded by

Vikram sharma
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views

Azure-Data-Engineer-Learning-Pathway

The Azure Data Engineer Learning Pathway provides a comprehensive guide for data professionals to implement and manage data engineering workloads on Microsoft Azure. It covers key topics such as data storage, processing, security, and monitoring using various Azure services, along with role-based certification preparation for the DP-203 exam. The pathway is designed for those new to Azure or data solutions, offering both foundational and advanced learning resources.

Uploaded by

Vikram sharma
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 2

January 2024

Azure Data Engineer Learning Pathway (1/2) www.aka.ms/pathways

Getting started Microsoft Learn/Documentation Role based Certification


Learn how to implement and manage data Design and implement data storage Design and implement data storage DP-203: Azure Data Engineer
engineering workloads on Microsoft Azure, • Understand Azure Data Lake Storage Gen2 • Views in Synapse serverless SQL pools Skills Measured
using Azure services such as Azure Synapse • Access tiers for Azure Blob Storage • Tutorial: Load data to Azure Synapse Analytics SQL
Analytics, Azure Data Lake Storage Gen2, Azure • Design and implement data storage
• Storage considerations when using Azure Synapse pool
Stream Analytics, Azure Databricks, and others. • Develop data processing
serverless SQL pools • Create, develop, and maintain Synapse notebooks
in Azure Synapse Analytics • Secure, monitor, and optimize data storage
Explore common data engineering tasks such • Query a Parquet file using Azure Synapse serverless
and data processing
as orchestrating data transfer and SQL pools
transformation pipelines, working with data files • Dynamic file pruning Design and develop data processing Self Study:
in a data lake, creating and loading relational • Understand table distribution design • Common practices for data loading
data warehouses, capturing and aggregating • Get started with data engineering on Azure
• Partitioning tables in dedicated SQL pool • Tutorial: Extract, transform, and load data by using
streams of real-time data, and tracking data • Build data analytics solutions using Azure Synapse
• Understand table distribution design Azure Databricks
assets and lineage. serverless SQL pools
• Best practices for dedicated SQL pools in Azure • Understand the Streaming Analytics Workflow
• Perform data engineering with Azure Synapse Apache
Audience Profile: Synapse Analytics • Handling bad records and files
Spark Pools
• Star Schema • Prepare and transform data with Azure Synapse
• Work with Data Warehouses using Azure Synapse Analytics
The primary audience for this course is data • Multidimensional Schemas and Data Analytics
professionals, data architects, and business Analyse complex data types in Azure Synapse • Transfer and transform data with Azure Synapse Analytics
• Manage retention of historical data in system- •
intelligence professionals who want to learn Analytics pipelines
versioned temporal tables
about data engineering and building analytical • Getting started with temporal tables • Understand data store models • Work with Hybrid Transactional and Analytical Processing
solutions using data platform technologies that Solutions using Azure Synapse Analytics
• Create and configure a self-hosted integration • Prepare and transform data
exist on Microsoft Azure. • Implement a Data Streaming Solution with Azure Stream
runtime • Define a modern data warehouse architecture
Analytics
• New to the Cloud or Azure? Start with Azure • Manage self-hosted integration runtime • Choosing a batch processing technology
• Choosing an analytical data store in Azure • Govern data across an enterprise
Fundamentals • Manage source data files
• New to data solutions on Azure? Build your • Synapse Analytics shared metadata tables • Copy activity in Azure Data Factory • Data engineering with Azure Databricks
knowledge with Data Fundamentals • When do you use Apache Spark pools? • MERGE (Transact-SQL)
• Intro to data classification and protection • Data Compression • Continuous integration and delivery for Azure Exam Study
Course Page Exam Page
Get started with data engineering on Azure • Exercise - Use table distribution and indexes to Synapse workspace Guide
• Introduction to data engineering on Azure improve performance • Handle SQL truncation error rows in Data Factory
• Introduction to Azure Data Lake Storage • Change storage account is replication mapping data flows 30 Day Practice Video on
Gen2 Challenge Assessment Demand (soon)
• Slowly Changing Dimension Transformation
• Introduction to Azure Synapse Analytics • Populate slowly changing dimensions
• Create external tables in Azure Synapse serverless Azure Data Architecture Guide
SQL pools
January 2024

Azure Data Engineer Learning Pathway (2/2) www.aka.ms/pathways

Additional Study
Microsoft Applied Skills
Design and develop data processing Design and implement data security Monitor and optimize data storage and
• Backup and restore in Azure Synapse Dedicated SQL • Implement encryption data processing
pool • Data ingestion security considerations • Auto Optimize in Azure Databricks
• Implement workload management • Configure authentication • Modify user-defined functions Targeted validation for real-world scenarios. Demonstrate
• Use extended Apache Spark history server to debug and • Designing distributed tables proficiency in specific, scenario- based skill sets so you can make
• Access control lists (ACLs) in Azure
diagnose Apache Spark applications a bigger impact on every project, at your organization, and in
Data Lake Storage Gen2 • Data spillage scenario - Search and
your career
• Enterprise Data Warehouse Architecture • Synapse access control purge
• Stream processing with Azure Databricks • Column-level security • Quickstart: Create an Azure Synapse
workspace using an ARM template
Explore Applied Skills
• Azure Synapse Analytics • Manage authorization through
• Monitoring for performance efficiency column and row level security • Indexing dedicated SQL pool tables
• Work with windowing functions • Manage user permissions • Performance tuning with result set
• Schema drift • Auditing for Azure SQL Database and caching
Azure Synapse Analytics • Optimize Apache Spark jobs 30 days to Learn it Challenge
• Time handling in Stream Analytics
• Checkpoint and replay concepts in Azure Stream • Retention Policy on storage accounts • Troubleshoot library installation errors
Analytics jobs • Understand network security options • Debug data factory pipelines 30 Days to Learn It can help you build skills and start your
• Scale an Azure Stream Analytics job to increase • Dynamic Data Masking preparation for Microsoft Certifications for AI, DevOps, Microsoft
throughput • Secure a dedicated SQL pool 365, low code, IoT, data science, cloud development, and more.
Select your challenge below, work through learning modules, and
exchange ideas with peers through a global community forum.
Design and develop data processing Monitor and optimize data storage and
• Use repartitioning to optimize processing data processing
• Azure Stream Analytics output error policy • Monitor and Alert Data Factory by
using Azure Monitor Explore the challenges
• Stream Analytics output to Cosmos DB
• Stream processing with Stream Analytics • Exercise – Implement workload
• Data Loading best practices management
• Get Started with Synapse Analytics • Monitor your Azure Synapse Analytics
dedicated SQL pool workload using
• Monitor your Synapse Workspace DMVs
• Collect custom logs with Log Analytics
agent
• Use Synapse Studio to monitor your
workspace pipeline runs
• Deploying Apache Airflow in Azure to
build and run data pipelines

You might also like