Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 8
Azure Data Engineer Road Map
• What is ETL and Why ETL
• OLTP vs OLAP • Datawarehouse concepts • Cloud computing concepts • Services in Azure • Storage Accounts • BLOB storage • ADLS Contd…. • Blob Vs ADLS • How to create services in Azure portal • What is Azure Data Factory • Components of Azure Data Factory • Integration Runtimes/Linked services/Datasets • Activities • Dataflows • Azure Key Vaults Contd…. • SQL concepts • DDL/DML/DCL • Joins • Views • Stored Procedures • What is spark • Spark Architecture • RDD vs Data frame Vs Dataset Contd…. • Lazy Evaluation • Transformations/Actions • Narrow Transformation Vs Wide Transformations • Functions in spark • Optimizations in spark • What is Delta lake • Delta lake features • Pyspark real time scenarios Contd…. • Python Basics • Data types • List/Tuple/Set/Dictionaries • Control statements • What is Databricks • Types of clusters • Cluster modes Contd… • Working with Data frames • Cache Vs Persist • Broadcast • Accumulators • What is MPP • Synapse Analytics • Dedicated pool • Serverless pool • Dedicated Vs Server less pool Contd… • What is Distribution • Types of distributions • Round Robin distribution • Hashing distribution • Replicated distribution • Agile methodology • Project architecture • Email Alerts using Logic Apps • Microsoft Fabric Introduction • Components in Fabric • Workspace creation • Use cases on Fabric