Data Analyst Azure PowerBI Syllabus (1)
Data Analyst Azure PowerBI Syllabus (1)
2.1
2 Introduction to Azure
2.2
2.3
3.1
3.2
3.3
3.4
3.5
3.5
3.6
3.7
3.8
3.9
3.10
3.11
3.12
3.13
3.14
4.1
4.2
4.4
4.5
5.1
5.2
5.3
5 Data exploration and transformation in Azure Databricks
5.4
5.5
5.5
5.6
5.7
6.1
6.2
Explore, transform, and load data into the Data Warehouse using
6
Apache Spark
6.3
6.4
6.5
7.1
7.2
7.3
7.5
7.6
8.1
8.2
8.4
8.5
8.6
9.1
9.2
9.3
9.4
9.5
9.7
9.8
9.9
9.10
9.11
9.12
10.7
11.1
11.2
11.3
11.4
11.6
11.7
11.7
11.8
11.9
12.1
12.2
12.3
12.4
Other 13
Other 13
5.1.4
5.2.1
6.1.5
6.2.1
6.2.2
Introduction to Apache Spark 6.2.3
6.2.4
6.2.5
6.3.1
6.3.2
6.3.3
6.3.4
Apache Spark in Azure Synapse Analytics
6.3.5
6.3.6
6.3.7
6.3.8
Knowledge Check
Lesson-end Project: Course Data Management
7.1.1
7.1.2
Loading data in Azure Synapse Analytics 7.1.3
7.1.4
7.2.1
7.2.2
Loading data with PolyBase and COPY using T-SQL
7.2.3
7.2.4
7.3.1
7.3.2
7.3.3
7.3.4
Overview of Azure Data Factory 7.3.5
7.3.6
7.3.7
7.3.8
7.3.9
Knowledge Check
Lesson-end Project: User Information Analysis
8.1.1
8.1.2
Data Integration with Azure Data Factory or Azure Synapse Pipelines 8.1.3
8.1.4
8.1.5
8.2.1
8.2.2
Code-less Transformation at Scale with Azure Data Factory or Azure Synapse 8.2.3
Pipelines 8.2.4
8.2.5
8.2.6
8.3.1
8.3.2
Overview of Data Flow Mapping
8.3.3
8.3.4
8.4.1
8.4.2
8.4.3
Overview of Data Orchestration
8.4.4
8.4.5
8.4.6
Knowledge Check
Lesson-end Project: Course Data Mapping
9.1.1
9.1.2
Overview of Azure Key Vault Service 9.1.3
9.1.4
9.1.5
9.2.1
Introduction to Encryption in Azure Data Factory
9.2.2
9.3.1
Overview of Customer- Managed Keys
9.3.2
9.4.1
9.4.2
9.4.3
Secure a data warehouse in Azure Synapse Analytics
9.4.4
9.4.5
9.4.6
9.5.1
9.5.3
9.6.1
Introduction to Data Masking
9.6.2
9.7.1
9.7.2
9.7.3
Overview of Auditing, Data Discovery and Classification
9.7.4
9.7.5
9.7.6
9.8.1
Overview of Azure AD Authentication
9.8.2
9.9.1
9.9.2
Row-level and column level security
9.9.3
9.9.4
9.10.1
9.10.2
9.10.3
Role-based access control list 9.10.4
9.10.5
9.10.6
9.10.7
Knowledge Check
11.1.4
11.2.1
Introduction to Batch Processing 11.2.2
11.2.3
11.3.1
Introduction to real-time processing
11.3.2
11.4.1
Sending and receiving events 11.4.2
11.4.3
11.5.1
Enabling reliable messaging using Azure Event Hubs 11.5.2
11.5.3
11.6.1
Working with data streams using Azure Stream Analytics 11.6.2
11.6.3
11.7.1
12.1.1
12.1.2
12.1.3
13.22
13.23
13.24
13.25
13.26
13.27
13.28
13.29
13.3
13.31
13.32
13.33
13.34
13.35
13.36
Data Ingestion End-to-End Pipeline
Sales Data Visualization Using Azure Synapse Analytics
Sub Topics Name
Business Scenario
Azure Storage Account
Types of Storage Accounts
Need for Storage Account
General-Purpose Storage Accounts
Blob Storage
Queue
Table Storage
Azure Files
Blob Storage Account
Types of Access Tiers
Hot Access Tier
Cold Access Tier
Azure Storage Replication
Types of Azure Storage Replication
Locally Redundant Storage (LRS)
Zone Redundant Storage
Geo-Redundant Storage
Read Access Geo-Redundant Storage (RA-GRS)
Object Replication for Block Blob Storage
Assisted Practice: Creating an Azure Account
Business Scenario
Azure SQL Database
Characteristics of Azure SQL Database
Types of Deployment Models
Purchasing Models
vCore-based Purchasing Model
DTU-based Purchasing Model
Assisted Practice: Creating an Azure SQL Database
Business Scenario
Azure Data Lake Storage
Azure Data Lake Storage: Features
Limitless Storage
Large Analytic Workloads
High Availability and Reliability
Security
Big Data Processing
Cost Efficient
Azure Storage: Redundancy
Azure Storage: Life cycle Management
Azure Data Lake Storage Gen-2
Azure Data Lake Storage Gen-2: Features
Azure Data Lake Storage Gen-2: Best Practices
Business Scenario
Azure Synapse Analytics
Azure Synapse Analytics: Benefits
Azure Synapse Analytics: Features
Working of Azure Synapse Analytics
Azure Synapse Workspace
Assisted Practice: Creating an Azure Synapse Workspace
Business Scenario
Azure Databricks
Benefits of Azure Databricks
SQL Pool
Types of SQL Pools
Dedicated SQL Pool
Benefits of a Dedicated SQL Pool
Serverless SQL Pool
Serverless SQL Pool: Features
Uses of Serverless SQL Pool
Best Practices of Serverless SQL Pool
Assisted Practice: Creating an SQL Pool
Business Scenario
Files Used in Azure
Querying a CSV File
Querying a Parquet File
Querying a JSON File
Forms of Authorization
Managing User Permissions in Azure Synapse
User Permissions in Azure Synapse
Types of Access Control Lists (ACLs)
Types of Permissions in Azure Synapse
Business Sceanrio
Databricks
Features of Databricks
Databricks in Data Science and Engineering
Business Sceanrio
Reading Data in CSV Format
Reading Data in JSON Format
Reading Data in Parquet Format
Writing Data
Business Sceanrio
DataFrame
DataFrame in Azure Databricks
Advanced DataFrames methods in Azure Databricks
Databricks File System
Advantage of Databricks File System
Databricks File System Permissions
Assisted Practice: Using DataFrames in Azure Databricks
Assisted Practice: Caching a DataFrame
Assisted Practice: Remove duplicate data
Assisted Practice: Manipulate date and time values
Business Scenario
Azure Synapse Notebook
Benefits of Azure Synapse Notebook
What is Data Exploration?
Assisted Practice: Data Exploration in Synapse Studio
Business Scenario
Apache Spark
Features of Apache Spark
Usecase of Apache Spark
Working with Apache Spark Notebooks
Business Scenario
Spark Pools
Spark Instances
Apache Spark Pool Auto-Scaling
Apache Spark Pool in Azure Synapse Analytics
Business Scenario
Azure Synapse Analytics
Loading data in Azure Synapse Analytics
Assisted Practice: Best Practives for loading data into Azure Synapse
Analytics
What is PolyBase
Loading Data with PolyBase
Loading Data with Copy Statement
Assisted Practice: Perform Loading of Data with PolyBase and COPY using
T-SQL
Business Scenario
Components of Azure Data Factory
Benefits of Azure Data Factory
Business Scenario
What Is Petabyte-Scale Ingestion?
Ingesting Data Using Copy Activity
Ingesting Data using the Compute Resources
Ingesting Data using the SSIS Package
Assisted Practice: Perform petabyte-scale ingestion with Azure Synapse
Pipelines
Business Scenario
What is Data Integration?
Data Integration Patterns
Data Integration Runtime
Types of Integration Runtime
Transformation Data Using Mapping Data Flow
Transformation Data Using Compute Resources
Transformation Data Using SSIS Package
Types of Azure Data Factory Transformation
Business Scenario
Azure Key Vault
Benefits of Azure Key Vault
Azure Key Vault Roles
Firewall Rules
Virtual Networks
Benefits of a Virtual Network
Private Endpoints
Assisted Practice: Secure Azure Synapse Analytics supporting infrastructure
Assisted Practice: Secure the Azure Synapse Analytics workspace and
managed services
Business Scenario
What is Transparent Data Encryption?
Service-Managed Transparent Data Encryption
Dynamic Data Masking
Dynamic Data Masking Policy
Auditing
Auditing for SQL Server
Limitation of Auditing
Data Discovery and Classification
Purpose of Data Discovery and Classification
Data Discovery and Classification Capabilities
What is Authentiication?
Types of Security
Business Scenario
Event Hub
Partitions and Consumers
Throughput Units
Business Scenario
Batch Processing
Batch Processing Steps
Real-Time Processing
Benefits of Real Time Processing
Publishing Events
Sending Events with Python
Receiving Events with Python
Business Scenario
Enabling Reliable Messaging Using Azure Event Hub
Azure Functions: Event Consumptions
Business Scenario
Azure Stream Analytics
Creating Stream Analytics Job
Business Scenario
What is Structured Streaming
What is Unstructured Streaming
Features of structured streaming
Benefits of structured streaming
Benefits of unstructured streaming
Assisted Practice: Streaming data from a file and write it out to a distributed
file system
Log Pipeline Executions to SQL Table using Azure Data Factory.
Incremental Load or Delta load from SQL to Blob Storage and vice versa in Azure
Data Factory.
Copy Data from on premise to Azure SQL DW with polybase _ with Bulk Insert.
Migrate AWS S3 Buckets to ADLS Gen2 using ADF (v2) Copy data AWS to Azure.
- will provide We will need aws account to Sachin
Flatten Transformation and Rank, Dense_Rank Transformations in Data Flows.
Slowly Changing Dimension Type1 (SCD1) with Surrogate Key Transformation.
Slowly Changing Dimension Type2 (SCD2).
How to use pivot and unpivot Transformations
Difference Between Join vs. Lookup Transformation & Merge Functionality
Incremental Load or Delta load from SQL to Blob Storage and vice versa in Azure
Data Factory.
How do you ensure data quality and consistency in a large-scale data pipeline?
How do you handle missing or corrupted data in a data pipeline?
How do you handle data security and privacy concerns in a data pipeline?
Databricks Variables, Widget Types, Databricks notebook parameters
Azure Databricks Databases and Tables
Read TSV Files and PIPE Separated CSV Files
Read Parquet files from Data Lake Storage Gen2
Reading and Writing XML Files
Reading and Writing JSON Files
Reading and Writing ORC and Avro Files
Databricks Integration with Blob Storage and Azure Data Lake Gen2.
Databricks Integration with Azure SQLDB.
Azure Data Factory Integration with Azure Databricks.
Implementing SCD Type1 and Apache Spark Databricks Delta
Delta live table (atleast overview and different scenarios where we can use Delta
live Table)
Characteristics of Delta live table
Data Security features offered by synapse
how to optimize the performance adf
error handling using log files
how to implement security in azure sql,
Explain delta live table. Subscription required for delta live table. Why didn't you
use delta live table for project? Limitations of delta live table.
One pipeline end to end from development to production
How you will enhance the performance of the code, Ex. Accessing a link to
connect to storage account, how u will improve performance of that link.