Introduction to ADF - LwTN
Introduction to ADF - LwTN
Manuel Quintana
Agenda
Azure Subscription
Must have an existing Azure Subscription
Azure Roles
Member of the contributor or owner role (or)
Administrator of the Azure subscription
Resource Groups
Data Factory
The name must be globally unique.
Subscription
Resource Group
Version (V1 vs V2)
Location
Version Control
Create Resources
Demo
Data Factory Navigation
Let’s get started – Home Hub
Actions
Ingest (Copy Data Activity Wizard)
Other Areas
Discover More
Recent Resources
Feature showcase
Resources
Author Hub
Design Area
Pipelines
Datasets
Data flows
Power Query
Monitoring Hub
Monitoring Options
Dashboards
Pipeline Runs
Trigger Runs
Integration Runtimes
Data flow debug
Manage Hub
Admin Options
Connections
Source Control
Author
Security
Data Factory Resources
Integration Runtimes
Linked Services
Datasets
Integration Runtimes
Folders
Used to group pipeline resources together
Used to group dataset resources together
Used to group data flow’s together
Create Linked Service
Demo
Copy Activity Wizard
Copy Activity Wizard
Task cadence or schedule
Run once now
Run Regularly on schedule (Creates
Trigger)
Source Data Store
Choose existing data set
Create new data set
Destination data store
Choose existing data set
Create new data set
Settings
Data Integration Unit
Degree of copy parallelism
Copy Activity Wizard
Demo
Pipeline Basics
Demo
Get Metadata Activity
Get Metadata activity
Purpose
Retrieve metadata information of data
Metadata options
Item Name
Item Type
Size
Created
Last Modified
Child Items
Content MD5
Structure
Column Count
Exists
Output Parameters
Output Parameters
Outputs can be used in other activities
Supports
Azure SQL Database
Synapse Analytics (Azure SQL DW)
SQL Server Database
Limitations
No output parameters to ADF
Stored Procedure Activity
Demo
Lookup Activity
Pipeline Design
Design Pattern
Lookup Activity
Purpose
Retrieve a dataset
Supports
Any Azure Data Factory data source
Executing Stored Procedures
Executing SQL Scripts
Output parameters
Outputs
Single Value
Array / Object
Lookup Activity
Demo
If Condition Activity
Pipeline Design
If Condition Activity
Purpose
If statement functionality
Boolean expression (True/False)
Supports
ADF Expressions and Functions
If True Activities
If False Activities
If Condition Activity
Demo
Data Flows
Overview
What are Data Flows
Purpose
Allows for data transformations
Items
Source
Transformations
Sink
How to Execute
Debug
Data Flow Activity
File Format
Column oriented data storage
format vs row oriented
Benefits
Storage
Performance
Source
Available Options
Azure SQL Data Warehouse
Azure SQL Database
Cosmos DB
Azure Blob
ADLS Gen1/2
Synapse Analytics
Items
Minimum of 1 Source
Transformations
Available Options
New Branch
Join
Conditional Split
Derived Column
Lookup
Select
Sort
Filter
Etc…
Expressions
Debug
Lets you see live in-progress preview of your
data results from the expression you are
building
Sink
Available Options
Azure SQL Data Warehouse
Azure SQL Database
Cosmos DB
Azure Blob
ADLS Gen1/2
Synapse Analytics
Items
Minimum of 1 Sink
Setup
Business Scenario
• My business has requested to get a file that lists all of the products our
company sells. (Source)
• They also want the model description of the product which comes from a
different table. (Lookup & Select)
• The shipping weight needs to be included but needs to be calculated by
padding the actual weight by 10% to account for packing (Derived Column)
• We also do not need products which have a list price of $0.00 (Filter)
• Finally we need to order the data in a file by the list price descending (Sort &
Sink)
Data Flow Overview
Demo
Scheduling a Pipeline
Triggers
Triggers
Schedule trigger
Invokes pipeline on a wall-clock schedule
Event-based trigger
Responds to an event
Schedule Trigger
Schedule Recurrence:
Every Minute
Hourly
Daily
Weekly
Monthly
Pipeline Assignment
Multiple pipelines to single trigger
Assignment performed from pipeline
Schedule Triggers
Demo
Other ADF Features
Triggers
Power Query
Can leverage the Power Query Editor Online to
Transform data in a Pipeline
Flowlet
Store re-usable code