0% found this document useful (0 votes)
14 views

ADF- Part-3

Uploaded by

Rajesh Tarra
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views

ADF- Part-3

Uploaded by

Rajesh Tarra
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 18

PART-3

LEARN
ZURE DATA
FACTORY (ADF)

C .R. Anil Kumar Reddy


www.linkedin.com/in/chenchuanil
Before we dive in to Control Flow and Transformational Activities let
us discuss about important activity called Copy activity

Copy Activity
The Copy Activity in Azure Data Factory (ADF) is one of the core
activities used for data movement. Its primary function is to copy data
from a source to a destination, supporting a variety of on-premises,
cloud-based, and SaaS data sources.

Purpose of Copy Activity

Data Movement: It moves data from a source to a destination


without making significant changes. It can handle structured, semi-
structured, and unstructured data.

ETL/ELT Processes: It serves as the Extract and Load phases in


ETL (Extract, Transform, Load) or ELT (Extract, Load, Transform)
pipelines, where it moves raw data to a data lake or a data
warehouse for further transformation or processing.

www.linkedin.com/in/chenchuanil
Use Cases of Copy Activity

ETL/ELT Pipelines: Copy Activity moves raw data from operational databases
to a data warehouse or lake for further processing.

Cloud Migration: It helps move large volumes of data from on-premises


systems to cloud-based storage or databases.

Data Synchronization: Copy Activity can synchronize data between multiple


systems (e.g., databases and data lakes).

Summary
The Copy Activity in Azure Data Factory is a versatile tool for moving data from
various sources to destinations. It handles data transfers efficiently, supports
multiple data formats, and offers extensive control over performance, fault
tolerance, and monitoring. It plays a crucial role in the data ingestion and
preparation phases of data processing pipelines in ADF.

www.linkedin.com/in/chenchuanil
Here is a detailed explanation of each control flow and
transformation activity in Azure Data Factory:

Control Flow Activities


1. Execute Pipeline Activity
Purpose: This activity allows you to invoke another pipeline within your
primary pipeline. It helps with modularizing and reusing pipelines, improving
manageability.

Use Case: When you need to break down complex data processes into
smaller, reusable pipelines, or when managing dependencies across
pipelines.

Configuration: Specify the pipeline to execute, pass any necessary


parameters, and configure settings like wait-for-completion or timeout
values.

www.linkedin.com/in/chenchuanil
2. Lookup Activity

Purpose: Used to retrieve data from a specified dataset. The


result can be a single row or a list of rows, and it’s commonly
used to get configuration values or check if data exists.

Use Case: When you need to fetch metadata, configuration


details, or conditionally trigger activities based on the retrieved
data.

Configuration: Specify the dataset and the source query or table.


If the lookup returns more than one row, ensure that the first row
is selected.

www.linkedin.com/in/chenchuanil
3. Filter Activity

Purpose: Filters a list of items based on a specified condition.

Use Case: When you want to limit the data processed to only
those items that meet a certain condition.

Configuration: Provide an input dataset and define the filter


condition using expressions.

www.linkedin.com/in/chenchuanil
4. Iteration Activities (Foreach)

Purpose: To iterate over a collection of items and execute


activities for each item.

Use Case: Processing a list of files, rows, or any iterable dataset


one by one.

Configuration: Provide the list or array of items to iterate and


define activities to execute for each iteration.

www.linkedin.com/in/chenchuanil
5. Iteration Activities(Until)

Purpose: Executes activities in a loop until a specific


condition is met.

Use Case: Scenarios where processing needs to continue


until a dataset reaches a target value.

Configuration: Define the exit condition and activities that


will be executed repeatedly.

www.linkedin.com/in/chenchuanil
6. Get Metadata Activity

Purpose: Retrieves metadata (like file size, last modified


date, or row count) from a dataset.

Use Case: Checking the properties of files, tables, or other


data sources before further processing.

Configuration: Specify the dataset and the list of


metadata fields you want to retrieve (like file name, size,
or schema).

www.linkedin.com/in/chenchuanil
7. Validation Activity

Purpose: Validates whether the dataset exists or meets


certain conditions (e.g., non-empty files).

Use Case: To ensure a dataset is available and valid before


proceeding with further pipeline activities.

Configuration: Select the dataset and configure the


validation type, such as whether the file exists or has a
minimum row count.

www.linkedin.com/in/chenchuanil
8. Conditional Activities (If Condition)

Purpose: Executes activities based on a true/false


expression.

Use Case: When you want to branch logic depending on


whether a condition is met.

Configuration: Define an expression to evaluate. If true,


certain activities are executed; if false, alternative
activities are triggered.

www.linkedin.com/in/chenchuanil
9. Conditional Activities (Switch)

Purpose: Executes different sets of activities based on


the value of an expression.

Use Case: When branching into more than two paths


based on values like status codes.

Configuration: Define the expression and configure cases


for each possible value.

www.linkedin.com/in/chenchuanil
10.Web Activity

Purpose: Invokes an external web service.

Use Case: When you need to interact with REST APIs to


send or retrieve data.

Configuration: Specify the endpoint, headers, body, and


authentication (if needed).

www.linkedin.com/in/chenchuanil
11. WebHook Activity

Purpose: Similar to the Web activity but supports long-


running tasks by calling an endpoint and waiting for a
response.

Use Case: Ideal for asynchronous web service calls.

Configuration: Provide the callback URL, headers, and the


expected wait-for-completion status.

www.linkedin.com/in/chenchuanil
Transformational Activities

Purpose: These are activities within mapping data flows that


transform data as it moves through the pipeline.

Common Transformations: Includes activities like joins,


aggregations, filters, and lookups within a data flow.

Use Case: When you need to cleanse, aggregate, or reshape data


before loading it into the final destination.

Configuration: Set up transformations using the data flow UI to


perform complex data operations.

www.linkedin.com/in/chenchuanil
Script and Stored Procedure Activities

Script Activity:

Purpose: Executes SQL scripts against a database.

Use Case: When you need to run raw SQL commands,


such as schema updates or bulk data manipulations.

Configuration: Define the script text or reference a script


file, and specify the target SQL database.

www.linkedin.com/in/chenchuanil
Stored Procedure Activity:

Purpose: Executes a stored procedure in a relational


database.

Use Case: When database logic is encapsulated in stored


procedures.

Configuration: Specify the stored procedure name and


any necessary parameters.

www.linkedin.com/in/chenchuanil
NIL REDDY CHENCHU

Torture the data, and it will confess to anything

DATA ANALYTICS

SHARE IF YOU LIKE THE POST

Lets Connect to discuss more on Data

www.linkedin.com/in/chenchuanil

You might also like