0% found this document useful (0 votes)
175 views

Azure Batch Notes

Azure Batch allows users to define compute resources to run applications in parallel at scale without manually configuring infrastructure. It creates and manages a pool of virtual machines, installs the desired applications, and schedules jobs to run across the nodes. Key steps include uploading files and applications, creating a pool and job, downloading inputs to nodes, monitoring task execution, uploading outputs, and downloading results. Billing is based on the underlying resources like VMs and storage.

Uploaded by

sharq
Copyright
© © All Rights Reserved
Available Formats
Download as ODT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
175 views

Azure Batch Notes

Azure Batch allows users to define compute resources to run applications in parallel at scale without manually configuring infrastructure. It creates and manages a pool of virtual machines, installs the desired applications, and schedules jobs to run across the nodes. Key steps include uploading files and applications, creating a pool and job, downloading inputs to nodes, monitoring task execution, uploading outputs, and downloading results. Billing is based on the underlying resources like VMs and storage.

Uploaded by

sharq
Copyright
© © All Rights Reserved
Available Formats
Download as ODT, PDF, TXT or read online on Scribd
You are on page 1/ 3

Azure Batch

Define the Azure compute resources to execute your applications in parallel or at scale without
manually configuring or managing infrastructure

Schedule compute-intensive tasks and dynamically add or remove compute resources based on
your requirements

Azure Batch:

• Creates and manages a pool of compute nodes (virtual machines)


• Installs the applications you want to run
• Schedules jobs to run on the nodes

Parallel workloads

Intrinsically parallel workloads

• Batch works well with intrinsically parallel workloads


• Intrinsically parallel workloads run independently, and each instance completes part of the
work
• applications are executing
• They do not communicate with other instances of the application
• They may access to common data or other resources

Tightly coupled workloads

• Workloads where the applications need to communicate with each other


• Tightly coupled applications normally use the Message Passing Interface (MPI) API
• Can run tightly coupled workloads with Batch using Microsoft MPI or Intel MPI
Azure Batch
steps

1. Upload input files and the applications to process those files to the Azure Storage account

• The application files can include scripts or applications that process the data
• The input files can be any data that the application processes, such as financial modeling data,
or video files to be transcoded

2. Create a Batch pool of compute nodes in Batch account, a job to run the workload on the pool,
and tasks in the job

Batch pool of compute nodes - Pool nodes are the VMs that execute your tasks through a job.
Specify properties such as the number and size of the nodes, a Windows or Linux VM image,
and an application to install when the nodes join the pool.Manage the cost and size of the pool
by using low-priority VMs or automatically scaling the number of nodes as the workload
changes.

Jobs - When you add tasks to a job, the Batch service automatically schedules the tasks for
execution on the compute nodes in the pool.

Tasks - Each task uses the application that you uploaded to process the input files.

3. Download input files and the applications to Batch


• Before each task executes, it can download the input data that it is to process to the assigned
compute node. If the application isn't already installed on the pool nodes, it can be downloaded
here instead. When the downloads from Azure Storage complete, the task executes on the
assigned node.

4. Monitor task execution

• As the tasks run, query Batch to monitor the progress of the job and its tasks

5. Upload task output

• As the tasks complete, they can upload their result data to Azure Storage. Files can be directly
retrieved from the filesystem on the compute node.

6. Download output files

When monitoring detects that the tasks in the job have completed, client application or service can
download the output data for further processing.

Billing:

• There is no additional charge for using Batch


• Pay for the underlying resources consumed, such as the virtual machines, storage, and
networking

You might also like