80% found this document useful (5 votes)
7K views

Unit-5 CC

The document discusses resource management and scheduling in cloud computing. It covers policies for admission control, capacity allocation, load balancing, energy optimization, and quality of service guarantees. Control theory and feedback control using dynamic thresholds are approaches for implementing these policies. The stability of a two-level resource allocation architecture is also discussed, with one level of controllers for the service provider and another for applications.

Uploaded by

Songa Sowjanya
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
80% found this document useful (5 votes)
7K views

Unit-5 CC

The document discusses resource management and scheduling in cloud computing. It covers policies for admission control, capacity allocation, load balancing, energy optimization, and quality of service guarantees. Control theory and feedback control using dynamic thresholds are approaches for implementing these policies. The stability of a two-level resource allocation architecture is also discussed, with one level of controllers for the service provider and another for applications.

Uploaded by

Songa Sowjanya
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 22

CLOUD COMPUTING-CS72

Syllabus: Unit-4

Cloud Resource Management and Scheduling: Policies and mechanisms for resource
management, Applications of control theory to task scheduling on a cloud, Stability of a two-
level resource allocation architecture, Feedback control based on dynamic thresholds,
Coordination of specialized autonomic performance managers, A utility-based model for cloud-
based web services, Resource bundling, combinatorial auctions for cloud resources, Scheduling
algorithms for computing clouds, fair queuing, Start time fair queuing, Cloud scheduling subject
to deadlines.

Policies and mechanisms for resource management

A policy typically refers to the principal guiding decisions, whereas mechanisms represent the
means to implement policies. Separation of policies from mechanisms is a guiding principle in
computer science.

Cloud resource management policies can be loosely grouped into five classes:

1. Admission control.
2. Capacity allocation.
3. Load balancing.
4. Energy optimization.
5. Quality-of-service (QoS) guarantees.

The explicit goal of an admission control policy is to prevent the system from accepting
workloads in violation of high-level system policies; for example, a system may not accept an
additional workload that would prevent it from completing work already in progress or
contracted.

Limiting the workload requires some knowledge of the global state of the system. In a dynamic
system such knowledge, when available, is at best obsolete. Capacity allocation means to
allocate resources for individual instances; an instance is an activation of a service. Locating
resources subject to multiple global optimization constraints requires a search of a very large
search space when the state of individual systems changes rapidly.

Load balancing and energy optimization can be done locally, but global load-balancing and
energy optimization policies encounter the same difficulties as the one we have already
discussed.

Load balancing and energy optimization are correlated and affect the cost of providing the
services. Indeed, it was predicted that by 2012 up to 40% of the budget for IT enterprise
infrastructure would be spent on energy.

Dr. Nandini N, Dr. AIT,Bengaluru Page 1


CLOUD COMPUTING-CS72

The common meaning of the term load balancing is that of evenly distributing the load to a set of
servers. For example, consider the case of four identical servers, A B C , , , and D , whose
relative loads whose relative loads are 80% 60% 40%, and 20%, respectively, of their capacity.
As a result of perfect load balancing, all servers would end with the same load - 50% of each
server’s capacity.

In cloud computing a critical goal is minimizing the cost of providing the service and, in
particular, minimizing the energy consumption. This leads to a different meaning of the term
load balancing instead of having the load evenly distributed among all servers, we want to
concentrate it and use the smallest number of servers while switching the others to standby
mode, a state in which a server uses less energy.

In our example, the load from D will migrate to A and the load from C will migrate to ; thus, B A
and B will be loaded at full capacity, whereas C and D will be switched to standby mode.

Quality of service is that aspect of resource management that is probably the most difficult to
address and, at the same time, possibly the most critical to the future of cloud computing.

Allocation techniques in computer clouds must be based on a disciplined approach rather than
adhoc methods.

The four basic mechanisms for the implementation of resource management policies are:

1. Control theory: Control theory uses the feedback to guarantee system stability and predict
transient behavior but can be used only to predict local rather than global behavior.

Dr. Nandini N, Dr. AIT,Bengaluru Page 2


CLOUD COMPUTING-CS72

2. Machine learning: A major advantage of machine learning techniques is that they do not
need a performance model of the system. This technique could be applied to coordination
of several autonomic system managers.
3. Utility-based: Utility-based approaches require a performance model and a mechanism to
correlate user-level performance with cost.
4. Market-oriented/economic mechanisms: Such mechanisms do not require a model of the
system, e.g., combinatorial auctions for bundles of resources

A distinction should be made between interactive and noninteractive workloads. The


management techniques for interactive workloads.

e.g., Web services, involve flow control and dynamic application.

Applications of control theory to task scheduling on a cloud :

Control theory has been used to design adaptive resource management for many classes of
applications, including power management, task scheduling, QoS adaptation in Web servers ,and
load balancing.

The classical feedback control methods are used in all these cases to regulate the key operating
parameters of the system based on measurement of the system output; the feedback control in
these methods assumes a linear time-invariant system model and a closed-loop controller.

This controller is based on an open-loop system transfer function that satisfies stability and
sensitivity constraints.

The technique allows multiple QoS objectives and operating constraints to be expressed as a cost
function and can be applied to stand-alone or distributed Web servers, database servers, high-
performance application servers, and even mobile/embedded systems.

Dr. Nandini N, Dr. AIT,Bengaluru Page 3


CLOUD COMPUTING-CS72

Stability of a two-level resource allocation architecture

A server with a closed-loop control system and can apply control theory principles to resource
allocation.

Allocation architecture based on control theory concepts for the entire cloud. The automatic
resource management is based on two levels of controllers, one for the service provider and one
for the application, see Figure 6.2

Dr. Nandini N, Dr. AIT,Bengaluru Page 4


CLOUD COMPUTING-CS72

The main components of a control system are the inputs, the control system components, and the
outputs. The inputs in such models are the offered workload and the policies for admission
control, the capacity allocation, the load balancing, the energy optimization, and the QoS
guarantees in the cloud.

The system components are sensors used to estimate relevant measures of performance and
controllers that implement various policies; the output is the resource allocations to the
individual applications .

The controllers use the feedback provided by sensors to stabilize the system; stability is related
to the change of the output. If the change is too large, the system may become unstable.

There are three main sources of instability in any control system:

1. The delay in getting the system reaction after a control action.


2. The granularity of the control, the fact that a small change enacted by the controllers
leads to very large changes of the output.
3. Oscillations, which occur when the changes of the input are too large and the control is
too weak, such that the changes of the input propagate directly to the output.

Two types of policies are used in autonomic systems:

(i) threshold-based policies


(ii) sequential decision policies

Dr. Nandini N, Dr. AIT,Bengaluru Page 5


CLOUD COMPUTING-CS72

Feedback control based on dynamic thresholds :

The elements involved in a control system are sensors, monitors, and actuators. The sensors
measure the parameter(s) of interest, then transmit the measured values to a monitor , which
determines whether the system behavior must be changed, and, if so, it requests that the actuators
carry out the necessary actions.

The implementation of such a policy is challenging.

• First, due to the very large number of servers and to the fact that the load changes rapidly
in time, the estimation of the current system load is likely to be inaccurate.
• Second, the ratio of average to maximal resource requirements of individual users
specified in a service-level agreement is typically very high.

Thresholds: A threshold is the value of a parameter related to the state of a system that triggers a
change in the system behavior. Thresholds are used in control theory to keep critical parameters
of a system in a predefined range.

The threshold could be static , defined once and for all, or it could be dynamic . A dynamic
threshold could be based on an average of measurements carried out over a time interval, a so-
called integral control.

The dynamic threshold could also be a function of the values of multiple parameters at a given
time or a mix of the two. To maintain the system parameters in a given range, a high and a low
threshold are often defined.

The two thresholds determine different actions; for example, a high threshold could force the
system to limit its activities and a low threshold could encourage additional activities.
Proportional Thresholding.

proportional thresholding

There are two types of controllers,

(1) application controllers that determine whether additional resources are needed and
(2) cloud controllers that arbitrate requests for resources and allocate the physical
resources?
(3) Is it feasible to consider fine control?
(4) Are dynamic thresholds based on time averages better than static ones?
(5) Is it better to have a high and a low threshold, or it is suf?cient to de?ne only a high
threshold?

Dr. Nandini N, Dr. AIT,Bengaluru Page 6


CLOUD COMPUTING-CS72

The essence of the proportional thresholding is captured by the following algorithm:

1. Compute the integral value of the high and the low thresholds as averages of the
maximum and, respectively, the minimum of the processor utilization over the process
history.
2. Request additional VMs when the average value of the CPU utilization over the current
time slice exceeds the high threshold.
3. Release a VM when the average value of the CPU utilization over the current time slice
falls below the low threshold.
4. The conclusions reached based on experiments with three VMs are as follows:

(a) dynamic thresholds perform better than static ones and


(b) two thresholds are better than one.

Coordination of specialized autonomic performance managers :

It Can specialized autonomic performance managers cooperate to optimize power consumption.


The reports on actual experiments carried out on a set of blades mounted on a chassis (see Figure
6.3 for the experimental setup).

Dr. Nandini N, Dr. AIT,Bengaluru Page 7


CLOUD COMPUTING-CS72

Virtually all modern processors support dynamic voltage scaling (DVS) as a mechanism for
energy saving. Indeed, the energy dissipation scales quadratically with the supply voltage.

The management controls the CPU frequency and, thus, the rate of instruction execution. For
some compute-intensive workloads the performance decreases linearly with the CPU clock
frequency, whereas for others the effect of lower clock frequency is less noticeable or
nonexistent. The clock frequency of individual blades/servers is controlled by a power manager,
typically implemented in the firmware; it adjusts the clock frequency several times a second.

The approach to coordinating power and performance management in is based on several ideas:

A utility-based model for cloud-based Web services :

A utility function relates the “benefits” of an activity or service with the “cost” to provide the
service. For example, the benefit could be revenue and the cost could be the power consumption.

A service-level agreement (SLA) often specifies the rewards as well as the penalties associated
with specific performance metrics. Sometimes the quality of services translates into average
response time; this is the case of cloud-based Web services when the SLA often explicitly
specifies this requirement.

Dr. Nandini N, Dr. AIT,Bengaluru Page 8


CLOUD COMPUTING-CS72

Dr. Nandini N, Dr. AIT,Bengaluru Page 9


CLOUD COMPUTING-CS72

Dr. Nandini N, Dr. AIT,Bengaluru Page 10


CLOUD COMPUTING-CS72

Resource bundling: Combinatorial auctions for cloud resources:

Resources in a cloud are allocated in bundles, allowing users get maximum benefit from a
specific combination of resources. Indeed, along with CPU cycles, an application needs specific
amounts of main memory, disk space, network bandwidth, and so on.

Resource bundling complicates traditional resource allocation models and has generated interest
in economic models and, in particular, auction algorithms.

Combinatorial Auctions.: Auctions in which participants can bid on combinations of items, or


pack- ages , are called combinatorial auctions. Such auctions provide a relatively simple,
scalable, and solution to cloud resource allocation.

Dr. Nandini N, Dr. AIT,Bengaluru Page 11


CLOUD COMPUTING-CS72

Two recent combinatorial auction algorithms are the simultaneous clock auction and clock
proxy auction.

Pricing and Allocation Algorithms : A pricing and allocation algorithm partitions the set of users
into two disjoint sets, winners and losers, denoted as W and , respectively.

The algorithm should:

1. Be computationally tractable. Traditional combinatorial auction algorithms such as


Vickey-Clarke- Groves (VLG) fail this criteria, because they are not computationally
tractable.
2. Scale well. Given the scale of the system and the number of requests for service,
scalability is a necessary condition.
3. Be objective. Partitioning in winners and losers should only be based on the price p u of
a user’s bid. If the price exceeds the threshold, the user is a winner; otherwise the user is
a loser.
4. Be fair. Make sure that the prices are uniform . All winners within a given resource pool
pay the same price.
5. Indicate clearly at the end of the auction the unit prices for each resource pool.
6. Indicate clearly to all participants the relationship between the supply and the demand in
the system.

The constraints in Table 6.4 correspond to our intuition:

(a) The first one states that a user either gets one of the bundles it has opted for or nothing; no
partial allocation is acceptable.
(b) The second constraint expresses the fact that the system awards only available resources;
only offered resources can be allocated.
(c) The third constraint is that the bid of the winners exceeds the final price.

(d) The fourth constraint states that the winners get the least expensive bundles in their
indifference set.
(e) The fifth constraint states that losers bid below the final price.
(f) The last constraint states that all prices are positive numbers.

Dr. Nandini N, Dr. AIT,Bengaluru Page 12


CLOUD COMPUTING-CS72

The ASCA Combinatorial Auction Algorithm.

In the ASCA algorithm the participants at the auction specify the resource and the quantities of
that resource offered or desired at the price listed for that time slot. Then the excess vector

Dr. Nandini N, Dr. AIT,Bengaluru Page 13


CLOUD COMPUTING-CS72

An auctioning algorithm is very appealing because it supports resource bundling and does not
require a model of the system. At the same time, a practical implementation of such algorithms is
challenging. First, requests for service arrive at random times, whereas in an auction all
participants must react to a bid at the same time. Periodic auctions must then be organized, but
this adds to the delay of the response. Second, there is an incompatibility between cloud
elasticity, which guarantees that the demand for resources of an existing application will be
satisfied immediately, and the idea of periodic auctions.

Scheduling algorithms for computing clouds :

Scheduling is a critical component of cloud resource management. Scheduling is responsible for


resource sharing/multiplexing at several levels.

A server can be shared among several virtual machines, each virtual machine could support
several applications, and each application may consist of multiple threads.

Dr. Nandini N, Dr. AIT,Bengaluru Page 14


CLOUD COMPUTING-CS72

CPU scheduling supports the virtualization of a processor, the individual threads acting as virtual
processors; a communication link can be multiplexed among a number of virtual channels, one
for each flow.

Scheduling algorithm should be efficient, fair, and starvation-free. The objectives of a scheduler
for a batch system are to maximize the throughput and to minimize the turnaround time
submission and its completion.

Schedulers for systems supporting a mix of tasks – some with hard real-time constraints, others
with soft, or no timing constraints – are often subject to contradictory requirements. Some
schedulers are preemptive, allowing a high-priority task to interrupt the execution of a lower-
priority one; others are nonpreemptive

Two distinct dimensions of resource management must be addressed by a scheduling policy:

(a) the amount or quantity of resources allocated and

(b) the timing when access to resources is granted.

Figure 6.7 identifies several broad classes of resource allocation requirements in the space
defined by these two dimensions: best-effort, soft requirements, and hard requirements. Hard-
real time systems are the most challenging because they require strict timing and precise amounts
of resources.

Dr. Nandini N, Dr. AIT,Bengaluru Page 15


CLOUD COMPUTING-CS72

Round-robin, FCFS, shortest-job-first (SJF), and priority algorithms are among the most
common scheduling algorithms for best-effort applications. Each thread is given control of the
CPU for a definite period of time, called a time-slice , in a circular fashion in the case of round-
robin scheduling.

The algorithm is fair and starvation-free. The threads are allowed to use the CPU in the order in
which they arrive in the case of the FCFS algorithms and in the order of their running time in the
case of SJF algorithms. Earliest deadline first (EDF) and rate monotonic algorithms (RMA) are
used for real-time applications.

Fair queuing

Computing and communication on a cloud are intimately related. Therefore, it should be no


surprise that the first algorithm we discuss can be used for scheduling packet transmission as
well as threads.

Interconnection networks allow cloud servers to communicate with one another and with users.
These networks consist of communication links of limited bandwidth and
switches/routers/gateways of limited capacity. When the load exceeds its capacity, a switch starts

Dr. Nandini N, Dr. AIT,Bengaluru Page 16


CLOUD COMPUTING-CS72

dropping packets because it has limited input buffers for the switching fabric and for the
outgoing links, as well as limited CPU cycles.

A fair queuing algorithm proposed in requires that separate queues, one per flow, be maintained
by a switch and that the queues be serviced in a round-robin manner. This algorithm guarantees
the fairness of buffer space management, but does not guarantee fairness of bandwidth
allocation. Indeed, a flow transporting large packets will benefit from a larger bandwidth (see
Figure 6.8 ).

Dr. Nandini N, Dr. AIT,Bengaluru Page 17


CLOUD COMPUTING-CS72

Start-time fair queuing

A hierarchical CPU scheduler for multimedia operating systems was proposed in. The basic idea
of the start-time fair queuing (SFQ) algorithm is to organize the consumers of the CPU
bandwidth in a tree structure; the root node is the processor and the leaves of this tree are the
threads of each application. A scheduler acts at each level of the hierarchy. The fraction of the
processor bandwidth.

Dr. Nandini N, Dr. AIT,Bengaluru Page 18


CLOUD COMPUTING-CS72

Cloud scheduling subject to deadlines

An SLA specifies the time when the results of computations done on the cloud should be
available. This motivates us to examine cloud scheduling subject to deadlines, a topic drawing on
a vast body of literature devoted to real-time applications.

There are two deadlines :

1. hard deadlines
2. soft deadlines

In the first case, if the task is not completed by the deadline, other tasks that depend on it may be
affected and there are penalties; a hard deadline is strict and expressed precisely as milliseconds
or possibly seconds. Soft deadlines play more of a guideline role and, in general, there are no
penalties.

Soft deadlines can be missed by fractions of the units used to express them, e.g., minutes if the
deadline is expressed in hours, or hours if the deadlines is expressed in days. The scheduling of
tasks on a cloud is generally subject to soft deadlines, though occasionally applications with hard
deadlines may be encountered.

Dr. Nandini N, Dr. AIT,Bengaluru Page 19


CLOUD COMPUTING-CS72

There are two rules, optimal partitioning rule and equal partitioning rule.

Dr. Nandini N, Dr. AIT,Bengaluru Page 20


CLOUD COMPUTING-CS72

Dr. Nandini N, Dr. AIT,Bengaluru Page 21


CLOUD COMPUTING-CS72

Dr. Nandini N, Dr. AIT,Bengaluru Page 22

You might also like