Cloud Computing Unit I and II
Cloud Computing Unit I and II
High-Throughput Computing-HTC
HTC paradigm pays more attention to high-flux computing. The main application for high-flux
computing is in Internet searches and web services by millions or more users simultaneously.
The performance measures high throughput or the number of tasks completed per unit of time.
HTC technology needs to improve batch processing speed, and also address the acute problems
of cost, energy savings, security, and reliability at many data and enterprise computing centers
• Parallel computing
In parallel computing, all processors are either tightly coupled with centralized
shared memory or loosely coupled with distributed memory
. Interprocessor communication is accomplished through shared memory or via
message passing.
A computer system capable of parallel computing is commonly known as a parallel computer
Programs running in a parallel computer are called parallel programs. The process of
writing parallel programs is often referred to as parallel programming
• Distributed computing
A distributed system consists of multiple autonomous computers, each having its own
private memory, communicating through a computer network.
Information exchange in a distributed system is accomplished through message passing.
A computer program that runs in a distributed system is known as a distributed program.
The process of writing distributed programs is referred to as distributed programming.
Distributed computing system uses multiple computers to solve large-scale problemsover
the Internet using a centralized computer to solve computational problems.
• Cloud computing
An Internet cloud of resources can be either a centralized or a distributed computing system.
The cloud applies parallel or distributed computing, or both.
Clouds can be built with physical or virtualized resources over large data centers that
are centralized or distributed.
Cloud computing can also be a form of utility computing or service computing
Degrees of Parallelism
Bit-level parallelism (BLP) :
o converts bit-serial processing toword-level processing gradually.
Instruction-levelparallelism (ILP)
o the processor executes multiple instructions simultaneously rather thanonly one instruction
at a time.
o ILP is executed through pipelining, superscalarcomputing, VLIW (very long
instruction word) architectures, and multithreading.
o ILP requiresbranch prediction, dynamic scheduling, speculation, and compiler support to
work efficiently.
Data-level parallelism (DLP)
o DLP through SIMD (single instruction, multipledata) and vector machines using vector
or array types of instructions.
o DLP requires even more hardwaresupport and compiler assistance to work properly.
Task-level parallelism (TLP):
o Ever since the introduction of multicoreprocessors and chip multiprocessors (CMPs), we
have been exploring TLP
o TLP is far from beingvery successful due to difficulty in programming and compilation
of code for efficient execution onmulticore CMPs.
Utility Computing
o Utility computing focuses on a business model in which customers receive computing
resources from a paid service provider. All grid/cloud platforms are regarded as utility
service providers.
Cyber-Physical Systems
o A cyber-physical system (CPS) is the result of interaction between computational
processes and the physical world.
o CPS integrates “cyber” (heterogeneous, asynchronous) with “physical” (concurrent and
information-dense) objects
o CPS merges the “3C” technologies of computation, communication, and control into
an intelligent closed feedback system
o IoT emphasizes various networking connections among physical objects, while the CPS
emphasizes exploration of virtual reality (VR) applications in the physicalworld
Computing cluster
o A computing cluster consists of interconnected stand-alone computers which work
cooperatively as a single integrated computing resource.
Cluster Architecture
o the architecture consists of a typical server cluster built around a low-latency, high bandwidth
interconnection network.
o build a larger cluster with more nodes, the interconnection network can be built with
multiple levels of Gigabit Ethernet, Myrinet, or InfiniBand switches.
o Through hierarchical construction using a SAN, LAN, or WAN, one can build
scalable clusters with an increasing number of nodes
o cluster is connected to the Internet via a virtual private network (VPN) gateway.
o gateway IP address locates the cluster
o Clusters have loosely coupled node computers.
o All resources of a server node are managed by
their own OS.
o Most clusters have multiple system images as a result
of having many autonomous nodes under different OS
control
Peer-to-Peer Network-P2P
P2P architecture offers a distributed model of networked systems.
P2P network is client-oriented instead of server-oriented
In a P2P system, every node acts as both a client and a server
Peer machines are simply client computers connected to the Internet.
All client machines act autonomously to join or leave the system freely. This implies that
no master-slave relationship exists among the peers.
No central coordination or central database is needed. The system is self-organizing
with distributed control.
P2P two layer of abstractions as given in the figure
Cloud Computing
A cloud is a pool of virtualized computer resources.
A cloud can host a variety of different workloads, including batch-style backend jobs
and interactive and user-facing applications.”
Cloud computing applies a virtualized platform with elastic resources on demand
by provisioning hardware, software, and data sets dynamically
Performance Metrics:
Dimensions of Scalability
Any resource upgrade ina system should be backward compatible with existing hardware and
software resources. System scaling can increase or decrease resources depending on many
practicalfactors
Size scalability
This refers to achieving higher performance or more functionality by increasingthe machine
size.
The word “size” refers to adding processors, cache, memory, storage, or I/Ochannels. The
most obvious way to determine size scalability is to simply count the number ofprocessors
installed.
Not all parallel computer or distributed architectures are equally sizescalable.
For example, the IBM S2 was scaled up to 512 processors in 1997. But in
2008, theIBMBlueGene/L system scaled up to 65,000 processors.
• Software scalability
This refers to upgrades in the OS or compilers, adding mathematical andengineering
libraries, porting new application software, and installing more user-
friendlyprogramming environments.
Some software upgrades may not work with large systemconfigurations.
Testing and fine-tuning of new software on larger systems is a nontrivial job.
• Application scalability
This refers to matching problem size scalability with machine sizescalability.
Problem size affects the size of the data set or the workload increase. Instead of
increasingmachine size, users can enlarge the problem size to enhance system efficiency
or cost-effectiveness.
• Technology scalability
This refers to a system that can adapt to changes in building technologies,such as the
component and networking technologies
Whenscaling a system design with new technology one must consider three aspects:
time, space, andheterogeneity.
(1) Time refers to generation scalability. When changing to new-generation
processors,one must consider the impact to the motherboard, power supply, packaging
and cooling,and so forth. Based on past experience, most systems upgrade their
commodity processors everythree to five years.
(2) Space is related to packaging and energy concerns. Technology scalabilitydemands
harmony and portability among suppliers.
(3) Heterogeneity refers to the use ofhardware components or software packages from
different vendors. Heterogeneity may limit thescalability.
Amdahl’s Law
Let the program has been parallelized or partitioned for parallelexecution on a cluster
of many processing nodes.
Assume that a fraction α of the code must be executedsequentially, called the
sequential bottleneck.
Therefore, (1 − α) of the code can be compiledfor parallel execution by n processors.
The total execution time of the program is calculated byα T + (1 − α)T/n, where the first
term is the sequential execution time on a single processor and thesecond term is the
parallel execution time on n processing nodes.
I/O time or exception handling timeis also not included in the following speedup analysis.
Amdahl’s Law states that the speedup factorof using the n-processor system over the use
of a single processor is expressed by:
the code is fully parallelizable with α = 0. As the cluster becomes sufficiently large,
that is, n →∞, S approaches 1/α, an upper bound on the speedup S.
this upper bound is independentof the cluster size n. The sequential bottleneck is
the portion of the code that cannot be parallelized.
Gustafson’s Law
To achieve higher efficiency when using a large cluster, we must consider scaling the
problem sizeto match the cluster capability. This leads to the following speedup law
proposed by John Gustafson(1988), referred as scaled-workload speedup.
Let W be the workload in a given program.
When using an n-processor system, the user scales the workload to W′ = αW + (1 −
α)nW.Scaled workload W′ is essentially the sequential execution time on a single processor.
The parallelexecution time of a scaled workload W′ on n processors is defined by a scaled-
workload speedupas follows:
User-Application Level
Virtualization at the application level virtualizes an application as a VM.
On a traditional OS, anapplication often runs as a process. Therefore, application-level
virtualization is also known as process-level virtualization.
The most popular approach is to deploy high level language (HLL)VMs. In this scenario, the
virtualization layer sits as an application program on top of the operatingsystem,
The layer exports an abstraction of a VM that can run programs written and compiledto a
particular abstract machine definition.
Any program written in the HLL and compiled for thisVM will be able to run on it. The
Microsoft .NET CLR and Java Virtual Machine (JVM) are twogood examples of this class of
VM.
Xen Architecture
Xen is an open source hypervisor program developed by Cambridge University.
Xen is a microkernel hypervisor
The core components of a Xen system are the hypervisor, kernel, and applications
The guest OS, which has control ability, is called Domain 0, and the others are called
Domain U
Domain 0 is designed to access hardware directly and manage devices
Full virtualization
Full virtualization, noncritical instructions run on the hardware directly while critical
instructions are discovered and replaced with traps into the VMM to be emulated by
software
VMware puts the VMM at Ring 0 and the guest OS at Ring 1.
The VMM scans the instruction stream and identifies the privileged, control- and
behavior-sensitive instructions.
When these instructions are identified, they are trapped into the VMM, which
emulates the behavior of these instructions.
The method used in this emulation is called binary translation.
Therefore, full virtualization combines binary translation and direct execution.
CPU Virtualization
• A CPU architecture is virtualizable if it supports the ability to run the VM’s privileged and
unprivileged instructions in the CPU’s user mode while the VMM runs in supervisor
mode.
• Hardware-Assisted CPU Virtualization: This technique attempts to simplify
virtualization because full or paravirtualization is complicated
Memory Virtualization
• Memory Virtualization :the operating system maintains mappings of virtual memory to
machine memory using page table
• All modern x86 CPUs include a memory management unit (MMU) and a translation
lookaside buffer (TLB) to optimize virtual memory performance
• Two-stage mapping process should be maintained by the guest OS and the VMM,
respectively: virtual memory to physical memory and physical memory to machine
memory.
• The VMM is responsible for mapping the guest physical memory to the actual machine
memory.
Virtual Clusters
• Four ways to manage a virtual cluster.
• First, you can use a guest-based manager, by which the cluster manager resides on a
guest system.
• The host-based manager supervises the guest systems and can restart the guest system
on another physical machine
• Third way to manage a virtual cluster is to use an independent cluster manager on both
the host and guest systems.
• Finally, use an integrated cluster on the guest and host systems.
• This means the manager must be designed to distinguish between virtualized resources
and physical resources