CCS335-Cloud-Computing-QUESTION BANK
CCS335-Cloud-Computing-QUESTION BANK
(Autonomous)
QUESTION BANK
COURSE OBJECTIVES:
To understand the principles of cloud architecture, models and infrastructure.
To understand the concepts of virtualization and virtual machines.
To gain knowledge about virtualization Infrastructure.
To explore and experiment with various Cloud deployment environments.
To learn about the security issues in the cloud environment.
COURSE OUTCOMES:
CO1: Understand the design challenges in the cloud.
CO2: Apply the concept of virtualization and its types.
CO3: Experiment with virtualization of hardware resources and Docker.
CO4: Develop and deploy services on the cloud and set up a cloud environment.
CO5: Explain security challenges in the cloud environment. TOTAL:60 PERIODS
TEXT BOOKS
1. Kai Hwang, Geoffrey C Fox, Jack G Dongarra, “Distributed and Cloud Computing, From
Parallel Processing to the Internet of Things”, Morgan Kaufmann Publishers, 2012.
2. James Turnbull, “The Docker Book”, O’Reilly Publishers, 2014.
3. Krutz, R. L., Vines, R. D, “Cloud security. A Comprehensive Guide to Secure Cloud
Computing”, Wiley Publishing, 2010.
REFERENCES
1. James E. Smith, Ravi Nair, “Virtual Machines: Versatile Platforms for Systems and Processes”,
Elsevier/Morgan Kaufmann, 2005.
2. Tim Mather, Subra Kumaraswamy, and Shahed Latif, “Cloud Security and Privacy: an
enterprise perspective on risks and compliance”, O’Reilly Media, In
CCS335 CLOUD COMPUTING
Question Bank
UNIT 1
CLOUD ARCHITECTURE MODELS AND INFRASTRUCTURE
SYLLABUS: Cloud Architecture: System Models for Distributed and Cloud Computing – NIST
Cloud Computing Reference Architecture – Cloud deployment models – Cloud service models; Cloud
Infrastructure: Architectural Design of Compute and Storage Clouds – Design Challenges
PART A
2 Marks
1. What is Cloud Computing?
Cloud Computing is defined as storing and accessing of data and computing services
over the internet. It doesn’t store any data on your personal computer. It is the on-demand
availability of computer services like servers, data storage, networking, databases, etc. The
main purpose of cloud computing is to give access to data centers to many users. Users can
also access data from a remote server.
Examples of Cloud Computing Services: AWS, Azure,
5. Write down advandages of cloud computing? Advantages (or) Pros of Cloud Computing?
1. Improved Performance
2. Lower IT Infrastructure Costs
3. Fewer Maintenance Issues
4. Lower Software Costs
5. Instant Software Updates
6. Increased Computing Power
6. Write down disadvantage of cloud computing?
1. Requires a constant Internet connection
2. Does not work well with low-speed connections
3. Can be slow
4. Stored data might not be secure
5. Stored data can be lost
7. What are the computing Paradigm Distinctions?
➢ Centralized computing
➢ Parallel Computing
➢ Distributed Computing
➢ Cloud Computing
8. What are the differences between Grid computing and cloud computing?
What? Grids computing power and storage Clouds enable access to leased
capacity from your desktop computing power and storage
capacity from your desktop
Who provides the Research institutes and universities Large individual companies
service? federate their services around the e.g. Amazon and Microsoft.
world.
Who uses the
Research collaborations, called Small to medium commercial
service? "Virtual Organizations", which bring businesses or researchers with
together researchers around the world generic IT needs
working in the same field.
Who pays for the Governments - providers and users are The cloud provider pays for
service? usually publicly funded research the computing resources; the
organizations. user pays to use them
10. lists the actors defined in the NIST cloud computing reference architecture?
The NIST cloud computing reference architecture defines five major actors:
cloud consumer, cloud provider, cloud carrier, cloud auditor and cloud broker. Each
actor is an entity (a person or an organization) that participates in a transaction or process
and/or performs tasks in cloud computing.
PART B
13 Marks
1.Explain in details about architecture of cloud computing?
(Definition:2 marks, Diagram:4 marks, Explanation:7 marks)
Cloud Computing , which is one of the demanding technology of the current time and which is
giving a new shape to every organization by providing on demand virtualized services/resources.
Starting from small to medium and medium to large, every organization use cloud computing
services for storing information and accessing it from anywhere and anytime only with the help of
internet. In this article, we will know more about the internal architecture of cloud computing.
Transparency, scalability, security and intelligent monitoring are some of the most important
constraints which every cloud infrastructure should experience. Current research on other important
constraints is helping cloud computing system to come up with new features and strategies
with a great capability of providing more advanced cloud solutions.
Cloud Computing Architecture :
Architecture of cloud computing is the combination of both SOA (Service Oriented Architecture)
and EDA (Event Driven Architecture). Client infrastructure, application, service, runtime cloud,
storage, infrastructure, management and security all these are the components of cloud computing
architecture.
Frontend : Frontend of the cloud architecture refers to the client side of cloud computing system.
Means it contains all the user interfaces and applications which are used by the client to access the
cloud computing services/resources. For example, use of a web browser to access the cloud
platform.
• Client Infrastructure – Client Infrastructure is a part of the frontend component. It
contains the applications and user interfaces which are required to access the cloud
platform.
• In other words, it provides a GUI (Graphical User Interface) to interact with the cloud.
Backend: Backend refers to the cloud itself which is used by the service provider. It contains the
resources as well as manages the resources and provides security mechanisms. Along with this, it
includes huge storage, virtual applications, virtual machines, traffic control mechanisms,
deployment models, etc.
1. Application
Application in backend refers to a software or platform to which client accesses. Means
it provides the service in backend as per the client requirement.
2. Service
Service in backend refers to the major three types of cloud based services like SaaS,
PaaS and IaaS. Also manages which type of service the user accesses.
3. Runtime Cloud-
Runtime cloud in backend provides the execution and Runtime platform/environment
to the Virtual machine.
4. Storage
Storage in backend provides flexible and scalable storage service and management of
stored data.
5. Infrastructure
Cloud Infrastructure in backend refers to the hardware and software components of
cloud like it includes servers, storage, network devices, virtualization software etc.
6. Management
Management in backend refers to management of backend components like
application, service, runtime cloud, storage, infrastructure, and other security
mechanisms etc.
7. Security
Security in backend refers to implementation of different security mechanisms in the
backend for secure cloud resources, systems, files, and infrastructure to end-users.
8. Internet
Internet connection acts as the medium or a bridge between frontend and backend and
establishes the interaction and communication between frontend and backend.
9. Database– Database in backend refers to provide database for storing structured data,
such as SQL and NOSQL databases. Example of Databases services include Amazon
RDS, Microsoft Azure SQL database and Google CLoud SQL.
10. Networking– Networking in backend services that provide networking infrastructure
for application in the cloud, such as load balancing, DNS and virtual private networks.
11. Analytics– Analytics in backend service that provides analytics capabilities for
data in the cloud, such as warehousing, business intelligence and machine learning.
Benefits of Cloud Computing Architecture:
• Makes overall cloud computing system simpler.
• Improves data processing requirements.
• Helps in providing high security.
• Makes it more modularized.
• Results in better disaster recovery.
• Gives good user accessibility.
• Reduces IT operating costs.
• Provides high level reliability.
• Scalability.
Massive systems are considered highly scalable, and can reach web-scale connectivity, either
physically or logically. In Table 1.2, massive systems are classified into four groups: clusters, P2P
networks, computing grids, and Internet clouds over huge data centers. In terms of node number,
these four system classes may involve hundreds, thousands, or even millions of computers as
participating nodes. These machines work collectively, cooperatively, or collaboratively at various
levels. The table entries characterize these four system classes in various technical and application
aspects.
1. Clusters of Cooperative Computers
A computing cluster consists of interconnected stand-alone computers which work cooperatively
as a single integrated computing resource. In the past, clustered computer systems have
demonstrated impressive results in handling heavy workloads with large data sets.
1.1 Cluster Architecture
Figure 1.15 shows the architecture of a typical server cluster built around a low-latency,
high-bandwidth interconnection network. This network can be as simple as a SAN (e.g., Myrinet)
or a LAN (e.g., Ethernet). To build a larger cluster with more nodes, the interconnection network
can be built with multiple levels of Gigabit Ethernet, Myrinet, or InfiniBand switches. Through
hierarchical construction using a SAN, LAN, or WAN, one can build scalable clusters with an
increasing number of nodes. The cluster is connected to the Internet via a virtual private network
(VPN) gateway. The gateway IP address locates the cluster. The system image of a computer is
decided by the way the OS manages the shared cluster resources. Most clusters have loosely
coupled node computers. All resources of a server node are managed by their own OS. Thus, most
clusters have multiple system images as a result of having many autonomous nodes under different
OS control.
Like an electric utility power grid, a computing grid offers an infrastructure that couples
computers, software/middleware, special instruments, and people and sensors together. The grid
is often con-structed across LAN, WAN, or Internet backbone networks at a regional, national, or
global scale. Enterprises or organizations present grids as integrated computing resources. They
can also be viewed as virtual platforms to support virtual organizations. The computers used in a
grid are pri-marily workstations, servers, clusters, and supercomputers. Personal computers,
laptops, and PDAs can be used as access devices to a grid system.
Figure 1.16 shows an example computational grid built over multiple resource sites owned by
different organizations. The resource sites offer complementary computing resources, including
workstations, large servers, a mesh of processors, and Linux clusters to satisfy a chain of
computational needs. The grid is built across various IP broadband networks including LANs and
WANs already used by enterprises or organizations over the Internet. The grid is presented to users
as an integrated resource pool as shown in the upper half of the figure. Many national and
international grids will be reported in Chapter 7, the NSF TeraGrid in US, EGEE in Europe,
and ChinaGrid in China for various distributed scientific grid applications.
formed by mapping each physical machine with its ID, logically, through a virtual mapping as
shown in Figure 1.17. When a new peer joins the system, its peer ID is added as a node in the
overlay network. When an existing peer leaves the system, its peer ID is removed from the overlay
network automatically. Therefore, it is the P2P overlay network that characterizes the logical
connectivity among the peers.
There are two types of overlay networks: unstructured and structured. An unstructured
overlay network is characterized by a random graph. There is no fixed route to send messages or
files among the nodes. Often, flooding is applied to send a query to all nodes in an unstructured
overlay, thus resulting in heavy network traffic and nondeterministic search results. Structured
overlay net-works follow certain connectivity topology and rules for inserting and removing nodes
(peer IDs) from the overlay graph. Routing mechanisms are developed to take advantage of the
structured overlays.
3.3 P2P Application Families
Based on application, P2P networks are classified into four groups, as shown in Table
1.5. The first family is for distributed file sharing of digital contents (music, videos, etc.) on the
P2P network. This includes many popular P2P networks such as Gnutella, Napster, and BitTorrent,
among others. Colla-boration P2P networks include MSN or Skype chatting, instant messaging,
and collaborative design, among others. The third family is for distributed P2P computing in
specific applications. For example, SETI@home provides 25 Tflops of distributed computing
power, collectively, over 3 million Internet host machines. Other P2P platforms, such as JXTA,
.NET, and FightingAID@home, support naming, discovery, communication, security, and
resource aggregation in some P2P applications. We will dis-cuss these topics in more detail in
Chapters 8 and 9.
make it too complex to apply in real applications. We need system scalability as the workload
increases. System scaling is directly related to performance and bandwidth. P2P networks do have
these properties. Data location is also important to affect collective performance. Data locality,
network proximity, and interoperability are three design objectives in distributed P2P applications.
P2P performance is affected by routing efficiency and self-organization by participating peers.
Fault tolerance, failure management, and load balancing are other important issues in using overlay
networks. Lack of trust among peers poses another problem. Peers are strangers to one another.
Security, privacy, and copyright violations are major worries by those in the industry in terms of
applying P2P technology in business applications [35]. In a P2P network, all clients provide
resources including computing power, storage space, and I/O bandwidth. The distributed nature of
P2P net-works also increases robustness, because limited peer failures do not form a single point
of failure.
By replicating data in multiple peers, one can easily lose data in failed nodes. On the other hand,
disadvantages of P2P networks do exist. Because the system is not centralized, managing it is
difficult. In addition, the system lacks security. Anyone can log on to the system and cause damage
or abuse. Further, all client computers connected to a P2P network cannot be considered reliable
or virus-free. In summary, P2P networks are reliable for a small number of peer nodes. They are
only useful for applications that require a low level of security and have no concern for data
sensitivity. We will discuss P2P networks in Chapter 8, and extending P2P technology to social
networking in Chapter 9.
4. Cloud Computing over the Internet
Gordon Bell, Jim Gray, and Alex Szalay [5] have advocated: “Computational science is changing
to be data-intensive. Supercomputers must be balanced systems, not just CPU farms but also
petascale I/O and networking arrays.” In the future, working with large data sets will typically
mean sending the computations (programs) to the data, rather than copying the data to the
workstations. This reflects the trend in IT of moving computing and data from desktops to large
data centers, where there is on-demand provision of software, hardware, and data as a service. This
data explosion has promoted the idea of cloud computing.
Cloud computing has been defined differently by many users and designers. For example, IBM,
a major player in cloud computing, has defined it as follows: “A cloud is a pool of virtualized
computer resources. A cloud can host a variety of different workloads, including
batch-style backend jobs .” Based on this definition, a cloud allows workloads to be deployed and scaled out
quickly through rapid provisioning of virtual or physical machines. The cloud supports redundant,
self-recovering, highly scalable programming models that allow workloads to recover from many
unavoidable hardware/software failures. Finally, the cloud system should be able to monitor resource
use in real time to enable rebalancing of allocations when needed.
4.1 Internet Clouds
Cloud computing applies a virtualized platform with elastic resources on demand by provisioning
hardware, software, and data sets dynamically (see Figure 1.18). The idea is to move desktop
computing to a service-oriented platform using server clusters and huge databases at data centers.
Cloud computing leverages its low cost and simplicity to benefit both users and providers. Machine
virtualization has enabled such cost-effectiveness. Cloud computing intends to satisfy many user
applications simultaneously. The cloud ecosystem must be designed to be secure, trustworthy, and
dependable. Some computer users think of the cloud as a centralized resource pool. Others consider
the cloud to be a server cluster which practices distributed computing over all the servers used.
platform and application as demonstrated in Figure 1.15. These three development layers are
implemented with virtualizationand standardization of hardware and software resources provisioned
in the cloud.The services to public, private and hybrid clouds are conveyed to users through
networking support over the Internet and intranets involved.
◆ It is clear that the infrastructure layer is deployed first to support laaS services.
he platform layer is for general purpose and repeated usage of the collection of software
resources.This layer provides users with an environment to develop their applications, to test
operation flows and to monitor execution results and performance.
◆ The platform should be able to assure users that they have scalability,
dependability, and security protection.In a way, the virtualized cloud platform serves as a "system
middleware" between the infrastructure and application layers of the cloud. The application layer is
formed with a collection of all needed software modules for SaaS applications.
Service applications in this layer include daily office management work such as information
retrieval, document processing and calendar and authentication services.
◆ The application layer is also heavily used by enterprises in business marketing and
sales, consumer relationship management (CRM),financial transactions and supply chain
management. From the provider's perspective, the services at various layers demand
different amounts of functionality support and resource management by providers.
In general, PaaS is in the middle, and laaS demands the least.For example, Amazon EC2 provides
not only virtualized CPU resources to users but also management of these provisioned resources.
Services at the application layer demand more work from providers.
◆ The best example of this is the Salesforce.com CRM service in which the provider supplies not
only the hardware at the bottom layer and the software at the top layer but also the platform and
software tools for user application development and monitoring.
• In Market Oriented Cloud Architecture, as consumers rely on cloud providers to meet more of their
computing needs, they will require a specific level of QoS to be maintained by their providers, in
order to meet their objectives and sustain their operations. Market-oriented resource management is
necessary to regulate the supply and demand of cloud resources to achieve market equilibrium
between supply and demand.
o The Pricing mechanism decides how service requests are charged. For instance, requests can be
charged based on Submission time (peak/off-peak), pricing Rates fixed/changing),(supply/demand)
of availability Of resources
• The VM Monitor mechanism keeps track of the availability of VMs and their resource
entitlements.
The Accounting mechanism maintains the actual usage of resources by requests so that the final
cost can be computed and charged to users.
In addition, the maintained historical usage information can be utilized by the Service Request
Examiner and Admission Control mechanism to improve resource allocation decisions.
The Dispatcher mechanism starts the execution of accepted service requests on allocated VMs. The
Service Request Monitor mechanism keeps track of the execution progress of service requests.
(i) Explain in detail about architectural design challenges of service availability and data lock
in problem Data Privacy and Security Concerns?
(Concept Explanation (i) 7marks, Concept Explanation (ii)6marks)
accounting systems.
● ◆ Therefore, using multiple cloud providers may provide more protection from
failures. Another availability obstacle is distributed denial of service (DDoS)
making their services unavailable. Some utility computing services offer SaaS providers the
opportunity
to defend against DDoS attacks by using quick scale ups. • Software stacks have improved
interoperability among different cloud platforms, but the APIs itself are still proprietary. Thus,
customers cannot easily extract their data and programs from one site to run on another.
The obvious solution is to standardize the APIs so that a SaaS developer can deploy services and
data across multiple cloud providers. ◆● This will rescue the loss of all data due to the failure
of a single company. In addition to mitigating data lock-in concerns, standardization of
APIs enables a new usage model in which the same software infrastructure can be used in both public
and private clouds.
Such an option could enable surge computing, in which the public cloud is used to capture the
extra tasks that cannot be easily run in the data center of a private cloud.
3. Explain in detail about architectural design challenges of
(i) Unpredictable Performance and Bottlenecks
(ii) Distributed Storage and Widespread Software Bugs?
● ◆ For example, to run 75 EC2 instances with the STREAM benchmark requires a
mean bandwidth of 1,355 MB/second.
However, for each of the 75 EC2 instances to write 1 GB files to the local disk requires a mean
disk write bandwidth of only 55
of clouds, this may complicate data placement and transport. ◆● Cloud users and providers have
to think about the implications of
placement and traffic at every level of the system, if they want to minimize costs.
● ◆ This kind of reasoning can be seen in Amazon's development of its new Cloud Front
service.
◆ ● Therefore, data transfer bottlenecks must be removed, bottleneck links must be widened
and weak servers should be removed.
● ◆ The opportunity is to not only meet this growth combine it with the cloud advantage
butscaling
of also arbitrarily up and down on demand.
● ◆ The pay as you go model applies to storage and network bandwidth; both are counted in
terms of the number of bytes used.
● ◆ GAE automatically scales in response to load increases or decreases and the users are charged
by the cycles used.
● ◆ AWS charges by the hour for the number of VM instances used, even
if the machine is idle. The opportunity here is to scale quickly up and down in response to
load variation, in order to save money, but without violating SLAS. Open Virtualization Format
(OVF) describes an open, secure, portable, efficient and extensible format for the packaging and
distribution of VMs. ◆● It also defines a format for distributing software to be deployed in
VMs.
● ◆ This VM format does not rely on the use of a specific host platform, virtualization platform
or guest operating system.
● ◆ In terms of cloud standardization, the ability for virtual appliances to run on any virtual
platform. The user is also need to enable VMs to run on heterogeneous hardware platform
hypervisors.
◆ ● This requires hypervisor-agnostic VMs. And also the user need to realize cross platform
live migration between x86 Intel and AMD technologies and support legacy hardware for load
balancing.
Challenge 6: Software Licensing and Reputation Sharing Many cloud computing providers
originally relied on open source software because the licensing model for commercial software is
not ideal for utility computing.
The primary opportunity is either for open source to remain popular or simply for commercial
software companies to change their licensing structure to better fit cloud computing. • One can
consider using both pay for use and bulk use licensing schemes to widen the business coverage.
PART C
15 Marks
Cloud Computing helps in rendering several services according to roles, companies, etc.
Cloud computing models are explained below.
• Infrastructure as a service (IaaS)
• Platform as a service (PaaS)
• Software as a service (SaaS)
Platform as a Service (PaaS) is a type of cloud computing that helps developers to build
applications and services over the Internet by providing them with a platform.
PaaS helps in maintaining control over their business applications.
Advantages of PaaS
• PaaS is simple and very much convenient for the user as it can be accessed via a web
browser.
• PaaS has the capabilities to efficiently manage the lifecycle.
Disadvantages of PaaS
• PaaS has limited control over infrastructure as they have less control over the
environment and are not able to make some customizations.
• PaaS has a high dependence on the provider.
Software as a Service (SaaS) is a type of cloud computing model that is the work of delivering
services and applications over the Internet. The SaaS applications are called Web-Based Software
or Hosted Software.
SaaS has around 60 percent of cloud solutions and due to this, it is mostly preferred by companies.
Advantages of SaaS
• SaaS can access app data
• SaaS provides easy access to features and services.
Disadvantages of SaaS
• SaaS solutions have limited customization, which means they have some restrictions
within the platform.
• SaaS has little control over the data of the user.
• SaaS are generally cloud-based, they require a stable internet connection for proper
working.
Cloud infrastructure
Cloud Computing which is one of the demanding technology of current scenario and which has been
proved as a revolutionary technology trend for businesses of all sizes. It manages a broad and
complex infrastructure setup to provide cloud services and resources to the customers. Cloud
Infrastructure which comes under the backend part of cloud architecture represents the hardware
and software component such as server, storage, networking, management software, deployment
software and virtualization software etc. In backend, cloud infrastructure enables the complete cloud
computing system.
Why Cloud Computing Infrastructure:
Cloud computing refers to providing on demand services to the customer anywhere and anytime
irrespective of everything where the cloud infrastructure represents the one who activates the
complete cloud computing system. Cloud infrastructure has more capabilities of providing the
same services as the physical infrastructure to the customers. It is available for private cloud,
public cloud, and hybrid cloud systems with low cost, greater flexibility and scalability.
Cloud infrastructure components:
Different components of cloud infrastructure support the computing requirements of a cloud
computing model. Cloud infrastructure has number of key components but not limited to only
server, software, network and storage devices. Still cloud infrastructure is categorized into three
parts in general i.e.
1. Computing
2. Networking
3. Storage
The most important point is that cloud infrastructure should have some basic infrastructural
constraints like transparency, scalability, security and intelligent monitoring etc.
The below figure represents components of cloud infrastructure
1. Hypervisor :
Hypervisor is a firmware or a low level program which is a key to enable virtualization. It is used
to divide and allocate cloud resources between several customers. As it monitors and manages
cloud services/resources that’s why hypervisor is called as VMM (Virtual Machine Monitor) or
(Virtual Machine Manager).
2. Management Software :
Management software helps in maintaining and configuring the infrastructure. Cloud
management software monitors and optimizes resources, data, applications and services.
3. Deployment Software :
Deployment software helps in deploying and integrating the application on the cloud. So,
typically it helps in building a virtual computing environment.
4. Network :
It is one of the key component of cloud infrastructure which is responsible for connecting cloud
services over the internet. For the transmission of data and resources externally and internally
network is must required.
5. Server :
Server which represents the computing portion of the cloud infrastructure is responsible for
managing and delivering cloud services for various services and partners, maintaining security
etc.
6. Storage :
Storage represents the storage facility which is provided to different organizations for storing and
managing data. It provides a facility of extracting another resource if one of the resource fails as
it keeps many copies of storage.
Along with this, virtualization is also considered as one of important component of cloud
infrastructure. Because it abstracts the available data storage and computing power away from the
actual hardware and the users interact with their cloud infrastructure through GUI (Graphical User
Interface).
NIST stands for National Institute of Standards and Technology. The goal is to achieve effective and
improve services
• NIST composed for six major workgroups specific to cloud computing
• Objectives of NIST Cloud Computing reference architecture Illustrate and understand the various
level of services
● ◆ In general, NIST generates report for future reference which includes survey, analysis
of
existing cloud computing reference model, vendors and federal agencies.
The conceptual reference architecture shown in figure 1.4 involves five actors. Each actor as entity
participates in cloud computing
Cloud consumer: A person or an organization that maintains a business relationship with and uses a
services from cloud providers
Cloud provider: A person, organization or entity responsible for making a service available to
interested parties
Cloud auditor: A party that conduct independent assessment of cloud services, information system
operation, performance and security of cloud implementation
● ◆ Cloud broker: An entity that manages the performance and delivery of cloud services
and negotiates relationship between cloud provider and consumer.
● ◆ Cloud carrier: An intermediary that provides connectivity and transport of cloud services
from cloud providers to consumers.
Figure 1.5 illustrates the common interaction exist in between cloud consumer and provider where
as the broker used to provide service to consumer and auditor collects the audit information.
The interaction between the actors may lead to different use case scenario.
Figure 1.6 shows one kind of scenario in which the Cloud consumer may request service from a cloud
broker instead of contacting service provider directly. In this case, a cloud broker can create a new
service by combining multiple services
Figure 1.7 illustrates the usage of different kind of Service Level Agreement (SLA) between
consumer, provider and carrier.
Cloud consumer is a principal stake holder for the cloud computing service and requires service level
agreements to specify the performance requirements fulfilled by a cloud provider.
There are three kinds of cloud consumers: SaaS consumers, PaaS Consumers and IaaS consumers.
◆ ● SaaS consumers are members directly access the software application. For example,
document management, content management, social networks, financial billing and so on.
PaaS consumers are used to deploy, test, develop and manage applications hosted in cloud
environment. Database application deployment, development and testing is an example for these kind
of consumer.
● ◆ laaS Consumer can access the virtual computer, storage and network infrastructure.
For example, usage of Amazon EC2 instance to deploy the web application.
On the other hand, Cloud Providers have complete rights to access software applications. In Software
as a Service model, cloud provider is allowed to configure, maintain and update the operations of
software application.
● ◆ Normally, the service layer defines the interfaces for cloud consumers to access the
computing services.
• Resource abstraction and control layer contains the system components that cloud provider use to
provide and mange access to the physical computing resources through software abstraction.
• Resource abstraction covers virtual machine management and virtual storage
management. Control layer focus on resource allocation, access control and usage
monitoring.
• Physical resource layer includes physical computing resources such as CPU, Memory, Router,
Switch, Firewalls and Hard Disk Drive.
Service orchestration describes the automated arrangement, coordination and management of
complex computing system
• In cloud service management, business support entails the set of business related services dealing
with consumer and supporting services which includes content management, contract management,
inventory management, accounting rating service.
• Provisioning of equipments, wiring and transmission is mandatory to setup a new service that
provides a specific application to cloud consumer. Those details are described in Provisioning and
Configuring management.
Portability enforces the ability to work in more than one computing environment without major
task. Similarly, Interoperatability means the ability of the system work with other system.
Public Cloud
Private Cloud
Community Cloud
Advantages of the Community Cloud Model
• Cost Effective: It is cost-effective because the cloud is shared by multiple
organizations or communities.
• Security: Community cloud provides better security.
• Shared resources: It allows you to share resources, infrastructure, etc. with multiple
organizations.
• Collaboration and data sharing: It is suitable for both collaboration and data sharing.
Disadvantages of the Community Cloud Model
• Limited Scalability: Community cloud is relatively less scalable as many
organizations share the same resources according to their collaborative interests.
• Rigid in customization: As the data and resources are shared among different
organizations according to their mutual interests if an organization wants some changes
according to their needs they cannot do so because it will have an impact on other
organizations.
Multi-Cloud
We’re talking about employing multiple cloud providers at the same time under this
paradigm, as the name implies. It’s similar to the hybrid cloud deployment approach, which
combines public and private cloud resources. Instead of merging private and public clouds, multi-
cloud uses many public clouds. Although public cloud providers provide numerous tools to
improve the reliability of their services, mishaps still occur. It’s quite rare that two distinct clouds
would have an incident at the same moment. As a result, multi-cloud deployment improves the
high availability of your services even more.
Multi-Cloud
The overall Analysis of these models with respect to different factors is described below.
Community
Factors Public Cloud Private Cloud Cloud Hybrid Cloud
Scalability
and High High Fixed High
Flexibility
Community
Factors Public Cloud Private Cloud Cloud Hybrid Cloud
PART A
2 Marks
There are many advantages to using cloud virtual machines instead of physical machines,
including:
Low cost: It is cheaper to spin off a virtual machine in the clouds than to procure a
physical machine.
Easy scalability: We can easily scale in or scale out the infrastructure of a cloud virtual
machine based on load.
Ease of setup and maintenance: Spinning off virtual machines is very easy as compared
to buying actual hardware. This helps us get set up quickly.
Shared responsibility: Disaster recovery becomes the responsibility of the Cloud provider.
We don’t need a different disaster recovery site incase our primary site goes down.
4. List the Benefits of Virtualization?
More flexible and efficient allocation of resources.
Enhance development productivity.
It lowers the cost of IT infrastructure.
Remote access and rapid scalability.
High availability and disaster recovery.
Pay peruse of the IT infrastructure on demand.
Enables running multiple operating systems.
In general No, but as an advanced hardware feature, we can allow the file-sharing for
different virtual machines.
7. What are Types of Virtual Machines?
we can classify virtual machines into two types:
1. System Virtual Machine
2. Process Virtual Machine
8. What are Types of Virtualization?
1. Application Virtualization
2. Network Virtualization
3. Desktop Virtualization
4. Storage Virtualization
5. Server Virtualization
6. Data virtualization
9. Define Uses of Virtualization?
Data-integration
Business-integration
Service-oriented architecture data-services
Searching organizational data
10. What is mean by hypervisor?
A hypervisor, also known as a virtual machine monitor or VMM, is software that creates
and runs virtual machines (VMs). A hypervisor allows one host computer to support multiple guest
VMs by virtually sharing its resources, such as memory and processing.
Virtualization is a technique how to separate a service from the underlying physical delivery of that
service. It is the process of creating a virtual version of something like computer hardware. It was
initially developed during the mainframe era. It involves using specialized software to create a
virtual or software-created version of a computing resource rather than the actual version of the
same resource. With the help of Virtualization, multiple operating systems and applications can run
on the same machine and its same hardware at the same time, increasing the utilization and
flexibility of hardware. In other words, one of the main cost-effective, hardware-reducing, and
energy-saving techniques used by cloud providers is Virtualization. Virtualization allows
sharing of a single physical instance of a resource or an application among multiple customers and
organizations at one time. It does this by assigning a logical name to physical storage and providing
a pointer to that physical resource on demand. The term virtualization is often synonymous with
hardware virtualization, which plays a fundamental role in efficiently delivering Infrastructure-as-
a-Service (IaaS) solutions for cloud computing. Moreover, virtualization technologies provide a
virtual environment for not only executing applications but also for storage, memory, and
networking.
Virtualization
• Host Machine: The machine on which the virtual machine is going to be built is known
as Host Machine.
• Guest Machine: The virtual machine is referred to as a Guest Machine.
Work of Virtualization in Cloud Computing
Virtualization has a prominent impact on Cloud Computing. In the case of cloud computing, users
but with the help of Virtualization, users have the extra benefit of sharing the infrastructure. Cloud
Vendors take care of the required physical resources, but these cloud providers charge a huge
amount for these services which impacts every user or organization. Virtualization helps Users or
Organizations in maintaining those services which are required by a company through external
(third-party) people, to the company. This is the way through which Virtualization works in Cloud
Computing.
Benefits of Virtualization
• More flexible and efficient allocation of resources.
• Enhance development productivity.
• It lowers the cost of IT infrastructure.
• Remote access and rapid scalability.
• High availability and disaster recovery.
• Pay peruse of the IT infrastructure on demand.
• Enables running multiple operating
systems. Drawback of Virtualization
• High Initial Investment: Clouds have a very high initial investment, but it is also true
that it will help in reducing the cost of companies.
• Learning New Infrastructure: As the companies shifted from Servers to Cloud, it
requires highly skilled staff who have skills to work with the cloud easily, and for this,
you have to hire new staff or provide training to current staff.
• Risk of Data: Hosting data on third-party resources can lead to putting the data at risk,
it has the chance of getting attacked by any hacker or cracker very easily.
For more benefits and drawbacks, you can refer to the Pros and Cons of Virtualization.
Characteristics of Virtualization
• Increased Security: The ability to control the execution of a guest program in a
completely transparent manner opens new possibilities for delivering a secure, controlled
execution environment. All the operations of the guest programs are generally performed
against the virtual machine, which then translates and applies them to the host programs.
• Managed Execution: In particular, sharing, aggregation, emulation, and isolation are
the most relevant features.
• Sharing: Virtualization allows the creation of a separate computing environment
within the same host.
• Aggregation: It is possible to share physical resources among several guests, but
virtualization also allows aggregation, which is the opposite process.
For more characteristics, you can refer to Characteristics of Virtualization.
Types of Virtualization
1. Application Virtualization
2. Network Virtualization
3. Desktop Virtualization
4. Storage Virtualization
5. Server Virtualization
6. Data virtualization
Types of Virtualization
Server Virtualization
6. Data Virtualization: This is the kind of virtualization in which the data is collected from
various sources and managed at a single place without knowing more about the technical
information like how data is collected, stored & formatted then arranged that data logically so that
its virtual view can be accessed by its interested people and stakeholders, and users through the
various cloud services remotely. Many big giant companies are providing their services like
Oracle, IBM, At scale, Cdata, etc.
Uses of Virtualization
• Data-integration
• Business-integration
• Service-oriented architecture data-services
• Searching organizational data
7. The total cost of cloud computing is The total cost of virtualization is lower
higher than virtualization. than Cloud Computing.
S.NO Cloud Computing Virtualization
Hypervisor
● ◆ In this model, the guest is represented by the operating system, the host by the physical computer
hardware, the virtual machine by its emulation and the virtual machine manager by the hypervisor.
Hardware level virtualization is also called system virtualization, since it provides ISA to virtual
machines, which is the representation of the hardware interface of a system. This is to differentiate it
from process virtual machines, which expose ABI to virtual machines.
■ Type I hypervisor take the place of the operating systems and interact directly with the ISA interface
exposed by the underlying hardware and they emulate this interface in order to allow the management
of guest operating systems.
This type of hypervisor is also called a native virtual machine since it runs natively on hardware. o Type
II hypervisors require the support of an operating system to provide virtualization services.
This means that they are programs managed by the operating system, which interact with it through the
ABI and emulate the ISA of virtual hardware for guest operating systems.
■ This type of hypervisor is also called a hosted virtual machine since it is hosted within an operating
system.
4. What are the Taxonomy of virtual machines?
(Concept explanation:10 marks, Diagram:3 marks)
operating system, which has full control of the hardware. System level techniques are implemented
directly on hardware and do not require or require a minimum of support from existing operating
system.
Within these two categories we can list various techniques that offer the guest a different type of
virtual computation environment:
◆ ● Bare hardware
● ◆ Therefore, execution virtualization can be implemented directly on top of the hardware by the
operating system, an application and libraries (dynamically or statically) linked to an application
image. ◆● Modern computing systems can be expressed in terms of the
● ◆ ISA is important to the operating system (OS) developer (System ISA) and developers of
applications that directly manage the underlying hardware (User ISA).
● ◆ The application binary interface (ABI) separates the operating system layer from the applications
and libraries, which are managed by the
OS. • ABI covers details such as low level data types, alignment, call conventions and defines a format
for executable programs. ◆● System calls are defined at this level. This interface allows portability of
applications and libraries across operating systems that implement the same ABI.
● ◆ The highest level of abstraction is represented by the application programming interface (API),
which interfaces applications to libraries and the underlying operating system.
◆ For this purpose, the instruction set exposed by the hardware has been divided into different security
classes that define who can operate with them. The first distinction can be made between privileged and non-
privileged instructions.
o Non privileged instructions are those instructions that can be used without interfering with their tasks
because they do not access shared resources.
This category contains all the floating, fixed-point, and arithmetic instructions.
• Privileged instructions are those that are executed under specific restrictions and are mostly used for
sensitive operations, which expose (behavior-sensitive) or modify (control-sensitive) the privileged
state.
● ◆. Some types of architecture feature more than one class of privileged instructions and implement
a finer control of how these instructions can be accessed.
For instance, a possible implementation features a hierarchy of privileges illustrate in the figure 2.2 in
the form of ring-based security: Ring 0, Ring 1, Ring 2, and Ring 3; Ring 0 is in the most privileged
level and Ring 3 in the least privileged level. Ring 0 is used by the kernel of the OS, rings 1 and 2 are
used by the OS level services, and Ring 3 is used by the user. Recent systems support only two levels,
with Ring 0 for supervisor mode and Ring 3 for user mode.
All the current systems support at least two different execution modes: supervisor mode and user mode.
o The supervisor mode denotes an execution mode in which all the instructions (privileged and non-
privileged) can be executed without any restriction. This mode, also called master mode or kernel
mode, is generally used by the operating system (or the hypervisor) to perform sensitive operations
on hardware level resources.
o In user mode, there are restrictions to control the machine level resources The distinction between user and
supervisor mode allows us to understand the role of the hypervisor and why it is called that. • Conceptually, the
hypervisor runs above the supervisor mode and from here the prefix "hyper" is used.
● ◆ In reality, hypervisors are run in supervisor mode and the division between privileged and non-
privileged instructions has posed challenges in designing virtual machine managers.
o The virtual machine represents an emulated environment in which the guest is executed.
o This level of indirection allows the virtual machine manager to control and filter the activity of the
guest, thus preventing some harmful operations from being performed.
• Managed execution Virtualization of the execution environment not only allows increased security,
but a wider range of features also can be implemented. In particular, sharing, aggregation, emulation, and
isolation are the most relevant feature.
o Virtualization allows the creation of a separate computing environment within the same host.
In this way it is possible to fully exploit the capabilities of a powerful guest, which would otherwise
be underutilized.
• Aggregation o Not only is it possible to share physical resource among several guests but virtualization
also allows aggregation, which is the opposite process.
。 A group of separate hosts can be tied together and represented to guests as a single virtual host.
Emulation
o Guest programs are executed within an environment that is controlled by the virtualization
layer, which ultimately is a program.
• This allows for controlling and tuning the environment that is exposed to guests.
Isolation
o Virtualization allows providing guests whether they are operating systems, applications, or other entities
with a completely separate environment, in which they are executed. • The guest program performs its activity
by interacting with an abstraction layer, which provides access to the underlying resources.
o Benefits of Isolation
First it allows multiple guests to run on the same host without interfering with each other.
■Second, it provides a separation between the host and the guest.
• ◆● This feature is a reality at present, given the considerable advances in hardware and software
supporting virtualization.
◆ ● It becomes easier to control the performance of the guest by finely tuning the properties of
the resources exposed through the virtual environment.
o The concept of portability applies in different ways according to the specific type of virtualization
considered. In the case of a hardware virtualization solution, the guest is packaged into a virtual image that,
in most cases, can be safely moved and executed on top of different virtual machines
Virtualization layer is responsible for converting portions of the real hardware into virtual machine
● ◆ Therefore, different operating systems such as Linux and Windows can run on the same
physical machine, simultaneously.
• Depending on the position of the virtualization layer, there are several classes of VM architectures,
namely the hypervisor architecture, para virtualization and host based virtualization. The hypervisor is
also known as the VMM (Virtual Machine Monitor). They both perform the same virtualization operations.
Hypervisor and Xen architecture
● ◆ The hypervisor supports hardware level virtualization on bare metal devices like CPU,
memory, disk and network interfaces.
● ◆ The hypervisor software sits directly between the physical hardware and its OS. This virtualization
layer is referred to as either the VMM or the hypervisor. Depending on the functionality, a hypervisor can assume
microkernel architecture like the Microsoft Hyper-V.
● ◆ It can assume monolithic hypervisor architecture like the VMware ESX for server
virtualization.
● ◆ A micro kernel hypervisor includes only the basic and unchanging functions (such as physical
memory management and processor scheduling).
● ◆ The device drivers and other changeable components are outside the hypervisor.
● ◆ The hypervisor supports hardware level virtualization on bare metal devices like CPU,
memory, disk and network interfaces.
● ◆ The hypervisor software sits directly between the physical hardware and its OS. This virtualization layer
is referred to as either the VMM or the hypervisor. The hypervisor provides hyper calls for the guest OSes and
applications.
Depending on the functionality, a hypervisor can assume micro kernel architecture like the Microsoft
Hyper-V.
◆ It can assume monolithic hypervisor architecture like the VMware ESX for server virtualization.
● ◆ A micro kernel hypervisor includes only the basic and unchanging functions (such as physical
memory management and processor scheduling).
● ◆ The device drivers and other changeable components are outside the hypervisor.
A monolithic hypervisor implements all the aforementioned functions, including those of the
device drivers. Therefore, the size of the hypervisor code of a micro-kernel hypervisor is smaller
than that of a monolithic hypervisor. Essentially, a hypervisor must be able to convert physical
devices into virtual resources dedicated for the deployed VM to use.
Xen architecture
• Xen is an open source hypervisor program developed by Cambridge University. • Xen is a microkernel
hypervisor, which separates the policy from the mechanism.
• The Xen hypervisor implements all the mechanisms, leaving the policy to be handled by Domain
0. Figure 2.4 shows architecture of Xen hypervisor.
Xen does not include any device drivers natively. It just provides a mechanism by which a guest OS
can have direct access to the physical devices.
• Xen provides a virtual environment located between the hardware and the OS.
The core components of a Xen system are the hypervisor, kernel, and applications.
◆ ● Like other virtualization systems, many guest OSes can run on top of the hypervisor.
● ◆ However, not all guest OSes are created equal, and one in particular controls the others.
● ◆ The guest OS, which has control ability, is called Domain 0, and the others are called Domain U.
● ◆ Domain 0 is a privileged guest OS of Xen. It is first loaded when Xen boots without any file
system drivers being available. Domain 0 is designed to access hardware directly and manage devices.
Therefore, one of the responsibilities of Domain 0 is to allocate and map hardware resources for the guest
domains (the Domain U domains).
● ◆ For example, Xen is based on Linux and its security level is C2. Its management VM is named
Domain 0 which has the privilege to manage other VMs implemented on the same host.
● ◆ If Domain 0 is compromised, the hacker can control the entire system. So, in the VM system,
security policies are needed to improve the security of Domain 0.
● ◆ Domain 0, behaving as a VMM, allows users to create, copy, save, read, modify, share, migrate
and roll back VMs as easily as manipulating a file, which flexibly provides tremendous benefits for
users.
● ◆ Full virtualization does not need to modify the host OS. It relies on binary translation to trap
and to virtualize the execution of certain sensitive, non virtualizable instructions. The guest OSes and their
applications consist of noncritical and critical instructions.
◆ In a host-based system, both a host OS and a guest OS are used. A virtualization software layer is built
between the host OS and guest OS. With full virtualization, noncritical instructions run on the hardware directly
while critical instructions are discovered and replaced with traps into the VMM to be emulated by software. ●◆
Both the hypervisor and VMM approaches are considered full virtualization.
• The VMM scans the instruction stream and identifies the privileged, control and behavior sensitive
instructions. When these instructions are identified, they are trapped into the VMM, which emulates
the behavior of these instructions.
• The method used in this emulation is called binary translation. ◆● Full virtualization combines
binary translation and direct execution.
● ◆ The guest OSes are installed and run on top of the virtualization layer.
Dedicated applications may run on the VMs. Certainly, some other applications can also run with
the host OS directly.
◆ Host based architecture has some distinct advantages, as enumerated next.
o First, the user can install this VM architecture without modifying the host OS. o Second, the host-based
● When x86 processor is virtualized, a virtualization layer between the hardware and the OS.
● According to the x86 ring definitions, the virtualization layer should also be installed at Ring 0.
Different instructions at Ring 0 may cause some problems. Although para virtualization reduces the
First, its compatibility and portability may be in doubt, because it must support the unmodified OS
as well. Second, the cost of maintaining para virtualized OSes is high, because they may require deep
OS kernel modifications.
Finally, the performance advantage of para virtualization varies greatly due to workload variations.
Compared with full virtualization, para virtualization is relatively easy and more practical. The main
problem in full virtualization is its low performance in binary translation. ●◆ KVM is a Linux para
virtualization system. It is a part of the Linux version 2.6.20 kernel. ◆● In KVM, Memory management
and scheduling activities are carried out by the existing Linux kernel.
● ◆ The KVM does the rest, which makes it simpler than the hypervisor that controls the entire
machine. KVM is a hardware assisted and para virtualization tool, which improves performance and supports
unmodified guest OSes such as Windows, Linux, Solaris, and other UNIX variants.
Unlike the full virtualization architecture which intercepts and emulates privileged and sensitive
instructions at runtime, para virtualization handles these instructions at compile time. The guest OS
kernel is modified to replace the privileged and sensitive instructions with hyper calls to the hypervisor
● ◆ The guest OS running in a guest domain may run at Ring 1 instead of at Ring 0. This implies that
the guest OS may not be able to execute some privileged and sensitive instructions. The privileged
instructions are implemented by hyper calls to the hypervisor.
7. What are the Types of Virtualization?
(Definition:3 marks, Concept explanation:10 marks)
1 Full virtualization
● ◆ Full virtualization refers to the ability to run a program, most likely an operating system,
directly on top of a virtual machine and without any modification, as though it were run on the raw
hardware. To make this possible, virtual machine manager are required to provide a complete
emulation of the entire underlying hardware. ◆● The principal advantage of full virtualization is complete security,
ease of emulation of different architectures and coexistence of different systems on the same platform.
● ◆ Whereas it is a desired goal for many virtualization solutions, full virtualization poses
important concerns related to performance and technical implementation.
● ◆ A key challenge is the interception of privileged instructions such as I/O instructions: Since they
change the state of the resources exposed by the host, they have to be contained within the virtual machine
manager.
● ◆ A simple solution to achieve full virtualization is to provide a virtual environment for all the
instructions, thus posing some limits on performance.
2. Para virtualization
• Para virtualization is a not transparent virtualization solution that allows implementing thin virtual
machine managers.
● ◆ Para virtualization techniques expose a software interface to the virtual machine that is slightly
modified from the host and, as a consequence, guests need to be modified.
● ◆ The aim of para virtualization is to provide the capability to demand the execution of performance
critical operations directly on the host, thus preventing performance losses that would otherwise be experienced
in managed execution. This allows a simpler implementation of virtual machine managers that have to simply
transfer the execution of these operations, which were hard to virtualize, directly to the host.
To take advantage of such an opportunity, guest operating systems need to be modified and
explicitly ported by remapping the performance critical operations through the virtual machine
software interface.
This is possible when the source code of the operating system is available, and this is the reason that
Para virtualization was mostly explored in the open source and academic environment.
This technique has been successfully used by Xen for providing virtualization solutions for Linux- based
operating systems specifically.
• Operating systems that cannot be ported can still take advantage of para virtualization by using ad
hoc device drivers that remap the execution of critical instructions to the Para virtualization APIs
exposed by the hypervisor. Xen provides this solution for running Windows based operating systems
on x86 architectures.
● ◆ Other solutions using Para virtualization include VMWare, Parallels, and some solutions for
embedded and real-time environments such as TRANGO, Wind River, and XtratuM.
3. Hardware assisted virtualization
● ◆ This technique was originally introduced in the IBM System/370. . At present, examples of
hardware assisted virtualization are the extensions to the x86 architecture introduced with Intel-VT
(formerly known as Vander pool) and AMD-V (formerly known as Pacifica).These extensions,
which differ between the two vendors, are meant to reduce the performance penalties experienced
by emulating x86
hardware with hypervisors.
● ◆ The reason for this is that by design the x86 architecture did not meet the formal requirements
introduced by Popek and Goldberg and early products were using binary translation to trap some
sensitive instructions and provide an emulated version. • Products such as VMware Virtual Platform,
introduced in 1999 by
VMware, which pioneered the field of x86 virtualization, were based on this technique.
. After 2006, Intel and AMD introduced processor extensions and a wide range of virtualization
solutions took advantage of them: Kernel-based Virtual Machine (KVM), VirtualBox, Xen,
VMware, Hyper-V, Sun XVM, Parallels, and others.
4. Partial virtualization
● ◆ Partial virtualization provides a partial emulation of the underlying hardware, thus not allowing
the complete execution of the guest operating system in complete isolation.
● ◆ Partial virtualization allows many applications to run transparently, but not all the features of
the operating system can be supported as happens with full virtualization. An example of partial
virtualization is address space virtualization used in time sharing systems; this allows multiple
applications and users to run concurrently in a separate memory space, but they still share the same
hardware resources (disk, processor, and network).
PART C
15 marks
A traditional computer runs with a host operating system specially tailored for its hardware
architecture, as shown in Figure 3.1(a). After virtualization, different user applications managed
by their own operating systems (guest OS) can run on the same hardware, independent of the host
OS. This is often done by adding additional software, called a virtualization layer as shown in
Figure 3.1(b). This virtualization layer is known as hypervisor or virtual machine monitor (VMM)
[54]. The VMs are shown in the upper boxes, where applications run with their own guest OS over
the virtualized CPU, memory, and I/O resources.
The main function of the software layer for virtualization is to virtualize the physical hardware of
a host machine into virtual resources to be used by the VMs, exclusively. This can be implemented
at various operational levels, as we will discuss shortly. The virtualization software creates the
abstraction of VMs by interposing a virtualization layer at various levels of a
computer system. Common virtualization layers include the instruction set architecture (ISA)
level, hardware level, operating level, and application level (see
Figure 3.2).
1.1 Instruction Set Architecture Level
At the ISA level, virtualization is performed by emulating a given ISA by the ISA of the host
machine. For example, MIPS binary code can run on an x86-based host machine with the help of
ISA emulation. With this approach, it is possible to run a large amount of legacy binary code writ-
ten for various processors on any given new hardware host machine. Instruction set emulation
leads to virtual ISAs created on any hardware machine.
The basic emulation method is through code interpretation. An interpreter program interprets
the source instructions to target instructions one by one. One source instruction may require tens
or hundreds of native target instructions to perform its function. Obviously, this process is
relatively slow. For better performance, dynamic binary translation is desired. This approach
translates basic blocks of dynamic source instructions to target instructions. The basic blocks can
also be extended to program traces or super blocks to increase translation efficiency. Instruction
set emulation requires binary translation and optimization. A virtual instruction set architecture
(V-ISA) thus requires adding a processor-specific software translation layer to the compiler.
1.2 Hardware Abstraction Level
Hardware-level virtualization is performed right on top of the bare hardware. On the
one hand, this approach generates a virtual hardware environment for a VM. On the
other hand, the process manages the underlying hardware through virtualization. The
idea is to virtualize a computer’s resources, such as its processors, memory, and I/O
devices. The intention is to upgrade the hardware utilization rate by multiple users
concurrently. The idea was implemented in the IBM VM/370 in the 1960s. More
recently, the Xen hypervisor has been applied to virtualize x86- based machines to
run Linux or other guest OS applications. We will discuss hardware virtualization
approaches in more detail in Section 3.3.
1.3 Operating System Level
This refers to an abstraction layer between traditional OS and user applications. OS-
level virtualization creates isolated containers on a single physical server and the OS
instances to utilize the hard-ware and software in data centers. The containers behave
like real servers. OS- level virtualization is commonly used in creating virtual hosting
environments to allocate hardware resources among a large number of mutually
distrusting users. It is also used, to a lesser extent, in consolidating server hardware
by moving services on separate hosts into containers or VMs on one server. OS-level
virtualization is depicted in Section 3.1.3.
1.4 Library Support Level
Most applications use APIs than using lengthy system calls by the OS. Since most systems
provide well-documented APIs, such an interface becomes another candidate for
virtualization. Virtualization with library interfaces is possible by controlling the
communication link between applications and the rest of a system through API hooks.
The software tool WINE has implemented this approach to support Windows
applications on top of UNIX hosts. Another example is the vCUDA which allows
applications executing within VMs to leverage GPU hardware acceleration. This
approach is detailed in Section 3.1.4.
1.5 User-Application Level
Virtualization at the application level virtualizes an application as a VM. On a
traditional OS, an application often runs as a process. Therefore, application-level
virtualization is also known as process-level virtualization. The most popular
approach is to deploy high level language (HLL)
VMs. In this scenario, the virtualization layer sits as an application program on top
of the operating system, and the layer exports an abstraction of a VM that can run
programs written and compiled to a particular abstract machine definition. Any
program written in the HLL and compiled for this VM will be able to run on it. The
Microsoft .NET CLR and Java Virtual Machine (JVM) are two good examples of this
class of VM.
Other forms of application-level virtualization a r e k n o w n a s
application isolation, application sandboxing, or application streaming. The process
involves wrapping the application in a layer that is isolated from the host OS and
other applications.
2. Explain in detail about virtualization of cpu, memory and I/O devices?
(Definition:2 marks, Diagram:5 marks, Concept Explanation:8 marks)
To support virtualization, processors such as the x86 employ a special running
mode and instructions, known as hardware-assisted virtualization. In this way, the
VMM and guest OS run in different modes and all sensitive instructions of the guest
OS and its applications are trapped in the VMM. To save processor states, mode
switching is completed by hardware. For the x86 architecture, Intel and AMD have
proprietary technologies for hardware-assisted virtualization.
1. Hardware Support for Virtualization
Modern operating systems and processors permit multiple processes to run
simultaneously. If there is no protection mechanism in a processor, all instructions
from different processes will
access the hardware directly and, all processors have at least two modes, user mode
and supervisor mode, to ensure controlled access of critical hardware.
Instructions running in supervisor mode are called privileged instructions. Other
instructions are unprivileged instructions. In a virtualized environment, it is more
difficult to make OSes and
applications run correctly because there are more layers in the machine stack.
Example 3.4 discusses Intel’s hardware support approach.
At the time of this writing, many hardware virtualization products were available.
The VMware Workstation is a VM software suite for x86 and x86-64 computers.
This software suite allows users to set up multiple x86 and x86-64 virtual computers
and to use one or more of these VMs simultaneously with the host operating system.
The VMware Workstation assumes the host-based virtualization. Xen is a hypervisor
for use in IA-32, x86-64, Itanium, and PowerPC 970 hosts. Actually, Xen modifies
Linux as the lowest and most privileged layer, or a hypervisor.
One or more guest OS can run on top of the hypervisor. KVM (Kernel-based
Virtual Machine) is a Linux kernel virtualization infrastructure. KVM can support
hardware-assisted virtualization and Para virtualization by using the Intel VT-x or
AMD-v and VirtIO framework, respectively. The VirtIO framework includes a Para
virtual Ethernet card, a disk I/O controller, a balloon device for adjusting guest
memory usage, and a VGA graphics interface using VMware drivers.
Example 3.4 Hardware Support for Virtualization in the Intel x86 Processor
Although x86 processors are not virtualizable primarily, great effort is taken to
virtualize them. They are used widely in comparing RISC processors that the bulk of
x86-based legacy systems cannot discard easily. Virtuali-zation of x86 processors is
detailed in the following sections. Intel’s VT-x technology is an example of
hardware-assisted virtualization, as shown in Figure
3.11. Intel calls the privilege level of x86 processors the VMX Root Mode. In order
to control the start and stop of a VM and allocate a memory page to maintain the
CPU state for VMs, a set of additional instructions is added. At the time of this
writing, Xen, VMware, and the Microsoft Virtual PC all implement their hypervisors
by using the VT-x technology.
Generally, hardware-assisted virtualization should have high efficiency. However,
since the transition from the hypervisor to the guest OS incurs high overhead switches
between processor modes, it sometimes cannot outperform binary translation. Hence,
virtualization systems such as VMware now use a hybrid approach, in which a few
tasks are offloaded to the hardware but the rest is still done in software. In addition,
para-virtualization and hardware-assisted virtualization can be combined to improve
the performance further.
3. Memory Virtualization
That means a two-stage mapping process should be maintained by the guest OS and
the VMM, respectively: virtual memory to physical memory and physical memory
to machine memory. Furthermore, MMU virtualization should be supported, which
is transparent to the guest OS. The guest OS continues to control the mapping of
virtual addresses to the physical memory addresses of VMs. But the guest OS cannot
directly access the actual machine memory. The VMM is responsible for mapping
the guest physical memory to the actual machine memory. Figure 3.12 shows the
two-level memory mapping procedure.
Since each page table of the guest OSes has a separate page table in the VMM
corresponding to it, the VMM page table is called the shadow page table. Nested page
tables add another layer of indirection to virtual memory. The MMU already handles
virtual-to-physical translations as defined by the OS. Then the physical memory
addresses are translated to machine addresses using another set of page tables defined
by the hypervisor. Since modern operating systems maintain a set of page tables for
every process, the shadow page tables will get flooded. Consequently, the
performance overhead and cost of memory will be very high.
VMware uses shadow page tables to perform virtual-memory-to-machine-memory
address translation. Processors use TLB hardware to map the virtual memory
directly to the machine
memory to avoid the two levels of translation on every access. When the guest OS
changes the virtual memory to a physical memory mapping, the VMM updates the
shadow page tables to enable a direct lookup. The AMD Barcelona processor has
featured hardware-assisted memory virtualization since 2007. It provides hardware
assistance to the two-stage address translation in a virtual execution environment by
using a technology called nested paging.
page table is a page fault, the CPU will generate a page fault interrupt and will let the
guest OS kernel handle the interrupt. When the PGA of the L3 page table is obtained,
the CPU will look for the EPT to get the HPA of the L3 page table, as described
earlier. To get the HPA corresponding
to a GVA, the CPU needs to look for the EPT five times, and each time, the memory
needs to be accessed four times. There-fore, there are 20 memory accesses in the
worst case, which is still very slow. To overcome this short-coming, Intel increased
the size of the EPT TLB to decrease the number of memory accesses.
4. I/O Virtualization
I/O virtualization involves managing the routing of I/O requests between virtual
devices and the shared physical hardware. At the time of this writing, there are three
ways to implement I/O virtualization: full device emulation, para-virtualization, and
direct I/O. Full device emulation is the first approach for I/O virtualization.
Generally, this approach emulates well-known, real- world devices.
All the functions of a device or bus infrastructure, such as device enumeration,
identification, interrupts, and DMA, are replicated in software. This software is
located in the VMM and acts as a virtual device. The I/O access requests of the guest
OS are trapped in the VMM which interacts with the I/O devices. The full device
emulation approach is shown in Figure 3.14.
A single hardware device can be shared by multiple VMs that run concurrently.
However, software emulation runs much slower than the hardware it emulates. The
para- virtualization method of I/O virtualization is typically used in Xen. It is also
known as the split driver model consisting of a frontend driver and a backend driver.
The frontend driver is running in Domain U and the backend dri-ver is running in
Domain 0. They interact with each other via a block of shared memory. The frontend
driver manages the I/O requests of the guest OSes and the backend driver is
responsible for managing the real I/O devices and multiplexing the I/O data of
different VMs. Although para-I/O-virtualization achieves better device performance
than full device emulation, it comes with a higher CPU overhead.
Direct I/O virtualization lets the VM access devices directly. It can achieve close-
to-native performance without high CPU costs. However, current direct I/O
virtualization implementations focus on networking for mainframes. There are a lot
of challenges for commodity hardware devices. For example, when a physical device
is reclaimed (required by workload migration) for later reassign-ment, it may have
been set to an arbitrary state (e.g., DMA to some arbitrary memory locations) that can
function incorrectly or even crash the whole system. Since software- based I/O
virtualization requires a very high overhead of device emulation, hardware-assisted
I/O virtualization is critical. Intel VT-d supports the remapping of I/O DMA transfers
and device- generated interrupts. The architecture of VT-d provides the flexibility to
support multiple usage models that may run unmodified, special-purpose, or
“virtualization-aware” guest OSes.
Another way to help I/O virtualization is via self-virtualized I/O (SV-IO) [47]. The
key idea of SV-IO is to harness the rich resources of a multicore processor. All tasks
associated with virtualizing an I/O device are encapsulated in SV-IO. It provides
virtual devices and an associated access API to VMs and a management API to the
VMM. SV-IO defines one virtual interface (VIF) for every kind of virtualized I/O
device, such as virtual network interfaces, virtual block devices (disk), virtual camera
devices, and others. The guest OS interacts with the VIFs via VIF device drivers.
Each VIF consists of two mes-sage queues. One is for outgoing messages to the
devices and the other is for incoming messages from the devices. In addition, each
VIF has a unique ID for identifying it in SV-IO.
● ◆ The hypervisor supports hardware level virtualization on bare metal devices like
CPU, memory, disk and network interfaces.
◆ The hypervisor software sits directly between the physical hardware and its OS. This
virtualization layer is referred to as either the VMM or the hypervisor.
The hypervisor provides hyper calls for the guest OSes and applications. Depending on
the functionality, a hypervisor can assume microkernel architecture like the Microsoft
Hyper-V.
◆ It can assume monolithic hypervisor architecture like the VMware ESX for server
virtualization.
● ◆ A micro kernel hypervisor includes only the basic and unchanging functions (such
as physical memory management and processor scheduling).
● ◆ The device drivers and other changeable components are outside the hypervisor.
● ◆ The hypervisor supports hardware level virtualization on bare metal devices like
CPU, memory, disk and network interfaces.
The hypervisor software sits directly between the physical hardware and its OS. This
virtualization layer is referred to as either the VMM or the hypervisor. The hypervisor provides
hyper calls for the guest OSes and applications. Depending on the functionality, a hypervisor
◆ It can assume monolithic hypervisor architecture like the VMware ESX for server
virtualization.
● ◆ The device drivers and other changeable components are outside the hypervisor.
A monolithic hypervisor implements all the aforementioned functions, including those of
the
device drivers. Therefore, the size of the hypervisor code of a micro-kernel hypervisor
is smaller than that of a monolithic hypervisor.
Essentially, a hypervisor must be able to convert physical devices into virtual resources
Xen architecture
• The Xen hypervisor implements all the mechanisms, leaving the policy to be handled
by Domain
1. Figure 2.4 shows architecture of Xen hypervisor.
Xen does not include any device drivers natively. It just provides a mechanism by
which a guest OS can have direct access to the physical devices.
● ◆ As a result, the size of the Xen hypervisor is kept rather small.
• Xen provides a virtual environment located between the hardware and the OS.
The core components of a Xen system are the hypervisor, kernel, and applications. ◆● The
organization of the three components is important.
● ◆ Like other virtualization systems, many guest OSes can run on top of the hypervisor.
● ◆ However, not all guest OSes are created equal, and one in particular controls the
others.
● ◆ The guest OS, which has control ability, is called Domain 0, and the others are called
Domain U.
● ◆ Domain 0 is a privileged guest OS of Xen. It is first loaded when Xen boots without
any file system drivers being available. Domain 0 is designed to access hardware directly and
manage devices. Therefore, one of the responsibilities of Domain 0 is to allocate and map hardware
resources for the guest domains (the Domain U domains).
● ◆ For example, Xen is based on Linux and its security level is C2. Its management VM
is named Domain 0 which has the privilege to manage other VMs implemented on the
same host.
● ◆ If Domain 0 is compromised, the hacker can control the entire system. So, in the VM
system, security policies are needed to improve the security of Domain 0.
● ◆ Domain 0, behaving as a VMM, allows users to create, copy, save, read, modify,
share, migrate and roll back VMs as easily as manipulating a file, which flexibly provides
tremendous benefits for users.
UNIT III
VIRTUALIZATION INFRASTRUCTURE AND DOCKER
SYLLABUS: Desktop Virtualization – Network Virtualization – Storage
Virtualization – System-level of Operating Virtualization – Application Virtualization
– Virtual clusters and Resource Management – Containers vs. Virtual Machines –
Introduction to Docker – Docker Components – Docker Container – Docker Images
and Repositories.
PART A
2 Marks
The guest can share the same network interface of the host and use Network Address
Translation (NAT) to access the network; The virtual machine manager can emulate,
and install on the host, an additional network device, together with the driver. The guest
can have a private network only with the guest.
3. What is Hardware-level virtualization?
4. Define hypervisor?
The hypervisor is generally a software and hardware that allows the abstraction of the
underlying physical hardware. Hypervisors is a fundamental element of hardware
virtualization is the hypervisor, or virtual machine manager (VMM).
There are different techniques for storage virtualization, one of the most popular being
network based virtualization by means of storage area networks (SANs). SANS use a
network accessible device through a large bandwidth connection to provide storage
facilities.
Memory Migration
Network Migration
Containers and virtual machines are two types of virtualization technologies that share
many similarities.
Virtualization is a process that allows a single resource, such as RAM, CPU, Disk, or
Networking, to be virtualized and represented as multiple resources.
However, the main difference between containers and virtual machines is that virtual
machines virtualize the entire machine, including the hardware layer, while containers
only virtualize software layers above the operating system level.
Bridge: This is the default network driver and is suitable for different containers that
need to communicate with the same Docker host.
Host: This network is used when there is no need for isolation between the container
and the host.
Overlay: This network allows swarm services to communicate with each other.
None: This network disables all networking.
Macvlan: This assigns a Media Access Control (MAC) address to containers, which
looks like a physical address.
15. What is the purpose of Docker Hub?
The Docker Hub is a cloud-based repository service where users can push their Docker
Container Images and access them from anywhere via the internet. It offers the option
to push images as private or public and is primarily used by DevOps teams.
The Docker Hub is an open-source tool that is available for all operating systems. It
functions as a storage system for Docker images and allows users to pull the required
images when needed.
PART B
13 Marks
1. The guest can share the same network interface of the host and use Network
Address Translation (NAT) to access the network; The virtual machine manager
can emulate, and install on the host, an additional network device, together with
the driver.
2. The guest can have a private network only with the guest.
There are different techniques for storage virtualization, one of the most popular
being network based virtualization by means of storage area networks (SANS).
● Operating system level virtualization offers the opportunity to create different and
separated execution environments for applications that are managed concurrently.
Windows platform. Wine features a software application acting as a container for the
guest application and a set of libraries, called Winelib, that developers can use to
compile applications to be ported on Unix systems. ◆● Wine takes its inspiration from
a similar product from Sun, Windows Application Binary Interface (WABI) which
implements the Win 16
• A similar solution for the Mac OS X environment is Cross Over, which allows
running Windows applications directly on the Mac OS X operating system.
● ◆ VMware Thin App is another product in this area, allows capturing the setup of
an installed application and packaging it into an executable image isolated from
The hosting operating system.
Virtual clusters are built using virtual machines installed across one or more physical
clusters, logically interconnected by a virtual network across several physical networks.
They can also be replicated in multiple servers to promote distributed parallelism, fault
tolerance, and disaster recovery.
Virtual cluster sizes can grow or shrink dynamically, similar to overlay networks in
peer-to-peer networks.
Physical node failures may disable some virtual machines, but virtual machine failures
will not affect the host system
The system should be capable of quick deployment, which involves creating and
distributing software stacks (including the OS, libraries, and applications) to physical
nodes within clusters, as well as rapidly
switching runtime environments between virtual clusters for different users.
When a user is finished using their system, the corresponding virtual cluster should be
quickly shut down or suspended to free up resources for other users. The concept of
"green computing" has gained attention recently, which focuses on reducing energy
costs by applying energy-efficient techniques across clusters of homogeneous
workstations and specific applications. Live migration of VMs allows workloads to be
transferred from one node strategies for green
computing without compromising cluster performance is a challenge.
Virtualization also enables load balancing of applications within a virtual cluster using
the load index and user login frequency.
This load balancing can be used to implement an automatic scale-up and scale-down
mechanism for the virtual cluster.
Virtual clustering provides a flexible solution for building clusters consisting of both
physical and virtual machines.
It is widely used in various computing systems such as cloud platforms, high-
performance computing systems, and computational grids.
Virtual clustering enables the rapid deployment of resources upon user demand or in
response to node failures. There are four different ways to manage virtual clusters,
including having the cluster manager reside on the guest or host systems, using
independent cluster managers, or an integrated cluster manager designed to distinguish
between virtualized and physical resources.
A VM can be in one of four states, including an inactive state, an active state, a paused
state, and a suspended state. The live migration of VMs allows for VMs to be moved
from one physical machine to another.
In the event of a VM failure, another VM running with the same guest OS can replace
it on a different node. During migration, the VM state file is copied from the storage
area to the host machine
Memory Migration
Network Migration
One crucial aspect of VM migration is memory migration, which involves moving the
memory instance of a VM from one physical host to another.
The efficiency of this process depends on the characteristics of the application/workloads
supported by the guest OS. In today's systems, memory migration can range from a few
hundred megabytes to several gigabytes.
The Internet Suspend-Resume (ISR) technique takes advantage of temporal locality,
where memory states are likely to have significant overlap between the suspended and
resumed instances of a VM. The ISR technique represents each file in the file system
as a tree of sub files, with a copy existing in both the suspended and resumed VM
instances.
By caching only, the changed files, this approach minimizes transmission overhead.
However, the ISR technique is not suitable for situations where live machine
migration is necessary, as it results in high downtime compared to other techniques.
For a system to support VM migration, it must ensure that each VM has a consistent
and location-independent view of the file system that is available on all hosts.
One possible approach is to assign each VM with its own virtual disk and map the file
system to it.
However, due to the increasing capacity of disks, it's not feasible to transfer the entire
contents of a disk over a network during migration.
Another alternative is to implement a global file system that is accessible across all
machines, where a VM can be located without the need to copy files between machines.
To ensure remote systems can locate and communicate with the migrated VM, it must
be assigned a virtual IP address that is known to other entities.
This virtual IP address can be the host machine where the VM is currently located.
Additionally, each VM can have its own virtual MAC address, and the VMM maintains
a mapping of these virtual IP and MAC addresses to their corresponding VMs in an
ARP table.
At Duke University, COD was developed to enable dynamic resource allocation with a
virtual cluster management system, and at Purdue University, the VIOLIN cluster was
constructed to demonstrate the benefits of dynamic adaptation using multiple VM
clustering.
Containers operate in isolation from each other and come bundled with their own
software, libraries, and configuration files. They can communicate with each other
through well-defined channels.
Unlike virtual machines, all containers share a single operating system kernel, which
results in lower resource consumption
Key Terminologies
A Docker Image is a file containing multiple layers of instructions used to create
and run a Docker container. It provides a portable and reproducible way to package
and distribute applications.
A Docker Container is a lightweight and isolated runtime environment created from an
image. It encapsulates an application and its dependencies, providing a consistent and
predictable environment for running the application.
A Docker file is a text file to build a Docker Image.
It defines the base image, application code, dependencies, and configuration needed
to create a custom Docker Image.
Docker Engine is the software that enables the creation and management of Docker
containers. It consists of three main components:
Docker Hub is a cloud-based registry that provides a centralized platform for storing,
sharing, and discovering Docker Images. It offers a vast collection of pre-built Docker
Images that developers can use to build, test, and deploy their applications.
Features of Docker
Open-source platform
An Easy, lightweight, and consistent way of delivery of applications Fast and efficient
development life cycle.
Segregation of duties
Service-oriented architecture
Security
Scalability
Reduction in size
Image management
Networking
Volume management
8. What are Docker Components?
(Definition:2 marks, Diagram:4 marks, Concept explanation:7marks)
Docker implements a client-server model where the Docker client communicates with
the Docker daemon to create, manage, and distribute containers.
The Docker client can be installed on the same system as the daemon or connected
remotely.
Communication between through a REST API either
over a UNIX socket or a network.
The Docker daemon is responsible for managing various Docker services and
communicates with other daemons to do so. Using Docker's API requests, the daemon
manages Docker objects such as images, containers, networks, and volumes.
Docker Client
The Docker client allows users to interact with Docker and utilize its functionalities. It
communicates with the Docker daemon using the Docker API.
The Docker client has the capability to communicate with multiple daemons. When a
user runs a Docker command on the terminal, the instructions are sent to the daemon.
The Docker daemon receives these instructions in the form of commands and REST
API requests from the Docker client.
The primary purpose of the Docker client is to facilitate actions such as pulling
images from the Docker registry and running them on the Docker host.
Commonly used commands by Docker clients include docker build, docker pull, and
docker run.
Docker Host
Docker Registry
Docker images are stored in the Docker registry, which can either be a public registry
like Docker Hub, or a private registry that can be set up.
To obtain required images from a configured registry, the 'docker run' or 'docker pull'
commands can be used. Conversely, to push images into a configured registry, the
'docker push' command can beused
Docker Objects
When working with Docker, various objects such as images, containers, volumes, and
networks are created and utilized.
Docker Images
A Docker image is a set of instructions used to create a container, serving as a read-
only template that can store and transport applications.
Images play a critical role in the Docker ecosystem by enabling collaboration among
developers in ways that were previously impossible
Docker Storage
Docker storage is responsible for storing data within the writable layer of the container,
and this function is carried out by a storage driver. The storage driver is responsible for
managing and controlling the images and containers on the Docker host. There are
several types of Docker storage.
о Data Volumes, which can be mounted directly into the container's file system, are
essentially directories or files on the Docker Host file system.
о Volume Container is used to maintain the state of the containers' data produced by
the running container, where Docker volumes file systems are mounted on Docker
containers. These volumes are stored on the host, making it easy for users to exchange
file systems among containers and backup data.
Docker networking
Docker networking provides complete isolation for containers, allowing users to link
them to multiple networks with minimal OS instances required to run workloads.
There are different types of Docker networks available, including:
o Bridge: This is the default network driver and is suitable for different containers
that need to communicate with the same Docker host.
o Host: This network is used when there is no need for isolation between the container
and the host.
оOverlay: This network allows to communicate with each other. None: This network
disables all networking.
Macvlan: This assigns a Media Access Control (MAC) address to containers, which
By default, a container is isolated from other containers and its host machine. It is
possible to control the level of isolation for a container's network, storage or other
underlying subsystems from other containers or from the host machine.
Any changes made to a container's state that are not stored in persistent storage will
be lost once the container is removed.
Docker provides a consistent environment for running applications from design and
development to production and maintenance, which eliminates production issues and
allows developers to focus on introducing quality features instead of debugging errors
and resolving configuration/compatibility issues.
Docker also allows for instant creation and deployment of containers for every process,
without needing to boot the OS, which saves time and increases agility. Creating,
destroying, stopping or starting a container can be done with ease, and YAML
configuration files can be used to automate deployment and scale the infrastructure.
Docker enables significant infrastructure cost reduction, with minimal costs for running
applications when compared with VMs and other technologies. This can lead to
increased ROI and operational cost savings with smaller engineering teams
PART C
15 Marks
● ◆ It consists of a virtual machine executing the byte code of program which is the
result of the compilation process.
● ◆ At runtime, the byte code can be either interpreted or compiled on the fly against
the underlying hardware instruction set.
● ◆ Other important examples of the use of this technology have been the UCSD
Pascal and Smalltalk
The Java virtual machine was originally designed for the execution of programs written
in the Java language, but other languages such as Python, Pascal, Groovy and Ruby
were made available.
◆ ● The ability to support multiple programming languages has been one of the
key elements of the Common Language Infrastructure (CLI) which is the
specification behind .NET Framework
● ◆ Cloud users do not need to know and have no way to discover physical
resources that are involved while processing a service request. In addition,
application developers do not care about some infrastructure issues such as
scalability and fault tolerance. Application developers focus on service logic. In
many cloud computing systems, virtualization software is used to virtualize the
hardware. System virtualization software is a special kind of software which
simulates the execution of hardware and runs even unmodified operating systems.
The development environment and deployment environment can now be the same,
which eliminates some runtime problems.
VMs provide flexible runtime services to free users from worrying about the system
environment.
An environment that meets one user's requirements often cannot satisfy another user.
Users have full access to their own VMs, which are completely separate from other
user's VMs.
● ◆ Multiple VMs can be mounted on the same physical server. Different VMs may
run with different OSes.The virtualized resources form a resource pool.
The virtualization is carried out by special servers dedicated to generating the
virtualized resource pool. The virtualized infrastructure (black box in the middle) is
built with many virtualizing integration managers.
These managers handle loads, resources, security, data, and provisioning functions.
Figure 3.2 shows two VM platforms.
● ◆ Each platform carries out a virtual solution to a user job. All cloud services
are managed in the boxes at the top.
AWS provides extreme flexibility (VMS) for users to execute their own applications.
GAE provides limited application level virtualization for users to build applications
only based on the services that are created by Google.
● ◆ The Microsoft tools are used on PCs and some special servers.
VM technology has increased in ubiquity. This has enabled users to create customized
environments atop physical infrastructure for cloud computing.
Use of VMs in clouds has the following distinct benefits:
VMs have the ability to run legacy code without interfering with other APIs VMs can
be used to improve security through creation of sandboxes for running applications with
questionable reliability
o Virtualized cloud platforms can apply performance isolation, letting providers offer
some guarantees and better QoS to customer applications
Containers are software packages that are lightweight and self- contained, and they
comprise all the necessary dependencies to run an application.
The dependencies include external third-party code packages, system libraries, and
other operating system-level applications.
These dependencies are organized in stack levels that are higher than the operating
system.
Advantages:
One advantage of using containers is their fast iteration speed. Due to their lightweight
nature and focus on high-level software, containers can be quickly modified and
updated.
Disadvantages:
о As containers share the same hardware system beneath the operating system layer,
any vulnerability in one container can potentially affect the underlying hardware and
break out of the container.
Although many container runtimes offer public repositories of pre-built containers,
there is a security risk associated with using these containers as they may contain
exploits or be susceptible to hijacking by malicious actors. Examples:
о Docker is the most widely used container runtime that offers Docker Hub, a public
repository of containerized applications that can be easily deployed to a local Docker
runtime.
CRI-O, on the other hand, is a lightweight alternative to using Docker as the runtime
for Kubernetes, implementing the Kubernetes Container Runtime Interface (CRI) to
support Open Container Initiative (OCI)-compatible runtimes.
Virtual Machines
Virtual machines are sof t wa recomplete emulation of low-
level hardware devices, such as CPU, disk, and networking devices. They may also
include a complementary software stack that can run on the emulated hardware.
Together, these hardware and software packages create a functional snapshot of a
computational system.
Advantages:
O Virtual machines provide full isolation security since they operate as standalone
systems, which means that they are protected from any interference or exploits from
other virtual machines on the same host.
o Though a virtual machine can still be hijacked by an exploit, the affected virtual
machine will be isolated and cannot contaminate other adjacent virtual machines.
о One can manually install software to the virtual machine and snapshot the virtual
machine to capture the present configuration state.
о The virtual machine snapshots can then be utilized to restore the virtual machine to that
particular point in time or create additional virtual machines with that configuration.
Disadvantages:
о Virtual machines are known for their slow iteration speed due to the fact that they
involve a complete system stack.
o Any changes made to a virtual machine snapshot can take a considerable amount of
time to rebuild and validate that they function correctly.
o Another issue with virtual machines is that they can occupy a significant amount of
storage space, often several gigabytes in size.
о This can lead to disk space constraints on the host machine where the virtual
machines are stored.
Examples:
Virtualbox is an open source emulation system that emulates x86 architecture, and is
owned by Oracle. It is widely used and has a set of additional tools to help develop and
distribute virtual machine images.
oVMware is a publicly traded company that provides a hypervisor along with its virtual
machine platform, which allows deployment and management of multiple virtual
machines. VMware offers robust UI for managing virtual machines, and is a
popular enterprise virtual machine.
о QEMU is a powerful virtual machine option that can emulate any generic hardware
architecture. However, it lacks a graphical user interface for configuration or execution,
and is a command line only utility. As a result, QEMU is one of the fastest virtual
machine options available.
3. Explain Docker Repositories with its features?
(Definition:2 marks, Concept explanation:13 marks)
The Docker Hub is a cloud-based repository service where users can push their Docker
Container Images and access them from anywhere via the internet. It offers the option
to push images as private or public and is primarily used by DevOps teams.
The Docker Hub is an open-source tool that is available for all operating systems. It
functions as a storage system for Docker images and allows users to pull the required
images when needed. However, it is necessary to have a basic knowledge of Docker to
push or pull images from the Docker Hub. If a developer team wants to share a project
along with its dependencies for testing, they can push the code to Docker Hub. To do
this, the developer must create images and push them to Docker Hub. The testing team
can then pull the same image from Docker Hub without needing any files, software, or
plugins, as the developer has already shared the image with all dependencies.
Docker Hub simplifies the storage, management, and sharing of images with others. It
provides security checks for images and generates comprehensive reports on any
security issues.
Additionally, Docker Hub can automate processes like Continuous Deployment and
Continuous Testing by triggering web hooks when a new image is uploaded.
Through Docker Hub, users can manage permissions for teams, users, and
organizations.
Moreover, Docker Hub can be integrated with tools like GitHub and Jenkins,
streamlining workflows.
Advantages of Docker Hub
Docker Container Images have a lightweight design, which enables us to push images
in a matter of minutes using a simple command.
This method is secure and offers the option of pushing private or public images.
Making code, software or any type of file available to the public can be done easily by
publishing the images on the Docker image.
UNIT IV
CLOUD DEPLOYMENT ENVIRONMENT
SYLLABUS: Google App Engine – Amazon AWS – Microsoft Azure; Cloud Software
Environments – Eucalyptus – OpenStack.
PART A
2 Marks
Google's App Engine (GAE) which offers a PaaS platform supporting various cloud
and web applications. This platform specializes in supporting scalable (elastic) web
applications. GAE enables users to run their applications on a large number of data
centers associated with Google's search engine operations.
GFS is used for storing large amounts of data. Map Reduce is for use in application
program development. Chubby is used for distributed application lock services. Big
Table offers a storage service for accessing structured data.
Blob, Queue, File, and Disk Storage, Data Lake Store, Backup, and Site Recovery.
Well-known GAE applications include the Google Search Engine, Google Docs,
Google Earth, and Gmail. These applications can support large numbers of users
simultaneously. Users can interact with Google applications via the web interface
provided by each application. Third-party application providers can use GAE to build
cloud applications for providing services.
Mainframe
Client-Server
Cloud Computing
Mobile Computing
Grid Computing
AWS can present a challenge due to its vast array of services and functionalities, which
may be hard to comprehend and utilize, particularly for inexperienced users. The cost
of AWS can be high, particularly for high-traffic applications or when operating
multiple services.
PART B
13 Marks
Google has the world's largest search engine facilities. The company has extensive
experience in massive data processing that has led to new insights into data-center
design and novel programming models that scale to incredible sizes.
Google platform is based on its search engine expertise. Google has hundreds of data
centers and has installed more than 460,000 servers worldwide.
For example, 200 Google data centers are used at one time for a number of cloud
applications. Data items are stored in text, images, and video and are replicated to
tolerate faults or failures.
Google's App Engine (GAE) which offers a PaaS platform supporting various cloud
and web applications. Google has pioneered cloud development by leveraging the large
number of data centers it operates.
For example, Google pioneered cloud services in Gmail, Google Docs, and Google
Earth, among other applications. These applications can support a large number of users
simultaneously with HA. Notable technology achievements i n c l u d e t h e
G o o g l e F i l e S y s t e m ( GFS), Map Reduce, Big Table, announced the GAE
web application platform which is becoming a common platform for many small cloud
service providers. This platform specializes in supporting scalable (elastic) web
applications. GAE enables users to run their applications on a large number of data
centers associated with Google's search engine operations.
1.1 GAE Architecture
GFS is used for storing large amounts of data.
Map Reduce is for use in application program development. Chubby is used for
distributed application lock services. BigTable offers a storage service for accessing
structured data.
Users can interact with Google applications via the web interface provided by each
application.
Third-party application providers can use GAE to build cloud applications for
providing services.
The applications all run in data centers under tight management by Google engineers.
Inside each data center, there are thousands of servers forming different clusters
Figure 4.1 shows the overall architecture of the Google cloud infrastructure.
A typical cluster configuration can run the Google File System, Map Reduce jobs and
Big Table servers for structure data.
• Extra services such as Chubby for distributed locks can also run in the clusters.
• GAE runs the user program on Goggle’s infrastructure. As it is a platform running
third-party programs, application developers now do not need to worry about the
maintenance of servers.
GAE can be thought of as the combination of several software components. The
frontend is an application framework which is similar to other web application
frameworks such as ASP, J2EE and JSP. At the time of this writing, GAE supports
Python and Java programming environments. The applications can run similar to web
application containers. The frontend can be used as the dynamic web serving
infrastructure which can provide the full support of common technologies.
The GAE platform comprises the following five major components. The GAE is not
an infrastructure platform, but rather an application development platform for users.
The data store offers object-oriented, distributed, structured data storage services based
on Big Table techniques. The data store secures data management operations.
The application runtime environment offers a platform for scalable web programming
and execution. It supports two development languages: Python and Java.
o The software development kit (SDK) is used for local application development. The
SDK allows users to execute test runs of local applications and upload application code.
Best-known GAE applications include the Google Search Engine, Google Docs,
Google Earth and Gmail. These applications can support large numbers of users
simultaneously. Users can interact with Google applications via the web interface
provided by each application. Third party application providers can use GAE to build
cloud applications for are all run in the Googledata centers. Inside each data center, there might be
thousands of server nodes to form different clusters. Each cluster can run multipurpose
servers. GAE supports many web applications.
One is a storage service to store application specific data in the Google infrastructure.
The data can be persistently stored in the backend storage server while still providing
the facility for queries, sorting and even transactions similar to traditional database
systems.
GAE also provides Google specific services, such as the Gmail account service. This
can eliminate the tedious work of building customized user management components
in web applications.
An update of an entity occurs a fixed number of times if other processes are trying to update
the same entity simultaneously.
The user application can execute multiple data store operations in a single transaction
which either all succeed or all fail together.
The data store implements transactions across its distributed network using entity
groups. A transaction manipulates entities within a single group. Entities of the same
group are stored together for efficient execution of transactions.
The user GAE application can assign entities to groups when the entities are created.
The performance of the data store can be enhanced by in-memory caching using the
memcache, which can also be used independently of the data store.
Recently, Google added the blob store which is suitable for large files as its size limit
is 2 GB.
There are several mechanisms for incorporating external resources.
The Google SDC Secure Data Connection can tunnel through the Internet and link your
intranet to an external GAE application. The URL Fetch operation provides the ability
for applications to fetch resources and communicate with other hosts over the Internet
using HTTP and HTTPS requests.
There is a specialized mail mechanism to send e-mail from your GAE application.
Applications can access resources on the Internet, such as web services or other data,
using GAE's URL fetch service. The URL fetch service retrieves web resources using
the same high- speed Google infrastructure that retrieves web pages for many other
Google products. There are dozens of Google "corporate" facilities including maps,
sites, groups, calendar, docs, and YouTube, among others. These support the Google
Data API which can be used inside GAE. An application can use Google Accounts for
user authentication. Google Accounts handles user account creation and sign-in, and a
user that already has a Google account (such as a Gmail account) can use that account
with your app.
GAE provides the ability to manipulate image data using a dedicated Images service
which can resize, rotate, flip, crop and enhance images. An application can perform
tasks outside of responding to web requests. A GAE application is configured to
consume resources up to certain limits or quotas. With quotas, GAE ensures that your
application would not exceed your budget and that other applications running on GAE
would not impact the performance of your app. In particular, GAE use is free up to
certain quotas. GFS was built primarily as the fundamental storage service for Google's
search engine. As the size of the web data that was crawled and saved was quite
substantial, Google needed a distributed file system to redundantly store massive
amounts of data on cheap and unreliable computers.
In addition, GFS was designed for Google applications and Google applications were
built for GFS.
In traditional file system design, such a philosophy is not attractive, as there should be
a clear interface between applications and the file system such as a POSIX interface.
GFS typically will hold a large number of huge files, each 100 MB or larger, with files
that are multiple GB in size quite common. Thus, Google has chosen its file data block
size to be 64 MB instead of the 4 KB in typical traditional file systems. The I/O pattern
in the Google application is also special. Files are typically written once, and the write
operations are often the appending data blocks to the end of files.
Multiple appending operations might be concurrent.
Big Table was designed to provide a service for storing and retrieving structured and
semi structured data. Big Table applications include storage of web pages, per-user
data, and geographic locations.
This is one reason to rebuild the data management system and the resultant system can
be applied across many cost.
The other motivation for rebuilding the data management system is performance.
Low level storage optimizations help increase performance significantly which is
much harder to do when running on top of a traditional database layer. The design and
implementation of the Big Table system has the following goals.
The applications want asynchronous processes to be continuously updating different
pieces of data and want access to the most current data at all times. The database needs
to support very high read/write rates and the scale might be millions of operations per
second. The application may need to examine data changes over time. Thus, Big Table
can be viewed as a distributed multilevel map. It provides a fault tolerant and persistent
database as in a storage service.
The Big Table system is scalable, which means the system has thousands of servers,
terabytes of in-memory data, peta bytes of disk based data, millions of reads/writes per
second and efficient scans. BigTable is a self-managing system (i.e., servers
added/removed dynamically and it features automatic load balancing). can be Chubby,
Google's Distributed Lock Service Chubby is intended to provide a coarse- grained
locking service.
It can store small files inside Chubby storage which provides a simple namespace as a
file system tree. The files stored in Chubby are quite small compared to the huge files
in GFS.
AWS Lambda:
AWS Lambda is a server less, event-driven compute service that enables code
execution without server management. Compute time consumption is the only factor
for payment, and there is no charge when code is not running. AWS Lambda offers the
ability to run code for any application type with no need for administration.
Amazon S3
Amazon S3(Simple Storage Service) is a web service interface for object storage that
enables you to store and retrieve any amount of data from any location on the web. It
is designed to provide limitless storage with a 99.999999999% durability guarantee.
Amazon S3 can be used as the primary storage solution for cloud-native applications,
as well as for backup and recovery and disaster recovery purposes. It delivers
unmatched scalability, data availability, security, and performance.
Amazon Glacier:
Amazon Glacier is a highly secure and cost-effective storage service designed for long-
term backup and data archiving. It offers reliable durability and ensures the safety of
your data. However, since data retrieval may take several hours, Amazon Glacier is
primarily intended for archiving purposes.
Amazon RDS
Amazon Relational Database Service (Amazon RDS) simplifies the process of setting
up, managing, and scaling a relational database in the cloud. Additionally, it offers
resizable and cost-effective capacity and is available on multiple database instance
types that are optimized for memory, performance, or I/O. With Amazon RDS, choice
of six popular database engines including Amazon Aurora, PostgreSQL, MySQL,
MariaDB, Oracle, and Microsoft SQL Server.
Amazon DynamoDB
Amazon DynamoDB is a NoSQL database service that offers fast and flexible storage
for applications requiring consistent, low-latency access at any scale. It's fully managed
and supports both document and key-value
data models. Its versatile data model and dependable performance make it well-suited
for various applications such as mobile, web, gaming, Internet of Things (IoT), and
more.
Azure is a cost-effective platform with simple pricing based on the "Pay as You G o^
prime model, which means the user only pay for the resources the user use. This makes
it a convenient option for setting up large servers without requiring
significant investments.
History
Windows Azure was announced by Microsoft in October 2008 and became available
in February 2010.In 2014, Microsoft renamed it as Microsoft Azure. It offered a
platform for various services including .NET services, SQL Services, and Live
Services. However, some people were uncertain about using cloud technology.
Nevertheless, Microsoft Azure is constantly evolving, with new tools and
functionalities being added. The platform has two releases: v1 and v2. The earlier
version was JSON script-oriented, while the newer version features an interactive UI
for easier learning and simplification. Microsoft Azure v2 is still in the preview stage.
Advantages of Azure
Azure offers a cost-effective solution as it eliminates the need for expensive hardware
investments. With a pay-as-you-go subscription model, the user can manage their
Setting up an Azure account is a simple process through the Azure Portal, where you
can choose the desired subscription and begin using the platform.
One of the major advantages of Azure is its low operational cost. Since it operates on
dedicated servers specifically designed for cloud functionality, it provides greater
reliability compared to on-site servers. By utilizing Azure, the user can eliminate the
need for hiring a dedicated technical support team to monitor and troubleshoot servers.
This results in significant cost savings for an organization. Azure provides easy backup
and recovery options for valuable data. In the event of a disaster, the user can quickly
recover the data with a single click, minimizing any impact on end user business.
Cloud-based backup and recovery solutions offer convenience, avoid upfront
investments, and provide expertise from third-party providers. Implementing the
business models in Azure is straightforward, with intuitive features and user-friendly
interfaces. Additionally, there are numerous tutorials available to expedite learning and
deployment process
Azure offers robust security measures, ensuring the protection of your critical data and
business applications. Even in the face of natural disasters, Azure serves as a reliable
safeguard for the resources. The cloud infrastructure remains operational, providing
continuous protection.
Azure services
Azure offers a wide range of services and tools for different needs. These include
Compute, which includes Virtual Machines, Virtual Machine Scale Sets, Functions for
server less computing, Batch for containerized batch workloads, Service Fabric for
micro services and container orchestration, and Cloud Services for building cloud-
based apps and APIs. The Networking tools in Azure offer several options like the
Virtual Network, Load Balancer, Application Gateway, VPN Gateway, Azure DNS for
domain hosting, Content Delivery Network, Traffic Manager, Express Route dedicated
private network fiber connections, and Network Watcher monitoring and diagnostics.
The Storage tools available in Azure include Blob, Queue, File, and Disk Storage, Data
Lake Store, Backup, and Site Recovery, among others. Web + Mobile services make it
easy to create and deploy web and mobile applications. Azure also includes tools for
Containers, Databases, Data + Analytics, AI + Cognitive Services, Internet of Things,
Security + Identity, and Developer Tools, such as Visual Studio Team Services, Azure
DevTest Labs, HockeyApp mobile app deployment and Monitoring.
Client-Server: In this environment, client devices access resources and services from
a central server, facilitating the sharing of data and processing capabilities.
Cloud Computing: Cloud computing leverages the Internet to provide resources and
services that can be accessed through web browsers or client software. It offers
scalability, flexibility, and on-demand availability.
Grid Computing: Grid computing involves the sharing of computing resources and
services across multiple computers, enabling large- scale computational tasks and data
processing
Embedded Systems: Embedded systems integrate software into devices and products,
typically with limited processing power and memory. These systems perform specific
functions within various industries, from consumer electronics to automotive and
industrial applications.
Each computing environment has its own set of advantages and disadvantages, and the
choice of environment depends on the specific requirements of the software application
and the available resources Computing has become an integral part of modern life,
where computers are utilized extensively to manage, process, and communicate
information efficiently.
7. Explain in detail about Eucalyptus and its components?
(Definition:2 marks, Diagram:3 marks, Concept explanation:6 marks, Advantages:2
marks)
Components:
Eucalyptus has various components that work together to provide efficient cloud
computing services.
The Node Controller manages the lifecycle of instances and interacts with the operating
system, hypervisor, and Cluster Controller. On the other hand, the Cluster Controller
manages multiple Node Controllers and the Cloud Controller, which acts as the front-
end for the entire architecture.
The Storage Controller, also known as Walrus, allows the creation of snapshots of
volumes and persistent block storage over VM instances.
Eucalyptus operates in different modes, each with its own set of features. In Managed
Mode, users are assigned security groups that are isolated by VLAN between the
Cluster Controller and Node Controller. In Managed (No VLAN) Node mode, however,
the root user on the virtual machine can snoop into other virtual machines running on
the same network layer. The System Mode is the simplest mode with the least number
of features, where a MAC address is assigned to a virtual machine instance and attached
to the Node Controller's bridge Ethernet device. Finally, the Static Mode is similar to
System Mode but provides more control over the assignment of IP addresses, as a MAC
address/IP address pair is mapped to a static entry within the DHCP server.
Features of Eucalyptus
Eucalyptus offers various components to manage and operate cloud infrastructure. The
Eucalyptus Machine Image is an example of an image, which is software packaged and
uploaded to the cloud, and when it is run, it becomes an instance.
The networking component can be divided into three modes: Static mode, which
allocates IP addresses to instances, System mode, which assigns a MAC address and
connects the instance's network interface to the physical network via NC, and Managed
mode, which creates a local network of instances. Access control is used to limit user
permissions. Elastic Block Storage provides block-level storage volumes that can be
attached to instances. Auto-scaling and load balancing are used to create or
remove instances or services.
Advantages of Eucalyptus
Eucalyptus is a versatile solution that can be used for both private and public cloud
computing.
Users can easily run Amazon or Eucalyptus machine images on either type of cloud.
Additionally, its API is fully compatible with all Amazon Web Services, making it easy
to integrate with other tools like Chef and Puppet for DevOps.
Although it is not as widely known as other cloud computing solutions like OpenStack
and Cloud Stack, Eucalyptus has the potential to become a viable alternative. It enables
hybrid cloud computing, allowing users to combine public and private clouds for their
needs. With Eucalyptus, users can easily transform their data centers into private clouds
and extend their services to other organizations.
PART C
15 Marks
Advantages of AWS
AWS provides the convenience of easily adjusting resource usage based on your
changing needs, resulting in cost savings and ensuring that your application always has
sufficient resources.
With multiple data centers and a commitment to 99.99 for many of its services, AWS
offers a reliable and secure infrastructure.
Its flexible platform includes a variety of services and tools that can be combined to
build and deploy various applications. Additionally, AWS's pay-as-you-go pricing
model means user only pay for the resource use, eliminating upfront costs and long-
term commitments.
Disadvantages:
AWS can present a challenge due to its vast array of services and functionalities,
which may be hard to comprehend and utilize, particularly for inexperienced
users. The cost of AWS can be high, particularly for high-traffic applications or when
operating multiple services. It can escalate over time, necessitating frequent expense monitoring.
AWS's management of various infrastructure elements may limit authority over certain
parts of your environment and application.
Global infrastructure
The AWS infrastructure spans across the globe and consists of geographical regions,
each with multiple availability zones that are physically isolated from each other. When
selecting a region, factors such as latency optimization, cost reduction, and government
regulations are considered. In case of a failure in one zone, the infrastructure in other
availability zones remains operational, ensuring business continuity. AWS's largest
region, North Virginia, has six availability zones that are connected by high-speed
fiber-optic networking.
To further optimize content delivery, AWS has over 100 edge locations worldwide that
support the Cloud Front content delivery network. This network caches frequently
accessed content, such as images and videos, at these edge locations and distributes
them globally for faster delivery and lower latency for end-users. Additionally, Cloud
Front offers protection against DDoS attacks
The controller node runs the Identity service, Image service, Placement service,
management portions of Compute, management portion of Networking, various
Networking agents, and the Dashboard. It also includes supporting services such as an
SQL database, message queue, and NTP.
Optionally, the controller node runs portions of the Block Storage, Object Storage,
Orchestration, and Telemetry services. The controller node requires a minimum of two
network interfaces. The compute node runs the hypervisor portion of Compute that
operates instances. By default, the compute node also runs a Networking service agent that
connects instances to virtual networks and provides firewalling services to instances via
security groups.
Administrator can deploy more than one compute node. Each node requires a
minimum of two network interfaces. The optional Block Storage node contains the
disks that the Block Storage and Shared File System services provision for instances.
For simplicity, service traffic between compute nodes and this node uses the
management network.
Production environments should implement a separate storage network to increase
performance and security. Administrator can deploy more than one block storage
node. Each node requires a minimum of one network interface. The optional Object
Storage node contains the disks that the Object Storage service uses for storing
accounts, containers, and objects. For simplicity, service traffic between compute nodes
and this node uses the management network. Production environments should
implement a separate storage network to increase performance and security. This
service requires two nodes. Each node requires a minimum of one network interface.
Administrator can deploy more than two object storage nodes. The provider networks
option deploys the OpenStack Networking service in the simplest way possible with
primarily layer 2 (bridging/switching) services and VLAN segmentation of networks.
Essentially, it bridges virtual networks to physical networks and relies on physical
network infrastructure for layer-3 (routing) services. Additionally, a DHCP service
provides IP address information to instances.
UNIT V
CLOUD SECURITY
SYLLABUS: Virtualization System-Specific Attacks: Guest hopping – VM migration
attack – hyperjacking. Data Security and Storage; Identity and Access Management
(IAM) - IAM Challenges - IAM Architecture and Practice.
PART A
2 Marks
1. What is a virtualization attack?
Virtualization Attacks One of the top cloud computing threats involves one of its core
enabling technologies: virtualization. In virtual environments, the attacker can take
control of virtual machines installed by compromising the lower layer hypervisor.
IAM roles are of 4 types, primarily differentiated by who or what can assume the role:
Service Role. Service-Linked Role. Role for Cross-Account Access.
PART B
13 Marks
1. What is virtual migration attacks?
(Definition:2 marks, Concept explanation:11 marks)
Guest-hopping attack: one of the possible mitigations of guest hopping attack is the
Forensics and VM debugging tools to observe any attempt to compromise VM. Another
possible mitigation is using High Assurance Platform (HAP) which provides a high
degree of isolation between virtual machines. -SQL injection: to mitigate SQL injection
attack you should remove all stored procedures that are rarely used. Also, assign the
least possible privileges to users who have permissions to access the Database-Side
channel attack: as a countermeasure, it might be preferable to ensure that none of the
legitimate user VMs resides on the same hardware of other users. This completely
eliminates the risk of side-channel attacks in a virtualized cloud Environment-
Malicious Insider: strict privileges’ planning, security auditing can minimize this
security Threat-Data storage security: ensuring data integrity and confidently ensure
limited access to the users’ data by the CSP employees.
What Is Hyperjacking?
Hypervisors form the backbone of virtual machines. These are software programs that
are responsible for creating, running, and managing VMs. A single hypervisor can host
multiple virtual machines, or multiple guest operating systems, at one time, which also
gives it the alternative name of virtual machine manager (VMM).
There are two kinds of hypervisors. The first is known as a "bare metal" or "native"
hypervisor, with the second being a "host" hypervisor. What you should note is that it
is the hypervisors of virtual machines that are the targets of hyperjacking attacks (hence
the term "hyper-jacking").
Origins of Hyperjacking
In the mid-2000s, researchers found that hyperjacking was a possibility. At the time,
hyperjacking attacks were entirely theoretical, but the threat of one being carried out
was always there. As technology advances and cybercriminals become more inventive,
the risk of hyperjacking attacks increases by the year.
In fact, in September 2022, warnings of real hyperjacking attacks began to arise. Both
Mandiant and VMWare published warnings stating that they found malicious actors
using malware to conduct hyperjacking attacks in the wild via a harmful version of
VMWare software. In this venture, the threat actors inserted their own malicious code
within victims' hypervisors while bypassing the target devices' security measures
(similarly to a rootkit).
Through this exploit, the hackers in question were able to run commands on the virtual
machines' host devices without detection.
Hypervisors are the key target of hyperjacking attacks. In a typical attack, the original
hypervisor will be replaced via the installation of a rogue, malicious hypervisor that the
threat actor has control of. By installing a rogue hypervisor under the original, the
attacker can therefore gain control of the legitimate hypervisor and exploit the VM.
By having control over the hypervisor of a virtual machine, the attacker can, in turn,
gain control of the entire VM server. This means that they can manipulate anything in
the virtual machine. In the aforementioned hyperjacking attack announced in September
2022, it was found to spy on victims.Compared to other hugely popular cybercrime tactics
like phishing and ransomware, hyperjacking isn't very common at the moment. But
with the first confirmed use of this method, it's important that you know how to keep
your devices, and your data, safe.
3. Explain about cloud data security in detail?
(Definition:2 marks, Concept explanation:11 marks)
Cloud data security is the practice of protecting data and other digital information assets
from security threats, human error, and insider threats. It leverages technology, policies,
and processes to keep your data confidential and still accessible to those who need it in
cloud-based environments. Cloud computing delivers many benefits, allowing you to
access data from any device via an internet connection to reduce the chance of data loss
during outages or incidents and improve scalability and agility. At the same time, many
organizations remain hesitant to migrate sensitive data to the cloud as they struggle to
understand their security options and meet regulatory demands.
Understanding how to secure cloud data remains one of the biggest obstacles to
overcome as organizations transition from building and managing on-premises data
centers. So, what is data security in the cloud? How is your data protected? And what
cloud data security best practices should you follow to ensure cloud-based data assets
are secure and protected?
Read on to learn more about cloud data security benefits and challenges, how it works,
and how Google Cloud enables companies to detect, investigate, and stop threats
across cloud, on-premises, and hybrid deployments.
Cloud data security protects data that is stored (at rest) or moving in and out of the
cloud (in motion) from security threats, unauthorized access, theft, and corruption. It
relies on physical security, technology tools, access management and controls, and
organizational policies.
Why companies need cloud security
Today, we’re living in the era of big data, with companies generating, collecting, and
storing vast amounts of data by the second, ranging from highly confidential business
or personal customer data to less sensitive data like behavioral and marketing analytics.
Beyond the growing volumes of data that companies need to be able to access,
manage, and analyze, organizations are adopting cloud services to help them achieve
more agility and faster times increasingly remote or hybrid workforces. The traditional
network perimeter is fast disappearing, and security teams are realizing that they need to
rethink current and past approaches when it comes to securing cloud data. With data and
applications no longer living inside your data center and more people than ever working
outside a physical office, companies must solve how to protect data and manage access
to that data as it moves across and through multiple environments.
As more data and applications move out of a central data center and away from
traditional security mechanisms and infrastructure, the higher the risk of exposure
becomes. While many of the foundational elements of on-premises data security
remain, they must be adapted to the cloud.
• Less control. Since data and apps are hosted on third-party infrastructure, they
have less control over how data is accessed and shared.
• Confusion over shared responsibility. Companies and cloud providers share
cloud security responsibilities, which can lead to gaps in coverage if duties and tasks
are not well understood or defined.
• Inconsistent coverage. Many businesses are finding multicloud and hybrid
cloud to better suit their business needs, but different providers offer varying levels of
coverage and capabilities that can deliver inconsistent protection.
• Growing cybersecurity threats. Cloud databases and cloud data storage make
ideal targets for online criminals looking for a big payday, especially as companies are
still educating themselves about data handling and management in the cloud.
• Strict compliance requirements. Organizations are under pressure to comply
with stringent data protection and privacy regulations, which require enforcing security
policies across multiple environments and demonstrating strong data governance.
• Distributed data storage. Storing data on international servers can deliver
lower latency and more flexibility. Still, it can also raise data sovereignty issues that
• might not be problematic if own data center.
Greater visibility
Strong cloud data security measures allow you to maintain visibility into the inner
workings of your cloud, namely what data assets you have and where they live, who is
using your cloud services, and the kind of data they are accessing.
Easy backups and recovery
Cloud data security can offer a number of solutions and features to help automate
and standardize backups, freeing your teams from monitoring manual backups and
troubleshooting problems. Cloud-based disaster recovery also lets you restore and
recover data and applications in minutes.
Cloud data compliance
Robust cloud data security programs are designed to meet compliance obligations,
including knowing where data is stored, who can access it, how it’s processed, and how
it’s protected. Cloud data loss prevention (DLP) can help you easily discover, classify,
and de-identify sensitive data to reduce the risk of violations.
Data encryption
Organizations need to be able to protect sensitive data whenever and wherever it goes.
Cloud service providers help you tackle secure cloud data transfer, storage, and sharing
by implementing several layers of advanced encryption for securing cloud data, both
in transit and at rest.
Lower costs
Cloud data security reduces total cost of ownership (TCO) and the administrative and
management burden of cloud data security. In addition, cloud providers offer the latest
security features and tools, making it easier for security professionals to do their jobs
with automation, streamlined integration, and continuous alerting.
Advanced incident detection and response
An advantage of cloud data security is that providers invest in cutting-edge AI
technologies and built-in security analytics that help you automatically scan for
suspicious activity to identify and respond to security incidents quickly.
IAM Challenges One critical challenge of IAM concerns managing access for diverse
user populations (employees, contractors, partners, etc.) accessing internal and
externally hosted services. IT is constantly challenged to rapidly provision appropriate
access to the users whose roles and responsibilities often change for business reasons.
Another issue is the turnover of users within the organization. Turnover varies by
industry and function—seasonal staffing fluctuations in finance departments, for
example—and can also arise from changes in the business, such as mergers and
acquisitions, new product and service releases, business process outsourcing, and
changing responsibilities. As a result, sustaining IAM processes can turn into a
persistent challenge. Access policies for information are seldom centrally and
consistently applied. Organizations can contain disparate directories, creating complex
webs of user identities, access rights, and procedures. This has led to inefficiencies in
user and access management processes while exposing these organizations to
significant security, regulatory compliance, and reputation risks. To address these
challenges and risks, many companies have sought technology solutions to enable
centralized and automated user access management. Many of these initiatives are
entered into with high expectations, which is not surprising given that the problem is
often large and complex. Most often those initiatives to improve IAM can span several
years and incur considerable cost. Hence, organizations should approach their IAM
strategy and architecture with both business and IT drivers that address the core
inefficiency issues while preserving the control’s efficacy (related to access control).
Only then will the organizations have a higher likelihood of success and return on
investment.
PART C
15 Marks
1. Explain in detail about IAM architecture?
(Definition:2 marks, Diagram:4 marks, Concept explanation:9 marks)
Access Management
Monitoring and Auditing: - Based on the defined policies the monitoring, auditing,
and reporting are done by the users regarding their access to within the organization.
Resource
s
Operational Activities of IAM: - In this process, we onboard the new users on the
organization's system and application and provide them with necessary access to the
services and data. Deprovisioning works completely opposite in that we delete or
deactivate the identity of the user and de-relinquish all the privileges of the user.
When compared to the traditional applications deployment model within the enterprise,
IAM practices in the cloud are still evolving. In the current state of IAM technology,
standards support by CSPs (SaaS, PaaS, and IaaS) is not consistent across providers.
Although large providers such as Google, Microsoft, and Salesforce.com seem to
demonstrate basic IAM capabilities, our assessment is that they still fall short of
enterprise IAM requirements for managing regulatory, privacy, and data protection
requirements. Table 5-2 illustrates the current maturity model, based on the authors’
assessment, generalized across SPI service delivery models.
The maturity model takes into account the dynamic nature of IAM users, systems, and
applications in the cloud and addresses the four key components of the IAM automation
process: • User Management, New Users • User Management, User Modifications •
Authentication Management • Authorization Management Table 5-3 defines the
maturity levels as they relate to the four key components.
By matching the model’s descriptions of various maturity levels with the cloud services
delivery model’s (SaaS, PaaS, IaaS) current state of IAM, a clear picture emerges of
IAM maturity across the four IAM components. If, for example, the service delivery
model (SPI) is “immature” in one area but “capable” or “aware” in all others, the IAM
maturity model can help focus attention on the area most in need of attention.
Although the principles and purported benefits of established enterprise IAM practices
and processes are applicable to cloud services, they need to be adjusted to the cloud
environment. Broadly speaking, user management functions in the cloud can be
categorized as follows:
• Cloud identity administration
• Federation or SSO
• Authorization management
• Compliance management
1.1 and Google Apps supports SAML 2.0. Enterprises accessing Google Apps and
Salesforce.com may benefit from a multiprotocol federation gateway hosted by an
identity management CSP such as Symplified or TriCipher. In cases where
credentialing is difficult and costly, an enterprise might also outsource credential
issuance (and background investigations) to a service provider, such as the GSA
Managed Service Organization (MSO) that issues personal identity verification (PIV)
cards and, optionally, the certificates on the cards. The GSA MSO† is offering the
USAccess management end-to-end solution as a shared service to federal civilian
agencies. In essence, this is a SaaS model for identity management, where the SaaS IdP
stores identities in a “trusted identity store” and acts as a proxy for the organization’s
users accessing cloud services, as illustrated in Figure 5-8.• They are consistent with
internal policies, processes, and access management frameworks. • They have direct
oversight of the service-level agreement (SLA) and security of the IdP. • They have an
incremental investment in enhancing the existing identity architecture to support
federation. Cons By not changing the infrastructure to support federation, new
inefficiencies can result due to the addition of life cycle management for non-employees
such as customers. Most organizations will likely continue to manage employee and
long-term contractor identities using organically developed IAM infrastructures and
practices. But they seem to prefer to outsource the management of partner and consumer
identities to a trusted cloudbased identity provider as a service partner. Identity
management-as-a-service In this architecture, cloud services can delegate
authentication to an identity management-asa-service (IDaaS) provider. In this model,
organizations outsource the federated identity management technology and user
management processes to a third-party service provider, such as Ping, or Symplified.com. When
federating identities to the cloud, organizations may need to manage the identity life
cycle using their IAM system and processes. However, the organization might benefit
from an outsourced multiprotocol federation gateway (identity federation service) if it
has to interface with many different partners and cloud service federation schemes. For
example, as of this writing, Salesforce.com supports SAML
1.1 and Google Apps supports SAML 2.0. Enterprises accessing Google Apps and
Salesforce.com may benefit from a multiprotocol federation gateway hosted by an
identity management CSP such as Symplified or TriCipher. In cases where
credentialing is difficult and costly, an enterprise might also outsource credential
issuance (and background investigations) to a service provider, such as the GSA
Managed Service Organization (MSO) that issues personal identity verification (PIV)
cards and, optionally, the certificates on the cards. The GSA MSO† is offering the
USAccess management end-to-end solution as a shared service to federal civilian
agencies. In essence, this is a SaaS model for identity management, where the SaaS IdP
stores identities in a “trusted identity store” and acts as a proxy for the organization’s
users accessing cloud services, as illustrated in Figure 5-8.
The identity store in the cloud is kept in sync with the corporate directory through a
provider proprietary scheme (e.g., agents running on the customer’s premises
synchronizing a subset of an organization’s identity store to the identity store in the
cloud using SSL VPNs). Once the IdP is established in the cloud, the organization
should work with the CSP to the cloud identity service provider. The cloud IdP will
authenticate the cloud users prior to them accessing any cloud services (this is done via
browser SSO techniques that involve standard HTTP redirection techniques). Here are
the specific pros and cons of this approach:
Pros
Delegating certain authentication use cases to the cloud identity management service
hides the complexity of integrating with various CSPs supporting different federation
standards. Case in point: Salesforce.com and Google support delegated authentication
using SAML. However, as of this writing, they support two different versions of
SAML: Google Apps supports only SAML 2.0, and Salesforce.com supports only
SAML 1.1. Cloud based identity management services that support both SAML
standards (multiprotocol federation gateways) can hide this integration complexity
from organizations adopting cloud services. Another benefit is that there is little need
for architectural changes to support this model. Once identity synchronization between
the organization directory or trusted system of record and the identity service directory
in the cloud is set up, users can sign on to cloud services using corporate identity,
credentials (both static and dynamic), and authentication policies.
Cons
When you rely on a third party for an identity management service, you may have less
visibility into the service, including implementation and architecture details. Hence, the
availability and authentication performance of cloud applications hinges on the
identity management service provider’s SLA, performance management, and
availability. It is important to understand the provider’s service level, architecture,
service redundancy, and performance guarantees of the identity management service
provider. Another drawback to this approach is that it may not be able to generate
custom reports to meet internal compliance requirements. In addition, identity attribute
management can also become complex when identity attributes are not properly defined
and associated with identities (e.g., definitions of attributes, both mandatory and
optional). New governance processes may be required to authorize various operations
(add/modify/remove attributes) to govern user attributes that move outside the
organization’s trust boundary. Identity attributes will change through the life cycle of
the identity itself and may get out of sync. Although both approaches enable the
identification and authentication of users to cloud services, various features and
integration nuances are specific to the service delivery model— SaaS, PaaS, and IaaS—
as we will discuss in the next section.