0% found this document useful (0 votes)
149 views38 pages

Report

The document is an internship report on cloud computing by Nitta Satish, detailing the seminar work conducted as part of the Bachelor of Technology program at Visakha Institute of Engineering and Technology. It includes sections on the introduction to cloud computing, its characteristics, architecture, services, and real-world applications, along with acknowledgments and references. The report emphasizes the benefits and challenges of cloud computing, highlighting its scalability, agility, and the shift from traditional IT infrastructure to cloud-based services.

Uploaded by

satishnittana1
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
149 views38 pages

Report

The document is an internship report on cloud computing by Nitta Satish, detailing the seminar work conducted as part of the Bachelor of Technology program at Visakha Institute of Engineering and Technology. It includes sections on the introduction to cloud computing, its characteristics, architecture, services, and real-world applications, along with acknowledgments and references. The report emphasizes the benefits and challenges of cloud computing, highlighting its scalability, agility, and the shift from traditional IT infrastructure to cloud-based services.

Uploaded by

satishnittana1
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd

1

Internship Report

Department of Computer Science


Visakha Institute Of Engineering and Technology,

CERTIFICATE
Certified that seminar work entitled “CLOUD COMPUTING” is a bonafide work carried
out in the eighth semester by “NITTA SATISH (21NT1A0556)” in partial fulfilment for the
award of
Bachelor of Technology in “Computer Science” from Visakha institute of
Engineering and Technology during the academic year 2021-2025. Who carried out the
seminar work under the guidance and no part of this work has been submitted earlier for the
award of any degree.

2
ACKNOWLEDGEMENT

I take this opportunity to express my deepest gratitude to my team leader and project guide
Mr.NarendraKirola (Sr. Network Engineer) for his able guidance and support in this phase of
transition from an academic to a professional life.. His support and valuable inputs helped me
immensely in completing this project.

I would also like to show my deep sense of gratitude to my team members Mr. Nitish Singh, Ms.
Payal Sharma, Ms. Richa Mishra and Mr. Deepak Kumar at Eroads Technology, Noida who
helped me in ways of encouragement, suggestions and technical inputs, thus contributing either
directly or indirectly to the various stages of the project.

I am also grateful toNavinKumar(noc,Eroads Technology) for providing me this great opportunity


of industrial training at Eroads Technology.

I extend my heartiest thanks to Er. Dev Kant Sharma(HOD, Computer Science And
Engineering, SPCET) for providing me the necessary help to undergo this industrial/project
training at Eroads Technology, Noida.

And last, but not the least, I would like to thank the staff at Eroads Technology for being so cordial
and cooperative throughout the period of my training.

AKANSHA TYAGI
COMPUTER SCIENCE

3
S. No. Topic Page No.

1. Introduction 1

Cloud Computing
2. 4
2.1Characteristics of cloud computing

3. Need for cloud computing 14

4. History 15

Enabling Technologies
5.1 Cloud computing application architecture
5.2 Server Architecture
5. 16
5.3 Map Reduce
5.4 Google File System
5.5 Hadoop
Cloud Computing Services
6. 6.1 Amazon Web Services 21
6.2 Google App Engine
Cloud Computing in the Real World
7.1 Time Machine
7. 7.2 IBM Google University Academic Initiative 24
7.3 SmugMug
7.4 Nasdaq

8. Conclusion 40

9. References 41

4
ABSTRACT

Cloud computing is the delivery of computing as a service rather than a product, whereby
shared resources, software, and information are provided to computers and other devices as a
metered service over a network (typically the Internet).

Cloud computing provides computation, software, data access, and storage resources without
requiring cloud users to know the location and other details of the computing infrastructure.

End users access cloud based applications through a web browser or a light weight desktop or
mobile app while the business software and data are stored on servers at a remote location.
Cloud application providers strive to give the same or better service and performance as if the
software programs were installed locally on end-user computers.

At the foundation of cloud computing is the broader concept of infrastructure convergence (or
Converged Infrastructure) and shared services. This type of data center environment allows
enterprises to get their applications up and running faster, with easier manageability and less
maintenance, and enables IT to more rapidly adjust IT resources (such as servers, storage, and
networking) to meet fluctuating and unpredictable business demand.

5
CLOUD COMPUTING
This overview gives the basic concept, defines the terms used in the industry, and outlines the general
architecture and applications of Cloud computing. It gives a summary of Cloud Computing and provides a
good foundation for understanding. Keywords: Grid, Cloud, Computing

1. INTRODUCTION
Cloud Computing,” to put it simply, means “Internet Computing.” The Internet is commonly visualized as
clouds; hence the term “cloud computing” for computation done through the Internet. With Cloud
Computing users can access database resources via the Internet from Anywhere, for as long as they need,
without worrying about any maintenance or management of actual resources. Besides, databases in cloudier
very dynamic and scalable. Cloud computing is unlike grid computing, utility computing, or autonomic
computing. In fact, it is a very independent platform in terms of computing. The best example of cloud
computing is Google Apps where any application can be accessed using a browser and it can be deployed on
thousands of computer through the Internet.

1.1WHAT IS CLOUD COMPUTING?

Cloud computing provides the facility to access shared resources and common infrastructure, offering
services on demand over the network to perform operations that meet changing business needs. The location
of physical resources and devices being accessed are typically not known to the end user. It also provides
facilities for users to develop, deploy and manage their applications ‘on the cloud’, which entails
virtualization of resources that maintains and manages itself.
Some generic examples include:
• Amazon’s Elastic Computing Cloud (EC2) offering computational services that enable people to use
CPU cycles without buying more computers
• Storage services such as those provided by Amazon’s Simple Storage Service (S3)
• Companies like Nirvana allowing

1.2. SOFTWARE AS A SERVICE(SAAS)


SaaS is a model of software deployment where an application is hosted as a service provided to customers
across the Internet. Saas is generally used to refer to business software rather than consumer software, which
falls under Web 2.0. By removing the need to install and run an application on a user’s own computer it is
seen as a way for businesses to get the same benefits as commercial software with smaller cost outlay.

1.3. CLOUD STORAGE


Over time many big Internet based companies (Amazon, Google…) have come to realise that only a small
amount of their data storage capacity is being used. This has led to the rentingout of space and the storage of
information on remote servers or "clouds.

6
Data Cloud:-Along with services the cloud will host data. There has been some discussion of this being
potentially useful notion possibly aligned with the Semantic Web, though it could result in data becoming
undifferentiated .

1.4. CLOUD COMPUTINGARCHITECTURE

Cloud computing architecture, just like any other system, is categorized into two main sections: Front End
and Back End.
Front End can be end user or client or any application (i.e. web browser etc.) which is using cloud services.
Backend is the network of servers with any computer program and data storage system. It is usually
assumed that cloud contains infinite storage capacity for any software available in market.
Cloud has different applications that are hosted on their own dedicated server farms. Cloud has centralized
server administration system. Centralized server administers the system, balances client supply, adjusts
demands, monitors traffic and avoids congestion. This server follows protocols, commonly known as
middleware. Middleware controls the communication of cloud network among them.
Cloud Architecture runs on a very important assumption, which is mostly true. The assumption is
that the demand for resources is not always consistent from client to cloud.
Because of this reason the servers of cloud are unable to run at their full capacity. To avoid this
scenario, server virtualization technique is applied. In sever virtualization, all physical servers are
virtualized and they run multiple servers with either same or different application.

1.5. CHARACTERISTICS OF CLOUDCOMPUTING

Cloud computing, typically entails:


• High scalability
Cloud environments enable servicing of business requirements for larger audiences, through high scalability
• Agility
The cloud works in the ‘distributed mode ‘environment. It shares resources among users and
tasks, while improving efficiency and agility (responsiveness)
• High availability and reliability
Availability of servers is high and more reliable as the chances of infrastructure failure are
minimal
• Multi-sharing
With the cloud working in a distributed and shared mode, multiple users and applications
2.Cloud Computing:-
A definition for cloud computing can be given as an emerging computer paradigm where data
and services reside in massively scalable data centres in the cloud and can be accessed from any connected
devices over the internet. Cloud computing is a way of providing various services on virtual machines
allocated on top of a large physical machine pool which resides in the cloud. Cloud computing comes into
focus only when we think about what IT has always wanted – a way to increase capacity or add different
capabilities to the current setting on the fly without investing in new infrastructure, training new personnel
or licensing new software. Here ‘on the fly’ and ‘without investing or training’ becomes the keywords in the
current situation. But cloud computing offers a better solution. We have lots of compute power and storage
capabilities residing in the distributed environment of the cloud. What cloud computing does is to harness
the capabilities of these resources and make available these resources as a single entity which can be

7
changed to meet the current needs of the user. The basis of cloud computing is to create a set of virtual
servers on the available vast resource pool and give it to the clients. Any web enabled device can be used to
access the resource through the virtual servers. Based on the computing needs of the client, the infrastructure
allotted to the client can be scaled up or down.
From a business point of view, cloud computing is a method to address the scalability and availability
concerns for large scale applications which involves lesser overhead. Since the resource allocated to the
client can be varied based on the needs of the client and can be done without any fuss, the overhead is very
low.
One of the key concepts of cloud computing is that processing of 1000 times the data need
not be 1000 times harder. As and when the amount of data increases, the cloud computing
services can be used to manage the load effectively and make the
processing tasks easier. In the era of enterprise servers and personal computers, Cloud computing is
basically an Internet-based network made up of large numbers of server mostly based on open standards,
modular and inexpensive. Clouds containvast amounts of information and provide a variety of
As a metaphor for the Internet, "the cloud" is a familiar cliché, but when combined with "computing", the
meaning gets bigger and fuzzier. Some analysts and vendors define cloud computing narrowly as an
updated version of utility computing: basically virtual servers available over the Internet. Others go very
broad, arguing anything you consume outside the firewall is "in the cloud", including conventional
outsourcing.
Cloud computing comes into focus only when you think about what we always need: a way to
increase capacity or add capabilities on the fly without investing in new infrastructure, training new
personnel, or licensing new software. Cloud computing encompasses any subscription-based or payper-use
service that, in real time over the Internet, extends ICT's existing capabilities.
Cloud computing is at an early stage, with a motley crew of providers large and small delivering a slew
of cloud-based services, from full-blown applications to storage services to spam filtering. Yes, utility-style
infrastructure providers are part of the mix, but so are SaaS (software as a service) providers such as
Salesforce.com. Today, for the most part, IT must plug into cloud-based services individually, but cloud
computing aggregators and integrators are already emerging. The Internet is often represented as a cloud and
the term “cloud computing” arises from that analogy.
Accenture defines cloud computing as the dynamic provisioning of IT capabilities (hardware, software,or
services) from third parties over a network. McKinsey says that clouds are ard ware-based services offering
compute, network and storage capacity where: hardware management is highly abstracted from the buyer;
buyers incur infrastructure costs as variable OPEX [operating expenditures]; and infrastructure capacity is
highly elastic (up or down).

The cloud model differs from traditional outsourcing in that customers do not hand over their own IT
resources to be managed. Instead they plug into the cloud, treating it as they would an internal data center or
computer providing the samefunctions.

Large companies can afford to build and expand their own data centers but small- to medium-size
enterprises often choose to house their IT infrastructure in someone else’s facility. A collocation center is a
type of data center where multiple customers locate network, server and storage assets, and interconnect to a
variety of telecommunications and other network service providers with a minimum of cost and complexity.
A selection of companies in the collocation and cloud arena is presented in Table 1.

Amazon has a head start but well known companies such as Microsoft, Google, and Apple have joined the
fray.
8
Although not all the companies selected for Table 1 would agree on the definitions given in this article, it is
generally supposed that there are three basic types of cloud computing: Infrastructure as a Service (IaaS),
Platform as a Service (PaaS) and Software as a Service (SaaS). In IaaS, cpu, grids or clusters, virtualized
servers, memory, networks, storage and systems software are delivered as a service. Perhaps the best known
example is Amazon’s Elastic Compute Cloud (EC2) and Simple Storage
Service (S3), but traditional IT vendors such as IBM, and telecoms providers such as AT&T and Verizon are
also offering solutions. Services are typically charged by usage and can be scaled dynamically, i.e. capacity
can be increased or decreased more or less on demand.
PaaS provides virtualized servers on which users can run applications, or develop new ones, without having
to worry about maintaining the operating systems, server hardware, load balancing or computing capacity.
Well known examples include Microsoft’s Azure and Salesforce’s Force.com. Microsoft Azure provides
database and platform services starting at $0.12 per hour for compute infrastructure; $0.15 per gigabyte for
storage; and $0.10 per 10,000 transactions. For SQL Azure, a cloud database, Microsoft is charging $9.99
for a Web Edition, which comprises up to a 1 gigabyte relational database; and $99.99 for a Business
Edition, which holds up to a 10 gigabyte relational database. For .NET Services, a set of Web based
developer tools for building cloud-based applications, Microsoft is charging $0.15 per 100,000 message
operations.

SaaS is software that is developed and hosted by the SaaS vendor and which the end user accesses over the
Internet. Unlike traditional applications that users install on their computers or servers, SaaS software is
owned by the vendor and runs on computers in the vendor’s data center (or a collocation facility). Broadly
speaking, all customers of a SaaS vendor use the same software: these are one-sizefits all solutions. Well
known examples are Salesforce.com, Google’s Gmail and Apps, instant messaging from AOL, Yahoo and
Google, and Voice-over Internet Protocol (VoIP) from Vonage and Skype.
Pros and Cons of Cloud Computing The great advantage of cloud computing is “elasticity”: the ability to
add capacity or applications almost at a moment’s notice. Companies buy exactly the amount of storage,
computing power, security and other IT functions that they need from specialists in data-center computing.
They get sophisticated data center services on demand, in only the amount they need and can pay for, at
service levels set with the vendor, with capabilities that can be added or subtracted at will.
The metered cost, pay-as-you-go approach appeals to small- and medium-sized enterprises; little or no
capital investment and maintenance cost is needed. IT is remotely managed and maintained, typically for a
monthly fee, and the company can let go of “plumbing concerns”. Since the vendor has many customers, it
can lower the per-unit cost to each customer. Larger companies may find it easier to manage collaborations
in the cloud, rather than having to make holes in their firewalls for contract research organizations. SaaS
deployments usually take less time than in-house ones, upgrades are easier, and users are always using the
most recent version of the application. There may be fewer bugs because having only one version of the
software reduces complexity.
This may all sound very appealing but there are downsides. In the cloud you may not have the kind of
control over your data or the performance of your applications that you need, or the ability to audit or
change the processes and policies under which users must work. Different parts of an application might be
in many places in the cloud. Complying with federal regulations such a Sarbanes Oxley, or FDA audit, is
extremely difficult. Monitoring and maintenance tools are immature. It is hard to get metrics out of the cloud
and general management of the work is not simple.
There are systems management tools for the cloud environment but they may not integrate with existing
system management tools, so you are likely to need two systems. Nevertheless, cloud computing may
provide enough benefits to compensate for the inconvenience of two tools.

9
Cloud customers may risk losing data by having them locked into proprietary formats and may lose control
of data because tools to see who is using them or who can view them are inadequate. Data loss is a real risk.
In October 2009 1 million US users of the T-Mobile Sidekick mobile phone and emailing device lost data as
a result of server failure at Danger, a company recently acquired by Microsoft. Bear in mind, though, that it
is easy to underestimate risks associated with the current environment while overestimating the risk of a
new one. Cloud computing is not risky for every system. Potential users need to evaluate security measures
such as firewalls, and encryption techniques and make sure that they will have access to data and the
software or source code if the service provider goes out of business.
It may not be easy to tailor service-level agreements (SLAs) to the specific needs of a business.
Compensation for downtime may be inadequate and SLAs are unlikely to cover concomitant damages, but
not all applications have stringent uptime requirements. It is sensible to balance the cost of guaranteeing
internal uptime against the advantages of opting for the cloud. It could be that your own IT organization is
not as sophisticated as it might seem.
Calculating cost savings is also not straightforward. Having little or no capital investment may actually have
tax disadvantages. SaaS deployments are cheaper initially than in-house installations and future costs are
predictable; after 3-5 years of monthly fees, however, SaaS may prove more expensive overall.
Large instances of EC2 are fairly expensive, but it is important to do the mathematics correctly and make a
fair estimate of the cost of an “on-premises” (i.e., in-house) operation.
Standards are immature and things change very rapidly in the cloud. All IaaS and SaaS providers use
different technologies and different standards. The storage infrastructure behind Amazon is different from
that of the typical data center (e.g., big Unix file systems). The Azure storage engine does not use a standard
relational database; Google’s App Engine does not support an SQL database. So you cannot just move
applications to the cloud and expect them to run. At least as much work is involved in moving an application
to the cloud as is involved in moving it from an existing server to a new one. There is also the issue of
employee skills: staff may need retraining and they may resent a change to the cloud and fear job losses.
Last but not least, there are latency and performance issues. The Internet connection may add to latency or
limit bandwidth. (Latency, in general, is the period of time that one component in a system is wasting time
waiting for another component. In networking, it is the amount of time it takes a packet to travel from source
to destination.) In future, programming models exploiting multithreading may hide latency.
Nevertheless, the service provider, not the scientist, controls the hardware, so unanticipated sharing and
reallocation of machines may affect run times. Interoperability is limited. In general, SaaS solutions work
best for non-strategic, non-mission-critical processes that are simple and standard and not highly integrated
with other business systems. Customized applications may demand an in-house solution, but SaaS makes
sense for applications that have become commoditized, such as reservation systems in the travel industry.
Virtualization of computers or operating systems hides the physical characteristics of a computing platform
from users; instead it shows another abstract computing platform. A hypervisor is a piece of virtualization
software that allows multiple operating systems to run on a host computer concurrently. Virtualization
providers include VMware, Microsoft, and Citrix Systems (see Table 1). Virtualization is an enabler of cloud
computing.
Recently some vendors have described solutions that emulate cloud computing on private networks referring
to these as “private” or “internal” clouds (where “public” or “external” cloud describes cloud computing in
the traditional mainstream sense). Private cloud products claim to deliver some of the clouds and connecting
customer data centers to those of external cloud providers. It has been reported that Eli Lilly wants to benefit
from both internal and external clouds and that Amylin is looking at private cloud VMware as a
complement to EC2. Other experts, however, are skeptical: one has even gone as far as to describe private
clouds as absolute rubbish.

10
Platform Computing has recently launched a cloud management system, Platform ISF, enabling customers to
manage workload across both virtual and physical environments and support multiple pervisors and
operating systems from a single interface. VMware, the market leader in virtualization 6technology, is
moving into cloud technologies in a big way, with vSphere 4. The company is building a huge partner
network of service providers and is also releasing a “vCloud API”. VMware wants customers to build a
series of “virtual data centers”, each tailored to meet different requirements, and then have the ability to
move workloads in the virtual data centers to the infrastructure provided by cloud vendors.
Cisco, EMC and VMware have formed a new venture called Acadia. Its strategy for private cloud computing
is based on Cisco’s servers and networking, VMware’s server virtualization and EMC’s storage. (Note, by
the way, that EMC owns nearly 85% of VMware.) Other vendors, such as Google, disagree with VMware’s
emphasis on private clouds; in return VMware says Google’s online applications are not ready for the
enterprise. Applicability
Not everyone agrees, but McKinsey has concluded as follows. “Clouds already make sense for many small
and medium-size businesses, but technical, operational and financial hurdles will need to be overcome
before clouds will be used extensively by large public and private enterprises. Rather than create
unrealizable expectations for “internal clouds”, CIOs should focus now on the immediate benefits of
virtualizing server storage, network operations, and other critical building blocks”.
They recommend that users should develop an overall strategy based on solid
business cases not “cloud for the sake of cloud”; use modular design in all new software to minimize costs
when it comes time to migrate to the cloud; and set up a Cloud CIO Council to advise industry. Applications
in the Pharmaceutical Industry In the pharmaceutical sector, where large amounts of sensitive data are
currently kept behin protective firewalls, security is a real concern, as is policing individual researchers’
access to the cloud.
Nevertheless, cheminformatics vendors are starting to look at cloud options, especially in terms of Software
as a Service (SaaS) and hosted informatics. In bioinformatics and number-crunching, the cloud has distinct
advantages. EC2 billing is typically hours times number of cpus, so, as an overgeneralization, the cost for 1
cpu for 1000 hours is the same as the cost of 1000 cpus for 1 hour. This makes cloud computing appealing
for speedy answers to complex calculations. Over the past two years, new DNA sequencing technology has
emerged allowing a much more comprehensive view of biological systems at the genetic level. This so-
called next-generation sequencing has increased by orders of magnitude the already daunting deluge of
laboratory data, resulting in an immense IT challenge. Could the cloud provide a solution?
An unnamed pharmaceutical company found that processing BLAST databases and query jobs was time
consuming on its internal grid and approached Cycle Computing about running BLAST and other
applications in the cloud. After the customer had approved Cycle’s security model, Cycle built a processing
pipeline for BLAST that provides more than 7000 public databases from the National Center for
Biotechnology Information (NCBI), Ensembl, and the Information Sciences Institute of the University of
Southern California (ISI) that are updated weekly. The CycleCloud BLAST service is now publicly
available to all users.
.

2.1. Characteristics of Cloud Computing


1. Self Healing
Any application or any service running in a cloud computing environment has the property of self
healing. In case of failure of the application, there is always a hot backup of the application ready to take
over without disruption. There are multiple copies of the same application - each copy updating itself
regularly so that at times of failure there is at least one copy of the application which can take over without
even the slightest change in its unning state.
11
2. Multi-tenancy
With cloud computing, any application supports multi-tenancy - that is multiple tenants a same instant of
time. The system allows several customers to share the infrastructure allotted to them without any of them
being aware of the sharing. This is done by virtualizing the
servers on the available machine pool and then allotting the servers to multiple users. This is done in such a
way that the privacy of the users or the security of their data is not compromised.
3. Linearly Scalable
Cloud computing services are linearly scalable. The system is able to break down the workloads into
pieces and service it across the infrastructure. An exact idea of linear scalability can be obtained from the
fact that if one server is able to process say 1000 transactions per
4. Service-oriented
Cloud computing systems are all service oriented - i.e. the systems are such that they are created out of
other discrete services. Many such Division of Computer Science and Engineering, School Of Engineering,
CUSATtogether to form this service. This allows re-use of the different services that are available and that
are being created. Using the
services that were just created, other such services can be created. SLA Driven
Usually businesses have agreements on the amount of services. Scalability and availability issues cause
clients to break these agreements. But cloud computing services are SLA driven such that when the system
experiences peaks of load, it will automatically adjust itself so as to comply with the service-level
agreements.
The services will create additional instances of the applications on more servers so that the load can be
easily managed.

Virtualized
The applications in cloud computing are fully decoupled from the underlying hardware. The cloud
computing environment is a fully virtualized environment.

Flexible
Another feature of the cloud computing services is that they are flexible. They can be used to serve a large
variety of workload types varying from small loads of a small consumer application to very heavy loads of a
commercial application.

3.Need for cloud computing


What is a Cloud computing?
12
Cloud computing is Internet- ("CLOUD-") based development and use of computer technology
("COMPUTING")
Cloud computing is a general term for anything that involves delivering hosted services over the Internet.
It is used to describe both a platform and type of application.
Cloud computing also describes applications that are extended to be accessible through the Inter net.
These cloud applications use large data centers and powerful servers that host Web applications and
Web services.
Anyone with a suitable Internet connection and a standard browser can access a cloud application.

4.History
The Cloud is a metaphor for the Internet, derived from its common depiction in network diagrams or more
generally components which are managed by others) as a cloud outline.
The underlying concept dates back to 1960 when John McCarthy opined that "computation may someday be
organized as a public utility" (indeed it shares characteristics with service bureaus which date back to the
1960s) and the term The Cloud was already in commercial use around the turn of the 21st century. Cloud
computing solutions had started to appear on the market, though most of the focus at this time was on
Software as a service.
The Cloud is a term with a long history in telephony, which has in the past decade, been adopted as a
metaphor for internet based services, with a common depiction in network diagrams as a cloud outline. The
13
underlying concept dates back to 1960 when John McCarthy opined that "computation may someday be
organized as a public utility"; indeed it shares characteristics with service bureaus which date back to the
1960s. The term cloud had already come into commercial use in the early 1990s to refer to large ATM
networks.
By the turn of the 21st century, the term "cloud computing" had started to appear, although most of the
focus at this time was on Software as a service (SaaS).
In 1999, Salesforce.com was established by Marc Benioff, Parker Harris, and his fellows. They applied
many technologies of consumer web sites like Google and Yahoo! to business applications. They also
provided the concept of "On demand" and "SaaS" with their real business and successful customers. The key
for SaaS is being customizable by customer alone or with a small amount of help. Flexibility and speed for
application development have been drastically welcomed and accepted by business users.
IBM extended these concepts in 2001, as detailed in the Autonomic Computing Manifesto -- which
described advanced automation techniques such as self-monitoring, self-healing, self-configuring, and self-
optimizing in the management of complex IT systems with heterogeneous storage, servers, applications,
networks, security mechanisms, and other system elements that can be virtualized across an enterprise.
Amazon.com played a key role in the development of cloud computing bymodernizing their data centers
after the dot-com bubble and, having found that the new cloud architecture resulted in significant internal
efficiency improvements, providing access to their systems by way of Amazon Web Services in 2005 on a
utility computing basis.
2007 saw increased activity, including Goggle, IBM and a number of universities embarking on a large
scale cloud computing research project, around the time the term started gaining popularity in the
mainstream press. It was a hot topic by mid-2008 and numerous cloud computing events.

5 Enabling Technologies

5.1 Cloud computing application architecture


Cloud Computing Architecture

14
Cloud architecture, the systems architecture of the software systems involved in the delivery of cloud
computing, comprises hardware and software designed by a cloud architect who typically works for a cloud
integrator. It typically involves multiple cloud components communicating with each other over application
programming interfaces, usually web services.
This closely resembles the UNIX philosophy of having multiple programs doing one thing well and working
together over universal interfaces. Complexity is controlled and the resulting systems are more manageable
than their monolithic counterparts.
Cloud architecture extends to the client, where web browsers and/or software applications access cloud
applications.
Cloud storage architecture is loosely coupled, where metadata operations are centralized enabling the data
nodes to scale into the hundreds, each independently delivering data to applications or user. duled.

5.2. Server Architecture

Cloud computing makes use of a large physical resource pool in the cloud. As

15
computing makes use of a large physical resource pool in the cloud. As said above, cloud
computing services and applications make use of virtual server instances built upon this
resource pool. There are two applications which help in managing the server instances,
the resources and also the management of the resources by these virtual server instances.
One of these is the Xen hypervisor which provides an abstraction layer between the
hardware and the virtual OS so that the
distribution of the resources and the processing is well managed. Another application that is widely used is
the Anomalism server management system which is used for management of the infrastructure platform.
When Xen is used for virtualization of the servers over the infrastructure, a thin software
layer known as the Xen hypervisor is inserted between the server's hardware and the
operating system. This provides an abstraction layer that allows each physical server to run
one or more "virtual servers," effectively decoupling the operating system and its
applications from the underlying physical server. The Xen hypervisor is a unique open
source technology, developed collaboratively by the Xen community and engineers at over
20 of the most innovative data center solution vendors, including AMD, Cisco, Dell, HP,
IBM, Intel, Mellanox, Network Appliance, Novell, Red Hat, SGI, Sun, Unisys, VERITAS,
Voltaire, and Citrix. Xen is licensed under the GNU General Public License (GPL2) and is
available at no charge in both source and object format. The Xen hypervisor is also
exceptionally lean-- less than 50,000 lines of code. That translates to extremely low
overhead and near-native performance for guests. Xen re-uses existing device drivers (both
closed and open source) from Linux, making device management easy. Moreover Xen is
robust to device driver failure and protects both guests and the hypervisor from faulty or
malicious drivers
The Enomalism virtualized server management system is a complete virtual server
infrastructure platform. Anomalism helps in an effective management of the resources.
Enomalism can be used to tap into the cloud just as you would into a remote server. It
brings together all the features such as deployment planning, load balancing, resource
monitoring, etc. Enomalism is an open source application. It has a very simple and easy to
use web based user interface. It has a module architecture which allows for the creation of
additional system add-ons and plugins. It supports
one click deployment of distributed or replicated applications on a global basis. It supports the management
of various virtual environments including KVM/Qemu,

16
Amazon EC2 and Xen, OpenVZ, Linux Containers, VirtualBox. It has fine grained user
permissions and access privileges.

5.3. Map Reduce


Map Reduce is a software framework developed at Google in 2003 to support parallel
computations over large (multiple petabyte) data sets on clusters of commodity computers.
This framework is largely taken from ‘map’ and ‘reduce’ functions commonly used in
functional programming, although the actual semantics of the framework are not the same.
It is a programming model and an associated implementation for processing and generating
large data sets. Many of the real world tasks are expressible in this model. MapReduce
implementations have been written in C++, Java and other languages.
Programs written in this functional style are automatically parallelized and executed on the
cloud. The run-time system takes care of the details of partitioning the input data,
scheduling the program’s execution across a set of machines, handling machine failures,
and managing the required inter-machine communication. This allows programmers
without any experience with parallel and distributed systems to easily utilize the resources
of a largely distributed system.
The computation takes a set of input key/value pairs, and produces a set of output
key/value pairs. The user of the MapReduce library expresses the computation as two
functions: Map and Reduce.
Map, written by the user, takes an input pair and produces a set of intermediate key/value
pairs. The MapReduce library groups together all intermediate values associated with the
same intermediate key I and passes them to the Reduce

function.
The Reduce function, also written by the user, accepts an intermediate key Ind a set of values for that key. It
merges together these values to form a possibly smaller set of values. Typically just zero or one output value
is produced per Reduce invocation. The intermediate values are supplied to the user's reduce function via
anterator. This allows us to handle lists of values that are too large to fit in memory.

Map Reduce achieves reliability by parcelling out a number of operations onhe set of data to each node in
the network; each node is expected to report back periodically with completed work and status updates. If a
node falls silent for longer han that interval, the master node records the node as dead, and sends out the
node's signed work to other nodes. Individual operations use atomic operations for naming file outputs as a
double check to ensure that there are not parallel conflicting threads running; when files are renamed, it is
possible to also copy them to another name in addition to the name of the task (allowing for side-effects).

17
5.4. Google File System
Google File System (GFS) is a scalable distributed file system developed by Google for data intensive
applications. It is designed to provide efficient, reliable access to data using large clusters of commodity
hardware. It provides fault tolerance while running on inexpensive commodity hardware, and it delivers
high aggregate performance to a large number of clients.
Files are divided into chunks of 64 megabytes, which are only extremely rarely overwritten, or shrunk; files
are usually appended to or read. It is also designed and optimized to run on computing clusters, the nodes of
which consist of cheap, "commodity" computers, which means precautions must be taken against the high
failure rate of individual nodes and the subsequent data loss. Other design decisions select for high data
throughputs, even when it comes at the cost of latency.
The nodes are divided into two types: one Master node and a large number of Chunkservers. Chunkservers
store the data files, with each individual file broken up into fixed size chunks (hence the name) of about 64
megabytes, similar to clusters or sectors in regular file systems. Each chunk is assigned a unique 64-bit
label, and logical mappings of files to constituent chunks are maintained. Each chunk is replicated several
times throughout the network, with the minimum being three, but even more for files that have high demand
or need more redundancy.
The Master server doesn't usually store the actual chunks, but rather all the metadata associated with the
chunks, such as the tables mapping the 64-bit labels to chunk locations and the files they make up, the
locations of the copies of the chunks, what processes are reading or writing to a particular chunk, or taking a
"snapshot" of the chunk pursuant to replicating it (usually at the instigation of the Master server, when, due
to node failures, the number of copies of a chunk has fallen beneath the set number). All this metadata is
kept current by the Master server periodically receiving updates from each chunk server ("Heart-beat
messages").
Permissions for modifications are handled by a system of time-limited, expiring "leases", where the Master
server grants permission to a process for a finite period of time during which no other process will be
granted permission by the Master server to modify the chunk. The modified chunkserver, which is always
the primary chunk holder, then propagates the changes to the chunkservers with the backup copies. The
changes are not saved until all chunkservers acknowledge, thus guaranteeing the completion and atomicity
of the operation. Programs access the chunks by first querying the Master server for the locations of the
desired chunks; if the chunks are not being operated on (if there are no outstanding leases), the Master
replies with the locations, and the program then contacts and receives the data from the chunkserver directly.
As opposed to many filesystems, it's not implemented in the kernel of an Operating System but accessed
through a library to avoid overhead.
5.5. Hadoop
Hadoop is a framework for running applications on large cluster built of commodity
hardware. The Hadoop framework transparently provides applications both reliability
and data motion. Hadoop implements the computation paradigm named MapReduce
which was explained above. The application is divided into many small fragments of
work, each of which may be executed or re-executed on any node in the cluster. In
addition, it provides a distributed file system that stores data on the compute nodes,
providing very high aggregate bandwidth across the cluster. Both MapReduce and the
distributed file system are designed so that the node failures are automatically handled
by the framework. Hadoop has been implemented making use of Java. In Hadoop, the
combination of the entire JAR files and classed needed to run a MapReduce program is
called a job. All of these components are themselves collected into a JAR which is
usually referred to as the job file. To execute a job, it is submitted to a jobTracker and
then executed.
18
Tasks in each phase are executed in a fault-tolerant manner. If node(s) fail in the middle of
a computation the tasks assigned to them are re-distributed among the remaining nodes.
Since we are using MapReduce, having many map and reduce tasks enables good load
balancing and allows failed tasks to be re-run with smaller runtime overhead.
The Hadoop MapReduce framework has master/slave architecture. It has a single master
server or a jobTracker and several slave servers or taskTrackers, one per node in the cluster.
The jobTracker is the point of interaction between the users and the framework. Users
submit jobs to the jobTracker, which puts them in a queue of pending jobs and executes
them on a first-come first-serve basis. The jobTracker manages the assignment of
MapReduce jobs to the taskTrackers. The taskTrackers execute tasks upon instruction
from the jobTracker and also handle data motion between the ‘map’ and ‘reduce’ phases of
the MapReduce job.
Hadoop is a framework which has received a wide industry adoption. Hadoop

is used along with other cloud computing technologies like the Amazon services so as to
make better use of the resources. There are many instances where Hadoop has been used.
Amazon makes use of Hadoop for processing millions of sessions which it uses for
analytics. This is made use of in a cluster which has about 1 to 100 nodes. Facebook uses
Hadoop to store copies of internal logs and dimension data sources and Division of
Computer Science and Engineering, School Of Engineering, CUSAT use it as a source for
reporting/analytics and machine learning. The New York Times made use of Hadoop for
large scale image conversions. Yahoo uses Hadoop to support research for advertisement
systems and web searching tools.

6. Cloud Computing Services


Even though cloud computing is a pretty new technology, there are many companies
offering cloud computing services. Different companies like Amazon, Google, Yahoo,
IBM and Microsoft are all players in the cloud computing services industry. But Amazon
is the pioneer in the cloud computing industry with services like EC2 (Elastic Compute
Cloud) and S3 (Simple Storage Service) dominating the industry. Amazon has an
expertise in this industry and has a small advantage over the others because of this.
Microsoft has good knowledge of the fundamentals of cloud science and is building
massive data centers. IBM, the king of business computing and traditional
supercomputers, teams up with Google to get a foothold in the clouds. Google is far and
away the leader in cloud computing with the company itself built from the ground up on
hardware.

6.1. Amazon Web Services


The ‘Amazon Web Services’ is the set of cloud computing services offered by
Amazon. It involves four different services. They are Elastic Compute Cloud (EC2), Simple
Storage Service (S3), Simple Queue Service (SQS) and Simple Database Service (SDB).
1. Elastic Compute Cloud (EC2)
Amazon Elastic Compute Cloud (Amazon EC2) is a web service that provides resizable compute capacity in
the cloud. It is designed to make webscale computing Easier for developers.
It provides on-demand processing power.
19
Amazon EC2's simple web service interface allows you to obtain and configure capacity with minimal
friction. It provides you with complete control of your computing resources and lets you run on Amazon's
proven computing environment. Amazon EC2 reduces the time required to obtain and boot new server
instances to minutes, allowing you to quickly scale capacity, both up and down, as your computing
requirements change. Amazon EC2 changes the economics of computing by allowing you to pay only for
capacity that you actually use. Amazon EC2 provides developers the tools to build failure resilient
applications and isolate themselves from common failure scenarios.
Amazon EC2 presents a true virtual computing environment, allowing you to use
web service interfaces to requisition machines for use, load them with your custom
application environment, manage your network's access permissions, and run your
image using as many or few systems as you desire. To set up an Amazon EC2
node we have to create an EC2 node configuration which consists of all our
applications, libraries, data and associated configuration settings. This
configuration is then saved as an AMI
(Amazon Machine Image). There are also several stock instances of Amazon AMIs
available which can be customized and used. We can then start, terminate and
monitor as many instances of the AMI as needed. Amazon EC2 enables you to
increase or decrease capacity within minutes. You can commission one, hundreds
or even thousands of server instances simultaneously. Thus the applications can
automatically scale itself up and down depending on its needs. You have root
access to each one, and you can interact with them as you would any machine. You
have the choice of several instance types, allowing you to select a configuration of
memory, CPU, and instance storage that is optimal for your application. Amazon
EC2 offers a highly reliable environment where replacement instances can be
rapidly and reliably commissioned. Amazon EC2 provides web service interfaces
to configure firewall settings that control network access to and between groups of
instances. You will be charged at the end of each month for your EC2 resources
actually consumed. So charging will be based on the actual usage of the resources.

2. Simple Storage Service (S3)


S3 or Simple Storage Service offers cloud computing storage service.
It offers services for storage of data in the cloud. It provides a high-availability large-store
database. It provides a simple SQL-like language. It has been
designed for interactive online use. S3 is storage for the Internet. It is designed to make web-scale
computing easier for developers. S3 provides a simple web services interface that can be used to store and
retrieve any amount of data, at any time, from anywhere on the web. It gives any developer access to the
same highly scalable, reliable, fast, inexpensive data storage infrastructure that Amazon uses to run its own
global network of web sites.
Amazon S3 allows write, read and delete of objects containing from 1 byte to 5
gigabytes of data each. The number of objects that you can store is unlimited. Each
object is stored in a bucket and retrieved via a unique developer-assigned key. A
bucket can be located anywhere in Europe or the Americas but can be accessed
from anywhere. Authentication mechanisms are provided to ensure that the data is
kept secure from unauthorized access. Objects can be made private or public, and
rights can be granted to specific users for particular objects. Also the S3 service
also works with a pay only for what you use method of payment.
20
3. Simple Queue Service (SQS)
Amazon Simple Queue Service (SQS) offers a reliable, highly scalable, hosted
queue for storing messages as they travel between computers. By using SQS,
developers can simply move data between distributed components of their
applications that perform different tasks, without losing
messages or requiring each component to be always available. Messages can be retained in a queue for up
to 4 days. It is simple, reliable, secure and scalable.

4. Simple Database Service (SDB)

Amazon SimpleDB is a web service for running queries on structured data in real
time. This service works in close conjunction with the Amazon S3 and EC2,
collectively providing the ability to store, process and query data sets in the cloud.
These services are designed to make web-scale computing easier and more cost-
effective to developers. Traditionally, this type of functionality is accomplished with
a clustered relational database, which requires a sizable upfront investment and
often requires a DBA to maintain and administer them.
Amazon SDB provides all these without the operational complexity. It requires no
schema, automatically indexes your data and provides a simple API for storage and
access. Developers gain access to the different functionalities from within the
Amazon’s proven computing environment and are able to scale instantly and need
to pay only for what they use.

6.2. Google App Engine


Google App Engine lets you run your web applications on Google's infrastructure. App
Engine applications are easy to build, easy to maintain, and easy to scale as your traffic
and data storage needs grow. You can serve your app using a free domain name on the
appspot.com domain, or use Google Apps to serve it from your own domain. You can share
your application with the world, or limit access to members of your organization. App
Engine costs nothing to get started. Sign up for a free account, and you can develop and
publish your application at no charge and with no obligation. A free account can use up to
500MB of persistent storage and enough
CPU and bandwidth for about 5 million page views a month. Google App Engine makes it easy to build an
application that runs reliably,
even under heavy load and with large amounts of data. The environment includes the following
features: dynamic web serving, with full support for common web technologies
 persistent storage with queries, sorting and transactions
 automatic scaling and load balancing
 APIs for authenticating users and sending email using Google Accounts
 a fully featured local development environment that simulates Google App
Engine on your computer
Google App Engine applications are implemented using the Python programming language.
The runtime environment includes the full Python language and most of the Python
standard library. Applications run in a secure environment that provides limited access to
the underlying operating system. These limitations allow App Engine to distribute web

21
requests for the application across multiple servers, and start and stop servers to meet
traffic demands.
App Engine includes a service API for integrating with Google Accounts. Your
application can allow a user to sign in with a Google account, and access the email
address and displayable name associated with the account. Using Google Accounts lets
the user start using your application faster, because the user may not need to create a
new account. It also saves you the effort of implementing a user account system just for
your application
App Engine provides a variety of services that enable you to perform common operations
when managing your application. The following APIs are provided to access these services:
Applications can access resources on the Internet, such as web services or other data, using
App Engine's URL fetch service. Applications can send email messages using App Engine's
mail service. The mail service uses Google infrastructure to send email messages. The
Image service lets your application manipulate images. With this API, you can resize, crop,
rotate and flip images in JPEG and PNG formats.
In theory, Google claims App Engine can scale nicely. But Google currently places a limit
of 5 million hits per month on each application. This limit nullifies App Engine's
scalability, because any small, dedicated server can have this performance. Google will
eventually allow webmasters to go beyond this limit (if they pay).

7. Cloud Computing in the Real World


7.1. Time Machine
Times machine is a New York Times project in which one can read any issue from Volume
1, Number 1 of The New York Daily Times, on September 18, 1851 through to The New
York Times of December 30, 1922. They made it such that one can choose a date in history
and flip electronically through the pages, displayed with their original look and feel.
Here’s what they did. They scanned all their public domain articles from 1851 to 1992 into
TIFF files. They converted it into PDF files and put them online. Using 100 Linux
computers, the job took about 24 hours. Then a coding error was discovered that required
the job be rerun. That’s when their software team decided that the job of maintaining this
much data was too much to do in-house.
So they made use of cloud computing services to do the work. All the content was put in
the cloud, in Amazon. They made use of 100 instances of Amazon EC2 and completed the
whole work in less than 24 hours. They uploaded all the TIFF files into the cloud and
made a program in Hadoop which does the whole job. Using Amazon.com's EC2
computing platform, the Times ran a PDF conversion app that converted that 4TB of TIFF
data into 1.5TB of PDF files. The PDF files were such that they were fully searchable.
The image manipulation and the search ability of the software were done using cloud
computing services.

22
7.2. IBM Google University Academic Initiative
Google and IBM came up with an initiative to advance large-scale distributed computing by
providing hardware, software, and services to universities. Their idea was to prepare
students "to harness the potential of modern computing systems," the
companies will provide universities with hardware, software, and services to advance training in largescale
distributed computing. The two companies aim to reduce the cost of distributed computing research, thereby
enabling academic institutions and their students to more easily contribute to this emerging computing
paradigm. Eric Schmidt, CEO of Google, said in a statement. "In order to most effectively serve the
long-term interests of our users, it is imperative that students are adequately equipped to harness the
potential of modern computing systems and for researchers to be able to innovate ways to address emerging
problems."
The first university to join the initiative is the University of Washington.
Carnegie-Mellon University, MIT, Stanford University, the University of California at
Berkeley, and the University of Maryland are also participating in the program. As part of
the initiative, Google and IBM are providing a cluster of several hundred computers --
Google's custom servers and IBM BladeCenter and System x servers. Over time, the
companies expect the cluster to surpass 1,600 processors. The Linux-based servers will run
open source software including Xen's virtualization system and Hadoop, an open source
implementation of Google's distributed file system that's managed by the Apache Software
Foundation.
Students working with the cluster will have access to a Creative Commonslicensed
University of Washington.

7.3. SmugMug
SmugMug is an online photo hosting application which is fully based on cloud computing
services. They don’t own any hard drives. All their storage is based in the Amazon S3
instances.

7.4. Nasdaq
NASDAQ which had lots of stock and fund data wanted to make extra revenue selling
historic data for those stocks and funds. But for this offering, called Market Replay, the
company didn't want to worry about optimizing its databases and servers to handle the
new load. So it turned to Amazon's S3 service to host the data, and created a lightweight
reader app that let users pull in the required data. The traditional approach wouldn't have
gotten off the ground economically. NASDAQ took its market data and created flat files
for every entity, each holding enough data for a 10-minute replay of the stock's or fund's
price changes, on a second-by-second basis. It adds 100,000 files per day to the several
million it started with.
Infrastructure (Cisco Systems)
Computer software (3tera, Hadoop, IBM, RightScale)
Operating systems (Solaris, AIX, Linux including Red Hat)
Platform virtualization (Citrix, Microsoft, VMware, Sun xVM, IBM)

23
Types of services:

These services are broadly divided into three categories:


Infrastructure-as-a-Service (IaaS) Platform-as-a-Service
(PaaS) Software-as-a-Service (SaaS).

Infrastructure-as-a-Service (IaaS):
Infrastructure-as-a-Service(IaaS) like Amazon Web Services provides virtual servers with unique IP
addresses and blocks of storage on demand. Customers benefit from an API from which they can control
their servers. Because customers can pay for exactly the amount of service they use, like for electricity
or water, this service is also called utility computing.
Platform-as-a-Service (PaaS):
Platform-as-a-Service(PaaS) is a set of software and development tools hosted on the provider's servers.
Developers can create applications using the provider's APIs. Google Apps is one of the most famous
Platform-as-a-Service providers. Developers should take notice that there aren't any interoperability
standards (yet), so some providers may not allow you to take your application and put it on another
platform. Software-as-a-Service (SaaS):

24
Software-as-a-Service (SaaS) is the broadest market. In this case the provider allows the customer
only to use its applications. The software interacts with the user through a user interface. These
applications can be anything from web based email, to applications like Twitter

Types by visibility:

Public cloud:
Public cloud or external cloud describes cloud computing in the traditional mainstream sense,
whereby resources are dynamically provisioned on a fine-grained, self-service basis over the
Internet, via web applications/web services, from an off-site third-party provider who shares
resources and bills on a fine-grained utility computing basis.
Hybrid cloud:
A hybrid cloud environment consisting of multiple internal and/or external providers] "will be typical
for most enterprises". A hybrid cloud can describe configuration combining a local device, such as a
Plug computer with cloud services. It can also describe configurations combining virtual and physical,
colocated assets—for example, a mostly virtualized environment that requires physical servers, routers,
or other hardware such as a network appliance acting as a firewall or spam filter. Private cloud:
Private cloud and internal cloud are neologisms that some vendors have recently used to describe
offerings that emulate cloud computing on private networks. These (typically virtualisation automation)
products claim to "deliver some benefits of cloud computing without the pitfalls", capitalising on data
security, corporate governance, and reliability concerns. They have been criticized on the basis that
users "still have to buy, build, and manage them" and as such do not benefit from lower up-front capital
costs and less hands-on management[, essentially "[lacking] the economic model that makes cloud
computing such an intriguing concept".
While an analyst predicted in 2008 that private cloud networks would be the future of corporate IT,
there is some uncertainty whether they are a reality even within the same firm. Analysts also claim that
within five years a "huge percentage" of small and medium enterprises will get most of their computing
resources from external cloud computing providers as they "will not have economies of scale to make it
worth staying in the IT business" or be able to afford private clouds. Analysts have reported on
Platform's view that private clouds are a stepping stone to external clouds, particularly for the financial
services, and that future datacenters will look like internal clouds.
The term has also been used in the logical rather than physical sense, for example in reference to

25
platform as a service offerings, though such offerings including Microsoft's Azure
Services Platform are not available for on-premises deployment.

How does cloud computing work?


Supercomputers today are used mainly by the military, government intelligence agencies, universities
and research labs, and large companies to tackle enormously complex calculations for such tasks as
simulating nuclear explosions, predicting climate change, designing airplanes, and analyzing which
proteins in the body are likely to bind with potential new drugs. Cloud computing aims to apply that
kind of power—measured in the tens of trillions of computations per second—to problems like
analyzing risk in financial portfolios, delivering personalized medical information, even powering
immersive computer games, in a way that users can tap through the Web. It does that by networking
large groups of servers that often use low-cost consumer PC technology, with specialized connections
to spread data-processing chores across them. By contrast, the newest and most powerful desktop PCs
process only about 3 billion computations a second. Let's say you're an executive at a large
corporation. Your particular responsibilities include making sure that all of your employees have the
right hardware and software they need to do their jobs. Buying computers for everyone isn't enough --
you also have to purchase software or software licenses to give employees the tools they require.
Whenever you have a new hire, you have to buy more software or make sure your current software
license allows another user. It's so stressful that you find it difficult to go.

A typical cloud computing system


Soon, there may be an alternative for executives like you. Instead of installing a suite of
software for each computer, you'd only have to load one application. That application would allow
workers to log into a Web-based service which hosts all the programs the user would need for his or
her job. Remote machines owned by another company would run everything from e-mail to word
processing to complex data analysis programs. It's called cloud computing, and it could change the
entire computer industry.
In a cloud computing system, there's a significant workload shift. Local computers no longer have to do
all the heavy lifting when it comes to running applications. The network of computers that make up the
cloud handles them instead. Hardware and software demands on the user's side decrease. The only thing
the user's computer needs to be able to run is the cloud computing system's interface software, which can
be as simple as a Web browser, and the cloud's network takes care of the rest.

26
There's a good chance you've already used some form of cloud computing. If you have an e-mail
account with a Web-based e-mail service like Hotmail, Yahoo! Mail or Gmail, then you've had some
experience with cloud computing. Instead of running an e-mail program on your computer, you log in
to a Web e-mail account remotely. The software and storage for your account doesn't exist on your
computer -- it's on the service's computer cloud.
SEVEN TECHNICAL SECURITY BENEFITS OF THE CLOUD:

SPECI FI C CHARACTERI STI CS / CAPABI LI TI ES OF CLOUDS


Since “clouds” do not refer to a specific technology, but to a general provisioning paradigm with
enhanced capabilities, it is mandatory to elaborate on these aspects. There is currently a strong tendency
to regard clouds as “just a new name for an old idea”, which is mostly due to a confusion between the
cloud concepts and the strongly related P/I/SaaS paradigms (see also II.A.2, but also due to the fact that
similar aspects have already been addressed without the dedicated term “cloud” associated with it (see
also II).
This section specifies the concrete capabilities associated with clouds that are considered essential
(required in any cloud environment) and relevant (ideally supported, but may be restricted to specific use
cases). We can thereby distinguish non-functional, economic and technological capabilities addressed,
respectively to be addressed by cloud systems.
Non-functional aspects represent qualities or properties of a system, rather than specific technological
requirements. Implicitly, they can be realized in multiple fashions and interpreted in different ways which
typically leads to strong compatibility and interoperability issues between individual providers as they
pursue their own approaches to realize their respective requirements, which strongly differ between
providers. Non-functional aspects are one of the key reasons why “clouds” differ so strongly in their
interpretation (see also II.B).
Economic considerations are one of the key reasons to introduce cloud systems in a business environment
in the first instance. The particular interest typically lies in the reduction of cost and effort through
outsourcing and / or automation of essential resource management. As has been noted in the first section,
relevant aspects thereby to consider relate to the cut-off between loss of control and reduction of effort. With
respect to hosting private clouds, the gain through cost reduction has to be carefully balanced with the
increased effort to build and run such a system. Obviously, technological challenges implicitly arise from
the non-functional and economical aspects, when trying to realize them. As opposed to these aspects,
technological challenges typically imply a specific realization – even though there may be no standard
approach as yet and deviations may hence arise. In addition to these implicit challenges, one can identify
additional technological aspects to be addressed by cloud system, partially as a pre-condition to realize some
of the high level features, but partially also as they directly relate to specific characteristics of cloud systems.
27
1. NON- FUNCTI ONAL ASPECTS
The most important non-functional aspects are:
 Elasticity is an essential core feature of cloud systems and circumscribes the capability of the
underlying infrastructure to adapt to changing, potentially non-functional requirements, for example
amount and size of data supported by an application, number of concurrent users etc. One can distinguish
between horizontal and vertical scalability, whereby horizontal scalability refers to the amount of
instances to satisfy e.g. changing amount of requests, and vertical scalability refers to the size of the
instances themselves and thus implicit to the amount of resources required to maintain the size. Cloud
scalability involves both (rapid) up- and down-scaling.
Elasticity goes one step further, tough, and does also allow the dynamic integration and extraction of
physical resources to the infrastructure. Whilst from the application perspective, this is identical to
scaling, from the middleware management perspective this poses additional requirements, in particular
regarding reliability. In general, it is assumed that changes in the resource infrastructure are announced
first to the middleware manager, but with large scale systems it is vital that such changes can be
maintained automatically.
 Reliability is essential for all cloud systems – in order to support today’s data centre-type applications in a
cloud, reliability is considered one of the main features to exploit cloud capabilities.
Reliability denotes the capability to ensure constant operation of the system without disruption, i.e. no
loss of data, no code reset during execution etc. Reliability is typically achieved through redundant
resource utilisation. Interestingly, many of the reliability aspects move from a hardware to a software-
based solution. (Redundancy in the file systems vs. RAID controllers, stateless front end servers vs. UPS,
etc.).
Notably, there is a strong relationship between availability (see below) and reliability – however, reliability
focuses in particular on prevention of loss (of data or execution progress).
 Quality of Service support is a relevant capability that is essential in many use cases where specific
requirements have to be met by the outsourced services and / or resources. In business cases, basic QoS
metrics like response time, throughput etc. must be guaranteed at least, so as to ensure that the quality
guarantees of the cloud user are met. Reliability is a particular QoS aspect which forms a specific quality
requirement.
 Agility and adaptability are essential features of cloud systems that strongly relate to the elastic
capabilities. It includes on-time reaction to changes in the amount of requests and size of resources, but
also adaptation to changes in the environmental conditions that e.g. require different types of resources,
different quality or different routes, etc. Implicitly, agility and adaptability require resources (or at least
their management) to be autonomic and have to enable them to provide self-* capabilities.

 Availability of services and data is an essential capability of cloud systems and was actually one of the
core aspects to give rise to clouds in the first instance. It lies in the ability to introduce redundancy for
services and data so failures can be masked transparently. Fault tolerance also requires the ability to
introduce new redundancy (e.g. previously failed or fresh nodes) in an online manner non-intrusively
(without a significant performance penalty).
With increasing concurrent access, availability is particularly achieved through replication of data / services
and distributing them across different resources to achieve load-balancing. This can be regarded as the
original essence of scalability in cloud systems.
2. ECONOMIC ASPECTS
In order to allow for economic considerations, cloud systems should help in realising the following aspects:

28
 Cost reduction is one of the first concerns to build up a cloud system that can adapt to changing
consumer behaviour and reduce cost for infrastructure maintenance and acquisition. Scalability and Pay
per Use are essential aspects of this issue. Notably, setting up a cloud system typically entails additional
costs – be it by adapting the business logic to the cloud host specific interfaces or by enhancing the local
infrastructure to be “cloud-ready”. See also return of investment below.  Pay per use. The capability to
build up cost according to the actual consumption of resources is a relevant feature of cloud systems. Pay
per use strongly relates to quality of service support, where specific requirements to be met by the system
and hence to be paid for can be specified. One of the key economic drivers for the current level of interest
in cloud computing is the structural change in this domain. By moving from the usual capital upfront
investment model to an operational expense, cloud computing promises to enable especially SME’s and
entrepreneurs to accelerate the development and adoption of innovative solutions.
 Improved time to market is essential in particular for small to medium enterprises that want to sell their
services quickly and easily with little delays caused by acquiring and setting up the infrastructure,

in particular in a scope compatible and competitive with larger industries. Larger enterprises need to be
able to publish new capabilities with little overhead to remain competitive. Clouds can support this by
providing infrastructures, potentially dedicated to specific use cases that take over essential capabilities
to support easy provisioning and thus reduce time to market.
 Return of investment (ROI) is essential for all investors and cannot always be guaranteed – in fact some
cloud systems currently fail this aspect. Employing a cloud system must ensure that the cost and effort
vested into it is outweighed by its benefits to be commercially viable – this may entail direct (e.g. more
customers) and indirect (e.g. benefits from advertisements) ROI. Outsourcing resources versus increasing
the local infrastructure and employing (private) cloud technologies need therefore to be outweighed and
critical cut-off points identified.
 Turning CAPEX into OPEX is an implicit, and much argued characteristic of cloud systems, as the
actual cost benefit (cf. ROI) is not always clear (see e.g.[9]). Capital expenditure (CAPEX) is required to
build up a local infrastructure, but with outsourcing computational resources to cloud systems according to
operational need.

 “Going Green” is relevant not only to reduce additional costs of energy consumption, but also to reduce
the carbon footprint. Whilst carbon emission by individual machines can be quite well estimated, this
information is actually taken little into consideration when scaling systems up.

Clouds principally allow reducing the consumption of unused resources (down-scaling). In addition, up-
scaling should be carefully balanced not only with cost, but also carbon emission issues. Note that
beyond software stack aspects, plenty of Green IT issues are subject to development on the hardware
level.
3. TECHNOLOGI CAL ASPECTS
The main technological challenges that can be identified and that are commonly associated with cloud
systems are:
 Virtualisation is an essential technological characteristic of clouds which hides the technological
complexity from the user and enables enhanced flexibility (through aggregation, routing and translation).
More concretely, virtualisation supports the following features:
Ease of use: through hiding the complexity of the infrastructure (including management, configuration
etc.) virtualisation can make it easier for the user to develop new applications, as well as reduces the
overhead for controlling the system.

29
Infrastructure independency: in principle, virtualisation allows for higher interoperability by making the
code platform independent.
Flexibility and Adaptability: by exposing a virtual execution environment, the underlying infrastructure
can change more flexible according to different conditions and requirements (assigning more resources,
etc.).
Location independence: services can be accessed independent of the physical location of the user and the
resource.
 Multi-tenancy is a highly essential issue in cloud systems, where the location of code and / or data is
principally unknown and the same resource may be assigned to multiple users (potentially at the same
time). This affects infrastructure resources as well as data / applications / services that are hosted on
shared resources but need to be made available in multiple isolated instances. Classically, all information
is maintained in separate databases or tables, yet in more complicated cases information may be
concurrently altered, even though maintained for isolated tenants. Multitenancy

implies a lot of potential issues, ranging from data protection to legislator issues (see section III).
 Security, Privacy and Compliance is obviously essential in all systems dealing with potentially sensitive
data and code.
 Data Management is an essential aspect in particular for storage clouds, where data is flexibly
distributed across multiple resources. Implicitly, data consistency needs to be maintained over a wide
distribution of replicated data sources. At the same time, the system always needs to be aware of the data
location (when replicating across data centres) taking latencies and particularly workload

into consideration. As size of data may change at any time, data management addresses both horizontal
and vertical aspects of scalability. Another crucial aspect of data management is the provided consistency
guarantees (eventual vs. strong consistency, transactional isolation vs. no isolation, atomic operations
over individual data items vs. multiple data times etc.).
APIs and / or Programming Enhancements are essential to exploit the cloud features: common
programming models require that the developer takes care of the scalability and autonomic capabilities
him- / herself, whilst a cloud environment provides the features in a fashion that allows the user to leave
such management to the system.

Metering of any kind of resource and service consumption is essential in order to offer elastic
pricing, charging and billing. It is therefore a pre-condition for the elasticity of clouds. Tools are
generally necessary to support development, adaptation and usage of cloud services.
C. RELATED AREAS
It has been noted, that the cloud concept is strongly related to many other initiatives in the area of the
“Future Internet”, such as Software as a Service and Service Oriented Architecture. New concepts and
terminologies often bear the risk that they seemingly supersede preceding work and thus require a “fresh
start”, where plenty of the existing results are lost and essential work is repeated unnecessarily. In order
to reduce this risk, this section provides a quick summary of the main related areas and their potential
impact on further cloud developments.
1. INTERNET OF SERVI CES
Service based application provisioning is part of the Future Internet as such and therefore a similar statement
applies to cloud and Internet of Services as to cloud and Future Internet. Whilst the cloud concept foresees
essential support for service provisioning (making them scalable, providing a simple API for development
etc.), its main focus does not primarily rest on service provisioning. As detailed in section II.A.1 cloud
30
systems are particularly concerned with providing an infrastructure on which any type of service can be
executed with enhanced features.
Clouds can therefore be regarded as an enabler for enhanced features of large scale service provisioning.
Much research was vested into providing base capabilities for service provisioning – accordingly,
capabilities that overlap with cloud system features can be easily exploited for cloud infrastructures.

2. INTERNET OF THI NGS


It is up to debate whether the Internet of Things is related to cloud systems at all: whilst the internet of things
will certainly have to deal with issues related to elasticity, reliability and data management etc., there is an
implicit assumption that resources in cloud computing are of a type that can host and / or process data – in
particular storage and processors that can form a computational unit (a virtual processing platform).
However, specialised clouds may e.g. integrate dedicated sensors to provide enhanced capabilities and
the issues related to reliability of data streams etc. are principally independent of the type of data source.
Though sensors as yet do not pose essential scalability issues, metering of resources will already require
some degree of sensor information integration into the cloud.
Clouds may furthermore offer vital support to the internet of things, in order to deal with a flexible amount
of data originating from the diversity of sensors and “things”. Similarly, cloud concepts for scalability and
elasticity may be of interest for the internet of things in order to better cope with dynamically scaling data
streams.
Overall, the Internet of Things may profit from cloud systems, but there is no direct relationship between
the two areas. There are however contact points that should not be disregarded. Data management and
interfaces between sensors and cloud systems therefore show commonalities.
3. THE GRI D
There is an on-going confusion about the relationship between Grids and Clouds [17], sometimes seeing
Grids as “on top of” Clouds, vice versa or even identical. More surprising, even elaborate comparisons
(such as [18][19][20]) still have different views on what “the Grid” is in the first

instance, thus making the comparison cumbersome. Indeed most ambiguities can be quickly resolved if
the underlying concept of Grids is examined first: just like Clouds, Grid is primarily a concept rather
than a technology thus leading to many potential misunderstandings between individual communities.
With respect to research being carried out in the Grid over the last years, it is therefore recommendable to
distinguish (at least) between (1) “Resource Grids”, including in parti cular Grid Computing, and (2)
“eBusiness Grids” which centres mainly on distributed Virtual Organizations and is closer related to
Service Oriented Architectures (see below). Note that there may be combination between the two, e.g.
when capabilities of the eBusiness Grids are applied for commercial resource provisioning, but this has
little impact on the assessment below.
Resource Grids try to make resource - such as computational devices and storage - locally available in a
fashion that is transparent to the user. The main focus thereby lies on availability rather than scalability, in
particular rather than dynamic scalability. In this context we may have to distinguish between HPC Grids,
such as EGEE, which select and provide access to (single) HPC resources, as opposed to distributed
computing Grids (cf. Service Oriented Architecture below) which also includes P2P like scalability - in
other words, the more resources are available, the more code instances are deployed and executed.
Replication capabilities may be applied to ensure reliability, though this is not an intrinsic capability of in
particular computational Grids. Even though such Grid middleware(s) offers manageability interfaces, it
typically acts on a layer on top of the actual resources and thus does rarely virtualise the hardware, but the
computing resource as a whole (i.e. not on the IaaS level).

31
Overall, Resource Grids do address similar issues to Cloud Systems, yet typically on a different layer with
a different focus - as such, e.g. Grids do generally not cater for horizontal and vertical elasticity. What is
more important though is the strong conceptual overlap between the issues addressed by Grid and Clouds
which allows re-usage of concepts and architectures, but also of parts of technology (see also SOA below).
Specific shared concepts:
• Virtualisation of computation resources, respectively of hardware
• Scalability of amount of resources versus of hardware, code and data
• Reliability through replication and check-pointing
• Interoperability
• Security and Authentication
eBusiness Grids share the essential goals with Service Oriented Architecture, though the specific focus
rests on integration of existing services so as to build up new functionalities, and to enhance these
services with business specific capabilities. The eBusiness (or here “Virtual Organization”) approach
derives in particular from the distributed computing aspect of Grids, where parts of the overall logic are
located in different sites. The typical Grid middleware thereby focus mostly on achieving reliability in
the overall execution through on-the-fly replacement and (re)integration. But eBusiness Grids also
explore the specific requirements for commercial employment of service consumption and provisioning -
even though this is generally considered an aspect more related to Service Oriented Architectures than to
Grids.

Again, eBusiness Grids and Cloud Systems share common concepts and thus basic technological
approaches. In particular with the underlying SOA based structure, capabilities may be exposed and
integrated as stand-alone services, thus supporting the re-use aspect.
Specific shared concepts:
• Pay-per-use / Payment models
• Quality of Service
• Metering
• Availability through self-management
It is worth noting that the comparison here is with deployed Grids. The original Grids concept had a
vision of elasticity, virtualization and accessibility [48] [49] not unlike that claimed for the Clouds vision.

4. SERVI CE ORIENTED ARCHI TECTURES


There is a strong relationship between the “Grid” and Service Oriented Architectures, often leading to
confusions where the two terms either are used indistinguishably, or the one as building on top of the
other. This arises mostly from the fact that both concepts tend to cover a comparatively wide scope of
issues, i.e. the term being used a bit ambiguously.
Service Oriented Architecture however typically focuses predominantly on ways of developing, publishing
and integrating application logic and / or resources as services. Aspects related to enhancing the
provisioning model, e.g. through secure communication channels, QoS guaranteed maintenance of services
etc. come in this definition secondary. Again it must be stressed though that the aspects of eBusiness Grids
and SOA are used almost interchangeably - in particular since the advent of Web Service technologies such
as the .NET Framework and Globus Toolkit 4, where GT4 is typically regarded as Grid related and .NET as
a Web Service / SOA framework (even though they share the same main capabilities).
Though providing cloud hosted applications as a service is an implicit aspect of Cloud SaaS provisioning,
the cloud concept is principally technology agnostic, but it is generally recommended to build on service-
32
oriented principles. However, in particular with the resource virtualization aspect of cloud systems, most
technological aspects will have to be addressed at a lower level than the service layer.
Service Oriented Architectures are therefore of primary interest for a) the type of applications

1. CENTRALIZED DATA:
Reduced Data Leakage: this is the benefit I hear most from Cloud providers - and in my view
they are right. How many laptops do we need to lose before we get this? How many backup
tapes? The data “landmines” of today could be greatly reduced by the Cloud as thin client
technology becomes prevalent. Small, temporary caches on handheld devices or Netbook
computers pose less risk than transporting data buckets in the form of laptops. Ask the CISO of
any large company if all laptops have company ‘mandated’ controls consistently applied; e.g. full
disk encryption. You’ll see the answer by looking at the whites of their eyes. Despite best efforts
around asset management and endpoint security we continue to see embarrassing and disturbing
misses. And what about SMBs? How many use encryption for sensitive data, or even have a data
classification policy in place? Monitoring benefits: central storage is easier to control and
monitor. The flipside is the nightmare scenario of comprehensive data theft. However, I would
rather spend my time as a security professional figuring out smart ways to protect and monitor
access to data stored in one place (with the benefit of situational advantage) than trying to figure
out all the places where the company data resides across a myriad of thick clients! You can get
the benefits of Thin Clients today but Cloud Storage provides a way to centralize the data faster
and potentially cheaper. The logistical challenge today is getting Terabytes of data to the Cloud in
the first place.
2. INCIDENT RESPONSE / FORENSICS:
Forensic readiness: with Infrastructure as a Service (IaaS) providers, I can build a dedicated
forensic server in the same Cloud as my company and place it offline, ready for use when
needed. I would only need pay for storage until an incident happens and I need to bring it online.
I don’t need to call someone to bring it online or install some kind of remote boot software - I
just click a button in the Cloud Providers web interface. If I have multiple incident responders, I
can give them a copy of the VM so we can distribute the forensic workload based on the job at
hand or as new sources of evidence arise and need analysis. To fully realise this benefit,
commercial forensic software vendors would need to move away from archaic, physical dongle
based licensing schemes to a network licensing model.
Decrease evidence acquisition time: if a server in the Cloud gets compromised (i.e. broken
into), I can now clone that server at the click of a mouse and make the cloned disks instantly
available to my Cloud Forensics server. I didn’t need to “find” storage or have it “ready,
waiting and unused” - its just there.
Eliminate or reduce service downtime: Note that in the above scenario I didn’t have to go tell the COO that
the system needs to be taken offline for hours whilst I dig around in the RAID Array hoping that my
physical acqusition toolkit is compatible (and that the version of RAID firmware isn’t supported by my
forensic software). Abstracting the hardware removes a barrier to even doing forensics in some situations.
Decrease evidence transfer time: In the same Cloud, bit fot bit copies are super fast - made faster by that
replicated, distributed file system my Cloud provider engineered for me. From a network traffic perspective,
it may even be free to make the copy in the same Cloud. Without the Cloud, I would have to a lot of time
consuming and expensive provisioning of physical devices. I only pay for the storage as long as I need the
evidence.
Eliminate forensic image verification time: Some Cloud Storage implementations expose a cryptographic
checksum or hash. For example, Amazon S3 generates an MD5 hash automagically when you store an
33
object. In theory you no longer need to generate time-consuming MD5 checksums using external tools - it’s
already there.
Decrease time to access protected documents: Immense CPU power opens some doors. Did the suspect
password protect a document that is relevant to the investigation? You can now test a wider range of
candidate passwords in less time to speed investigations
.
3. PASSWORD ASSURANCE TESTING (AKA CRACKING):
Decrease password cracking time: if your organization regularly tests password strength by running
password crackers you can use Cloud Compute to decrease crack time and you only pay for what you use.
Ironically, your cracking costs go up as people choose better passwords ;-).
Keep cracking activities to dedicated machines: if today you use a distributed password cracker to spread the
load across non-production machines, you can now put those agents in dedicated Compute instances - and
thus stop mixing sensitive credentials with other workloads

4. LOGGING:
“Unlimited”, pay per drink storage: logging is often an afterthought, consequently insufficient disk space is
allocated and logging is either non-existant or minimal. Cloud Storage changes all this - no more ‘guessing’
how much storage you need for standard logs.
Improve log indexing and search: with your logs in the Cloud you can leverage Cloud Compute to index
those logs in real-time and get the benefit of instant search results. What is different here? The Compute
instances can be plumbed in and scale as needed based on the logging load - meaning a true real-time view.
Getting compliant with Extended logging: most modern operating systems offer extended logging in the
form of a C2 audit trail. This is rarely enabled for fear of performance degradation and log size.
Now you can ‘opt-in’ easily - if you are willing to pay for the enhanced logging, you can do so. Granular
logging makes compliance and investigations easier.

5. IMPROVE THE STATE OF SECURITY SOFTWARE (PERFORMANCE):


Drive vendors to create more efficient security software: Billable CPU cycles get noticed. More attention
will be paid to inefficient processes; e.g. poorly tuned security agents. Process accounting will make a
comeback as customers target ‘expensive’ processes. Security vendors that understand how to squeeze the
most performance from their software will win.

6. SECURE BUILDS:
Pre-hardened, change control builds: this is primarily a benefit of virtualization based Cloud
Computing. Now you get a chance to start ’secure’ (by your own definition) - you create your Gold Image
VM and clone away. There are ways to do this today with bare-metal OS installs but frequently these require
additional 3rd party tools, are time consuming to clone or add yet another agent to each endpoint.
Reduce exposure through patching offline: Gold images can be kept up securely kept up to date.
Offline VMs can be conveniently patched “off” the network.
Easier to test impact of security changes: this is a big one. Spin up a copy of your production environment,
implement a security change and test the impact at low cost, with minimal startup time.
This is a big deal and removes a major barrier to ‘doing’ security in production environments.
7. SECURITY TESTING:
Reduce cost of testing security: a SaaS provider only passes on a portion of their security testing costs. By
sharing the same application as a service, you don’t foot the expensive security code review and/or
penetration test. Even with Platform as a Service (PaaS) where your developers get to write code, there are

34
potential cost economies of scale (particularly around use of code scanning tools that sweep source code for
security weaknesses).
Adoption fears and strategic innovation opportunities
Adoption-fears
Security: Many IT executives make decisions based on the perceived security risk instead of the real
security risk. IT has traditionally feared the loss of control for SaaS deployments based on an assumption
that if you cannot control something it must be unsecured. I recall the anxiety about the web services
deployment where people got really worked up on the security of web services because the users could
invoke an internal business process from outside of a firewall.
The IT will have to get used to the idea of software being delivered outside from a firewall that gets meshed
up with on-premise software before it reaches the end user. The intranet, extranet, DMZ, and the internet
boundaries have started to blur and this indeed imposes some serious security challenges such as relying on
a cloud vendor for the physical and logical security of the data, authenticating users
across firewalls by relying on vendor's authentication schemes etc., but assuming challenges as fears is not a
smart strategy.

Latency: Just because something runs on a cloud it does not mean it has latency. My opinion is quite the
opposite. The cloud computing if done properly has opportunities to reduce latency based on its architectural
advantages such as massively parallel processing capabilities and distributed computing. The web-based
applications in early days went through the same perception issues and now people don't worry about
latency while shopping at Amazon.com or editing a document on Google docs served to them over a cloud.
The cloud is going to get better and better and the IT has no strategic advantages to own and maintain the
data centers. In fact the data centers are easy to shut down but the
applications are not and the CIOs should take any and all opportunities that they get to move the data centers
away if they can.

SLA: Recent Amazon EC2 meltdown and RIM's network outage created a debate around the availability of
a highly centralized infrastructure and their SLAs. The real problem is not a bad SLA but lack of one. The IT
needs a phone number that they can call in an unexpected event and have an up front estimate about the
downtime to manage the expectations. May be I am simplifying it too much but this is the crux of the
situation. The fear is not so much about 24x7 availability since an on-premise system hardly promises that
but what bothers IT the most is inability to quantify the impact on business in an event of non-availability of
a system and set and manage expectations upstream and downstream. The non-existent SLA is a real issue
and I believe there is a great service innovation opportunity for ISVs and partners to help CIOs with the
adoption of the cloud computing by providing a rock solid SLA and transparency into the defect resolution
process.
Strategic innovation opportunities Seamless
infrastructure virtualization:
If you have ever attempted to connect to Second Life behind the firewall you would know that it requires
punching few holes into the firewall to let certain unique transports pass through and that's not a viable
option in many cases. This is an intra-infrastructure communication challenge. I am glad to see IBM's
attempt to create a virtual cloud inside firewall to deploy some of the regions of the Second Life with
seamless navigation in and out of the firewall. This is a great example of a single sign on that
extends beyond the network and hardware virtualization to form infrastructure virtualization with seamless
security.

35
Hybrid systems: The IBM example also illustrates the potential of a hybrid system that combines an on-
premise system with remote infrastructure to support seamless cloud computing. This could be a great start
for many organizations that are on the bottom of the S curve of cloud computing adoption. Organizations
should consider pushing non-critical applications on a cloud with loose integration with on-premise systems
to begin the cloud computing journey and as the cloud infrastructure matures and some concerns are
alleviated IT could consider pushing more and more applications on the cloud. Google App Engine for cloud
computing is a good example to start creating applications on-premise that can eventually run on Google's
cloud and Amazon's AMI is expanding day-by-day to allow people to push their applications on Amazon's
cloud. Here is a quick comparison of Google and Amazon in their cloud computing efforts. Elastra's
solution to deploy EnterpriseDB on the cloud is also a good example of how organizations can outsource IT
on the cloud.
BENEFITS:
Cloud computing infrastructures can allow enterprises to achieve more efficient use of their IT
Hardware and software investments. They do this by breaking down the physical inherent in
isolated systems, and automating the management of the group of systems as a single entity.
Cloud computing is an example of an ultimately virtualized system, and a natural evolution for Data
centers that employ automated systems management, workload balancing, and virtualization technologies.
A cloud infrastructure can be a cost efficient model for delivering information services

Application:
A cloud application leverages cloud computing in software architecture, often eliminating the need to
install and run the application on the customer's own computer, thus alleviating the burden of software
maintenance, ongoing operation, and support. For example:
Peer-to-peer / volunteer computing (BOINC, Skype)
Web applications (Webmail, Facebook, Twitter, YouTube, Yammer)
Security as a service (MessageLabs, Purewire, ScanSafe, Zscaler)
Software as a service (Google Apps, Salesforce,Nivio,Learn.com, Zoho, BigGyan.com) Software plus
services (Microsoft Online Services)
Storage [Distributed]
Content distribution (BitTorrent, Amazon CloudFront)
8.CONCLUSION:
In my view, there are some strong technical security arguments in favour of Cloud Computing assuming we
can find ways to manage the risks. With this new paradigm come challenges and opportunities. The
challenges are getting plenty of attention - I’m regularly afforded the opportunity to comment on them, plus
obviously I cover them on this blog. However, lets not lose sight of the potential upside.
Some benefits depend on the Cloud service used and therefore do not apply across the board. For example; I
see no solid forensic benefits with SaaS. Also, for space reasons, I’m purposely not including the ‘flip side’
to these benefits, however if you read this blog regularly you should recognise some.
We believe the Cloud offers Small and Medium Businesses major potential security benefits. Frequently
SMBs struggle with limited or non-existent in-house INFOSEC resources and budgets. The caveat is that the
Cloud market is still very new - security offerings are somewhat foggy - making selection tricky. Clearly,
not all Cloud providers will offer the same security.

36
Increases business responsiveness Accelerates creation of new services via rapid prototyping capabilities
Reduces acquisition complexity via service oriented approach Uses IT resources efficiently via sharing and
higher system utilization Reduces energy consumption
Handles new and emerging workloads Scales to extreme workloads quickly and easily
Simplifies IT management Platform for collaboration and innovation Cultivates
skills for next generation workforce

37

You might also like