100% found this document useful (3 votes)
389 views

Cloud Computing Chapter3

The chapter discusses existing cloud infrastructure providers including Amazon Web Services (AWS), Google, and Microsoft. It focuses on AWS, describing its regions and availability zones, instance types and attributes, and services like S3 storage and EC2 compute. Other topics covered include private cloud platforms, cloud storage diversity, and vendor lock-in.

Uploaded by

lukerichman29
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
100% found this document useful (3 votes)
389 views

Cloud Computing Chapter3

The chapter discusses existing cloud infrastructure providers including Amazon Web Services (AWS), Google, and Microsoft. It focuses on AWS, describing its regions and availability zones, instance types and attributes, and services like S3 storage and EC2 compute. Other topics covered include private cloud platforms, cloud storage diversity, and vendor lock-in.

Uploaded by

lukerichman29
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 42

Chapter 3 – Cloud Infrastructure

Cloud Computing: Theory and Practice.


Dan C. Marinescu Chapter 3 1
Contents
IaaS services from Amazon.
Regions and availability zones for Amazon Web Services.
Instances – attributes and cost.
A repertoire of Amazon Web Services.

SaaS and PaaS services from Google.

SaaS and PaaS services from Microsoft.

Open-source platforms for private clouds.

Cloud storage diversity and vendor lock-in.


————————————————————————-

Cloud Computing: Theory and Practice.


Dan C. Marinescu Chapter 3 2
Existing cloud infrastructure
■ The cloud computing infrastructure at Amazon, Google, and Microsoft
(as of mid 2012).
■ Amazon is a pioneer in Infrastructure-as-a-Service (IaaS).
■ Google's efforts are focused on Software-as-a-Service (SaaS) and
Platform-as-a-Service (PaaS).
■ Microsoft is involved in PaaS.
■ Private clouds are an alternative to public clouds. Open-source cloud
computing platforms such as:
■ Eucalyptus,
■ OpenNebula,
■ Nimbus,
■ OpenStack
can be used as a control infrastructure for a private cloud.

Cloud Computing: Theory and Practice.


Dan C. Marinescu Chapter 3 3
Amazon Web Services (AWS)

■ AWS ! IaaS cloud computing services launched in 2006.



■ Businesses in 200 countries used AWS in 2012.

■ The infrastructure consists of compute and storage servers


interconnected by high-speed networks and supports a set of
services.

■ An application developer:
■ Installs applications on a platform of his/her choice.
■ Manages resources allocated by Amazon.

Cloud Computing: Theory and Practice.


Dan C. Marinescu Chapter 3 4
AWS regions and availability zones
■ Amazon offers cloud services through a network of data centers on
several continents.
■ In each region there are several availability zones interconnected by
high-speed networks.
■ An availability zone is a data center consisting of a large number of
servers.

■ Regions do not share resources and communicate through the Internet.

Cloud Computing: Theory and Practice.


Dan C. Marinescu Chapter 3 5
Cloud Computing: Theory and Practice.
Dan C. Marinescu Chapter 3 6
AWS instances
■ An instance is a virtual server with a well specified set of resources
including: CPU cycles, main memory, secondary storage,
communication and I/O bandwidth.
■ The user chooses:
■ The region and the availability zone where this virtual server
should be placed.
■ An instance type from a limited menu of instance types.
■ When launched, an instance is provided with a DNS name; this
name maps to a
■ private IP address ! for internal communication within the
internal EC2 communication network.
■ public IP address ! for communication outside the internal
Amazon network, e.g., for communication with the user that
launched the instance.

Cloud Computing: Theory and Practice.


Dan C. Marinescu Chapter 3 7
Oooo

Insta

HW

Oooo

8
AWS instances (cont’d)

■ Network Address Translation (NAT) maps external IP addresses to


internal ones.

■ The public IP address is assigned for the lifetime of an instance.

■ An instance can request an elastic IP address, rather than a public IP


address. The elastic IP address is a static public IP address allocated
to an instance from the available pool of the availability zone.

■ An elastic IP address is not released when the instance is stopped or


terminated and must be released when no longer needed.

Cloud Computing: Theory and Practice.


Dan C. Marinescu Chapter 3 9
Cloud Computing: Theory and Practice.
Dan C. Marinescu Chapter 3 10
U1

Elastic IP
Watchdog
New Instance
instance

HW

11
Steps to run an application

■ Retrieve the user input from the front-end.

■ Retrieve the disk image of a VM (Virtual Machine) from a repository.

■ Locate a system and requests the VMM (Virtual Machine Monitor)


running on that system to setup a VM.

■ Invoke the Dynamic Host Configuration Protocol (DHCP) and the IP


bridging software to set up MAC and IP addresses for the VM.

Cloud Computing: Theory and Practice.


Dan C. Marinescu Chapter 3 12
User interactions with AWS

■ The AWS Management Console. The easiest way to access all


services, but not all options may be available.

■ AWS SDK libraries and toolkits are provided for several


programming languages including Java, PHP, C#, and Objective-C.

■ Raw REST requests.

Cloud Computing: Theory and Practice.


Dan C. Marinescu Chapter 3 13
Examples of Amazon Web Services
■ AWS Management Console - allows users to access the services
offered by AWS .
■ Elastic Cloud Computing (EC2) - allows a user to launch a variety
of operating systems.
■ Simple Queuing Service (SQS) - allows multiple EC2 instances to
communicate with one another.
■ Simple Storage Service (S3), Simple DB, and Elastic Bloc Storage
(EBS) - storage services.
■ Cloud Watch - supports performance monitoring.
■ Auto Scaling - supports elastic resource management.
■ Virtual Private Cloud - allows direct migration of parallel
applications.

Cloud Computing: Theory and Practice.


Dan C. Marinescu Chapter 3 14
CloudWatch

EC2

Linux, Debian,
Fedora,OpenSolaris ,
Open Suse , Red Hat, S3
Ubuntu, Windows, Suse
Linux

EBS
SQS -Simple Queue Service

EC2

Linux, Debian, Simple DB


Fedora,OpenSolaris ,
Open Suse , Red Hat,
Ubuntu, Windows, Suse
Linux

Virtual Private Cloud

Autoscaling

AWS Management Console

Cloud Computing: Theory and Practice.


Dan C. Marinescu Chapter 3 15
EC2 – Elastic Cloud Computing

■ EC2 - web service for launching instances of an application under


several operating systems, such as:
■ Several Linux distributions.
■ Microsoft Windows Server 2003 and 2008.
■ OpenSolaris.
■ FreeBSD.
■ NetBSD.
■ A user can
■ Load an EC2 instance with a custom application environment.
■ Manage network’s access permissions.
■ Run the image using as many or as few systems as desired.

Cloud Computing: Theory and Practice.


Dan C. Marinescu Chapter 3 16
EC2 (cont’d)
■ Import virtual machine (VM) images from the user environment to an
instance through VM import.
■ EC2 instances boot from an AMI (Amazon Machine Image) digitally
signed and stored in S3.
■ Users can access:
■ Images provided by Amazon.
■ Customize an image and store it in S3.
■ An EC2 instance is characterized by the resources it provides:
■ VC (Virtual Computers) – virtual systems running the instance.
■ CU (Compute Units) – measure computing power of each system.
■ Memory.
■ I/O capabilities.

Cloud Computing: Theory and Practice.


Dan C. Marinescu Chapter 3 17
Instance types
Standard instances: micro (StdM), small (StdS), large (StdL), extra
large (StdXL); small is the default.
High memory instances: high-memory extra large (HmXL), high-
memory double extra large (Hm2XL), and high-memory quadruple
extra large (Hm4XL).
High CPU instances: high-CPU extra large (HcpuXL).
Cluster computing: cluster computing quadruple extra large (Cl4XL).

Cloud Computing: Theory and Practice.


Dan C. Marinescu Chapter 3 18
Instance cost
■ A main attraction of the Amazon cloud computing is the low cost.

Cloud Computing: Theory and Practice.


Dan C. Marinescu Chapter 3 19
S3 – Simple Storage System

■ Service designed to store large objects; an application can handle


an unlimited number of objects ranging in size from 1 byte to 5 TB.
■ An object is stored in a bucket and retrieved via a unique,
developer-assigned key; a bucket can be stored in a Region
selected by the user.
■ Supports a minimal set of functions: write, read, and delete; it does
not support primitives to copy, to rename, or to move an object from
one bucket to another.
■ The object names are global.
■ S3 maintains for each object: the name, modification time, an
access control list, and up to 4 KB of user-defined metadata.

Cloud Computing: Theory and Practice.


Dan C. Marinescu Chapter 3 20
S3 (cont’d)

■ Authentication mechanisms ensure that data is kept secure.


■ Objects can be made public, and rights can be granted to other
users.
■ S3 computes the MD5 of every object written and returns it in a
field called ETag.
■ A user is expected to compute the MD5 of an object stored or
written and compare this with the ETag; if the two values do not
match, then the object was corrupted during transmission or
storage.

Cloud Computing: Theory and Practice.


Dan C. Marinescu Chapter 3 21
Elastic Block Store (EBS)
■ Provides persistent block level storage volumes for use with EC2
instances; suitable for database applications, file systems, and
applications using raw data devices.
■ A volume appears to an application as a raw, unformatted and reliable
physical disk; the range 1 GB -1 TB.
■ An EC2 instance may mount multiple volumes, but a volume cannot
be shared among multiple instances.
■ EBS supports the creation of snapshots of the volumes attached to an
instance and then uses them to restart the instance.
■ The volumes are grouped together in Availability Zones and are
automatically replicated in each zone.

Cloud Computing: Theory and Practice.


Dan C. Marinescu Chapter 3 22
SimpleDB
■ Non-relational data store. Supports store and query functions
traditionally provided only by relational databases.

■ Supports high performance Web applications; users can store and


query data items via Web services requests.

■ Creates multiple geographically distributed copies of each data item.

■ It manages automatically:
■ The infrastructure provisioning.
■ Hardware and software maintenance.
■ Replication and indexing of data items.
■ Performance tuning.

Cloud Computing: Theory and Practice.


Dan C. Marinescu Chapter 3 23
SQS - Simple Queue Service

■ Hosted message queues are accessed through standard SOAP


and Query interfaces.

■ Supports automated workflows - EC2 instances can coordinate by


sending and receiving SQS messages.

■ Applications using SQS can run independently and asynchronously,


and do not need to be developed with the same technologies.

■ A received message is “locked'' during processing; if processing


fails, the lock expires and the message is available again.

■ Queue sharing can be restricted by IP address and time-of-day.

Cloud Computing: Theory and Practice.


Dan C. Marinescu Chapter 3 24
CloudWatch

■ Monitoring infrastructure used by application developers, users,


and system administrators to collect and track metrics important
for optimizing the performance of applications and for increasing
the efficiency of resource utilization.
■ Without installing any software a user can monitor either seven
or eight pre-selected metrics and then view graphs and statistics
for these metrics.
■ When launching an Amazon Machine Image (AMI) the user can
start the CloudWatch and specify the type of monitoring:
Basic Monitoring - free of charge; collects data at five-minute
intervals for up to seven metrics.
Detailed Monitoring - subject to charge; collects data at one minute
interval.

Cloud Computing: Theory and Practice.


Dan C. Marinescu Chapter 3 25
AWS services introduced in 2012
■ Route 53 - low-latency DNS service used to manage user's DNS
public records.
■ Elastic MapReduce (EMR) - supports processing of large amounts of
data using a hosted Hadoop running on EC2.
■ Simple Workflow Service (SWF) - supports workflow management;
allows scheduling, management of dependencies, and coordination of
multiple EC2 instances.
■ ElastiCache - enables web applications to retrieve data from a
managed in-memory caching system rather than a much slower disk-
based database.
■ DynamoDB - scalable and low-latency fully managed NoSQL database
service.

Cloud Computing: Theory and Practice.


Dan C. Marinescu Chapter 3 26
AWS services introduced in 2012 (cont’d)

■ CloudFront - web service for content delivery.


■ Elastic Load Balancer - automatically distributes the incoming
requests across multiple instances of the application.
■ Elastic Beanstalk - handles automatically deployment, capacity
provisioning, load balancing, auto-scaling, and application monitoring
functions.
■ CloudFormation - allows the creation of a stack describing the
infrastructure for an application.

Cloud Computing: Theory and Practice.


Dan C. Marinescu Chapter 3 27
Elastic Beanstalk

■ Handles automatically the deployment, capacity provisioning, load


balancing, auto-scaling, and monitoring functions.
■ Interacts with other services including EC2, S3, SNS, Elastic Load
Balance and AutoScaling.
■ The management functions provided by the service are:
■ Deploy a new application version (or rollback to a previous version).
■ Access to the results reported by CloudWatch monitoring service.
■ Email notifications when application status changes or application
servers are added or removed.
■ Access to server log files without needing to login to the application
servers.
■ The service is available using: a Java platform, the PHP server-side
description language, or the .NET framework.

Cloud Computing: Theory and Practice.


Dan C. Marinescu Chapter 3 28
SaaS services offered by Google

■ Gmail - hosts Emails on Google servers and provides a web


interface to access the Email.
■ Google docs - a web-based software for building text documents,
spreadsheets and presentations.
■ Google Calendar - a browser-based scheduler; supports multiple
user calendars, calendar sharing, event search, display of daily/
weekly/monthly views, and so on.
■ Google Groups - allows users to host discussion forums to create
messages online or via Email.
■ Picasa - a tool to upload, share, and edit images.
■ Google Maps - web mapping service; offers street maps, a route
planner, and an urban business locator for numerous countries
around the world

Cloud Computing: Theory and Practice.


Dan C. Marinescu Chapter 3 29
PaaS services offered by Google

■ AppEngine - a developer platform hosted on the cloud.


■ Initially supported Python, Java was added later.
■ The database for code development can be accessed with GQL
(Google Query Language) with a SQL-like syntax.
■ Google Co-op - allows users to create customized search engines
based on a set of facets/categories.
for example, the facets for a search engine for the database research community available at http://data.cs.washington.edu/
coop/dbresearch/index.html are professor, project, publication, jobs.

■ Google Drive - an online service for data storage.


■ Google Base - allows users to load structured data from different
sources to a central repository, a very large, self-describing, semi-
structured, heterogeneous database.

Cloud Computing: Theory and Practice.


Dan C. Marinescu Chapter 3 30
PaaS and SaaS services from Microsoft

■ Windows Azure - an operating system; has 3 components:


■ Compute - provides a computation environment.
■ Storage - for scalable storage.
■ Fabric Controller - deploys, manages, and monitors applications.

■ SQL Azure - a cloud-based version of the SQL Server.

■ Azure AppFabric, formerly .NET Services - a collection of services for


cloud applications.

Live Services, SQL Azure, AppFabric,


SharePoint, and Dynamics CRM

Cloud Computing: Theory and Practice.


Dan C. Marinescu Chapter 3 31
Azure Content Delivery Network (CDN

maintains cache copies of data to


speed up computations.

Connect Applications and Data CDN

Compute Storage

Blobs Tables Queues

Fabric Controller deploys, manages, and monitors applications

Cloud Computing: Theory and Practice.


Dan C. Marinescu Chapter 3 32
)

Storage Concepts in Azure


Blobs, tables, queues, and drives are used as scalable storage.
Blob : A blob contains binary data; a container consists of one or
more blobs. Blobs can be up to a terabyte and they may have
associated metadata (e.g., the information about where a JPEG
photograph was taken). Blobs allow a Windows Azure role
instance to interact with persistent storage as though it were a
local NTFS6 le system.
Queues enable Web role instances to communicate
asynchronously with Worker role instances.

33
fi

Open-source platforms for private clouds


■ Eucalyptus - can be regarded as an open-source counterpart of
Amazon's EC2.

■ Open-Nebula - a private cloud with users actually logging into the


head node to access cloud functions. The system is centralized and
its default configuration uses the NFS file system.

■ Nimbus - a cloud solution for scientific applications based on Globus


software; inherits from Globus:
■ The image storage.
■ The credentials for user authentication.
■ The requirement that a running Nimbus process can ssh into all
compute nodes.

Cloud Computing: Theory and Practice.


Dan C. Marinescu Chapter 3 34
Eucalyptus

■ Virtual Machines - run under several VMMs (Virtual Machine


Monitors) including Xen, KVM, and VMware.
■ Node Controller - runs on server nodes hosting a VM and controls
the activities of the node.
■ Cluster Controller - controls a number of servers.
■ Cloud Controller - provides the cloud access to end-users,
developers, and administrators.
■ Storage Controller - provides persistent virtual hard drives to
applications. It is the correspondent of EBS.
■ Storage Service (Walrus) - provides persistent storage; similar to
S3, it allows users to store objects in buckets.

Cloud Computing: Theory and Practice.


Dan C. Marinescu Chapter 3 35
VM1 VM2

VMM

HW

36
Cloud Computing: Theory and Practice.
Dan C. Marinescu Chapter 3 37
Cloud Computing: Theory and Practice.
Dan C. Marinescu Chapter 3 38
Cloud storage diversity and vendor lock-in
■ Risks when a large organization relies on a single cloud service
provider:
■ Cloud services may be unavailable for a short or an extended
period of time.
■ Permanent data loss in case of a catastrophic system failure.
■ The provider may increase the prices for service.

■ Switching to another provider could be very costly due to the large


volume of data to be transferred from the old to the new provider.

■ A solution is to replicate the data to multiple cloud service providers,


similar to data replication in RAID.

Cloud Computing: Theory and Practice.


Dan C. Marinescu Chapter 3 39
RAID 5 controller

a1 a2 a3 aP

b1 b2 bP b3

c1 cP c2 c3

dP d1 d2 d3

Disk 1 Disk 2 Disk 3 Disk 4

(a)
Cloud 1 Cloud 2

a1
b1 a2
c1 b2
d1
dP c1
cP
d1

Client Proxy
a3
bP
c2
d2
aP
d3
b3
c3 Cloud 3
d3

Cloud 4
(b)
Cloud Computing: Theory and Practice.
Dan C. Marinescu Chapter 3 40
Recovery after a single failure

A1 A2 A3 Parity Recover A1 = a2 Recover Recover


A1 XOR XOR A3 XOR Parity A2 =A1 A3
A2 XOR XOR A3
A3 XOR
Parity
0 0 0 0 0 0
0 0 1 1 0
0 1 0 1 0
0 1 1 0
1 0 0 1
1 0 1 0
1 1 0 0
1 1 1 1

41
Rest of the slides not in syllabus

42

You might also like