CloudComputingCS6T016 PDF
CloudComputingCS6T016 PDF
SS ZG527 AWS Storage and DB Services Dr. S. Panda, CSIS, BITS Pilani, Hyderabad Campus 2
Amazon Web Services
Amazon Web Services Cloud
• Provides highly reliable and scalable infrastructure for
deploying web-scale solutions
• With minimal support and administration costs
• More flexibility than own infrastructure, either on premise
or at a data centre facility
SS ZG527 AWS Storage and DB Services Dr. S. Panda, CSIS, BITS Pilani, Hyderabad Campus 3
SS ZG527 AWS Storage and DB Services Dr. S. Panda, CSIS, BITS Pilani, Hyderabad Campus 4
Amazon Web Services
SS ZG527 AWS Storage and DB Services Dr. S. Panda, CSIS, BITS Pilani, Hyderabad Campus 5
Infrastructure Services
• Elastic IP addresses - allocate a static IP address and
assigned to an instance.
• CloudWatch: Enable monitoring Amazon EC2 instance - -
visibility into resource utilization, operational performance,
and overall demand patterns (including metrics such as
CPU utilization, disk reads and writes, and network traffic).
• Auto-scaling - to automatically scale capacity on certain
conditions based on metric that Amazon CloudWatch
collects.
• Elastic LB – distribute incoming traffic by creating an
elastic load balancer
• Amazon Elastic Block Storage (EBS) - volumes provide
network-attached persistent storage to Amazon EC2
instances.
SS ZG527 AWS Storage and DB Services Dr. S. Panda, CSIS, BITS Pilani, Hyderabad Campus 6
Infrastructure Services
• Amazon S3 is highly durable and distributed data store.
With a simple web services interface, store and retrieve
large amounts of data as objects in buckets (containers) at
any time, using standard HTTP
• Amazon SimpleDB - Provides the core functionality of a
database, real-time lookup and simple querying of
structured data
• Amazon Relational Database Service - provides an easy
way to setup, operate and scale a relational database in
the cloud.
• Amazon Elastic MapReduce - provides a hosted Hadoop
framework
• AWS Identity and Access Management (IAM) – enables
multiple User creation with unique security credentials and
manage the permissions for each of these Users
SS ZG527 AWS Storage and DB Services Dr. S. Panda, CSIS, BITS Pilani, Hyderabad Campus 7
Amazon Elastic Compute
Cloud (Amazon EC2)
SS ZG527 AWS Storage and DB Services Dr. S. Panda, CSIS, BITS Pilani, Hyderabad Campus 8
Features of Amazon EC2
SS ZG527 AWS Storage and DB Services Dr. S. Panda, CSIS, BITS Pilani, Hyderabad Campus 9
Amazon EC2
SS ZG527 AWS Storage and DB Services Dr. S. Panda, CSIS, BITS Pilani, Hyderabad Campus 10
SS ZG527 AWS Storage and DB Services Dr. S. Panda, CSIS, BITS Pilani, Hyderabad Campus 11
Amazon Machine Image
(AMI)
SS ZG527 AWS Storage and DB Services Dr. S. Panda, CSIS, BITS Pilani, Hyderabad Campus 12
Types of AMI
SS ZG527 AWS Storage and DB Services Dr. S. Panda, CSIS, BITS Pilani, Hyderabad Campus 13
Amazon EC2 Choices
SS ZG527 AWS Storage and DB Services Dr. S. Panda, CSIS, BITS Pilani, Hyderabad Campus 14
Amazon EC2 Instance Types
SS ZG527 AWS Storage and DB Services Dr. S. Panda, CSIS, BITS Pilani, Hyderabad Campus 15
Elastic IP Address
SS ZG527 AWS Storage and DB Services Dr. S. Panda, CSIS, BITS Pilani, Hyderabad Campus 16
SS ZG527 AWS Storage and DB Services Dr. S. Panda, CSIS, BITS Pilani, Hyderabad Campus 17
Auto Scaling
SS ZG527 AWS Storage and DB Services Dr. S. Panda, CSIS, BITS Pilani, Hyderabad Campus 18
Elastic Load Balancing
SS ZG527 AWS Storage and DB Services Dr. S. Panda, CSIS, BITS Pilani, Hyderabad Campus 19
Amazon VPC
SS ZG527 AWS Storage and DB Services Dr. S. Panda, CSIS, BITS Pilani, Hyderabad Campus 20
SS ZG527 AWS Storage and DB Services Dr. S. Panda, CSIS, BITS Pilani, Hyderabad Campus 21
Amazon Route 53
SS ZG527 AWS Storage and DB Services Dr. S. Panda, CSIS, BITS Pilani, Hyderabad Campus 22
SS ZG527 AWS Storage and DB Services Dr. S. Panda, CSIS, BITS Pilani, Hyderabad Campus 23
Security Groups
https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/using-network-security.html
SS ZG527 AWS Storage and DB Services Dr. S. Panda, CSIS, BITS Pilani, Hyderabad Campus 25
Inbound
Source Protocol Port Range Comments
0.0.0.0/0 TCP 80 Allow inbound HTTP access from all
IPv4 addresses
::/0 TCP 443 Allow inbound HTTPS access from all IPv6
addresses
Your network's public IPv4 address TCP 22 Allow inbound SSH access to Linux
range instances from IPv4 IP addresses in
your network (over the Internet
gateway)
Your network's public IPv4 address TCP 3389 Allow inbound RDP access to Windows
range instances from IPv4 IP addresses in
your network (over the Internet
gateway)
Outbound
Destination Protocol Port Range Comments
The ID of the security group for your TCP 1433 Allow outbound Microsoft SQL Server
Microsoft SQL Server database servers access to instances in the specified
security group
The ID of the security group for your TCP 3306 Allow outbound MySQL access to
MySQL database servers instances in the specified security group
SS ZG527 AWS Storage and DB Services Dr. S. Panda, CSIS, BITS Pilani, Hyderabad Campus 26
Region versus Availability
Zones
SS ZG527 AWS Storage and DB Services Dr. S. Panda, CSIS, BITS Pilani, Hyderabad Campus 27
SS ZG527 AWS Storage and DB Services Dr. S. Panda, CSIS, BITS Pilani, Hyderabad Campus 28
SS ZG527 AWS Storage and DB Services Dr. S. Panda, CSIS, BITS Pilani, Hyderabad Campus 29
Regions
SS ZG527 AWS Storage and DB Services Dr. S. Panda, CSIS, BITS Pilani, Hyderabad Campus 30
Amazon S3
SS ZG527 AWS Storage and DB Services Dr. S. Panda, CSIS, BITS Pilani, Hyderabad Campus 31
Organization of Data in S3
SS ZG527 AWS Storage and DB Services Dr. S. Panda, CSIS, BITS Pilani, Hyderabad Campus 32
Amazon S3
SS ZG527 AWS Storage and DB Services Dr. S. Panda, CSIS, BITS Pilani, Hyderabad Campus 33
Amazon S3 pricing
S3 Standard Storage
S3 Glacier Storage
SS ZG527 AWS Storage and DB Services Dr. S. Panda, CSIS, BITS Pilani, Hyderabad Campus 34
Billions of Objects Stored
SS ZG527 AWS Storage and DB Services Dr. S. Panda, CSIS, BITS Pilani, Hyderabad Campus 35
S3 Namespace
SS ZG527 AWS Storage and DB Services Dr. S. Panda, CSIS, BITS Pilani, Hyderabad Campus 36
S3 Namespace
SS ZG527 AWS Storage and DB Services Dr. S. Panda, CSIS, BITS Pilani, Hyderabad Campus 37
S3 API
SS ZG527 AWS Storage and DB Services Dr. S. Panda, CSIS, BITS Pilani, Hyderabad Campus 38
Storage Resources
SS ZG527 AWS Storage and DB Services Dr. S. Panda, CSIS, BITS Pilani, Hyderabad Campus 39
Elastic Block Storage
SS ZG527 AWS Storage and DB Services Dr. S. Panda, CSIS, BITS Pilani, Hyderabad Campus 40
Elastic Block Storage
SS ZG527 AWS Storage and DB Services Dr. S. Panda, CSIS, BITS Pilani, Hyderabad Campus 41
Elastic Block Storage
SS ZG527 AWS Storage and DB Services Dr. S. Panda, CSIS, BITS Pilani, Hyderabad Campus 42
Elastic Block Storage
In the diagram, Volume 1 is shown at
three points in time. A snapshot is taken
of each of these three volume states.
•In State 1, the volume has 10 GiB of
data. Because Snap A is the first
snapshot taken of the volume, the entire
10 GiB of data must be copied.
•In State 2, the volume still contains 10
GiB of data, but 4 GiB have changed.
Snap B needs to copy and store only the
4 GiB that changed after Snap A was
taken. The other 6 GiB of unchanged
data, which are already copied and
stored in Snap A, are referenced by
Snap B rather than (again) copied. This
is indicated by the dashed arrow.
•In State 3, 2 GiB of data have been
added to the volume, for a total of 12
GiB. Snap C needs to copy the 2 GiB
that were added after Snap B was taken.
As shown by the dashed arrows, Snap C
also references 4 GiB of data stored in
Snap B, and 6 GiB of data stored in
Snap A.
•The total storage required for the three
https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/EBSSnapshots.html
snapshots is 16 GiB.
SS ZG527 AWS Storage and DB Services Dr. S. Panda, CSIS, BITS Pilani, Hyderabad Campus 43
Instance Storage vs EBS
Storage
SS ZG527 AWS Storage and DB Services Dr. S. Panda, CSIS, BITS Pilani, Hyderabad Campus 44
SS ZG527 AWS Storage and DB Services Dr. S. Panda, CSIS, BITS Pilani, Hyderabad Campus 45
SS ZG527 AWS Storage and DB Services Dr. S. Panda, CSIS, BITS Pilani, Hyderabad Campus 46
Amazon Glacier
https://aws.amazon.com/glacier/
SS ZG527 AWS Storage and DB Services Dr. S. Panda, CSIS, BITS Pilani, Hyderabad Campus 47
Amazon Dynamo DB
SS ZG527 AWS Storage and DB Services Dr. S. Panda, CSIS, BITS Pilani, Hyderabad Campus 50
Amazon Dynamo DB
SS ZG527 AWS Storage and DB Services Dr. S. Panda, CSIS, BITS Pilani, Hyderabad Campus 51
Amazon Dynamo DB
SS ZG527 AWS Storage and DB Services Dr. S. Panda, CSIS, BITS Pilani, Hyderabad Campus 52
Amazon Cloud Front
SS ZG527 AWS Storage and DB Services Dr. S. Panda, CSIS, BITS Pilani, Hyderabad Campus 53
How CloudFront delivers
Content?
SS ZG527 AWS Storage and DB Services Dr. S. Panda, CSIS, BITS Pilani, Hyderabad Campus 54
How CloudFront delivers
Content?
SS ZG527 AWS Storage and DB Services Dr. S. Panda, CSIS, BITS Pilani, Hyderabad Campus 55
How CloudFront delivers
Content?
SS ZG527 AWS Storage and DB Services Dr. S. Panda, CSIS, BITS Pilani, Hyderabad Campus 56
Bibliography
• Jayaswal K., Kallakurchi J., Houde D. J., and Shah D.
Cloud Computing Black Book. DreamTech Press; 2014.
• Dan C. Marinescu, Cloud Computing Theory and
Practice. Elsevier; 2013.
• Internet Resources.
• Recorded Lectures.
SS ZG527 AWS Storage and DB Services Dr. S. Panda, CSIS, BITS Pilani, Hyderabad Campus 57
Cloud Computing
BITS Pilani
Hyderabad Campus
What is OpenStack?
- OpenStack.org
Source: http://docs.openstack.org/developer/nova/architecture.html
Source: http://docs.openstack.org/developer/glance/architecture.html
SS ZG527 VM Provisioning and Migration Dr. S. Panda, CSIS, BITS Pilani, Hyderabad Campus 2
Virtual Machine Provisioning and
Manageability
Source: http://www.slideshare.net/mhajibaba/cloud-computing-principles-and-paradigms-5-virtual-machines-provisioning-and-migration-services
SS ZG527 VM Provisioning and Migration Dr. S. Panda, CSIS, BITS Pilani, Hyderabad Campus 3
VM Provisioning Process
Provisioning a virtual machine or server can be explained and
illustrated as follows:
Source: http://docplayer.net/15384567-Cloud-computing-virtual-machines-provisioning-and-migration-services-mohamed-el-refaey.html
SS ZG527 VM Provisioning and Migration Dr. S. Panda, CSIS, BITS Pilani, Hyderabad Campus 4
Steps to Provision VM
SS ZG527 VM Provisioning and Migration Dr. S. Panda, CSIS, BITS Pilani, Hyderabad Campus 6
Provisioning of VM (contd..)
After creating a virtual machine by virtualizing a
physical server, or by building a new virtual server in
the virtual environment, a template can be created
out of it.
Most virtualization management vendors (VMware,
Xen Server, etc.) provide the data center’s
administration with the ability to perform such tasks
in an easy way.
Provisioning from a template is an invaluable
feature, because it reduces the time required to
create a new virtual machine.
SS ZG527 VM Provisioning and Migration Dr. S. Panda, CSIS, BITS Pilani, Hyderabad Campus 7
Provisioning of VM (contd..)
Administrators can create different templates for different
purposes.
For example, you can create a Windows 2003 Server
template for the finance department, or a Red Hat Linux
template for the engineering department.
This enables the administrator to quickly provision a
correctly configured virtual server/virtual machine on
demand.
For example:
• Vagrant provision tool using VagrantFile(template file).
• Heat –Orchestration Tool of openstack(Heat template in YAML
format).
SS ZG527 VM Provisioning and Migration Dr. S. Panda, CSIS, BITS Pilani, Hyderabad Campus 8
VM Provisioning Using Vagrant
”Create and Configure lightweight, reproducible and
portable environments.”
What is Vagrant?
• A tool to build development environments based on
virtual machines.
• Focused to create environments that are similar as
possible or identical with production servers.
• Created by Mitchell Hashimoto and written in ruby.
• Initially built on top of VirtualBox API, today offers
VMWare Fusion support(as $79/license).
SS ZG527 VM Provisioning and Migration Dr. S. Panda, CSIS, BITS Pilani, Hyderabad Campus 9
Use Cases of Vagrant
Exercise:
Create a virtual test lab on your machine using
Vagrant.
SS ZG527 VM Provisioning and Migration Dr. S. Panda, CSIS, BITS Pilani, Hyderabad Campus 10
VM Migration Techniques
Migration service, in the context of virtual machines, is the
process of moving a virtual machine from one host server
or storage location to another.
Techniques of VM migration:
Hot/Live Migration (real-time migration)
Cold/Regular Migration
Live Storage Migration
All key machines’ components, such as CPU, storage
disks, networking, and memory, are completely
virtualized, thereby facilitating the entire state of a virtual
machine to be captured by a set of easily moved data
files.
SS ZG527 VM Provisioning and Migration Dr. S. Panda, CSIS, BITS Pilani, Hyderabad Campus 11
Live Migration
SS ZG527 VM Provisioning and Migration Dr. S. Panda, CSIS, BITS Pilani, Hyderabad Campus 12
Live Migration contd…
Pre-Assumption :
All storage resources are
separated from computing
resources.
Storage devices of VMs
are attached from network :
– NAS: NFS, CIFS
– SAN: Fibre Channel
– iSCSI, network block device
– Drdb network RAID
Require high quality
network connection
– Common L2 network (LAN)
– L3 re-routing
SS ZG527 VM Provisioning and Migration Dr. S. Panda, CSIS, BITS Pilani, Hyderabad Campus 13
Live Migration contd…
SS ZG527 VM Provisioning and Migration Dr. S. Panda, CSIS, BITS Pilani, Hyderabad Campus 14
Live Migration(Xen Hypervisor)
Stage 0: Pre-Migration. An active virtual machine exists on the
physical host A
Stage 1: Reservation. A request is issued to migrate an OS
from host A to host B (a precondition is that the necessary
resources exist on B and a VM container of that size).
Stage 2: Iterative Pre-Copy. During the first iteration, all pages
are transferred from A to B. Subsequent iterations copy only
those pages dirtied during the previous transfer phase.
Stage 3: Stop-and-Copy. Running OS instance at A is
suspended, and its network traffic is redirected to B. CPU state
and any remaining inconsistent memory pages are then
transferred. At the end of this stage, there is a consistent
suspended copy of the VM at both A and B. The copy at A is
considered primary and is resumed in case of failure.
SS ZG527 VM Provisioning and Migration Dr. S. Panda, CSIS, BITS Pilani, Hyderabad Campus 15
Live Migration(Xen Hypervisor)
SS ZG527 VM Provisioning and Migration Dr. S. Panda, CSIS, BITS Pilani, Hyderabad Campus 16
Live Migration timeline
SS ZG527 VM Provisioning and Migration Dr. S. Panda, CSIS, BITS Pilani, Hyderabad Campus 17
SS ZG527 VM Provisioning and Migration Dr. S. Panda, CSIS, BITS Pilani, Hyderabad Campus 18
SS ZG527 VM Provisioning and Migration Dr. S. Panda, CSIS, BITS Pilani, Hyderabad Campus 19
SS ZG527 VM Provisioning and Migration Dr. S. Panda, CSIS, BITS Pilani, Hyderabad Campus 20
SS ZG527 VM Provisioning and Migration Dr. S. Panda, CSIS, BITS Pilani, Hyderabad Campus 21
Regular/Cold Migration
Cold migration is the migration of a powered-off
virtual machine.
With cold migration, you have the option of
moving the associated disks from one data store
to another.
The virtual machines are not required to be on a
shared storage.
SS ZG527 VM Provisioning and Migration Dr. S. Panda, CSIS, BITS Pilani, Hyderabad Campus 22
Regular/Cold Migration
Cold Migration Process:
The configuration files, including the NVRAM
file(BIOS settings), log files, as well as the disks of
the virtual machine, are moved from the source
host to the destination host’s associated storage
area.
The virtual machine is registered with the new
host.
After the migration is completed, the old version of
the virtual machine is deleted from the source
host.
Example: VM vSphere
SS ZG527 VM Provisioning and Migration Dr. S. Panda, CSIS, BITS Pilani, Hyderabad Campus 23
Live migration Vs Cold migration
SS ZG527 VM Provisioning and Migration Dr. S. Panda, CSIS, BITS Pilani, Hyderabad Campus 24
Live Storage Migration of Virtual
Machine
Constitutes moving of the virtual disks or configuration file
of a running virtual machine to a new data store without
any interruption in the availability of the virtual machine’s
service. Ex: Vmware Storage Vmotion
Migration of VM disk files within and across storage arrays
with no down time or disruption in service.
Relocates VM disk files from one shared storage location
to another shared storage location with zero downtime,
continuous service availability and complete transaction
integrity.
Benefits:
Simplify storage array migration and storage upgrades.
Dynamically optimize storage I/O performance.
SS ZG527 VM Provisioning and Migration Dr. S. Panda, CSIS, BITS Pilani, Hyderabad Campus 25
How does it work?
Completely transparent to the virtual machine or the end
user.
Moves the “home directory” (configuration, swap and log
files) of the VM to the new location.
Copies the contents of the entire VM storage disk file to
the destination storage host, leveraging “changed block
tracking” to maintain data integrity during the migration
process.
The VM is quickly suspended and resumed so that it can
begin using the virtual machine home directory and disk
file on the destination data store location.
SS ZG527 VM Provisioning and Migration Dr. S. Panda, CSIS, BITS Pilani, Hyderabad Campus 26
How does it work?
Source: http://www.suredatum.com/blog/oracle-licensing-the-vmotion-trap/
SS ZG527 VM Provisioning and Migration Dr. S. Panda, CSIS, BITS Pilani, Hyderabad Campus 27
Migration of VMs to Alternate
Platforms
One of the main advantages of having facility in
datacenter’s technologies is to have the ability to migrate
virtual machines from one platform to another.
Vmware converter handles migrations between ESX
hosts; the Vmware server; and the Vmware workstation.
The Vmware converter can also import from other
virtualization platforms, such as Microsoft virtual server
machines.
Ex:Vmware vCenter Converter
SS ZG527 VM Provisioning and Migration Dr. S. Panda, CSIS, BITS Pilani, Hyderabad Campus 28
VM Provisioning and Migration in
Action
ConVirt (open source framework for the management of
open source virtualization like Xen and KVM).
You can create and provision images, diagnose
performance problems, and balance load across the data
center.
Using this we can manage the lifecycle, provision, and
migrate a virtual machine.
https://www.convirture.com/products_opensource.php
SS ZG527 VM Provisioning and Migration Dr. S. Panda, CSIS, BITS Pilani, Hyderabad Campus 29
Summary
Virtual Machine Provisioning and Manageability
– VM Provisioning Process
Virtual Machine Migration Services
– Migration techniques
SS ZG527 VM Provisioning and Migration Dr. S. Panda, CSIS, BITS Pilani, Hyderabad Campus 30
Bibliography
• Buyya K. R., Broberg J., Goscinski A., Cloud Computing
Principles and Paradigms. Wiley; 2013.
• Recorded Lectures.
• https://en.wikipedia.org/wiki/Xen
• https://support.gmocloud.us/hc/en-us/articles/230943528-
What-is-the-difference-between-a-hot-migration-and-a-cold-
migration-
• http://www.sersc.org/journals/IJGDC/vol8_no5/33.pdf
• https://pubs.vmware.com/vsphere-
50/index.jsp?topic=%2Fcom.vmware.vsphere.vcenterhost.do
c_50%2FGUID-326DEC3C-3EFC-4DA0-B1E9-
0B2D4698CBCC.html
SS ZG527 VM Provisioning and Migration Dr. S. Panda, CSIS, BITS Pilani, Hyderabad Campus 31
Cloud Computing
BITS Pilani PaaS
Hyderabad Campus
Agenda
o Introduction to PaaS
o Building blocks of PaaS
o Characteristics of PaaS
o Advantages and Risks
o PaaS Example – Windows Azure
• Web based user interface creation tools help to create, modify, test
and deploy different UI scenarios
• Queues are used by Web roles and Worker roles for inter-
application communication, and by applications to communicate
with each other.
o Introduction to GAE
o Why Google App Engine ?
o Scalability
o Development Life Cycle of GAE
o GAE Services
o Programming Languages Supported
o GAE Example In JAVA using Eclipse
SS ZG527 GAE Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, Hyderabad Campus 2
Google App Engine
GAE is part of Google Cloud and is Platform As A Service
cloud (PAAS)
Use Google Infrastructure to host and build your Web
Applications
SS ZG527 GAE Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, Hyderabad Campus 3
Introduction to Google App Engine
• Google App Engine is a PaaS solution that enables users to
host their own applications on the Google data center similar
to Google Docs, Google Maps and other popular Google
services.
SS ZG527 GAE Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, Hyderabad Campus 4
Why Google App Engine ?
SS ZG527 GAE Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, Hyderabad Campus 5
Why Google App Engine ?
SS ZG527 GAE Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, Hyderabad Campus 6
Scalability
SS ZG527 GAE Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, Hyderabad Campus 7
Life of Request
SS ZG527 GAE Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, Hyderabad Campus 8
Scalability
SS ZG527 GAE Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, Hyderabad Campus 9
Application Life Cycle
SS ZG527 GAE Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, Hyderabad Campus 10
Development Life Cycle
SS ZG527 GAE Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, Hyderabad Campus 11
What does GAE Provide?
SS ZG527 GAE Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, Hyderabad Campus 12
GAE Services
SS ZG527 GAE Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, Hyderabad Campus 13
Limitations with free account on GAE
SS ZG527 GAE Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, Hyderabad Campus 14
Programming Language Support
Java:
• App Engine runs JAVA apps on a JAVA 7 virtual
machine (currently supports JAVA 6 as well).
SS ZG527 GAE Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, Hyderabad Campus 15
Programming Language Support
Python:
• Uses WSGI (Web Server Gateway Interface) standard.
• Python applications can be written using:
• Webapp2 framework
• Django framework
• Any python code that uses the CGI (Common
Gateway Interface) standard.
• Getting started :
– https://developers.google.com/appengine/docs/pytho
n/gettingstartedpython27/
SS ZG527 GAE Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, Hyderabad Campus 16
Programming Language Support
SS ZG527 GAE Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, Hyderabad Campus 17
Programming Language Support
Google’s Go:
• Go is an Google’s open source programming
environment.
• Tightly coupled with Google App Engine.
• Applications can be written using App Engine’s Go SDK.
• Getting started:
https://developers.google.com/appengine/docs/go/overvi
ew
SS ZG527 GAE Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, Hyderabad Campus 18
GAE Java Example using Eclipse
• Develop code
SS ZG527 GAE Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, Hyderabad Campus 19
GAE Java Example using Eclipse
Tools used :
• JDK 1.6
• Eclipse 3.7 + Google Plugin for Eclipse
• Google App Engine Java SDK 1.6.3.1
SS ZG527 GAE Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, Hyderabad Campus 20
GAE Java Example using Eclipse
SS ZG527 GAE Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, Hyderabad Campus 21
GAE Java Example using Eclipse
SS ZG527 GAE Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, Hyderabad Campus 22
GAE Java Example using Eclipse
2. Configure
• The created project will have this structure.
SS ZG527 GAE Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, Hyderabad Campus 23
GAE Java Example using Eclipse
3. Code
• In this example we will return Hello world from GAE.
• Open the HelloWorldGAEServlet Class and add the
following code
import java.io.IOException;
import javax.servlet.http.*;
@SuppressWarnings("serial")
public class HelloWorldGAEServlet extends HttpServlet {
public void doGet(HttpServletRequest req, HttpServletResponse resp)
throws IOException {
resp.setContentType("text/plain");
resp.getWriter().println("Hello world from GAE");
}
}
SS ZG527 GAE Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, Hyderabad Campus 24
GAE Java Example using Eclipse
SS ZG527 GAE Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, Hyderabad Campus 25
GAE Java Example using Eclipse
SS ZG527 GAE Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, Hyderabad Campus 26
GAE Java Example using Eclipse
</appengine-web-app>
SS ZG527 GAE Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, Hyderabad Campus 27
GAE Java Example using Eclipse
SS ZG527 GAE Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, Hyderabad Campus 28
GAE Java Example using Eclipse
SS ZG527 GAE Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, Hyderabad Campus 29
THANK YOU !
SS ZG527 GAE Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, Hyderabad Campus 30
BITS Pilani presentation
Software as a Service (SaaS)
BITS Pilani
Hyderabad Campus
SaaS (Software as a Service)
- No Worries - It's a Service
SS ZG527 Intro to SaaS Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 2
Objectives
Introduction to SaaS
SS ZG527 Intro to SaaS Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 3
When you opt for SaaS ??
For better understanding
• Imagine a system
where you don't have to buy new hardware or update
software
where you pay nothing or pay as much as you use
where everything is done as a service: Infrastructure,
computing, storage and usage
where you don't worry about your resources spent on
Infrastructure security and operational security
where you cut your IT spending
where you have freedom of usage from anywhere with
internet connectivity
which is eco-friendly
SS ZG527 Intro to SaaS Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 4
Dependency on IaaS and PaaS
6
SS ZG527 Intro to SaaS Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 5
SaaS - Definition
The most complete cloud computing service model is one in
which the computing hardware and software, as well as the
solution itself, are provided by a vendor as a complete service
offering.
SaaS is a model where an application is hosted on a remote
data center and provided as a service to customers across the
internet.
Shortly, in the SaaS model software is deployed as a hosted
service and accessed over the Internet, as opposed to “On
Premise.”
In this model, the provider takes care of all software
development, maintenance and upgrades.
Salesforce.com is a common and popular example of a CRM
SaaS application.
SS ZG527 Intro to SaaS Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 6
The applications are accessible from various client
devices through a web browser.
http://cloudcomputingwire.com
SS ZG527 Intro to SaaS Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 7
Is it customizable?
• Many people believe that SaaS software is not
customizable, and in many SaaS applications this is indeed
the case
- user-centric application like office suite.
• Many other SaaS solutions expose Application Programming
Interfaces (API) to developers to allow them to create
custom composite applications
- Salesforce.com, Quicken.com, etc.
• So, SaaS does not necessarily mean that the software is
static or monolithic. Customers can configure user-specific
application parameters and settings.
SS ZG527 Intro to SaaS Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 9
SaaS – How is it delivered
SS ZG527 Intro to SaaS Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 10
SaaS – How is it delivered (1)
Source: wiki
SS ZG527 Intro to SaaS Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 11
An Analogy
Traditional On-Demand Utility
SS ZG527 Intro to SaaS Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 12
SaaS characteristics
The software is available over the Internet globally through a
browser on demand.
SS ZG527 Intro to SaaS Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 14
SaaS - Pros
No large upfront costs - usually free trials
Anywhere, anytime, anyone - mobility
Stay focused on business processes
Change software to an Operating Expense instead of a
Capital Purchase, making better accounting and budgeting
sense.
Create a consistent application environment for all users
No concerns for cross platform support
Easy Access
Reduced piracy of your software
Lower Cost
For an affordable monthly subscription
Implementation fees are significantly lower
Continuous Technology Enhancements
SS ZG527 Intro to SaaS Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 15
SaaS - Cons
Initial time needed for licensing and agreements
Trust, or the lack thereof, is the number one factor blocking the
adoption of software as a service (SaaS).
Centralized control
Possible erosion of customer privacy
Absence of disconnected use
Not suited to high volume data entry
Broadband risk
SS ZG527 Intro to SaaS Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 16
SaaS Advantages
SS ZG527 Intro to SaaS Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 17
SaaS Advantages
SS ZG527 Intro to SaaS Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 18
SaaS User Benefits
SS ZG527 Intro to SaaS Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 19
SaaS User Benefits
• Access Anywhere
- Users can use their applications and access their data
anywhere
• With an Internet connection and a computing device.
- This enhances the customer experience of the software and
makes it easier for users to get work done fast.
• Freedom to Choose (or Better Software)
- The pay-as-you-go (PAYG) nature of SaaS enables users to
select applications they wish to use and to stop using those
that no longer meet their needs.
- Ultimately, this freedom leads to better software applications
because vendors must be receptive to customer needs and
wants.
SS ZG527 Intro to SaaS Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 20
SaaS User Benefits
SS ZG527 Intro to SaaS Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 21
SaaS Vendor Benefits
• Increased Total Available Market
- Lower upfront costs and reduced infrastructure capital translate into
a much larger available market for the software vendor,
- because users that previously could not afford the software license or
lacked the skill to support the necessary infrastructure are potential
customers.
- A related benefit is that the decision maker for the purchase of a
SaaS application will be at a department level rather than the
enterprise level that is typical for the perpetual license model.
- This results in shorter sales cycles.
• Enhanced Competitive Differentiation
- The ability to deliver applications via the SaaS model enhances a
software company’s competitive differentiation.
- It also creates opportunities for new companies to compete effectively
with larger vendors.
- On the other hand, software companies will face ever-increasing
pressure from their competitors to move to the SaaS model.
- Those who lag behind will find it difficult to catch up as the software
industry continues to rapidly evolve.
SS ZG527 Intro to SaaS Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 22
SaaS Vendor Benefits
SS ZG527 Intro to SaaS Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 23
SaaS Vendor Benefits
SS ZG527 Intro to SaaS Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 24
Summary:
Introduction to SaaS
Pros and Cons of SaaS model
SS ZG527 Intro to SaaS Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 25
BITS Pilani presentation
SaaS Architecture
BITS Pilani
Hyderabad Campus
Objectives
SaaS Architecture
Applications of SaaS
Traditional packaged Software Vs SaaS
Examples of SaaS
Case study
SS ZG527 SaaS Architecture Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 2
SaaS – Architecture
(Introduction)
Run by:
• Bandwidth technologies
- The cost of a PC has been reduced
significantly with more powerful computing.
- But the cost of application Software has not
followed.
• A normal scenario would require timely and
expensive setup and maintenance costs.
• Licensing issues for business are contributing
significantly to the use of illegal software and
piracy.
SS ZG527 SaaS Architecture Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 3
SaaS Application Architecture (1)
- Scalable
- Multitenant efficient
- Configurable
• Scaling the application
• Maximizing concurrency, and using application
resources more efficiently.
• i.e. optimizing locking duration, statelessness,
sharing pooled resources such as threads and
network connections, caching reference data, and
partitioning large databases.
SS ZG527 SaaS Architecture Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 4
SaaS Application Architecture (2)
Multi-tenancy:
Important architectural shift from designing
isolated, single-tenant applications.
One application instance must be able to
accommodate users from multiple other
companies at the same time.
All transparent to any of the users.
This requires an architecture that maximizes the
sharing of resources across tenants.
Is still able to differentiate data belonging to
different customers.
SS ZG527 SaaS Architecture Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 5
SaaS Application Architecture (3)
Configurable:
• A single application instance on a single server has to
accommodate users from several different companies at
once.
• To customize the application for one customer will change
the application for other customers as well.
• Traditionally customizing an application would mean code
changes for individual customer.
• Each customer uses metadata to configure the way the
application appears and behaves for its users.
• Customers configuring applications must be simple and
easy without incurring extra development or operation costs.
SS ZG527 SaaS Architecture Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 6
SaaS Models
SS ZG527 SaaS Architecture Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 7
Example SaaS applications
Salesforce.com
Google Apps
Gmail, Google Groups, Google Calendar, Talk, Docs, etc
Google Apps Marketplace (Google apps for both free and for a fee)
SS ZG527 SaaS Architecture Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 11
SaaS examples
SS ZG527 SaaS Architecture Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 12
Which applications are suitable?
SS ZG527 SaaS Architecture Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 13
Myths
SaaS is still relatively new and untested.
SS ZG527 SaaS Architecture Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 15
Web Application Hosting on AWS
SS ZG527 SaaS Architecture Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 16
Multi-Tenancy in SaaS
SS ZG527 SaaS Architecture Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 17
Multi-Tenancy in Saas
SS ZG527 SaaS Architecture Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 18
3-Tier auto-scalable Web Application Architecture
SS ZG527 SaaS Architecture Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 19
ASP vs SaaS
SS ZG527 SaaS Architecture Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 20
Myths (contd..)
SS ZG527 SaaS Architecture Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 21
Case study
Cloud computing for education
SS ZG527 SaaS Architecture Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 24
Applicability – Scenario 2
Infrastructure Software
– Serves as the foundation for most other
enterprise software applications.
– Inapplicable to SaaS model
• Installation locally is required
• Forms the basis to run other applications
– Example: Window XP, Oracle database
SS ZG527 SaaS Architecture Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 25
Applicability – Scenario 3
Embedded Software
• Software components for embedded systems.
• Supports the functionality of the hardware device
• Inapplicable to SaaS model
- Embedded software and hardware is combined
together and is inseparable.
- Example: software embedded in ATM machines,
cell phones, routers, medical equipment, etc.
SS ZG527 SaaS Architecture Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 26
Applicability – Scenario 4
Enterprise Software Application
– Performs business functions
– Organize internal and external information
– Share data among internal and external users
– The most standard type of software applicable to
SaaS model
– Example: Saleforce.com CRM application, Siebel
On-demand application.
SS ZG527 SaaS Architecture Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 27
SaaS Example- Zoho Doc Writer
SS ZG527 SaaS Architecture Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 28
Case study- DOCUMENT SERVICES:
GOOGLE DOCS
SS ZG527 SaaS Architecture Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 29
Google Docs Portal
Google Docs APIs
SS ZG527 SaaS Architecture Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 31
Example:
SS ZG527 SaaS Architecture Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 32
Google Docs packages that one needs to import
are the following:
import com.google.common.*;
import com.google.gdata.util.*;
import com.google.gdata.client.uploader.*;
import com.google.gdata.data.docs.*;
import com.google.gdata.data.media.*;
import com.google.gdata.data.acl.*;
SS ZG527 SaaS Architecture Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 33
Java code that uploads a file to Google Docs
Embedding Google Docs in Other HTML Pages
Consider a scenario where you have your own web page
and would like to embed Google Docs to use Google
Docs as a back-end store.
Clicking on a link would display a document that is actually stored in
Google Docs.
Requirements:
Google doc API’s for upload
The unique URL of the document to be inserted is
needed.
To get this unique URL, the file needs to be published as
a web page.
SS ZG527 SaaS Architecture Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 35
HTML code similar to the following, which has to be inserted into your web page:
<iframe
src="https://docs.google.com/document/d/1swzqklOR0jcVphTe0DBQ3NwNI8MDI17eB50aBY
ap3Kk/pub?embedded=true"></iframe>
Case study- Salesforce CRM
SS ZG527 SaaS Architecture Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 37
Salesforce benefits
Costs:
Receive a very reasonable non-profit rate ($300 per year per user license)
Marketing : Services
• Track all data related to a campaign (date, location, costs).
• Track email blasts internally and view an aggregated record of all
emails sent to a potential student
- Prevents excessive emails
• Track all potential lead data including survey data
• Calculate a Return On Investment
- Some marketing campaigns can be quite costly (print, radio etc.)
- Given the current times of economic hardship, it is crucial to
measure the effectiveness of marketing campaigns.
• Increase rate of conversion by capturing web-based inquiry data.
• Help keep our data clean with the integrated de-duplication tools.
SS ZG527 SaaS Architecture Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 38
Salesforce benefits
Email Blasts
Vertical Response
• Email Service Provider sends out mass emails to customers.
- This email provided basic contact information and a series of links
including one back to a web-based inquiry form.
- Auto Response Rules:
• Example- creating an effective message back to the student
within minutes of clicking submit.
Survey Tools
Poltzer
• Integrated survey tool - great for creating basic surveys, and have the
data automatically tracked in salesforce.
- It can send a survey link out in your mass email and combine the two
great services.
• You can use your own external survey tool for more advanced survey,
reporting, and import the data into salesforce.
SS ZG527 SaaS Architecture Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 39
Marketing via Salesforce
Google docs:
Refer book :
Moving to the cloud for:
Printing the details of the uploaded file
Handling errors while uploading
Sharing the document with a mailing list
Salesforce CRM:
Refer:
Salesforce.User Guide Site: http://tinyurl.com/2ajcpgs
Salesforce Blog: http://salesforceatrutgers-
sci.blogspot.com
SS ZG527 SaaS Architecture Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 4-Mar-17 41
Summary: SaaS
• Software/Interface
- SaaS provides the users a complete software application or the
user interface to the application itself.
• Outsourced Management
- The cloud service provider manages the underlying cloud
infrastructure including servers, network, operating systems, storage
and application software,
- and the user is unaware of the underlying architecture of the cloud.
• Thin client interfaces
- Applications are provided to the user through a thin client interface (e.g.
a browser).
- SaaS applications are platform independent and can be accessed from
various client devices such as workstations, laptop, tablets and
smartphones, running on different operating systems.
• Ubiquitous Access
- Since the cloud service provider manages both the application and data,
the users are able to access the applications from anywhere.
SS ZG527 SaaS Architecture Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 42
Summary: SaaS
SaaS
Benefits Characteristics Examples
- Lower costs - Multi-tenancy - Google Apps
- No infrastructure required - On-demand software - Salesforce.com
- Seamless upgrades - Open integration protocols - Facebook
- Guaranteed performance - Social network integration - Zoho
- Automated backups - Dropbox
- Easy data recovery - Taleo
Adoption
- Secure - Microsoft Office 365
- Individual users: High
- High adoption - Small & medium enterprises: High - Linkedin
- On-the move access - Large organizations: High - Slideshare
- Government: Medium - CareCloud
SS ZG527 SaaS Architecture Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 43
Bibliography
SS ZG527 SaaS Architecture Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 44
Cloud Computing
BITS Pilani Orchestration and Dockers
Hyderabad Campus
BITS Pilani
Hyderabad Campus
• Docker
• Nova (compute)
• Cinder (block storage)
• Glance (image library)
• Swift (object storage)
• Neutron (network)
• Keystone (identity)
• Heat (orchestration tool)
• CoreOS https://coreos.com/
• OpenShift https://www.openshift.com
• Docker https://www.docker.com/
• Kubernetes http://kubernetes.io/
3)
sudo add-apt-repository
"deb[arch=amd64]https://download.docker.com/linux/ubuntu xenial
stable"
We can also push our own applications as images into the Docker hub
• https://www.ibm.com/developerworks/cloud/library/cl-
cloud-orchestration-technologies-trs/index.html
• https://docs.dockers.com/engine/installation/windows/
SS Z G527
Cloud
Computing
CS 8.1
DFS
SS ZG527 DFS Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 2
Objectives BITS Pilani, Hyderabad Campus
SS ZG527 DFS Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 3
File system BITS Pilani, Hyderabad Campus
SS ZG527 DFS Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 4
BITS Pilani, Hyderabad Campus
Distributed File System (DFS)?
SS ZG527 DFS Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 5
NAS versus SAN
Another term for DFS is network attached storage (NAS),
referring to attaching storage to network servers that provide
file systems
A similar sounding term that refers to a very different
approach is storage area network (SAN)
SAN makes storage devices (not file systems) available over a network
BITS Pilani, Hyderabad Campus
NAS versus SAN
Benefits of DFSs
DFSs provide:
1.File sharing over a network: without a DFS, we
would have to exchange files by e-mail or use
applications such as the Internet’s FTP.
SS ZG527 DFS Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 8
DFS Architectures
SS ZG527 DFS Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 9
Cluster-Based Distributed File Systems
SS ZG527 DFS Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 10
Google File System (GFS)
The Google File System (GFS) is a cloud computing
based scalable DFS for large distributed data
intensive applications.
GFS divides large files into multiple pieces called chunks
or blocks (by default 64MB) and stores them on different
data servers.
This design is referred to as block-based design
Each GFS chunk has a unique 64-bit identifier and
is stored as a file in the lower layer local file system on
the data server.
GFS distributes chunks across cluster data servers using
a random distribution policy.
SS ZG527 DFS Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 11
GFS Random
Distribution Policy
SS ZG527 Intro to file system Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 12
How to replicate?
SS ZG527 DFS Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 13
Google File System
GFS stores a huge number of files, totaling
many terabytes of data.
Individual file characteristics:
– Very large, multiple gigabytes per file
– Files are updated by appending new
entries to the end (faster than overwriting
existing data)
– Files are virtually never modified (other
than by appends) and virtually never
deleted.
– Files are mostly read-only
SS ZG527 DFS Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 14
GFS Architecture
SS ZG527 Intro to file system Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 15
Master and Chunk Server
Responsibilities
SS ZG527 Intro to file system Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 16
GFS
Chunks are replicated within a cluster for fault
tolerance, using a primary/backup scheme.
Periodically the master polls all its chunk
servers to find out which chunks each
one stores
– This means the master doesn’t need to know
each time a new server comes on board, when
servers crash, etc.
Polling occurs often enough to guarantee that
master’s information is “good enough”.
SS ZG527 DFS Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 17
HDFS??
Hadoop's Distributed File System is designed to
reliably store very large files across machines in a
large cluster.
It is inspired by the Google File System.
Hadoop DFS stores each file as a sequence of
blocks, all blocks in a file except the last block are
the same size.
Blocks belonging to a file are replicated for fault
tolerance. The block size and replication factor are
configurable per file. Files in HDFS are "write
once" and have strictly one writer at any time.
SS ZG527 DFS Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 18
Hadoop File system
SS ZG527 Intro to file system Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 19
Hadoop Distributed File System – Goals:
• Store large data sets
• Cope with hardware failure
• Emphasize streaming data access
From GFS to HDFS
Terminology differences:
– GFS master = Hadoop namenode
– GFS chunkservers = Hadoop datanodes
SS ZG527 DFS Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 21
HDFS Architecture
HDFS namenode
Application /foo/bar
(file name, block id) File namespace
HDFS Client block 3df2
(block id, block location)
instructions to datanode
datanode state
(block id, byte range)
HDFS datanode HDFS datanode
block data Linux file system Linux file system
… …
Hadoop Server Roles
SS ZG527 DFS Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 23
Source: http://bradhedlund.com/2011/09/10/understanding-hadoop-clusters-and-the-network
Namenode Responsibilities
Source:
http://bradhedlund.com/2011/09/10/under
standing-hadoop-clusters-and-the-network
The Name Node is not in the data path. The Name Node only provides the map
of where data is and where data should go in the cluster
SS ZG527 DFS Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 26
Preparing HDFS writes
SS ZG527 Intro to file system Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 27
Pipelined write
SS ZG527 DFS Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 28
Pipelined write (contd..)
The Client is ready to start the pipeline process again for the next block of data
SS ZG527 Intro to file system Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 29
Multi-block replication pipeline
Note: The initial node in the pipeline will vary for each block
SS ZG527 Intro to file system Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 30
Client reading file from HDFS BITS Pilani, Hyderabad Campus
SS ZG527 Intro to file system Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 31
Data Node reading file from
BITS Pilani, Hyderabad Campus
HDFS
SS ZG527 Intro to file system Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 32
Name node, Data node
(heart beat) BITS Pilani, Hyderabad Campus
SS ZG527 Intro to file system Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 33
Single point of failure
(name node) BITS Pilani, Hyderabad Campus
SS ZG527 Intro to file system Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 34
Data Recovery BITS Pilani, Hyderabad Campus
SS ZG527 Intro to file system Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 35
Overall effect?
SS ZG527 DFS Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 36
Scaling the cluster BITS Pilani, Hyderabad Campus
SS ZG527 Intro to file system Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 37
Scaling the cluster (contd..) BITS Pilani, Hyderabad Campus
SS ZG527 DFS Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 38
Scaling the cluster (contd..) BITS Pilani, Hyderabad Campus
Wide scaling:
Cluster size increases by increasing the no. of nodes
Network needs to scale appropriately
Deep:
Instead of increasing the number of machines you can
look at increasing the density of each machine
i.e., increasing each node capacity in terms of more
CPUs, disk drives and RAM
More network I/O requirements (fewer machines)
SS ZG527 DFS Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 39
What does it have to do with cloud BITS Pilani, Hyderabad Campus
computing?
SS ZG527 DFS Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 40
Cloud??? BITS Pilani, Hyderabad Campus
SS ZG527 DFS Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 41
BITS Pilani, Hyderabad Campus
Reference BITS Pilani, Hyderabad Campus
SS ZG527 DFS Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 43
Summary BITS Pilani, Hyderabad Campus
SS ZG527 DFS Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 44
Cloud Computing
BITS Pilani Multi-Tenancy
Hyderabad Campus
Objective
Multi-Tenancy
SS ZG527 Multitenancy, Security, and SLA Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, Hyderabad Campus 2
Multitenancy
• Multi-tenancy refers to a principle in software architecture
where a single instance of the software runs on a server,
serving multiple client organizations (tenants).
• Multi-tenancy is different from multi-instance architecture
where separate software instances (or hardware systems) are
set up for different client organizations.
• Multi-tenancy is a critical technology to allow
one instance of application to serve multiple
customers by sharing resources.
Multi - multiple, independent customers are served
tenant is any legal entity responsible for data and is provided on
a contractual basis. Tenant is the contract signee.
Applications : IaaS, PaaS, SaaS
SS ZG527 Multitenancy, Security, and SLA Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, Hyderabad Campus 3
Multitenancy
• Single tenant applications: lots of waste
App App
Db
Db
App App
Db Db
SS ZG527 Multitenancy, Security, and SLA Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, Hyderabad Campus 4
Multitenancy
Multi-tenant applications :
App
Db
SS ZG527 Multitenancy, Security, and SLA Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, Hyderabad Campus 5
Monitor Multiple Customers Using Typical Infrastructure
DB WS DB WS DB WS
SS ZG527 Multitenancy, Security, and SLA Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, Hyderabad Campus 6
Multi-Tenant Network Monitoring Infrastructure
Management
DB Workstation
SS ZG527 Multitenancy, Security, and SLA Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, Hyderabad Campus 7
Goals of Multi-tenancy
Scale
– Server is distributed and it can handle larger load by
adding more nodes.
SS ZG527 Multitenancy, Security, and SLA Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, Hyderabad Campus 8
Multi-tenants Deployment Modes
for Application Server
SS ZG527 Multitenancy, Security, and SLA Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, Hyderabad Campus 9
Multi-tenants Deployment Modes in
Data Centers
SS ZG527 Multitenancy, Security, and SLA Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, Hyderabad Campus 10
4 levels of multi tenancy
1. Ad-hoc/customizable instances
2. Configurable instances
3. Configurable multi–tenant efficient instances
4. Scalable, configurable, multi-tenant efficient
instances
SS ZG527 Multitenancy, Security, and SLA Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, Hyderabad Campus 11
4 levels of multi tenancy
SS ZG527 Multitenancy, Security, and SLA Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, Hyderabad Campus 12
Ad-hoc / Customizable Instances
Adv:
- Easy Management: Single copy of the software.
SS ZG527 Multitenancy, Security, and SLA Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, Hyderabad Campus 14
Configurable MULTI-TENANT
Efficient Instances
All customers share the same version of the
software (only single copy among all
customers).
SS ZG527 Multitenancy, Security, and SLA Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, Hyderabad Campus 15
Configurable Multi-Tenant
Efficient Instances Scalable)
All customers share the same version of the
software (only single copy among all customers).
Software is hosted on a cluster of computers.
Hence, allows the capacity of the system to
scale almost limitlessly.
Thus, increase in no. of customers and capacity
as well.
Ex: Gmail, yahoo mail, etc
Disadv: Shared storage problem
SS ZG527 Multitenancy, Security, and SLA Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, Hyderabad Campus 16
Multi-tenant models for cloud
services
Tenants
T1 T2
IaaS PaaS
AP AP AP AP AP AP AP AP AP AP SaaS
SS ZG527 Multitenancy, Security, and SLA Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, Hyderabad Campus 17
Multi-tenancy Issues in the Cloud
• Conflict between tenants’ opposing goals
– Tenants share a pool of resources and have opposing
goals.
• How does multi-tenancy deal with conflict of interest?
– Can tenants get along together and ‘play nicely’ ?
– If they can’t, can we isolate them?
• How to provide separation between tenants?
• Cloud Computing brings new threats
• Multiple independent users share the same physical infrastructure.
- Thus an attacker can legitimately be in the same physical machine as the target.
SS ZG527 Multitenancy, Security, and SLA Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, Hyderabad Campus 18
Cloud Computing
BITS Pilani Cloud Security
Hyderabad Campus
Objectives
• Cloud Security
• Who is responsible for Managing Security
• Service License Agreements: Lifecycle and Management
• Traditional approaches to SLO management
• Automated Policy based management
SS ZG527 Multitenancy, Security, and SLA Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, Hyderabad Campus 20
Cloud Security
SS ZG527 Multitenancy, Security, and SLA Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, Hyderabad Campus 18-Mar-17 21
SS ZG527 Multitenancy, Security, and SLA Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, Hyderabad Campus 22
BITS Pilani, Hyderabad Campus
SS ZG527 Multitenancy, Security, and SLA Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, Hyderabad Campus 23
Different Aspects of Information
Security
Security Attacks:
Any action that compromises the security of information
owned by an organization.
Security Mechanisms:
A mechanism that is designed to detect, prevent or recover
from a security attack.
Security Services:
A service that enhances the security of data processing
system and the information transfers of an organization.
SS ZG527 Multitenancy, Security, and SLA Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, Hyderabad Campus 24
Some General Terms
SS ZG527 Multitenancy, Security, and SLA Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, Hyderabad Campus 25
Security attacks
2. Interception:
This is an attack on confidentiality
3. Modification:
This is an attack on integrity
4. Fabrication:
This is an attack on authenticity
SS ZG527 Multitenancy, Security, and SLA Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, Hyderabad Campus 26
Passive Vs Active attacks
Passive:
Do not involve any modification Passive attacks
to the contents of an original
message.
Eg. An unauthorized party gain
Release of Traffic analysis
access to an asset
contents
(unauthorized copying of files or
programs).
Active:
Contents of the original
messages are modified in some
way or a false message is
created.
SS ZG527 Multitenancy, Security, and SLA Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, Hyderabad Campus 27
Security Attacks in Practice
SS ZG527 Multitenancy, Security, and SLA Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, Hyderabad Campus 28
Security is the key inhibitor to cloud adaptation
SS ZG527 Multitenancy, Security, and SLA Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, Hyderabad Campus 29
Companies
18
are still afraid to use clouds
SS ZG527 Multitenancy, Security, and SLA Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, Hyderabad Campus 30
SS ZG527 Multitenancy, Security, and SLA Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, Hyderabad Campus 31
Recent Cloud attacks
• Infrastructure Security
• Privacy
SS ZG527 Multitenancy, Security, and SLA Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, Hyderabad Campus 33
Infrastructure Security
• Network Level
• Host Level
• Application Level
34
SS ZG527 Multitenancy, Security, and SLA Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, Hyderabad Campus 34
The Network Level
35
SS ZG527 Multitenancy, Security, and SLA Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, Hyderabad Campus 35
The Network Level -
Mitigation
Note that network-level risks exist regardless of
what aspects of “cloud computing” services are
being used.
The primary determination of risk level is
therefore not which *aaS is being used.
But rather whether your organization intends to
use or is using a public, private, or hybrid cloud.
36
SS ZG527 Multitenancy, Security, and SLA Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, Hyderabad Campus 36
The Host Level
SaaS/PaaS
–Both the PaaS and SaaS platforms abstract and
hide the host OS from end users.
–Host security responsibilities are transferred to the
CSP (Cloud Service Provider).
• You do not have to worry about
protecting hosts.
–However, as a customer, you still own the risk of
managing information hosted in the cloud services.
37
SS ZG527 Multitenancy, Security, and SLA Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, Hyderabad Campus 37
More on attacks…
SS ZG527 Multitenancy, Security, and SLA Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, Hyderabad Campus 38
Who is responsible for managing security
SS ZG527 Multitenancy, Security, and SLA Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, Hyderabad Campus 39
Who is responsible for managing security
(contd..)
SS ZG527 Multitenancy, Security, and SLA Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, Hyderabad Campus TS
40Pilani, Hyderabad Campus
Cloud Security Issues
SS ZG527 Multitenancy, Security, and SLA Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, Hyderabad Campus 41
Loss of Control in the Cloud
Consumer’s loss of control
– Data, applications, resources are located with provider.
– User identity management is handled by the cloud
– User accessed control rules, security policies, and
enforcement are managed by the cloud provider
– Consumer relies on provider to ensure
• Data security and privacy
• Resource availability
• Monitoring and repairing of services/resources
Minimize Loss of Control in the Cloud
Monitoring
Utilizing different clouds
Access control management
SS ZG527 Multitenancy, Security, and SLA Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, Hyderabad Campus 42
Lack of Trust in the Cloud
Trusting a third party requires taking risk
Defining trust and risk
– Opposite sides of the same coin (J. Camp)
– People only trust when it pays (Economist’s view)
– Need for trust arises only in risky situations
Trust here means mostly lack of accountability and
verifiability
SS ZG527 Multitenancy, Security, and SLA Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, Hyderabad Campus 43
Multi-tenancy Issues in the Cloud
Conflict between tenants’ opposing goals
– Tenants share a pool of resources and have opposing
goals
How does multi-tenancy deal with conflict of
interest?
– Can tenants get along together and ‘play nicely’ ?
– If they can’t, can we isolate them?
How to provide separation between tenants?
Who are my neighbors? What is their objective?
SS ZG527 Multitenancy, Security, and SLA Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, Hyderabad Campus 44
Minimize Multi-tenancy in
the Cloud
Can’t really force the provider to accept less
tenants
– Can try to increase isolation between tenants
• Strong isolation techniques (VPC to some
degree)
• VM Side channel attacks (T. Ristenpart et
al.)
– Can try to increase trust in the tenants
• Who’s the insider, where’s the security
boundary? Who can I trust?
• Use SLAs to enforce trusted behavior
SS ZG527 Multitenancy, Security, and SLA Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, Hyderabad Campus 45
Threat Model
A threat model helps in analyzing a security
problem, design mitigation strategies, and evaluate
solutions.
Steps:
– Identification: Identify attackers, assets, threats and other
components
– Ranking: Rank the threats
– Mitigation: Choose mitigation strategies
– Solution: Build solutions based on the strategies
SS ZG527 Multitenancy, Security, and SLA Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, Hyderabad Campus 46
Top 5 cloud security threats
1. Account Hijacking
3. Data Loss
4. Data Breach
5. Insider Threat
SS ZG527 Multitenancy, Security, and SLA Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, Hyderabad Campus 47
Top 5 cloud security threats
Account hijacking
• Multi-factor authentication
• Protect the global admin account
Data loss
• Accidental deletion
• Archiving service
• User / Admin level –recycle bin
• Can get data for a period of time
• Redundancy for natural disasters (Geo-redundancy
SS ZG527 Multitenancy, Security, and SLA Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, Hyderabad Campus 48
Top 5 cloud security threats
• Data breach
• Media breach
• Physical security
• Finding the data is like finding a needle in a haystack
• Encryption at rest
• Man in the middle
• Encryption in transit within and outside datacenters
• End-to-end encryption
• Message encryption
SS ZG527 Multitenancy, Security, and SLA Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, Hyderabad Campus 49
Top 5 cloud security threats
SS ZG527 Multitenancy, Security, and SLA Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, Hyderabad Campus 50
Bibliography
• Buyya K. R., Broberg J., Goscinski A., Cloud
Computing Principles and Paradigms. Wiley;
2013.
• Recorded Lectures.
• Dinkar Sitaram, Geetha Manjunath, Moving to
the Cloud Developing Apps in the New World
of Cloud Computing;2012
SS ZG527 Multitenancy, Security, and SLA Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, Hyderabad Campus 51
BITS Pilani presentation
Dr.Subhrakanta Panda
BITS Pilani BITS-Pilani, Hyderabad Campus
Hyderabad Campus
BITS Pilani
Hyderabad Campus
SS Z G527
CloudComputing
CS 8.2
HADOOP
SS ZG527 HADOOP Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 2
Objectives
Introduction to Hadoop
MapReduce
Understanding MapReduce various logical steps
Exploring the word count java program in detail
Summary of MapReduce facts
SS ZG527 HADOOP Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 3
Hadoop
SS ZG527 HADOOP Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 5
Big DATA ????
SS ZG527 HADOOP Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 6
Wow, that’s so much of DATA to
process!!!!!!
SS ZG527 HADOOP Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 7
One of the research in the year 2012, Hadoop held the
world record for the fastest system to sort large data
(500 GB of data in59 sec and 100 terabytes of data in
68 seconds)Designed to answer the question: “How to
process big data with reasonable cost and time?”
SS ZG527 HADOOP Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 8
Hadoop components and importance of
MapReduce
Hadoop is optimized for batch-processing
applications, and scales to the number of
CPUs available in the cluster
MapReduce is fundamental building block
in Hadoop
Provides Framework for Massive parallel
processing
Provides scalability
Programmer can focus on their program,
and the framework takes care of the
details of parallelization, fault-tolerance,
locality optimization, load balancing
Paradigm shift: In MapReduce
programming model, computation goes to
data rather than data coming to program.
Processing takes place where data is.
SS ZG527 HADOOP Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 9
Hadoop components
Hive: provides a database query interface to Apache
Hadoop
Pig-A high-level data-flow language and execution
framework for parallel computation.
ZooKeeper is an effort to develop and maintain an open-
source server which enables highly reliable distributed
coordination
Hbase: A scalable, distributed database that supports
structured data storage for large tables.
Mahout: A Scalable machine learning and data mining
library.
Sqoop is a tool designed to transfer data between Hadoop
and relational database servers
SS ZG527 HADOOP Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 10
Hadoop Framework
MapReduce (Data Processing Framework)
MapReduce is a software framework for easily running applications which
processes large amount of data in parallel on large clusters having
thousands of nodes of commodity hardware in a reliable and fault-tolerant
manner
MapReduce
Software Processes large Using large Nodes of In a reliable and
Framework for amount of data in clusters having commodity fault-tolerant
easily running parallel thousands of hardware manner
applications nodes
SS ZG527 HADOOP Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 12
Suggested Reading
SS ZG527 HADOOP Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 13
MapReduce??
Origin from Google, [OSDI’04]
A simple programming model - distributed programming
frame work (works on divide and conquer)
Used for processing and generating large data sets
Functional model
For large-scale data processing
– Exploits large set of commodity computers
– Executes process in distributed manner
– Offers high availability
SS ZG527 HADOOP Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 14
Motivation
Lots of demands for very large scale data processing
A certain common themes for these demands
– Lots of machines needed (scaling)
– Two basic operations on the input
• Map
• Reduce
SS ZG527 HADOOP Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 15
Architecture overview
blog.raremile.com
The Job Tracker:
Central authority for the complete MapReduce cluster
and responsible for scheduling and monitoring
MapReduce jobs.
Responds to client request for job submission and status.
The TaskTracker:
Workers that accepts map and reduce tasks from job
tracker, launches them and keeps track of their
progress, reports the same to job tracker.
Keeps track of resource usage of tasks and kills the
tasks that overshoots their memory limits.
SS ZG527 HADOOP Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 17
Ref: Jeffrey Dean and Sanjay Ghemawat
Distributed Grep BITS Pilani, Hyderabad Campus
R
M E
Very
A D Result
big
P U
data
C
E
Map: Reduce
– Accepts input key/value pair – Accepts intermediate key/value pair
– Emits intermediate key/value pair – Emits output key/value pair
Flow of MapReduce
1. Define Inputs
5. Define output
SS ZG527 HADOOP Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 22
MapReduce Programming Model
Map function:
(Kin, Vin) list(Kinter, Vinter)
Reduce function:
(Kinter, list(Vinter)) list(Kout, Vout)
SS ZG527 HADOOP Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 23
Examples
let map(k,v) =emit (k.toUpper(), v.toUpper() )
– (“foo”, “bar”) -> (“FOO”,”BAR”)
– (“key2”,”data”) -> (“KEY2”,”DATA”)
SS ZG527 HADOOP Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 24
Example: Word Count
def mapper(line):
foreach word in line.split():
output(word, 1)
SS ZG527 HADOOP Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 25
Word Count Execution BITS Pilani, Hyderabad Campus
SS ZG527 HADOOP Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 27
Word Count with Combiner BITS Pilani, Hyderabad Campus
public void map(LongWritable key, Text value, OutputCollector<Text,InWritable> output, Reporter reporter ) throws
IOException, InterruptedException {
String line = value.toString();
StringTokenizer tokenizer = new StringTokenizer(line);
while (tokenizer.hasMoreTokens()) {
word.set(tokenizer.nextToken());
output.collect(word, one);
}
}
}
public static class Reduce extends Reducer<Text, IntWritable, Text, IntWritable> {
public void reduce(Text key, Iterable<IntWritable> values, OutputCollector<Text,InWritable> output, Reporter reporter
) throws IOException, InterruptedException {
int sum = 0;
while(values.hasNext()){
sum += values.next().get();
}
output.collect(key, new IntWritable(sum));}
}
SS ZG527 HADOOP Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 30
public static void main(String[] args) throws Exception {
job.setOutputKeyClass(Text.class);
job.setOutputValueClass(IntWritable.class);
job.setMapperClass(Map.class);
job.setCombinerClass(Reducer.class)
job.setReducerClass(Reduce.class);
job.setInputFormatClass(TextInputFormat.class);
job.setOutputFormatClass(TextOutputFormat.class);
JobClient.runJob(job);
}
} // WordCount class end
Running the application
Step 1: compile your program.java and create a jar
Step 2: Place the files in appropriate HDFS directory
– /user/CSE/wordcount/input/file01 (Hello WILP students)
– /user/CSE/wordcount/input/file02 (How are you! Bye for now)
Output:
cat /user/CSE/wordcount/output/part-r-00000
are 1
Bye 1
For 1
Hello 1
How 1
Now 1
Students 1
You! 1
WILP 1
SS ZG527 HADOOP Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 32
Word Count example code (java)
http://hadoop.apache.org/docs/stable/mapred_tutorial.html
http://wiki.apache.org/hadoop/WordCount
SS ZG527 HADOOP Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 33
Challenges of Cloud Environment
Cheap nodes fail, especially when you have many
– Mean time between failures for 1 node = 3 years
– MTBF for 1000 nodes = 1 day
– Solution: Build fault tolerance into system
SS ZG527 HADOOP Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 34
MapReduce Execution Details
SS ZG527 HADOOP Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 35
Fault Tolerance in MapReduce
1. If a task crashes:
– Retry on another node
• OK for a map because it had no dependencies
• OK for reduce because map outputs are on disk
– If the same task repeatedly fails, fail the job or
ignore that input block
SS ZG527 HADOOP Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 36
Fault Tolerance in MapReduce
2. If a node crashes:
– Relaunch its current tasks on other nodes
– Relaunch any maps the node previously ran
• Necessary because their output files were lost
along with the crashed node
SS ZG527 HADOOP Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 37
Fault Tolerance in MapReduce
SS ZG527 HADOOP Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 38
MapReduce (Map task)
What if data is not local?
MapReduce (Reduce task)
BITS Pilani,
Hyderabad Campus
Examples
Inverted Index BITS Pilani,
Hyderabad Campus
• Map:
foreach word in text.split():
output(word, filename)
• Reduce:
def reduce(word, filenames):
output(word, sort(filenames))
SS ZG527 HADOOP Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 43
Inverted Index Example BITS Pilani, Hyderabad Campus
hamlet.txt
to, hamlet.txt
to be or be, hamlet.txt
not to be or, hamlet.txt afraid, (12th.txt)
not, hamlet.txt be, (12th.txt, hamlet.txt)
greatness, (12th.txt)
not, (12th.txt, hamlet.txt)
of, (12th.txt)
12th.txt be, 12th.txt or, (hamlet.txt)
not, 12th.txt to, (hamlet.txt)
be not afraid, 12th.txt
afraid of of, 12th.txt
greatness greatness, 12th.txt
Summary of MapReducefacts
SS ZG527 HADOOP Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 45
Summary of MapReduce facts (contd..)
Principal philosophies:
Make it scale, so you can throw hardware at problems
Make it cheap, saving hardware, programmer and
administration costs (but necessitating fault tolerance)
SS ZG527 HADOOP Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 46
Available solutions with cloud platforms
AWS (EMR)
Microsoft Azure (Azure HDInsight)
OpenStack (Sahara)
SS ZG527 HADOOP Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 47
Summary
Hadoop components and importance of
MapReduce
Understanding MapReduce various logical steps
Exploring the word count java program in detail
Few examples
Summary of MapReduce facts
SS ZG527 HADOOP Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 48
Cloud Computing
BITS Pilani SLA
Hyderabad Campus
SLA – Service level Agreement
Enterprises enter into a legal agreement – SLA (Service
Level Agreement) with the infrastructure service providers
to guarantee a minimum quality of service (QoS).
General QoS Parameters:
– System CPU
– Data storage
– Network bandwidth
SLA rules:
The application’s server machine will be available for 99.9% of the key
business hours of the application’s end users, also called core time, and
85% of the non-core time.
The service provider would respond to a reported issue in less than 10
minutes during the core time, but would respond in one hour during non-
core time.
These SLAs are termed as Infrastructure SLA and
Providers are called ASP (Applications Service Providers).
SS ZG527 Multitenancy, Security, and SLA Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, Hyderabad Campus 2
SLA – Service level Agreement
An implementation of an SLA should specify:
•Purpose: Objectives to achieve by using a SLA.
•Restrictions: Necessary steps or actions that need
to be taken to ensure that the requested level of
service is delivered.
•Validity Period: Period of time during which the
SLA is valid.
•Scope: Services that will be delivered to the
consumer and services that are outside the SLA.
•Parties: Any involved organizations or individual
and their roles (e.g. provider, consumer).
SS ZG527 Multitenancy, Security, and SLA Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, Hyderabad Campus 3
SLA – Service level Agreement
An implementation of an SLA should specify:
•Service-Level Objectives (SLOs): Levels of
services on which both parties agree. These are
expressed by means of service-level indicators such
as availability, performance, and reliability.
•Penalties: The penalties that will occur if the
delivered service does not achieve the defined
SLOs.
•Optional Services: Services that are not
mandatory but might be required.
•Administration: Processes that are used to
guarantee that SLOs are achieved and the related
organization is responsible for controlling these
processes.
SS ZG527 Multitenancy, Security, and SLA 4
Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, Hyderabad Campus
Example : Amazon EC2 SLA
BITS Pilani, Hyderabad Campus
SS ZG527 Multitenancy, Security, and SLA Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, Hyderabad Campus 5
Amazon EC2 SLA (contd..)
SS ZG527 Multitenancy, Security, and SLA Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, Hyderabad Campus 6
Amazon S3 SLA
SS ZG527 Multitenancy, Security, and SLA Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, Hyderabad Campus 7
Amazon S3 SLA
SS ZG527 Multitenancy, Security, and SLA Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, Hyderabad Campus 10
Key components of SLA:
Service Level Parameter: Describes an
observable property of a service whose value is
measurable (reasonable, attainable, enforceable,
measurable).
Metrics: These are definitions of values of service
properties that are measured from a service
providing system (server uptime is 98% for a
period of 10 weeks).
Function: A function specifies how to compute a
metric’s value from the values of other metrics and
constants.
Measurement directives: These specify how to
measure a metric.
SS ZG527 Multitenancy, Security, and SLA Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, Hyderabad Campus 11
Types of SLA
SS ZG527 Multitenancy, Security, and SLA Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, Hyderabad Campus 12
Infrastructure SLA
SS ZG527 Multitenancy, Security, and SLA Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, Hyderabad Campus 14
Life Cycle of SLA
Five phases in SLA life cycle:
1. Contract definition
2. Publishing and discovery
3. Negotiation
4. Operationalization: SLA operation consists of
– SLA Monitoring
– SLA Enforcement
– SLA Accounting
5. De-commissioning
SS ZG527 Multitenancy, Security, and SLA Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, Hyderabad Campus 15
Life Cycle of SLA
1. Contract definition : Define a set of service offerings and
corresponding SLAs using standard templates.
3. Negotiation:
- For a standard packaged application offered as service, this
phase is automated.
- For customized applications hosted on cloud platforms, this phase
is manual.
- The service provider analyze the application’s behavior with
respect to scalability and performance before agreeing on the
specification of SLA.
- At the end of this phase, the SLA is mutually agreed by both
customers.
SS ZG527 Multitenancy, Security, and SLA Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, Hyderabad Campus 16
Life Cycle of SLA
4. Operationalization: SLA operation consists of:
• SLA monitoring - measure parameter values and calculate the metrics
defined as a part of SLA and determine the deviation.
• SLA accounting – capture and archive the SLA adherence for
compliance. The application’s actual performance and the performance
guaranteed as a part of SLA is reported and provide the penalties paid
for each SLA violation.
• SLA enforcement – take appropriate action when the runtime monitoring
detects a SLA violation and notify the concerned parties, charge the
penalties besides other things.
5. De-commissioning :
- Termination of all activities performed under a particular SLA
when the hosting relationship between the service provider and
the service consumer has ended.
- SLA specifies the terms and conditions of contract termination
and specifies situations under which the relationship between a
service provider and a service consumer can be considered to be
legally ended.
SS ZG527 Multitenancy, Security, and SLA Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, Hyderabad Campus 17
SLA Management in Cloud
SLA management of applications hosted on cloud platforms
involves five phases:
• Feasibility Analysis
– Technical feasibility
– Infrastructure feasibility
– Financial feasibility
• On-Boarding of Application
Moving an application to the MSP’s hosting platform is called on-boarding.
• Preproduction
The application is hosted in a simulated production environment.
• Production
The application is made accessible to its end users under the agreed SLA.
• Termination
- When the customer wishes to withdraw the hosted application and does not wish to
continue to avail the services of the MSP for managing the hosting of its application, the
termination activity is initiated.
- On initiation of termination, all data related to the application are transferred to the
customer and only the essential information is retained for legal compliance.
SS ZG527 Multitenancy, Security, and SLA Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, Hyderabad Campus 18
TRADITIONAL APPROACHES TO SLO
MANAGEMENT
SS ZG527 Multitenancy, Security, and SLA Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, Hyderabad Campus 19
Load Balancing Algorithms
SS ZG527 Multitenancy, Security, and SLA Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, Hyderabad Campus 20
Load Balancing Algorithms (contd..)
Class – agnostic
This means that the front-end node is neither aware of the type of client from
which the request originates nor aware of the category (e.g., browsing, selling,
payment, etc.) to which the request belongs to.
Class – aware
With class-aware load balancing and requests distribution, the front-end node
must additionally inspect the type of client making the request and/or the type of
service requested before deciding which back-end node should service the
request.
SS ZG527 Multitenancy, Security, and SLA Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, Hyderabad Campus 21
Admission Control
SS ZG527 Multitenancy, Security, and SLA Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, Hyderabad Campus 22
Admission Control
Mechanisms
QoS Aware: Requests from low priority users & requests that are likely to
consume more system resources can be rejected.
SS ZG527 Multitenancy, Security, and SLA Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, Hyderabad Campus 23
Automated Policy-based Management
SS ZG527 Multitenancy, Security, and SLA Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, Hyderabad Campus 24
Policy based Management System
SS ZG527 Multitenancy, Security, and SLA Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, Hyderabad Campus 25
Policy based Management System
(example)
Consider a Policy Based Management system used by some cloud
computing environment where 80% of the load is optimal for physical servers.
There are 3 physical machines (or servers)–A, B and C with CPU and memory
capacity of 100 units each. The data center in which A, B and C are hosted follows
Green Computing methodology for conserving power resources, meaning until
absolutely required physical machines are not turned on. Following figure shows
the resource allocation to different virtual machines in the data center.
Resource allocation to VMs in a data center
SS ZG527 Multitenancy, Security, and SLA Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, Hyderabad Campus 26
a.
b.
c.
None of the physical machines A, B, and C has the resources to
provision new VM (VM8). So, a new physical machine, D, needs to be
switched on for hosting it.
SS ZG527 Multitenancy, Security, and SLA Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, Hyderabad Campus 27
Bibliography
• Buyya K. R., Broberg J., Goscinski A., Cloud Computing
Principles and Paradigms. Wiley; 2013.
• Recorded Lectures.
• Dinkar Sitaram, Geetha Manjunath, Moving to the Cloud
Developing Apps in the New World of Cloud Computing;2012
• Internet Sources.
Suggested Reading:
• Cloud Computing Black Book (Chapters 10, 11, and 18)
SS ZG527 Multitenancy, Security, and SLA Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, Hyderabad Campus 28