0% found this document useful (0 votes)
44 views

Cloud Computing

The document discusses Amazon S3 which is an object storage service that offers scalability, availability, security and performance. It allows storing and protecting unlimited amounts of data for various use cases. The document then describes the design requirements, principles, and how S3 works including buckets, objects, metadata and security features.

Uploaded by

Yahid
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
44 views

Cloud Computing

The document discusses Amazon S3 which is an object storage service that offers scalability, availability, security and performance. It allows storing and protecting unlimited amounts of data for various use cases. The document then describes the design requirements, principles, and how S3 works including buckets, objects, metadata and security features.

Uploaded by

Yahid
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 34

Cloud Storage Providers

Amazon S3 ( very imp)


Google Big table Data store
Mobile Me
Live Mesh
Amazon S3
• Amazon Simple Storage Service (Amazon S3)
• Amazon Simple Storage Service (Amazon S3) is
an object storage service that offers industry-
leading scalability, data availability, security,
and performance.
• Store and protect any amount of data for a
range of use cases, such as data likes,
websites, mobile applications, backup and
restore, archive, enterprise applications, IoT
devices, and big data analytics.
Design Requirements in Amazon S3

• Scalable
Amazon S3 can scale in terms of storage, request rate, and users to support an
unlimited number of web-scale applications.
• Reliable
Store data durably, with 99.99 percent availability. Amazon says it does not allow
any downtime.
• Fast
Amazon S3 was designed to be fast enough to support high-performance
applications. Server-side latency must be insignificant relative to Internet latency.
Any performance bottlenecks can be fixed by simply adding nodes to the system.
• Inexpensive
Amazon S3 is built from inexpensive commodity hardware components.
• Simple
Building highly scalable, reliable, fast, and inexpensive storage is difficult. Doing
so in a way that makes it easy to use for any application anywhere is more
difficult. Amazon S3 must do both.
Design Principles

• Decentralization It uses fully decentralized techniques to remove scaling bottlenecks


and single points of failure.
• Autonomy The system is designed such that individual components can make
decisions based on local information.
• Local responsibility each individual component is responsible for achieving its
consistency;
• Controlled concurrency Operations are designed such that no or limited
concurrency control is required.
• Failure toleration The system considers the failure of components to be a normal
mode of operation and continues operation with no or minimal interruption.
• Controlled parallelism parallelism can be used to improve performance and
robustness of recovery or the introduction of new nodes.
• Small, well-understood building blocks Do not try to provide a single service that
does everything for everyone, but instead build small components that can be used
as building blocks for other services.
• Symmetry Nodes in the system are identical in terms of functionality, and require no
or minimal node-specific configuration to function.
• Simplicity The system should be made as simple as possible.
How S3 Works

• Amazon, S3’s design aims to provide scalability, high availability, and low
latency at commodity costs.
• To store your data in Amazon S3, you work with resources known as
buckets and objects.
• A bucket is a container for objects.
• An object is a file and any metadata that describes that file.
• S3 stores arbitrary objects at up to 5GB in size, and each is accompanied by
up to 2KB of metadata.
• Objects are organized by buckets.
• Each bucket is owned by an AWS account and the buckets are identified by
a unique, user-assigned key
• To store an object in Amazon S3, we need to create a bucket and then
upload the object to a bucket.
• When the object is in the bucket, we can open it, download it, and move it.
• When no longer need an object or a bucket, we can clean up your
resources.
S3 - Working
• Buckets and objects are created, listed, and
retrieved using either a REST-style or SOAP
interface.
• Objects can also be retrieved using the HTTP GET
interface or via Bit Torrent.
• An access control list restricts who can access the
data in each bucket.
• To upload your data (photos, videos, documents
etc.) to Amazon S3, you must first create an S3
bucket in one of the AWS Regions.
• You can then upload any number of objects to the
bucket.
S3 - Working
• An object is a file and any metadata that describes that file.
• Amazon S3 is an object store that uses unique key-values to store as many objects
as you want.
• You store these objects in one or more buckets, and each object can be up to 5 TB
in size.
• An object consists of the following:
• Key - The name that you assign to an object. You use the object key to retrieve the
object.
• Version ID - Within a bucket, a key and version ID uniquely identify an object. The
version ID is a string that Amazon S3 generates when you add an object to a
bucket.
• Value - The content that you are storing. An object value can be any sequence of
bytes. Objects can range in size from zero to 5 TB.
• Metadata A set of name-value pairs with which you can store information
regarding the object.
• You can assign metadata, referred to as user-defined metadata, to your objects in
Amazon S3.
• Amazon S3 also assigns system-metadata to these objects, which it uses for
managing objects.
S3 - Working
• Working with metadata :
• Subresources - Amazon S3 uses the
subresource mechanism to store object-
specific additional information.
• Access control information -You can control
access to the objects you store in Amazon S3.
Security
• Amazon S3 supports both server-side encryption
(with three key management options (SSE-KMS,
SSE-C, SSE-S3) and client-side encryption for data
uploads.
• Amazon S3 offers flexible security features to block
unauthorized users from accessing your data.
• Amazon S3 provides management features so that
you can optimize, organize, and configure access to
your data to meet your specific business,
organizational, and compliance requirements.
Features of Amazon S3

• Storage classes
• Amazon S3 offers a range of storage classes designed for
different use cases.
• For example, one can store mission-critical production data
in S3 Standard for frequent access, save costs by storing
infrequently accessed data in S3
• We can store data with changing or unknown access patterns
in S3 Intelligent-Tiering, which optimizes storage costs by
automatically moving your data between four access tiers
when your access patterns change.
• These four access tiers include two low-latency access tiers
optimized for frequent and infrequent access, and two opt-in
archive access tiers designed for asynchronous access for
rarely accessed data.
Features of Amazon S3

• Storage management

• Amazon S3 has storage management features that you can use to manage costs,
meet regulatory requirements, reduce latency, and save multiple distinct copies of
your data for compliance requirements.
• S3 Lifecycle – Configure a lifecycle policy to manage your objects and store them
cost effectively throughout their lifecycle. You can transition objects to other S3
storage classes or expire objects that reach the end of their lifetimes.
• S3 Object Lock – Prevent Amazon S3 objects from being deleted or overwritten for
a fixed amount of time or indefinitely. You can use Object Lock to help meet
regulatory requirements that require write-once-read-many (WORM) storage or to
simply add another layer of protection against object changes and deletions.
• S3 Replication – Replicate objects and their respective metadata and object tags to
one or more destination buckets in the same or different AWS Regions for reduced
latency, compliance, security, and other use cases.
• S3 Batch Operations – Manage billions of objects at scale with a single S3 API
request or a few clicks in the Amazon S3 console. You can use Batch Operations to
perform operations such as Copy, Invoke AWS Lambda function, and Restore on
millions or billions of objects.
Features of Amazon S3
• Access management
• Amazon S3 provides features for auditing and managing access to your
buckets and objects.
• By default, S3 buckets and the objects in them are private.
• You have access only to the S3 resources that you create.
• To grant granular resource permissions that support your specific use case
or to audit the permissions of your Amazon S3 resources, you can use the
following features.
– S3 Block Public Access
– AWS Identity and Access Management (IAM)
– Bucket policies
– Amazon S3 access points
– Access control lists (ACLs)
– S3 Object Ownership
– Access Analyzer for S3
Features of Amazon S3
• Data processing
• To transform data and trigger workflows to
automate a variety of other processing
activities at scale, you can use the following
features.
– S3 Object Lambda
– Event notifications
Features of Amazon S3
• Storage logging and monitoring
• Amazon S3 provides logging and monitoring
tools that you can use to monitor and control
how your Amazon S3 resources are being
used.
• Automated monitoring tools are :
– Amazon CloudWatch metrics for Amazon S3
– AWS CloudTrail
Manual monitoring tools : Server access logging,
AWS Trusted Advisor
Features of Amazon S3
• Analytics and insights
• Amazon S3 offers features to help you gain
visibility into your storage usage, which
empowers you to better understand, analyze,
and optimize your storage at scale.
– Amazon S3 Storage Lens
– Storage Class Analysis
– S3 Inventory with Inventory reports
Features of Amazon S3
• Strong consistency
• Amazon S3 provides strong read-after-write
consistency for PUT and DELETE requests of
objects in your Amazon S3 bucket in all AWS
Regions.
• This behavior applies to both writes of new
objects as well as PUT requests that overwrite
existing objects and DELETE requests.
• In addition, read operations on Amazon S3
Select, Amazon S3 access control lists (ACLs),
Amazon S3 Object Tags, and object metadata
Google Bigtable Datastore
Google Bigtable Datastore
• Cloud Bigtable is a sparsely populated table
• It can scale upto billions of rows and thousands
of columns
• Enables to store TB or even PB of data.
• A single value in each row is indexed
• This value is known as the row key.
• It is possible to store terabytes or even
petabytes of data in Google Cloud BigTable
• The row key is the lone index value that appears
in every row and is also known as the row value.
Google Bigtable Datastore
• Low-latency storage for massive amounts of single-keyed data is made
possible by Google Cloud Bigtable.
• It is the perfect data source for MapReduce processes since it enables
great read and write throughput with low latency.
• MapReduce program executes in three stages, namely map stage, shuffle
stage, and reduce stage.
• Applications can access Google Cloud BigTable through a variety of client
libraries, including supported Java extension to the Apache HBase library.
• Because of this, it is compatible with the current Apache ecosystem of
open-source big data software.
• Applications that require high throughput and scalability for key/value
data, where each value is typically no more than 10 MB, should use
Google Cloud BigTable.
• Additionally, Google Cloud Bigtable excels as a storage engine for
machine learning, stream processing, and batch MapReduce operations.
BigTable Storage Concept

• Each massively scalable table in Google Cloud Bigtable is a sorted key/value map
that holds the data.
• The table is made up of columns that contain unique values for each row and
rows that typically describe a single object.
• A single row key is used to index each row, and a column family is often formed
out of related columns.
• The column family and a column qualifier, a distinctive name within the column
family, are combined to identify each column.
• Multiple cells may be present at each row/column intersection.
• A distinct timestamped copy of the data for that row and column is present in
each cell.
• When many cells are put in a column, a history of the recorded data for that row
and column is preserved.
• Cloud by Google Bigtable tables is sparse, taking up no room if a column is not
used in a given row.

• Few points to remember Rows of columns could be empty.


Google Bigtable Datastore

• All of the following forms of data can be stored in and


searched using Google Cloud Bigtable:
• Time-series information, such as CPU and memory
utilization patterns across various servers.
• Marketing information, such as consumer preferences
and purchase history.
• Financial information, including stock prices, currency
exchange rates, and transaction histories.
• Internet of Things data, such as consumption statistics
from home appliances and energy meters.
• Graph data, which includes details on the connections
between users.
Google Bigtable Datastore
Google Bigtable Datastore
• Google describes Bigtable as a fast and extremely scalable
DBMS.
• This allows Bigtable to scale across thousands of commodity
servers that can collectively store petabytes of data.
• Each table in Bigtable is a multidimensional sparse map.
• That is, the table is made up of rows and columns, and each
cell has a timestamp.
• Multiple versions of a cell can exist, each with a different
timestamp.
• With this stamping, you can select certain versions of a web
page, or delete cells that are older than a given date and
time.
Google Bigtable Datastore
• A specific row and column contain cells with
individual timestamps (t).
• All client queries made through the Google
Cloud Bigtable architecture are sent through a
frontend server before being forwarded to a
Google Cloud Bigtable node.
• The nodes are arranged into a Google Cloud
Bigtable cluster, which is a container for the
cluster and is part of a Google Cloud Bigtable
instance.
Google Bigtable Datastore
• Because the tables are so large, Bigtable splits them at row boundaries and saves them
as tablets.
• Each tablet is about 200MB, and each server houses 100 tablets.
• Given this, data from a database is likely to be stored in many different servers—(not in
the same geographic location). This architecture also allows for load balancing.
• If one table is getting a lot of queries, it can remove other tablets or move the busy
table to another machine that is not as busy.
• Also, if a machine fails, since the tablet is spread to different machines, users may not
even notice the outage.
• When a machine fills up, it compresses some tablets using a Google-proprietary
technique.
• On a minor scale, only a few tablets are compressed.
• On a large scale, entire tablets are compressed, freeing more drive space.
• Bigtable tablet locations are stored in cells, and looking them up is a three-tiered
system.
• Clients point to the META0 table.
• META0 then keeps track of many tables on META1 that contain the locations of the
tablets.
• Both META0 and META1 make use of prefetching and caching to minimize system
bottlenecks.
Google Bigtable Datastore : Issues
• While Bigtable is a robust tool, developers have been
cautious about using it.
• Because it is a proprietary system, they get locked into
Google.
• That is also the case with Amazon’s Web Services and other
cloud providers.
• On the other hand, Google App Engine and Bigtable are
affordable, costing about the same as Amazon’s S3.
• Costs are as follows:
– $0.10–$0.12 per CPU core-hour
– $0.15–$0.18 per GB-month of storage
– $0.11–$0.13 per GB of outgoing bandwidth
– $0.09–$0.11 per GB of incoming bandwidth
Mobile Me
Mobile Me
• MobileMe is a set of cloud services and solutions provided by Apple Inc.
• Designed for use with proprietary Apple devices, such as the iPhone.
• MobileMe provides several solutions that are entirely hosted, provisioned
and managed via a subscription based billing model from Apple’s remote
cloud infrastructure.
• Previously known as .Mac and iTools, MobileMe was replaced with iCloud in
mid-2011.
• MobileMe is a set of cloud services and solutions Apple’s solution that
delivers push email, push contacts, and push calendars from the MobileMe
service in the cloud to native applications on iPhone, iPod touch, Macs, and
PCs.
• MobileMe also provides a suite of ad-free web applications that deliver a
desktoplike experience through any modern browser.
• MobileMe applications (www.me.com) include Mail, Contacts, and Calendar,
as well as Gallery for viewing and sharing photos and iDisk for storing and
exchanging documents online.
MobileMe Features
• With a MobileMe email account, all folders, messages, and status indicators look identical
whether checking email on iPhone, iPod touch, a Mac, or a PC.
• New email messages are pushed instantly to iPhone Wi-Fi, removing the need to manually
check email and wait for downloads.
• Push also keeps contacts and calendars continuously up to date
• Push works with the native applications on iPhone and iPod touch, Microsoft Outlook for the PC,
and Mac OS X applications, Mail, Address Book, and iCal, as well as the MobileMe web
application suite.
• MobileMe web applications provide a desktop-like experience that allows users to drag and
drop, click and drag, and even use keyboard shortcuts.
• MobileMe provides anywhere access to Mail, Contacts, and Calendar, with a unified interface
• Gallery users can upload, rearrange, rotate, and title photos from any browser; post photos
directly from an iPhone; allow visitors to download print-quality images; and contribute photos
to an album.
• MobileMe iDisk lets users store and manage files online with drag-and-drop filing and makes it
easy to share documents too large to email by automatically sending an email with a link for
downloading the file.
• MobileMe includes 20GB of online storage that can be used for email, contacts, calendar,
photos, movies, and documents
MobileMe
• MobileMe included cloud productivity and synchronization
tools, communication services and remote storage.
• MobileMe applications and services included:
– Find My iPhone: An online tool for iPhone tracking and
management
– Cloud storage: Up to 40 GB
– Address Book and calendar (iCal): An online contacts and
scheduling directory created by synching the iPhone
– iGallery: Online photo and video storage

– MobileMe also provided a PC synchronization application, AOL


Instant Messenger (AIM) and iWeb for publishing and deploying
hosted websites.
Live Mesh
• Live Mesh is Microsoft’s “software-plus-services” platform
• Manage, Access, Share files and Applications on the Web
and across their world of devices.
• Live Mesh has the following components:
– A platform that defines and models a user’s digital relationships
among devices, data, applications, and people—made available to
developers through an open data model and protocols.
– A cloud service providing an implementation of the platform
hosted in Microsoft datacenters.
– Software, a client implementation of the platform that enables
local applications to run offline and interact with the cloud.
– A platform experience that exposes the key benefits of the
platform for bringing together a user’s devices, files and
applications, and social graph, with news feeds across all of these.
Live Mesh
• The Live Mesh software, called Mesh Operating Environment (MOE), is
available for
– Windows XP
– Windows Vista
– Windows Mobile
– Mac OS X
• The software is used to create and manage the synchronization
relationships between devices and data.
• Live Mesh also incorporates a cloud component, called Live Desktop.
• This is an online storage service that allows synchronized folders to be
accessible via a web site.
• It also includes remote desktop software called Live Mesh Remote
Desktop, which can be used to remotely connect and manage any of the
devices in the synchronization relationship.
• Live Mesh Remote Desktop allows you to control your devices from the
Live Mesh application, as well as from any other PC connected to the
Internet.
Live Framework
• For developers, there is a development component consisting of a protocol and
APIs known as Live Framework.
• Live Framework is a REST-based API for accessing the Live Mesh services over HTTP.
• Live Framework differs from MOE in that MOE simply lets folders be shared.
• The Live Framework APIs can be used to share any data item between devices that
recognize the data.
• The API encapsulates the data into a Mesh Object, which is the synchronization
unit of Live Mesh.
• It is then tracked for changes and synchronization.
• A Mesh Object consists of data feeds, which can be represented in Atom, RSS,
JSON, or XML.
• The MOE software also creates Mesh Objects for each Live Mesh folder so they can
be synchronized.

You might also like