0% found this document useful (0 votes)
31 views28 pages

Lecture 04 - Cloud Storage

storage

Uploaded by

idc.cupons
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
31 views28 pages

Lecture 04 - Cloud Storage

storage

Uploaded by

idc.cupons
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 28

Cloud  

Computing
Lecture  4

Storage  – CAP
RDBMS

Dan  Amiga
[email protected]
Stateless Instances

http://yourapp.cloudapp.net
Putting It All Together

Web role Worker role


Web role Worker role
Web role Worker role
LB

Storage
Stateless  compute
+  Durable  storage
-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐
=  Scalable  application
Scale-up And Scale-out

Volume
Volume

WWW
$10,000
machine DNS

$1000
machine

$500 $500 $500 $500 $500


machine machine machine machine machine

#  Machines
Scale  Up Scale  Out
Scale Up vs Scale Out
• Scale Up
– Easier (?)
– Bounded
– Expensive and proprietary
– Sometimes a must (?)
• Scale Out
– Harder (?)
– Slower when you start…
– Maintain Session (sticky vs regular)
– Unbounded, Cheaper, Always a must
• Storage is key for scaling out
On Premise / Traditional Storage Choices

• SAN, NAS, DAS


• Databases
• Offline Archival
• RAID Architecture on top of the above
Application Design Patterns

• Scale out for capacity


• Scale out for redundancy
• Asynchronous communication
• Short time outs with retries
• Idempotent operations
• Stateless with durable external storage
RAID (redundant array of independent disks)

• Storage technology that combines multiple


disk drive components into a logical unit.
• Data is distributed across the drives in one of
several ways called "RAID levels", depending
on the level of redundancy and performance
required.
Storage

• Simple, essential storage abstractions:


– Large items of data: Blobs, file streams, …
– Service state: Simple tables, caches, …
– Service communication: Queues, locks, …
• With an emphasis on:
– Massive scale, availability and durability
– Geo-location and geo-replication
• This is not a relational database in the cloud
Durable Storage

Blobs Tables Queues


• Three replicas of everything


• REST API
Storage
Blobs
Queues
AWS Storage Options

• Ephemeral Storage
• Elastic Block Storage (EBS)
• S3
• SQS
• NoSQL – Simple / Dynamo
• Relational Database Storage
• Storage Gateway
http://www.slideshare.net/AmazonWebServic
es/aws-storage-options
Amazon S3

• https://www.dropbox.com/help/7
• http://aws.amazon.com/s3/
• 1kb to 5TB of unlimited number
• You can choose a Region to optimize for
latency, minimize costs, or address
regulatory requirements.
• http://aws.amazon.com/s3-sla/
CAP Theorem
• Consistency (Atomic data objects)
– any read operation that begins after a write operation
completes must return that value, or the result of a
later write operation.
– E.g. if A writes 1 then 2 to location X, client B cannot
read 2 followed by 1.
• Available Data Objects
– even when severe (network? storage?) failures occur,
every request must terminate + minimal latency.
– Easier – all operations return successfully
• Partition Tolerance
– No set of failures less than total network failure is
allowed to cause the system to respond incorrectly.
– Easier – if the network stop delivering messages
between two sets of servers, the system will still
continue to work.
Simplified Proof
CAP Transactional Analysis

• You want consistency?


– Give up availability
– Or give up partition tolerance
Tradeoff

• Consistency give up
– DNS; Inconsistency;
• Availability give up
– Bad idea… Use retries
• Partition Tolerance
– VLDB/Clusters; Synchronous 2-phase commit
CAP In the real world

• AP: You are guaranteed get back responses


promptly (even with network partitions), but
you aren’t guaranteed anything about the
value/contents of the response.
• CP: You are guaranteed that any response you
get (even with network partitions) has a
consistent result. But you might not get any
responses whatsoever.
• CA: If the network never fails (and nodes never
crash, as they postulated earlier), then,
unsurprisingly, life is good. But if messages
could be dropped, all guarantees are off.
Consistency Models

• In an ideal world there would only be one consistency model;


when an update is made all observers will see that update

• Tradeoff to get a consistency update:


– Time
– Partition Tolerance or Availability

• An important observation is that in larger distributed scale


systems, network partitions are a given and as such consistency
and availability cannot be achieved at the same time. This
means that one has two choices on what to drop; relaxing
consistency will allow the system to remain highly available
under the partition conditions and prioritizing consistency
means that under certain conditions the system will not be
available.

• http://www.allthingsdistributed.com/2007/12/eventually_consis
tent.html
Eventual Consistency

• Different nodes keep replicas and each update is


“eventually” propagated to each replica
– And eventually, there is agreement on which
update is the latest
• As the consistency achieved is eventual, the
system has to resolve conflicts.
– Read repair: The correction is done when a read
finds an inconsistency. This slows down the read
operation.
– Write repair: The correction takes place during a
write operation, if an inconsistency has been
found, slowing down the write operation.
– Asynchronous repair: The correction is not part of
a read or write operation.
AWS S3 and Azure Storage Consistency
• Amazon S3 buckets in the US West (Oregon), US West
(Northern California), EU (Ireland), Asia Pacific (Singapore), Asia
Pacific (Tokyo), Asia Pacific (Sydney) and South America (Sao
Paulo) Regions provide read-after-write consistency for PUTS of
new objects and eventual consistency for overwrite PUTS and
DELETES. Amazon S3 buckets in the US Standard Region
provide eventual consistency.
• Azure storage is Consistent, Available, and Partition Tolerance.
How?
Storage Usage Comparison in Azure

You might also like