0% found this document useful (0 votes)
2 views

Module 6 - Simple Storage Service(S3)

S3
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views

Module 6 - Simple Storage Service(S3)

S3
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 48

Amazon S3

Simple Storage Service


S3 (Simple Storage Service)

- Amazon Simple Storage Service is storage for the Internet


- Amazon S3 has a simple web services interface that you can use to store and retrieve any amount of
data, at any time, from anywhere on the web.

Advantages of S3

Amazon S3 is intentionally built with a minimal feature set that focuses on simplicity and robustness

- Creating buckets – Create and name a bucket that stores data. Buckets are the fundamental
container in Amazon S3 for data storage.
- Storing data – Store an infinite amount of data in a bucket. Upload as many objects as you like into an
Amazon S3 bucket. Each object can contain up to 5 TB of data.
- Downloading data – Download your data or enable others to do so. Download your data anytime you
like, or allow others to do the same.
- Permissions – Grant or deny access to others who want to upload or download data into your
Amazon S3 bucket.
Amazon S3 concepts

Buckets

- To upload your data (photos, videos, documents etc.) to Amazon S3, you must first create an S3
bucket in one of the AWS Regions.
- A bucket is a region specific
- A bucket is a container for objects stored in Amazon S3.
- Every object is contained in a bucket.
- By default, you can create up to 100 buckets in each of your AWS accounts. If you need more buckets,
you can increase your account bucket limit to a maximum of 1,000 buckets by submitting a service
limit increase.
- For example, if the object named photos/puppy.jpg is stored in the john bucket in the US West
(Oregon) Region, then it is addressable using the URL
https://john.s3.us-west-2.amazonaws.com/photos/puppy.jpg
Region

- You can choose the geographical AWS Region where Amazon S3 will store the buckets that you
create.
- You might choose a Region to optimize latency, minimize costs, or address regulatory requirements.
- Objects stored in a Region never leave the Region unless you explicitly transfer them to another
Region.
- For example, objects stored in the Europe (Ireland) Region never leave it.

Object

- Amazon S3 is a simple key, value store designed to store as many objects as you want.
- You store these objects in one or more buckets.
- S3 supports object level storage i.e., it stores the file as a whole and does not divide them
- An object size can be in between 0 KB and 5 TB
- When you upload an object in a bucket, it replicates itself in multiple availability zones in the same
region
An object consists of the following:

- Key – The name that you assign to an object.


- Version ID – Within a bucket, a key and version ID uniquely identify an object.
- Value – The content that you are storing.
- Metadata – A set of name-value pairs with which you can store information regarding the object.
Object Versioning

- When you re-upload the same object name in a bucket, it replaces the whole object
- You can use versioning to keep multiple versions of an object in one bucket.
- For example, you could store my-image.jpg (version 111111) and my-image.jpg (version 222222) in a
single bucket.
- Versioning protects you from the consequences of unintended overwrites and deletions.
- You must explicitly enable versioning on your bucket. By default, versioning is disabled.
- Regardless of whether you have enabled versioning, each object in your bucket has a version ID.
- If you have not enabled versioning, Amazon S3 sets the value of the version ID to null. If you have
enabled versioning, Amazon S3 assigns a unique version ID value for the object.
- When you enable versioning on a bucket, objects already stored in the bucket are unchanged. The
version IDs (null), contents, and permissions remain the same.
- Enabling and suspending versioning is done at the bucket level.
- When you enable versioning for a bucket, all objects added to it
will have a unique version ID. Unique version IDs are randomly
generated.
- This functionality prevents you from accidentally overwriting or
deleting objects and affords you the opportunity to retrieve a
previous version of an object.
- When you DELETE an object, all versions remain in the bucket
and Amazon S3 inserts a delete marker, as shown in the
following figure.
- The delete marker becomes the current version of the object.
You can, however, GET a noncurrent version of an object by specifying its version ID

You can permanently delete an object by specifying


the version you want to delete. Only the owner of an
Amazon S3 bucket can permanently delete a version.
Server Access Logging

- Server access logging provides detailed


records for the requests that are made
to a bucket. Server access logs are
useful for many applications.
- For example, access log information can
be useful in security and access audits.
- Each access log record provides details
about a single access request, such as
the requester, bucket name, request
time, request action, response status,
and an error code, if relevant.
- Both the source and target S3 buckets
must be owned by the same AWS
account, and the S3 buckets must both
be in the same Region.
Multi-part Upload

- The multipart upload feature is designed to improve


the upload experience for larger objects
- You can upload objects in parts
- These object parts can be uploaded independently, in
any order, and in parallel
- It is not possible to execute multipart uploads
manually utilizing the AWS Management Console
Object Lock

- With Amazon S3 object lock, you can store objects


using a write-once-read-many (WORM) model. You
can use it to prevent an object from being deleted or
overwritten for a fixed amount of time or indefinitely.
- You can only enable Amazon S3 object lock for new
buckets. If you want to turn on Amazon S3 object lock
for an existing bucket, contact AWS Support.
- When you create a bucket with Amazon S3 object lock
enabled, Amazon S3 automatically enables versioning
for the bucket.
- Once you create a bucket with Amazon S3 object lock
enabled, you can't disable object lock or suspend
versioning for the bucket.
Protecting data using encryption

Data protection refers to protecting data while in-transit (as it travels to and from Amazon S3) and at
rest (while it is stored on disks in Amazon S3 data centers)

Server-side encryption – Amazon S3 encrypts your objects before saving them on disks in AWS data
centers and then decrypts the objects when you download them
Client-side encryption – You encrypt your data client-side and upload the encrypted data to Amazon S3.
In this case, you manage the encryption process, encryption keys, and related tools.
Static Website Hosting

- You configure an Amazon S3 bucket for website hosting and


then upload your website content to the bucket.
- This bucket must have public read access. It is intentional that
everyone in the world will have read access to this bucket.
- Depending on your Region, Amazon S3 website endpoints
follow one of these two formats:
- http://bucket-name.s3-website.Region.amazonaws.com
- http://bucket-name.s3-website-Region.amazonaws.com
- To request a specific object that is stored at the root level in the bucket, use the following URL
structure
- http://bucket-name.s3-website.Region.amazonaws.com/object-name
- For example
- http://example-bucket.s3-website.us-west-2.amazonaws.com/photo.jpg
Storage Classes

- Each object in Amazon S3 has a storage class associated with it.


- Amazon S3 offers a range of storage classes for the objects that you store.
- You choose a class depending on your use case scenario and performance access requirements. All of
these storage classes offer high durability.
Storage classes for frequently accessed objects

- For performance-sensitive use cases (those that require millisecond access time) and frequently
accessed data, Amazon S3 provides the following storage class:

S3 Standard – The default storage class. If you don't specify the storage class when you upload an
object, Amazon S3 assigns the S3 Standard storage class

Storage classes for infrequently accessed objects

- The S3 Standard-IA and S3 One Zone-IA storage classes are designed for long-lived and infrequently
accessed data(IA stands for infrequent access)
- S3 Standard-IA and S3 One Zone-IA objects are available for millisecond access (similar to the S3
Standard storage class)
- Amazon S3 charges a retrieval fee for these objects, so they are most suitable for infrequently
accessed data.
Storage classes for archiving objects

The S3 Glacier Instant Retrieval, S3 Glacier Flexible Retrieval, and S3 Glacier Deep Archive storage
classes are designed for low-cost data archiving. These storage classes offer the same durability and
resiliency as the S3 Standard and S3 Standard-IA storage classes.

- S3 Glacier Instant Retrieval – Use for archiving data that is rarely accessed and requires milliseconds
retrieval. S3 Glacier Instant Retrieval has higher data access costs than S3 Standard-IA
- S3 Glacier Flexible Retrieval – Use for archives where portions of the data might need to be
retrieved in minutes. Data stored in the S3 Glacier Flexible Retrieval storage class has a minimum
storage duration period of 90 days and can be accessed in as little as 1-5 minutes by using an
expedited retrieval. The retrieval time is flexible, and you can request free bulk retrievals in up to 5-
12 hours
- S3 Glacier Deep Archive – Use for archiving data that rarely needs to be accessed. Data stored in
the S3 Glacier Deep Archive storage class has a minimum storage duration period of 180 days and a
default retrieval time of 12 hours.
AWS S3 Glacier

- Amazon S3 Glacier (S3 Glacier) is a


secure and durable service for low-cost
data archiving and long-term backup
- You can get started with Amazon S3
Glacier (S3 Glacier) by working with
vaults and archives
- A vault is a container for storing
archives, and an archive is any object,
such as a photo, video, or document,
that you store in a vault
- An archive is the base unit of storage in
S3 Glacier
Storage class for automatically optimizing data with changing or unknown access patterns

S3 Intelligent-Tiering is an Amazon S3 storage class that's designed to optimize storage costs by


automatically moving data to the most cost-effective access tier, without performance impact or
operational overhead.

S3 Intelligent-Tiering automatically stores objects in three access tiers:

- Frequent Access – Objects that are uploaded or transitioned to S3 Intelligent-Tiering are


automatically stored in the Frequent Access tier
- Infrequent Access – S3 Intelligent-Tiering moves objects that have not been accessed in 30
consecutive days to the Infrequent Access tier
- Archive Instant Access – With S3 Intelligent-Tiering, any existing objects that have not been accessed
for 90 consecutive days are automatically moved to the Archive Instant Access tier
In addition to these three tiers, S3 Intelligent-Tiering offers two optional archive access tiers

- Archive Access – S3 Intelligent-Tiering provides you with the option to activate the Archive Access
tier for data that can be accessed asynchronously. After activation, the Archive Access tier
automatically archives objects that have not been accessed for a minimum of 90 consecutive days
- Deep Archive Access – S3 Intelligent-Tiering provides you with the option to activate the Deep
Archive Access tier for data that can be accessed asynchronously. After activation, the Deep Archive
Access tier automatically archives objects that have not been accessed for a minimum of 180
consecutive days.
Object Life Cycle Management

- To manage your objects so that they are


stored cost effectively throughout their
lifecycle, configure their lifecycle.
- A lifecycle configuration is a set of rules
that define actions that Amazon S3
applies to a group of objects.
There are two types of actions:

Transition actions—Define when objects transition to


another storage class.
Expiration actions—Define when objects expire. Amazon S3 deletes
expired objects on your behalf.
Cross Region Replication and Same Region Replication

- Replication enables automatic, asynchronous copying of objects across Amazon S3 buckets.


- Buckets that are configured for object replication can be owned by the same AWS account or by
different accounts.
- You can copy objects between different AWS Regions or within the same Region.
- S3 Replication Time Control is designed to replicate 99.99% of objects within 15 minutes after upload,
with the majority of those new objects replicated in seconds. S3 RTC is backed by an SLA with a
commitment to replicate 99.9% of objects within 15 minutes for Cross region and same region
replication

Types of Object Replication

You can replicate objects between different AWS Regions or within the same AWS Region.

1. Cross-Region replication (CRR) is used to copy objects across Amazon S3 buckets in different AWS
Regions.
2. Same-Region replication (SRR) is used to copy objects across Amazon S3 buckets in the same AWS
Region.
How it works
Bucket Policies

- A bucket policy is a resource-


based policy that you can use
to grant access permissions to
your Amazon S3 bucket and the
objects in it
- You can use bucket policies to
add or deny permissions for
the objects in a bucket
- Bucket policies use JSON
format
Amazon S3 Access Points

- Easily manage access for shared datasets on Amazon S3


- With S3 Access Points, customers can create unique access control policies for each access point
to easily control access to shared datasets
Amazon S3 access points support AWS Identity and Access Management (IAM) resource policies that
allow you to control the use of the access point by resource, user, or other conditions

Example: The following access point policy grants IAM user Jane in account 123456789012 permissions
to GET and PUT objects with the prefix Jane/ through the access point my-access-point in
account 123456789012
Amazon S3 Transfer Acceleration

- Amazon S3 Transfer Acceleration can speed up content


transfers to and from Amazon S3 by as much as 50-500%
for long-distance transfer of larger objects
- S3TA improves transfer performance by routing traffic
through Amazon CloudFront’s globally distributed Edge
Locations and over AWS backbone networks
AWS S3 Event Notification

- Amazon S3 Event Notifications feature


can be used to receive notifications
when certain events happen in your S3
bucket
- You can enable certain Amazon S3
bucket events to send a notification
message to a destination whenever
those events occur
Delete a bucket

- You can delete the objects individually. Or you can empty a bucket, which deletes all the objects in the
bucket without deleting the bucket.
- You can also delete a bucket and all the objects contained in the bucket.
- If you want to use the same bucket, don’t delete the bucket, can empty the bucket and keep it.
- After you delete the bucket, It is available for re-use, but the name might not be available for you to
reuse for various reasons. For example, it might take some time before the name can be reused, and
some other account could create a bucket with that name before you do.
Cloud Front
Introduction to Cloud Front distributions and Cloud Front Edge locations

Is a web service that speeds up distribution of your static and dynamic web content, such as .html, .css,
.js, and image files, to your users

⁻ CloudFront delivers your content through a worldwide network of data centers called edge locations
⁻ When a user requests content that you're serving with CloudFront, the user is routed to the edge
location that provides the lowest latency (time delay), so that content is delivered with the best
possible performance
⁻ If the content is already in the edge location with the lowest latency, CloudFront delivers it
immediately
⁻ If the content is not in that edge location, CloudFront retrieves it from an origin that you've defined—
such as an Amazon S3 bucket, a Media Package channel, or an HTTP server (for example, a web
server)
⁻ You also get increased reliability and availability because copies of your files (also known as
objects) are now held (or cached) in multiple edge locations around the world
How Cloud Front
works ?
AWS Data Migration
services
AWS Storage Gateway

- Hybrid cloud storage service that gives you on-premises access to virtually unlimited cloud storage
- Provides a standard set of storage protocols such as iSCSI, SMB, and NFS, which allow you to use AWS
storage without rewriting your existing applications
- Provides low-latency performance by caching frequently accessed data on premises, while storing
data securely and durably

Storage Gateway supports four key hybrid cloud use cases

The Amazon S3 File Gateway enables you to store and retrieve objects in
Amazon Simple Storage Service (S3) using file protocols such as Network File System
(NFS) and Server Message Block (SMB). Objects written through S3 File Gateway can
File gateway be directly accessed in S3
The Amazon FSx File Gateway enables you to store and retrieve files in Amazon FSx
for Windows File Server using the SMB protocol. Files written through Amazon FSx
File Gateway are directly accessible in Amazon FSx for Windows File Server.
Amazon FSx
File Gateway

The Volume Gateway provides block storage to your on-premises applications


using iSCSI connectivity. Data on the volumes is stored in Amazon S3 and you can
take point-in-time copies of volumes that are stored in AWS as Amazon EBS
Volume Gateway snapshots.

The Tape Gateway provides your backup application with an iSCSI virtual tape
library (VTL) interface, consisting of a virtual media changer, virtual tape drives,
and virtual tapes. Virtual tapes are stored in Amazon S3 and can be archived to
Tape Gateway Amazon S3 Glacier or Amazon S3 Glacier Deep Archive.
Fundamental blueprint
AWS DataSync

- Online data movement and discovery service


that simplifies and accelerates data
migrations to AWS as well as moving data
between on-premises storage, edge
locations, other clouds, and AWS Storage
- Reduces the complexity and cost of online
data transfer,
Transfer data between on-premises and AWS
Transfer data between AWS storage services
Transfer data between AWS and other locations
AWS Snow Family

- Move petabytes of data to and from AWS, or process data at the edge
- AWS Snow Family devices are physical devices

AWS Snow Family include three device types

AWS Snowcone AWS Snowball AWS Snowmobile


AWS Snowcone

It is a small, rugged, and secure device offering edge computing, data storage, and data transfer on-
the-go, in austere environment with little or no connectivity.

How it works
AWS Snowball

Migrate petabyte-scale data to AWS with Snowball.

How it works
AWS Snowmobile

Migrate or transport exabyte-scale datasets into and out of AWS

How it works
Feature comparison matrix
1. What are bucket naming guidelines ?
2. What is a bucket and object ?
3. Min and max size of an object ?
4. Maximum size of an S3 bucket ?
5. Maximum number of buckets per account (soft and hard limits) ?
6. What is multipart upload ?
7. What is versioning in S3 ?
8. Different storage classes and their uses ?
9. What is Lifecycle management ?
10.Cross region and same region replication ?
11.What are different Storage gateway options available ?
12.Define DataSync ?
13.What are the different types of data migration service available ?

You might also like