AWS_Course
AWS_Course
network
Client Server
• Storage: Data
Home or Garage
Office Data center
Problems with traditional IT approach
• Pay for the rent for the data center
• Pay for power supply, cooling, maintenance
• Adding and replacing hardware takes time
• Scaling is limited
• Hire 24/7 team to monitor the infrastructure
• How to deal with disasters? (earthquake, power shutdown, fire…)
2003: 2006:
Amazon infrastructure is Re-launched
one of their core strength. publicly with
Idea to market SQS, S3 & EC2
AWS Cloud Number Facts
• In 2019, AWS had $35.02
billion in annual revenue
• AWS accounts for 47% of the
market in 2019 (Microsoft is
2nd with 22%)
• Pioneer and Leader of the
AWS Cloud Market for the
9th consecutive year
• Over 1,000,000 active users
• https://infrastructure.aws/
IAM Section
IAM: Users & Groups
• IAM = Identity and Access Management, Global service
• Root account created by default, shouldn’t be used or shared
• Users are people within your organization, and can be grouped
• Groups only contain users, not other groups
• Users don’t have to belong to a group, and user can belong to multiple groups
"Resource": "*"
}
]
}
Multi Factor Authentication - MFA
• Users have access to your account and can possibly change
configurations or delete resources in your AWS account
• You want to protect your Root Accounts and IAM users
• MFA = password you know + security device you own
Alice
m5.2xlarge
• m: instance class
• 5: generation (AWS improves them over time)
• 2xlarge: size within the instance class
EC2 Instance Types – General Purpose
• Great for a diversity of workloads such as web servers or code repositories
• Balance between:
• Compute
• Memory
• Networking
• In the course, we will be using the t2.micro which is a General Purpose EC2
instance
* this list will evolve over time, please check the AWS website for the latest information
EC2 Instance Types – Compute Optimized
• Great for compute-intensive tasks that require high performance
processors:
• Batch processing workloads
• Media transcoding
• High performance web servers
• High performance computing (HPC)
• Scientific modeling & machine learning
• Dedicated gaming servers
* this list will evolve over time, please check the AWS website for the latest information
EC2 Instance Types – Memory Optimized
• Fast performance for workloads that process large data sets in memory
• Use cases:
• High performance, relational/non-relational databases
• Distributed web scale cache stores
• In-memory databases optimized for BI (business intelligence)
• Applications performing real-time processing of big unstructured data
* this list will evolve over time, please check the AWS website for the latest information
EC2 Instance Types – Storage Optimized
• Great for storage-intensive tasks that require high, sequential read and write
access to large data sets on local storage
• Use cases:
• High frequency online transaction processing (OLTP) systems
• Relational & NoSQL databases
• Cache for in-memory databases (for example, Redis)
• Data warehousing applications
• Distributed file systems
* this list will evolve over time, please check the AWS website for the latest information
EC2 Instance Types: example
t2.micro is part of the AWS free tier (up to 750 hours per month)
Inbound traffic
Security
Group
WWW Outbound traffic EC2 Instance
US-EAST-1A US-EAST-1B
EBS Snapshot
Custom AMI
US-EAST-1A US-EAST-1B
Launch
Create AMI from AMI
EFS – Elastic File System
• Managed NFS (network file system) that can be mounted on 100s of EC2
• EFS works with Linux EC2 instances in multi-AZ
• Highly available, scalable, expensive (3x gp2), pay per use, no capacity planning
us-east-1a us-east-1b us-east-1c
Security Group
EFS FileSystem
EBS vs EFS
Availability Zone 1 Availability Zone 2 Availability Zone 1 Availability Zone 2
EBS EBS
EFS EFS
Mount Mount
Target Target
snapshot restore
EBS Snapshot
EFS
Elastic Load Balancing & Auto
Scaling Groups Section
Scalability & High Availability
• Scalability means that an application / system can handle greater loads
by adapting.
• There are two kinds of scalability:
• Vertical Scalability
• Horizontal Scalability (= elasticity)
• Scalability is linked but different to High Availability
• Let’s deep dive into the distinction, using a call center as an example
Vertical Scalability
• Vertical Scalability means increasing the size
of the instance
• For example, your application runs on a
t2.micro
• Scaling that application vertically means
running it on a t2.large
• Vertical scalability is very common for non
distributed systems, such as a database.
• There’s usually a limit to how much you can
vertically scale (hardware limit)
junior operator senior operator
Horizontal Scalability operator operator operator
• High Availability: Run instances for the same application across multi AZ
• Auto Scaling Group multi AZ
• Load Balancer multi AZ
Scalability vs Elasticity (vs Agility)
• Scalability: ability to accommodate a larger load by making the
hardware stronger (scale up), or by adding nodes (scale out)
Load Balancer
User 1
User 2
User 3
Why use a load balancer?
• Spread load across multiple downstream instances
• Expose a single point of access (DNS) to your application
• Seamlessly handle failures of downstream instances
• Do regular health checks to your instances
• Provide SSL termination (HTTPS) for your websites
• High availability across zones
Why use an Elastic Load Balancer?
• An ELB (Elastic Load Balancer) is a managed load balancer
• AWS guarantees that it will be working
• AWS takes care of upgrades, maintenance, high availability
• AWS provides only a few configuration knobs
• It costs less to setup your own load balancer but it will be a lot more
effort on your end (maintenance, integrations)
• 4 kinds of load balancers offered by AWS:
• Application Load Balancer (HTTP / HTTPS only) – Layer 7
• Network Load Balancer (ultra-high performance, allows for TCP) – Layer 4
• Gateway Load Balancer – Layer 3
• Classic Load Balancer (retired in 2023) – Layer 4 & 7
What’s an Auto Scaling Group?
• In real-life, the load on your websites and application can change
• In the cloud, you can create and get rid of servers very quickly
• The goal of an Auto Scaling Group (ASG) is to:
• Scale out (add EC2 instances) to match an increased load
• Scale in (remove EC2 instances) to match a decreased load
• Ensure we have a minimum and a maximum number of machines running
• Automatically register new instances to a load balancer
• Replace unhealthy instances
• Cost Savings: only run at an optimal capacity (principle of the cloud)
Auto Scaling Group in AWS
Maximum size
Load Balancer
• http://bucket-name.s3-website.aws-region.amazonaws.com
S3 Bucket
• If you get a 403 Forbidden error, make sure the bucket (demo-bucket)
• Availability:
• Measures how readily available a service is
• Varies depending on storage class
• Example: S3 standard has 99.99% availability = not available 53 minutes a year
S3 Standard – General Purpose
• 99.99% Availability
• Used for frequently accessed data
• Low latency and high throughput
• Sustain 2 concurrent facility failures
• Use Cases: Big Data analytics, mobile & gaming applications, content
distribution…
S3 Storage Classes – Infrequent Access
• For data that is less frequently accessed, but requires rapid access when needed
• Lower cost than S3 Standard
Availability
>= 3 >= 3 >= 3 1 >= 3 >= 3 >= 3
Zones
Min. Storage
None None 30 Days 30 Days 90 Days 90 Days 180 Days
Duration Charge
Min. Billable
None None 128 KB 128 KB 128 KB 40 KB 40 KB
Object Size
Retrieval Fee None None Per GB retrieved Per GB retrieved Per GB retrieved Per GB retrieved Per GB retrieved
https://aws.amazon.com/s3/storage-classes/
AWS Storage Cloud Native Options
• Note: many databases technologies could be run on EC2, but you must
handle yourself the resiliency, backup, patching, high availability, fault
tolerance, scaling…
Amazon RDS Overview
• RDS stands for Relational Database Service
• It’s a managed DB service for DB use SQL as a query language.
• It allows you to create databases in the cloud that are managed by AWS
• Postgres
• MySQL
• MariaDB
• Oracle
• Microsoft SQL Server
• IBM DB2
• Aurora (AWS Proprietary database)
Advantage over using RDS versus deploying
DB on EC2
• RDS is a managed service:
• Automated provisioning, OS patching
• Continuous backups and restore to specific timestamp (Point in Time Restore)!
• Monitoring dashboards
• Read replicas for improved read performance
• Multi AZ setup for DR (Disaster Recovery)
• Maintenance windows for upgrades
• Scaling capability (vertical and horizontal)
• Storage backed by EBS
• BUT you can’t SSH into your instances
RDS Solution Architecture
Read/write
EC2 Instances
Possibly in an ASG
Amazon Aurora
• Aurora is a proprietary technology from AWS (not open sourced)
• PostgreSQL and MySQL are both supported as Aurora DB
• Aurora is “AWS cloud optimized” and claims 5x performance improvement
over MySQL on RDS, over 3x the performance of Postgres on RDS
• Aurora storage automatically grows in increments of 10GB, up to 128 TB
• Aurora costs more than RDS (20% more) – but is more efficient
• Not in the free tier
Amazon Aurora Serverless
Client
• Automated database instantiation and
auto-scaling based on actual usage
• PostgreSQL and MySQL are both
supported as Aurora Serverless DB
• No capacity planning needed
Proxy Fleet
• Least management overhead (managed by Aurora)
• Pay per second, can be more cost-
effective
• Use cases: good for infrequent,
intermittent or unpredictable
workloads…
Shared storage Volume
RDS Deployments: Read Replicas, Multi-AZ
• Read Replicas: • Multi-AZ:
• Scale the read workload of your DB • Failover in case of AZ outage (high availability)
• Can create up to 15 Read Replicas • Data is only read/written to the main database
• Data is only written to the main DB • Can only have 1 other AZ as failover
replication replication
SQL (relational)
Database
DynamoDB
• Fully Managed Highly available with replication across 3 AZ
• NoSQL database - not a relational database
• Scales to massive workloads, distributed “serverless” database
• Millions of requests per seconds, trillions of row, 100s of TB of storage
• Fast and consistent in performance
• Single-digit millisecond latency – low latency retrieval
• Integrated with IAM for security, authorization and administration
• Low cost and auto scaling capabilities
• Standard & Infrequent Access (IA) Table Class
DynamoDB – type of data
• DynamoDB is a key/value database
https://aws.amazon.com/nosql/key-value/
DocumentDB
• Aurora is an “AWS-implementation” of PostgreSQL / MySQL …
• DocumentDB is the same for MongoDB (which is a NoSQL database)
Region
Availability Zone 1 Availability Zone 2
VPC
VPC CIDR Range:
10.0.0.0/16
Public subnet Public subnet
Public Subnet
subnet
• Can have ALLOW and DENY rules
• Are attached at the Subnet level NACL
• Rules only include IP addresses
• Security Groups
• A firewall that controls traffic to and from an
ENI / an EC2 Instance
• Can have only ALLOW rules
• Rules include IP addresses and other security
groups Security group
Network ACLs vs Security Groups
https://docs.aws.amazon.com/vpc/latest/userguide/VPC_Secur
ity.html#VPC_Security_Comparison
VPC Flow Logs
• Capture information about IP traffic going into your interfaces:
• VPC Flow Logs
• Subnet Flow Logs
• Elastic Network Interface Flow Logs
• Helps to monitor & troubleshoot connectivity issues. Example:
• Subnets to internet
• Subnets to subnets
• Internet to subnets
• Captures network information from AWS managed interfaces too: Elastic Load
Balancers, ElastiCache, RDS, Aurora, etc…
• VPC Flow logs data can go to S3, CloudWatch Logs, and Kinesis Data Firehose
VPC Peering
• Connect two VPC, privately using VPC peering
AWS’ network VPC A
Aß àB
VPC B
• Make them behave as if they were
in the same network
• Must not have overlapping CIDR (IP
address range)
VPC C
• VPC Peering connection is not VPC peering VPC peering
transitive (must be established for Aß àC Bß à C
VPC Endpoint
• VPC Endpoint Gateway: S3 & Gateway
DynamoDB
• VPC Endpoint Interface: the rest
S3 DynamoDB CloudWatch
AWS Training
• AWS Digital (online) and Classroom Training (in-person or virtual)
• AWS Private Training (for your organization)
• Training and Certification for the U.S Government
• Training and Certification for the Enterprise
https://repost.aws/knowledge-center
AWS Certification Paths – Architecture
Architecture
Solutions Architect
Design, develop, and manage
cloud infrastructure and assets,
work with DevOps to migrate
applications to the cloud
Dive Deep
Architecture
Application Architect
Design significant aspects of
application architecture including
user interface, middleware, and
infrastructure, and ensure
enterprise-wide scalable, reliable,
and manageable systems Dive Deep
https://d1.awsstatic.com/training-and-
certification/docs/AWS_certification_paths.pdf