AWS Training (003) MV
AWS Training (003) MV
Concepts
Billing principles
Pay as you go: pay for what you use, remain agile, responsive, meet scale
demands
Save when you reserve: minimize risks, predictably manage budgets, comply
with long-terms requirements
Pay less by using more: volume-based discounts
Pay less as AWS grows
Abuse Team - report AWS resources used for abusive or illegal purposes
Security team – assist with security of services offered by AWS
Concierge team - assist with billing and account management
Customer Service team – assist with technology questions
Compute
Description
Billing options
Testing
Lambda
Storage
File storage for EC2 instances for data that must be quickly accessible and
requires long-term persistence
Network drives attached to one EC2 instance at a time
Mapped to an Availability Zones
EBS snapshots - backup of EBS volume & transfer across AZ
File storage for use with Amazon EC2 (like a shared folder)
Highly scalable file storage system designed to provide flexible storage for
multiple EC2 instances
Network file system attached on several EC2 instances in a region
EFS-IA – Infrequent Access: Cost-optimized storage class for infrequent
accessed files
Object storage to store and retrieve data from anywhere (websites, mobile
apps, corporate applications, and data from IoT sensors or devices)
Concepts: Buckets (folders) and Objects (files) tied to a region
Features:
o Security: IAM policy, S3 Bucket Policy (public access), S3 Encryption
o Websites: host a static website on Amazon S3
o Versioning: multiple versions for files to roll-back
o Access logs: log requests made within your S3 bucket
o Replication: same-region or cross-region replication
o Object Lock: Block an object version deletion
o Glacier Vault Lock: Lock policy of object deletion for future edits
o Lifecycle rules: move objects across different storage classes
S3 Storage classes (for real-time data access):
o S3 Standard General Purpose - low latency and high throughput
o S3 Standard Infrequent Access (IA) - data that is less frequently
accessed
o S3 One Zone-Infrequent Access – same as above for but stored in only
one zone
o S3 Intelligent Tiering - Cost-optimized by automatically moving objects
between two access tiers – better for unpredictable access patterns
S3 Glacier (for archive & backup)
o Glacier & Glacier Deep Archive - Low cost object storage, long retrieve
times
Database
Aurora
Set up, operate and scale a relational database based on MySQL and
PostgreSQL
Aurora is a proprietary DB technology from AWS
5x performance improvement over MySQL on RDS and 3x over Postgres
Aurora costs more than RDS (20% more) – but is more efficient
DynamoDB
Amazon ElastiCache
Web service that makes it easy to deploy, operate, and scale an in-memory
cache in the cloud
Provide ultrafast and inexpensive access to copies of data
Analytics
Redshift
EMR
Athena
Objective:
Types of devices:
Snowcone
o Small briefcase, less storage < 8 TB
o Petabyte-scale data transport solution
Snowball
o Large suitcase, large storage > 80 Tb
o Petabyte-scale data transport solution
o Transfer large amounts of data into and out of AWS
Snowbal Edge
o Data migration and edge computing device
o Two types of solutions: Storage Optimized (100 TB) and Compute
Optimized (52 vCPUs)
o To be used in environments with limited connectivity
Snowmobile
o Truck, huge storage (exabytes)
o Exabyte-scale data transfer service
o Move extremely large amounts of data to AWS
Networking
Direct Connect
CloudFront
Route 53
AWS CloudWatch
Monitoring and management service that provides metrics for all AWS
services
Use CloudWatch for:
o Metrics: monitor the performance of AWS services and billing metrics
o Alarms: automate notifications based on metric
o Logs: collect log files from AWS services
o Events: react to events or trigger a rule on a schedule
AWS CloudTrail
Trusted Advisor
AWS CloudFormation
AWS Config
Security
Amazon Inspector
AWS Shield
AWS Organizations
AWS WAF
Firewall that helps protect your web applications from common web exploits
AWS Artifact
Application integration
Cost management
Contains the most comprehensive set of AWS cost and usage dataset
Lists AWS usage for each service used by an account and its IAM users
Cost Explorer
Visualize, understand, and manage your AWS costs and usage over time
Create custom reports that analyze cost and usage data
View current usage (detailed) and forecast usage
Choose an optimal Savings Plan (to lower prices)
AWS Budgets
AWS Database
CLOUD (Madalena)
Five characteristics:
Architecture
Uses middleware
SaaS allows access to software through a subscription model hosted by a provider and is consumed
by customers over the internet in an “as-you-go” model, which usually means lower upfront costs,
regular updates, and easy maintenance. With SaaS, the positives are ease of adaptation,
predictable expenses, and higher speeds and benefits. Software as a Service is simple,
straightforward, and keeps unnecessary costs down.
Public cloud – suited for less confidential information and is hosted at the provider’s location.
The service may be free or offered as a pay-per-usage model. Examples: dropbox, google
drive, facebook moments
Private cloud – dedicated to a single organization with high levels of confidential information.
Serves as a data center
Hybrid cloud – mix. Examples: Office 365, (users can execute some work publicly through
public cloud and the rest privately on the private cloud using SharePoint)
On demand self service – Users can provision resources and use them without
interaction from the service provider
Broad network access – Resources available over the network, and can be accessed by
diverse client platforms
Multi-tenancy and resource pooling – Multiple customers can share the same
infrastructure and applications with security and privacy; multiple customers are serviced
from the same physical resources
Rapid elasticity and scalability (+Importante) – Automatically and quickly acquire and
dispose resources when needed; quickly and easily scale based on demand
Measured service: Usage is measured, users pay correctly for what they have used
6 Advantages:
1. Trade capital expense (CAPEX) for operational expense (OPEX)
Pay on demand: don’t own hardware
Reduced total cost of Ownership (TCO) & Operational Expense (OPEX)
2. Benefit from massive economies of scale
Prices reduced as AWS is more efficient due to large scale
3. Stop guessing capacity
Scale based on actual measured usage
4. Increase speed and agility
5. Stop spending money running and maintain data centers
Problems solved by the Cloud:
Flexibility: change resources type when needed
Cost-Effectiveness: pay as you go, for what you use
Scalability: accommodate larger loads by making hardware stronger or adding additional
nodes
Elasticity: ability to scale out and scale in when needed
High availability and fault-tolerance: build across data centers
Agility: rapidly develop, test and launch software applications
Types of Cloud Computing
IaaS: provide building blocks for cloud IT; provides networking computers, data storage space;
highest level of flexibility; easy parallel with traditional on-premises IT
PaaS: Removes the need for your organization to manage the underlying infrastructure; focus
on the development and management of your applications
SaaS: completed product that is going to be run and managed by the service provider
Pricing of the Cloud: Compute and Storage and Data transfer OUT of the cloud (only for the
compute time and amount of data stored). Data transfer IN to the Cloud is free
Choosing na AWS region: compliance with data governance and legal requirements, proximity
to customers, available services and features within a Region, and pricing. AWS regions is
composed of multiple, isolated and separated availability zones. IAM encompasses all regions
An IAM policy is an entity that, when attached to an identity or resource, defines their
permissions.
EC2
Eg.: m5.2xlarge
m – instance class
5 – generation of hardware (AWS improves the hardware over time)
2xlarge – size within the instance class (more size, more memory, more CPU in the instance)
EC2 Instance types
General Purpose great for a diversity of workloads such as web servers or code
repositories. Balance between: compute, memory, networking t2
t2.micro is a General Purpose EC2
Compute Optimized great for compute intensive tasks that require high performance
processors: batch processing workloads, media transcoding, high performance web
servers, high performance computing (HPC), scientific modeling & machine learning,
dedicated gaming servers C6g. All these things are tasks that require a very good CPU
Memory Optimized fast performance for workloads that process large data sets in
memory. Use cases: High performance, relational/non-relational databases, distributed
web scale cache stores, in-memory databases optimized for BI (business intelligence),
applications performing real time processing of big unstructured data R6g
Storage Optimized great for storage intensive tasks that require high, sequential read
and write access to large data sets on local storage. Use cases: high frequency online
transaction processing (OLTP) systems, relational & NoSQL databases, cache for in-
memory databases (Redis), data warehousing applications, distributed file systems I3
Security Groups are the fundamental of network security in AWS; they control how traffic is
allowed into or out of our EC2 Instances
Security groups are easy because they only contain allow rules, so we can say what is allowed
to go in and to go out. They have rules that reference either by IP addresses
Security groups can be attached to multiple instances and an instance can have multiple
security groups too. Security groups are locked down to a region (switch to another region,
create a new security group)
Ports to know:
o 22 = SSH (Secure Shell) – log into a Linux instance
o 21 = FTP (File Transfer Protocol) – uploads files into a file share
o 22 = SFTP (Secure File Transfer Protocol) – upload files using SSH
o 80 = HTTP – access unsecured websites
o 443 = HTTPS – access secured websites
o 2289 = RDP (Remote Desktop Protocol) – log into a Windows instance
Spot Instances: short workloads, cheap, can lose instances (less reliable)
Provide you the highest discount in AWS – 90%
But you can loose them at any point of time, if the price you are willing to pay for them
(max price), is less than the current spot price. Spot prices change over time
The most cost efficient instances in AWS
Just use for workloads that are resilient to failure: batch jobs, data analysis, image
processing, any distributed workloads, workloads with flexible start and end time
Not suitable for critical jobs or databases!! (loose all the work)
EBS Snapshots transferimos o q temos para um EBS snapshot e desse colocamos noutra AZ
Make a backup (snapshot) of your EBS volume at a point in time
Not necessary to detach volume to do snapshot, but recommended
Can copy snapshots across AZ or Region!! When we need to have the info in another region
EFS Infrequent Access (EFS-IA) – qnd não precisamos de aceder todos os dias aos files
Storage class that is cost-optimized for files not accessed every day
Up to 92% lower cost compared to EFS Standard – COST SAVING
EFS will automatically move your files to EFS-IA based on the last time they were accessed
Enable EFS-IA with a lifecycle policy
Eg.: move files that are not accessed for 60 days to EFS-IA
Transparent to the applications accessing EFS
Amazon FSx
1) Amazon FSx for windows File Server
A fully managed, highly reliable and scalable Windows native shared file system
Built on Windows File Server
Supports SMB protocol & Windows NTFS
Integrated with Microsoft Active Directory
Can be accessed from AWS or your on-premise infrastructure
SECTION 8 - S3
Amazon S3 is on of the main building blocks of AWS
It’s advertised as “infinitely scaling” storage
Many websites use Amazon S3 as a backbone. Many AWS services uses S3 as an integration
as well
S3 Use Cases
1. Backup and storage
2. Disaster Recovery
3. Archive
4. Hybrid Cloud Storage
5. Application hosting
6. Media hosting
7. Data lakes & big data analytics
8. Software delivery
9. Static website
Amazon S3 allows people to store objects (files) in buckets (directories)
Buckets must have a globally unique name (across all regions all accounts)
Buckets are defined at the regional level
S3 looks like a global service but buckets are created in a region
Naming convention – global unique
S3 Security
User based IAM policies – which API calls should be allowed for a specific user from
IAM console
Resource Based Bucket Policies – bucket wide rules from the S3 console – allows
cross account
Object Access Control List (ACL) – finer grain
Bucket ACL – less common !!
Note: An IAM principal can access an S3 object if:
The user IAM permissions allow it OR the resource policy allows it AND ther’s no explicit
DENY
Encryption: encrypt objects in Amazon S3 using encryption keys
S3 Bucket Policies
JSON based policies
Resources: buckets and objects
Actions Set of API to Allow or Deny
Effect: Allow/Deny
Principal: The account or user to apply the policy to
Use S3 bucket for policy to: grant public access to the bucket, force objects to be encrypted at
upload, grant access to another account (cross account)
S3 Websites
S3 can host static websites and have them accessible on the www
The website url will be <bucket-name>.s3-website-<AWS-region>.amazonaws.com
If we don’t make the S3 bucket public in the first place, we’re going to get a 403 Erros
(Forbidden)!!
Amazon S3 – Versioning
You can version your files in Amazon S3. It is enabled at the bucket level. Same key overwrite
will increment the “version”: 1,2,3,… It is best practice to version your buckets (protect against
unintended deletes, easy roll back to previous version)
Notes: any file that is not versioned prior to enabling versioning will have version “null”,
suspending versioning does not delete the previous versions
S3 Access Logs for audit purpose, you may want to log all access to S3 buckets
S3 Replication (CRR & SRR)
• Must enable versioning in source and destination
• Cross Region Replication (CRR)
• Same Region Replication (SRR)
• Buckets can be in different accounts
• Copying is asynchronous
• Must give proper IAM permissions to S3
• CRR - Use cases: compliance, lower latency access, replication across accounts
• SRR – Use cases: log aggregation, live replication between production and test accounts
Availability is how rapidly available a service is. S3 standard has 99,99% availability. Varies
depending on storage class
Shared Responsibility Model for S3
AWS Infrastructure (global security, durability, sustain concurrent loss of data in two facilities),
Configuration and vulnerability analysis, Compliance validation
User S3 Versioning, Bucket policies, replication, Logging and Monitoring, S3 storage classes,
data encryption at rest and in transit
Snow family are offline devices to perform data migrations (receive via the post)! If it takes more
than 1 week to transfer over the network, use Snowball devices!!
AWS Snowcone
Small, portable, computing, anywhere, rugged & secure, withstands harsh environments
Light (4.5 pounds, 2,1 kg)
Device used for edge computing, storage, and data transfer
8TBs of usable storage
Use Snowcone where Snowball does not fit (space-constrained environment)
Must provide your own battery/cables
Can be sent to AWS offline, or connect it to internet and use AWS DataSync to send data
AWS Snowmobile
Transfer exabytes of data (1 EB = 1000000 TBs)
Each Snowmobile has 100 PB of capacity (use multiple in parallel)
High security: temperature controlled, GPS, 24/7 video surveillance
Better than Snowball if you transfer more than 10 PB
Edge Computing
Process data while it’s being created on an edge location (eg: a truck on the road, ship on the
sea). These locations may have: limited / no internet access ; Limited / no easy access to
computing power. So, we setup a Snowball Edge / Snowball device to do edge computing. Use
cases of Edge Computing: Preprocess data, Machine learning at the edge, Transcoding media
streams. Eventually (if need be) we can skip back the device to AWS (for transferring data for
eg)
Hybrid Cloud for Storage (storage gateway)
AWS is pushing for “hybrid cloud”(part of your infrastructure is on-premises and the other part is
on the cloud)
This can be due to: Long cloud migrations, security requirements, compliance requirements, IT
strategy
S3 is a proprietary storage technology, so how do you expose the S3 data on premise? AWS
Storage Gateway – allows you to bridge whatever happens on-premises directly into the AWS
cloud
Relational Databases looks like Excel spreadsheets, with links between them. Can use the
SQL language to perform queries/lookups
NoSQL (non-relational databases) Databases are purpose built for specific data models and
have flexible schemas for building modern application
Benefits: - Flexibility: easy to evolve data model, - Scalability: designed to scale-out (add) by
using distributed clusters, - High performance: optimized for a specific data model, - Highly
functional: types optimized for the data model
Eg: key value, document, graph, in-memory, search databases
data in JSON format (same as IAM policies)
AWS offers mange different databases.
Benefits: Quick provisioning, High Availability, Vertical and Horizontal Scaling; Automated
Backup and Restore, Operations, Upgrades; Operating System Patching is handled by AWS,
Monitoring, alerting
Note: we can use our own databases but it is our responsibility (so a managed database is a
lifesaver in many cases)
Amazon Aurora
Aurora is a proprietary technology from AWS (not open sourced)
PostgreSQL and MySQL are both supported as Aurora DB
Aurora is “AWS cloud optimized” (better performance)
Aurora storage automatically grows in increments of 10GB, up to 64TB
Aurora costs more than RDS but is more efficient
Not in the free tier
RDS Deployments
1) Read Replicas
Scale the read workload of your DB
Can create up to 5 Read Replicas
Data is only written to the main DB
2) Multi AZ
Failover in case of AZ outage (high availability)
Data is only read/written to the main database
Can only have 1 AZ as a failover AZ
3) Multi Region (Read Replicas)~
Disaster recovery in case of region issue
Local performance for global reads
Replication cost
ElastiCache Overview is to get managed Redis or Memcached (serve all the databases)
Caches are in-memory databases with high performance, low latency
Helps reduce load off databases for read intensive workloads
AWS takes care of OS maintenance / patching, optimizations, setup, configuration, monitoring,
failure recovery and backups
DynamoDatabase – NoSQL database (not relational)
Fully managed Highly available with replication across 3 AZ
Scales to massive workloads, distributed “serverless” database
It scales to millions of requests per seconds
Fast and consistent in performance
Single digit millisecond latency – low latency retrieval
Integrated with IAM for security, authorization and administration
Low cost and auto scaling capabilities
Redshift
Is based on PostgeSQL but it’s used for OLAP - online analytical processing – (analytics and
data warehousing)
Look data once every hour, not every second
10x better performance than other data warehouses
Columnar storage of data (instead of row based)
Massively Parallel Query Execution (MPP), highly available
Pay as you go based on the instances provisioned
Has a SQL interface for performing the queries
BI tools such as AWS Quicksight or Tabeau integrate with it
EMR (Elastic MapReduce) helps creating Hadoop clusters (Big Data) to analyze and process
vast amount of data
The clusters can be made of hundreds of EC2 instances
EMR takes care of all the provisioning and configuration
Auto-scaling and integrated with Spot instances
Use cases: data processing, machine learning, web indexing, big data
Amazon Athena
Serverless query service to perform analytics against S3 objects
Uses SQL language to query the Files
Use cases: Business intelligence/analytics/reporting, analyze & query VPC Flow Logs, EBL
Logs, CloudTrails,…
Analyze data in S3 using serverless SQL
Amazon QuickSight
Serverless machine learning-powered business intelligence service to create interactive
dashboards
Fast, automatically scalable, embeddable, with per-session pricing
Use cases: Business analytics, building visualisations,
Integrated with RDS, Aurora, athena, redshift, S3, …
Amazon Neptune
Fully managed graph database. A popular graph would be a social network
Highly available across 3 AZ, with up to 15 read replicas
Build and run applications working with highly connected datasets
Can store up to billions of relations and query the graph with milliseconds latency
Highly available with replications across multiple AZs
AWS Glue
Managed extract, transform, and load (ETL) service – Loaded for analytics
Usedful to prepare and transform data for analytics
Fully serverless service
Glue Catalog
Section 10: ECS, Lambda, Batch, Lightsail
What is Docker? Docker is a software development platform to deploy apps. Apps are
packaged in containers that can be run on any OS. Apps run the same, regardless of where
they’re run. Scale containers up and down very quickly (seconds) Docker is a software
development platform that allows you to run applications the same way, regardless of where
they are run. It can scale containers up and down within seconds. nao e preciso saber
Serverless Introduction
Is a new paradigm in which the developers don’t have to manage servers anymore… They just
deploy code and functions. Serverless does not mean there are no servers… it means you just
don’t manage / provision/ see them (eg: Amazon S3, Dynamo DB, Fargate, Lambda)
AWS Lambda
AWS Batch
Fully managed batch processing at any scale
Efficiently run 100000s of computing batch jobs on AWS
A “batch” job is a job with a start and an end (opposed to continuous)
Batch will dynamically launch EC2 instances or Spot instances
AWS Batch provisions the right amount/memory
Batch jobs are defined as Docker images and run on ECS
Helpful for cost optimization and focusing less on the infrastructure (automatically scales)
AWS Batch enables developers, scientists, and engineers to easily and efficiently run hundreds of
thousands of batch computing jobs on AWS. AWS Batch dynamically provisions the optimal
quantity and type of compute resources (e.g., CPU or memory-optimized instances) based on
the volume and specific resource requirements of the batch jobs submitted
Lambda Batch
Time limit No Time limit
Limited runtimes Any runtime
Limited temporary disk space Rely on EBS
Serverless Relies on EC2 (can be managed by AWS)
CloudFormation is a declarative way of outlining your AWS Infrastructure, for any resources
(most of them are supported)
Is going to be used when we have infrastructure as code, templates when we need to repeat an
architecture in different environments, different regions, or even AWS accounts
For example: I want a security group, I want 2 instances, I want a EBL….
Then, Cloud Formation creates those for you, in the right order, with the exact configuration
that you specify
Benefits: Infrastructure as code (no resources are manually created, which is excellent for
control; changes to the infrastructure are reviewed through code).
Cost (savings strategy)
Productivity (ability to destroy and re-create infrastructure on the cloud, automated
generation of Diagram for your templates)
Don’t’ reinvent the wheel (leverage existing templates on the web and documentation)
Supports (almost) all AWS resources
AWS Cloud Development Kit (CDK) define your cloud infrastructure using a familiar language:
JavaScript, Python, … The code is compiled into a CloudFormation template /JSON/YAML).
You can therefore deploy infrastructure and application runtime code together (great for lambda
functions and dock containers)
Benstalk
(When we are a developer on AWS, we don’t want to be managing infrastructure and
configuring all the databases, load balancers, … . We just want to deploy code! And ensure it
scales!
Most web infrastructure have the same architecture: load bouncer + auto scaling group)
So Benstalk is the answer! Platform as a Service (PaaS) – only manage data and apps
Is a developer centric of deploying an application on AWS. It’s all in ONE view and we still have
full control over the configuration. It is free but we pay for the underlying instances. It is a
managed service. Just the application code is the responsibility of the developer! Very
developer friendly service.
Three architecture models: Single instance (good for dev), LB+ASG (great for production web
applicants), ASG only (great for non-web apps in production
Support for many platforms!
Health Monitoring – Health agent pushes metrics to CloudWatch. Checks for app health,
publishes health events
that I've gotten to really show you that Beanstalk is a way to do health monitoring for
your applications.
CodeCommit 1
Before pushing the application code to servers, it needs to be stored somewhere. Developers
usually store code in a repository, using GIT technology
A famous public offering is GitHub, AWS’ competing product is CodeCommit (makes it easy to
collaborate with others on code. The code changes are automatically versioned). Benefits:
Fully managed, Scalable & highly available, Private, Secured, Integrated with AWS
CodePipeline
Orchestrate the different steps to have the code automatically pushed to production (basis for
CICD – continues integration & continuous delivery) – orchestration of pipeline
Benefits: fully managed, compatible with different services, fast delivery and rapid updates
CodeArtifact
Software packages depend on each other to be built – dependencies
Storing and retrieving these dependencies is called artifact management
Traditionally, you need to setup your own artifact management system. Now, with CodeArtifact,
it is a secure, scalable, and cost effective artifact management for software development / a
place to store their code dependencies
CodeStar
Unified UI to easily manage software development activities in one place
Central Service that allows you developers to quickly start with development while using best
CI/CD practices
Can edit the code “in the cloud” using AWS Cloud9
AWS Cloud9
Is a cloud IDE (Integrated Development Environment) for writing, running and debugging
code
Classic IDE are downloaded on a computer before being used
A cloud IDE can be used within a web browser, meaning you can work from anywhere without
any setup - Allows code collaboration
DEPLOYMENT
DEVELOPER
SERVICES
SECTION 12 – Global Infrastructure section
Why Global Application? A global application is an application deployed in multiple
geographies
On AWS: deploy your application onto different AWS Regions or Edge Locations
Decreased Latency (latency is the time it takes for a network packet to reach a server)
Disaster Recovery (A DR plan is important to reach high availability)
Attack protection: distributed global infrastructure is harder to attack (hackers online)
Regions: for deploying applications and infrastructure. They’re made of multiple data centers
(AZ). Edge Locations (Points of Presence): for content delivery as close as possible to users
AWS CloudFront Content Delivery Network (CDN) – you cage content at the edge
Improves read performance, content is cached at the edge location – content distributed all
around the world
Improves users experience (since content is cached all around the world)
216 Points of Presence (number of edge locations)
DDoS protection (because worldwide), integration with Shield, AWS Web Application Firewall
Wat CloudFront can cache from? It can cache from S3 buckets (enhanced security with cloud
front Origin Access Identity (OAI), Custom Origin HTTP (EC2 Instance, Application Load
Balancer, …)
You can use AWS WAF web access control lists (web ACLs) to help minimize the effects of a
distributed denial of service (DDoS) attack. For additional protection against DDoS attacks, AWS
also provides AWS Shield Standard and AWS Shield Advanced.
S3 Transfer Acceleration
S3 Buckets are linked only to one Region and
sometimes we need to transfer files all around the world into one specific S3 buckets
Increase transfer speed by transferring file to an AWS edge location which will forward the data
to the S3 bucket in the target region.
We can test using a URL link
AWS Global Accelerator to make your request go faster and go through the internal AWS
network globally
Improve global application availability and performance using the AWS global network
Leverage the AWS internal network to optimize the route to your application (60% improvement)
You only access you application through 2 Anycast IP that are created and traffic is sent
through Edge Locations
The Edge Locations send the traffic to your application
AWS Outposts
Hybrid Cloud: businesses that keep an on-premises infrastructure alongside a cloud
infrastructure
Therefore 2 ways of dealing with IT systems: One for AWS cloud, one for their on-premises
infrastructure
AWS Outposts are “server racks” that offers the same AWS infrastructure, services, APIs &
tools to build your own applications on-premises just as in the cloud
AWS will set up and manage “outposts Racks” within your on-premises infrastructure and you
can start leveraging AWS services on-premises
You are responsible for the Outposts Rack physical security!!
Benefits: Low-latency access to on premises systems; local data processing, data residency,
easier migration from on premises to the cloud, fully managed service.
Amazon MQ
SQS and SNS are cloud native services, they are using proprietary protocols from AWS. But
applications may use open protocols such as MQTT, AMQP, WSS, Openwire
When migrating to the cloud, instead of re-engineering the application, we can use Amazon MQ
(does not scale as much as the others and it is not a serverless)
Amazon CloudWatch Metrics provides metrics for every services in AWS (metric is a variable
to monitor).
Important metrics:
EC2 instances: CPU utilization, status checks, network
EBS Volumes: Disk Reads/Writes
S3 buckets: BucketSizeBytes, Number of objects
Billing: Total Estimated Charge
Service Limits: how much you’ve been using a service API
Custom metrics: push your own metrics
AWS X-Ray
Do tracing and get visual analysis of your application – get a full picture
Distributed tracing, troubleshooting, you want to have a service graph
AWS X-Ray helps developers analyze and debug production, distributed applications, such as
those built using a microservices architecture
CodeGuru
Do automated code reviews (Reviewer) and application performance recommendations
(profiler)
Internet Gateways helps our VPC instances connect with internet – public
Nat Gateways & NAT Instances – private
Public internet
On-premises: must use a Customer Gateway (CGW)
AWS: must use a Virtual private Gateway (VGW)
Vs. Direct Connect
Penetration Testing
When you are trying to attack your own infrastructure to test your security. We can do pen
testing on the Cloud. Remember that some are authorized, but anything that looks like an attack
such as DDoS attack or DNS zone walking is not authorized because for AWS it would seem
like you’re trying to attack their infrastructure
AWS Certificate Manager (ACM) – service that can help us do in flight encryption for websites
(HTTPS) and generates SSL/TLS certificates
AWS Secrets Manager – Storing secret to be managing in RDS and to be rotated every X
days
AWS Artifact – portal that provides customers with on demand access to AWS compliance
documentation and AWS agreements
Amazon GuardDuty protect your accounts against attacks from the outside and inside. Uses
Machine Learning to detect anomalies and malicious activities and can match with a third party
data sets – CloudWatchEvents
Inspector Automated Security Assessments for EC2 Instances. After the assessment, you get
a report with a list of vulnerabilities
AWS Config Helps auditing and recording compliance of your AWS resources. Helps recording
configurations and changes over time. Possibility of storing the configuration data into S3. Per-
region service
Amazon Macie is a fully managed data security and data privacy service that uses machine
learning and pattern matching to discover and protect your service data in AWS. Helps identify
and alert you to sensitive data, such as personally identifiable information (PII)
Security Hub Security centralized place, integrated view and make simper to find security
issues and remediate them
Central security tool to manage security across several AWS accounts and automate security
checks. Must first enable the AWS Config Service
Amazon Detective analyses, investigates and quickly identifies the root cause of security
issues or suspicious activities
AWS Abuse Report suspected AWS resources used for abusive or illegal purposes (Spam,
DDoS attacks). Contact the AWS abuse team (through a form or an e-mail)
Amazon Recognition find objects, people, text, scenes and videos using ML. Face detection,
labeling, celebrity recognition
Amazon Transcribe automatically convert speech to text (audio to text – subtitles)
Amazon Polly (opposite of transcribe) turn text into speech using deep learning (text to audio)
Amazon Translate natural and accurate language translation
Amazon Lex (it is like Siri for iPhone, build conversational chatbots) & Connect (virtual contact
center)
Amazon Comprehend – Natural Language Processing – NLP, fully managed and serverless
service
Amazon SageMaker fully managed service for developers/data scientists to build ML models
Amazon Forecast fully managed service that uses ML to deliver highly accurate forecasts
Amazon Kendra fully managed document search service powered by ML
Amazon Personalize ML service to build apps with real time personalized recommendations
AWS Control Tower – easy way to set up and govern a secure and compliant multi-account
AWS environment based on best practices. Benefits: automate set ups in a few clicks, detect
violations, monitor compliance through an interactive board. It also implements SCP
Pricing Models
Pay as you go: pay for what you use, remain agile, responsive, meet scale demands
Save when you reserve: minimize risks, predictably manage budgets
Pay less by using more: volume based discounts
Pay less as AWS grows
Savings Plan – EC2 savings plan, Compute savings plan (Lambda, Fargate). Commit a certain
amount of $ per hour for 1/3 years --- long term commitments on AWS
Estimating Costs
TCO (Total Cost Ownership) Calculators
which is able to help us understand how much it will cost us and the cost savings associated
with us when we migrate from on premises to the cloud and creates an executive report for it.
Tracking costs
AWS billing dashboard – will show you all the costs ushered in front of the month, the forecast
and the month to date
Cost Allocation Tags – use tags to track your AWS costs on a detailed level. Starts with a prefix
– aws:____ ou user:___
Cost and usage reports – contains the most comprehensive/ granular set of AWS cost and
usage data available, including metadata about AWS services, pricing, and reservations. We
can analyse this report by using Athena, QuickSight and Redshift.
Cost Explorer – tool that will allow you to forecast your bills up to 12 months based on previous
usage. It also allows you to choose an optimal Savings Plan (to lower prices on your bill)
Monitoring costs
CloudWatch Billing alarms: Billing data metric is only stored in CloudWatch us-east-1. It’s for
actual costs, not for projected costs. Intended a simple alarm (not as powerful as AWS Budgets)
Budgets – create budget and send alarms when costs exceed the budget defined. 3 types of
budgets: Usage, Cost and Reservation.
Trusted Advisor analyze your AWS account and provides recommendation on 5 categories:
Cost Optimization, Performance, Security, Fault tolerance and Service Limits (best practices)
7 CORE CHECKS (basic & developer support plan): S3 Budget Permissions, Security Groups,
IAM Use, MFA on Root Account, EBS Public Snapshots, RDS Public Snapshots, Service Limits
FULL CHECKS (Business & Enterprise Support plan): Full checks available on 5 categories,
Ability to set CloudWatch alarms when reaching limits, Programmatic Access using AWS
Support API
3) Business Support
Intended to be used if you have production workloads. Trusted Advisor – full set of
checks + API access. 24x7 phone, email and chat access to Cloud support Engineers.
Unlimited cases/unlimited contacts. Access to Infrastructure Event Management for
additional fee. Case Severity/response times – 24 or 12 business hours or <4 hours
(production impaired) or <1hour (production system down)
4) Enterprise Support
Intended if you have mission critical workloads
All of Business Plan + Access to a Technical Account Manager (TAM) and Concierge
Support Team (for billing and account best practices), Infrastructure Event Manager,
Well-Architected & Operations Reviews. Case Severity/response times – 24 or 12
business hours or <4 hours (production impaired) or <1hour (production system down)
or <15 mins (business critical system down)
AWS STS (Security Token Service) – enables you to create temporary, limited privileges
credentials to access your AWS resources
We configure expiration period
Amazon Cognito – Identity for your Web and Mobile applications users (potentially millions).
Instead of creating them an IAM user, you create a user in Cognito
Amazon Cognito lets you add user sign-up, sign-in, and access control to your web and mobile
apps quickly and easily.
Microsoft Active Directory (AD) Database for objects: User, accounts, computers, printers, File
shares, Security groups. AWS Directory Services
AWS Single Sign-On (SSO) – Centrally manage Single Sign-On to access multiple accounts
and 3rd party business applications. Integrated with AWS Organisations. Supports SAML 2.0
markup. Integration with on-premise Active Directory
Now, the main one you're being tested on at the exam is called Disaster Recovery. So
Disaster Recovery, as the name indicates, is a way for you to quickly and easily recover
your physical, virtual and cloud-based servers into AWS using a Disaster Recovery
strategy.
AWS DataSync – move large amount of data from on-premises to AWS. The replication tasks
are incremental after the first full load
1) Operational Excellence
Platform operations as code; Annotate documentation; Make frequent, small and
reversible changes; Refine operations procedures frequently; Anticipate failure; Learn
from all operational failures. PREPARE, OPERATE, EVOLVE
2) Security
Implement strong identity foundation; Enable traceability; Apply security at all layers;
Automate security best practices; protect data, Prepare for security events
3) Reliability
Recover from infrastructure or service disruptions
Test recovery procedures; Automatically recover from failure; Scale horizontally to
increase availability; Stop guessing capacity; Manage change in automation
4) Performance Efficiency
Democratize advanced technologies; Go global in minutes; Use serverless; Experiment
more often; Mechanical sympathy
5) Cost optimization
Adopt a consumption mode – pay for what you use; Measure overall efficiency
(cloudwatch); Stop spending money on data center operations; Analyse and attribute
expenditure; Use managed and application level services to reduce cost of ownership
Right size – EC2 has many instance types, but choosing the most powerful instance type isn’t
the best choice, because the cloud is elastic. The idea is to size your workload performance and
capacity requirements at the lowest possible cost! Scaling up is easy so always start small!
Do Right Size before a Cloud Migration and continuously after the cloud onboarding process