83% found this document useful (6 votes)
1K views288 pages

Googlecloudplatform 1151921572881138355

This document provides an overview of Google Cloud Platform concepts including the resource hierarchy, organization structure, projects, folders, resources, labels, compute choices like VM instances, machine types, base images, storage options, networking components, and availability policies. It describes the key building blocks and configuration options for deploying and managing workloads on GCP.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
83% found this document useful (6 votes)
1K views288 pages

Googlecloudplatform 1151921572881138355

This document provides an overview of Google Cloud Platform concepts including the resource hierarchy, organization structure, projects, folders, resources, labels, compute choices like VM instances, machine types, base images, storage options, networking components, and availability policies. It describes the key building blocks and configuration options for deploying and managing workloads on GCP.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 288

GCP Professional Cloud Architect

Certification Prep
Google Cloud Certifications
asdfCloud
Google

Associate Professional
Certifications Certifications

Associate Cloud Professional Cloud Professional Data Professional Cloud


Engineer Architect Engineer Developer

Professional Cloud Professional Cloud Professional


Network Engineer Security Engineer Collaboration Engineer
Professional Cloud Architect
• Test duration: 2 hours
• Registration fee: $200 + taxes
• Languages: English, Japanese, Spanish, Portuguese
• Recommended: 3+ years GCP experience
Professional Cloud Architect
• Vast array of services for a wide variety of use cases
• Extensive labs for hands-on practice:
• https://codelabs.developers.google.com/?cat=Cloud

• Case studies link here:


• https://cloud.google.com/certification/guides/professional-cloud-architect/
Google Cloud Platform Basics

Month/Year
Resource Hierarchy Components

Organization

Folders

Billing account Project

Resources Labels
Organization
• Top of resource hierarchy
• Contains projects and folders
• Identities come from G Suite or a Cloud Identity account
• IAM policies are inherited down into projects and resources
• Central control for all resources
• Projects belong to the organization, not employees
• Can grant organization level roles
Folders
• Grouping mechanism within an organization
• Logical group of projects
• Can set IAM policies to administer multiple projects
• Model legal entities, departments, and teams
Projects
• Container for billable resources
• Some resources can be used for free
• For all others, billing account needs to be linked
• Required resource for using GCP services
Resources
• Any component that incurs billing
• Must exist within project
• Can set resource-level IAM
• Inherits policies from organization, folder, project
• Lowest level of the hierarchy
Labels
• Key-value pairs
• Resource metadata
• Can use to organize billing
• Can break down billing by label
Using GCP Resources

Cloud Console and Client APIs


gsutil and gcloud
Cloud Shell
Programmatic access via
(bq and kubectl)
Under the hood, making HTTP calls to GCP
Command-line tools
API calls. Cloud Shell is a endpoints
great terminal utility.
Choices in Computing

Compute Storage
Where is code executed and how? Where is data stored?
Networking, hosting, logging, are choices
made after this fundamental decision
App:Hello
Compute Choices

Bare VM Container Hosted Apps Serverless


metal Instances Clusters functions
GCP Compute Choices

Google Google Google Google


Compute Kubernetes App Cloud
Engine Engine Engine Functions

IaaS PaaS
Google Compute Engine (GCE)
Bare Metal vs. IaaS
Bare Metal IaaS
• Apps run on OS which runs on • Hypervisor between apps and
hardware hardware
• Less portable • More portable
• CPUs • vCPUs
• Full burden of ops and admin • Much of ops burden managed by
service provider
GCP Internals

Zone Region Network


Availability zone Set of zones with User-controlled IP
(similar to a high-speed network addresses, subnets
datacenter) links and firewalls
Global, Regional and Zonal
Compute Resources
• Global:
• Static external IP addresses
• Images and Snapshots
• Networks, firewalls, routes
• Regional
• Subnets
• Regional persistent disks
• Zonal
• Instances
• Persistent disks
Configuration Choices

Machine Type Base Image

Memory size, virtual CPU Public (free or premium),


(vCPU) count, and maximum custom, snapshots from boot
persistent disk capability disks
Machine Type

Predefined Custom

General- Compute- Memory-


purpose optimized optimized
General Purpose Machines
• Day to day computing for known workloads
• Best price-performance ratio
• N1 first generation: 6.5GB of memory per vCPU
• N2 second generation: 8GB of memory per vCPU
• More heavy duty workloads such as webserving, databases, application use N2

• Can customize machine types


• Come in high-memory and high-cpu variants
Compute-optimized Machines
• Compute intensive workloads
• Offer the highest performance per core
• C2 machine types
• Gaming, single-threaded applications, electronic design
automation
• Custom machine types not supported
Memory-optimized Machines
• Memory-intensive workloads
• Offer the highest memory per core
• Custom machine types not supported
Shared-core Machines
• Cost-effective for running non-resource intensive operations
• A single vCPU run for a time period on single hardware
• Offer micro-bursting capabilities for spikes
• Instance will use additional physical CPUs during spikes
Base Images

Public Custom
Base Images

Public Custom

Provided and maintained by Google,


open-source communities, and third-
party vendors

All projects have access to these images


and can use them to create instances
Base Images

Public Custom

Linux, Windows, Container-optimized


OS, SQL Server

Many images come with Shielded VM


support
Base Images

Public Custom

Available only to your project


First, create a custom image from boot
disks and other images; then, use the
custom image to create an instance
Shielded VM
• Verifiable integrity of your compute instances
• Ensure they haven’t been compromised by boot or kernel-level
malware
• Secure Boot: Verifies digital signature of software during boot
• Virtual Trusted Platform Module vTPM: Specialized computer
chip to protect keys and certificates
• Measured Boot: Hashes boot components to verify load order
and components loaded
• Integrity monitoring of VM instances
Preemptible VM Instances
An instance that you can create and run at a much lower price
than normal instances. However, GCE might terminate (preempt)
these instances if it requires access to those resources for other
tasks.

May not always be available

Not covered by SLAs


Sole-tenant Nodes
A sole-tenant node is a physical Compute Engine server that is
dedicated to hosting VM instances only for your specific project

Keeps your instances physically separated from instances in other


projects

Group instances together on the same hardware


VM Instances as Building Blocks

Managed Instance
GCE VM Instances Load Balancers
Groups
Accessing Storage from VMs

Block Storage Object Storage File Storage

GCS Buckets Cloud Filestore

Persistent Disks Local SSD

RAM disk

HDD SSD
Persistent Disks vs. Buckets
Persistent Disks Buckets
• Block storage • Object storage
• Max 64TB in size • Infinitely scalable
• Pay what you allocate • Pay what you use
• Tied to GCE VMs • Independent of GCE VMs
• Zonal (or regional) access • Global access
Persistent Disks
• Resize on the fly
• Move across zones
• Create images and snapshots
• Encrypted at rest
• can use custom keys
Boot Disk
• Each GCE VM needs a persistent boot disk
• This disk contains boot loader, OS etc.
• Bootable
• Durable
• can delete VM but keep disk
Persistent Disks vs. Local SSDs
Persistent Disks Local SSDs
• Network-attached storage • Physically attached to instance
• Data redundancy built-in • No data redundancy built-in
• Bootable • Not bootable
• Durable • Not durable
• HDD or SSD • SSD for better performance
• 64TB max • 3TB max
• Create snapshots or images • Can not create snapshots or
• Relatively slow images
• Very fast, especially for random
access
Availability Policies
A VM instance's availability policy determines how it behaves
when an event occurs that requires Google to move your VM to a
different host machine
Availability Policies

Live Migrate Terminate (and


Restart) Automatic Restart
Instance remains running
during the migration GCE shuts down If instance crashes, GCE
(Default) instance, terminates it automatically restarts it
and restarts it elsewhere
Labels
Key-value pairs that can be associated with any GCP resource; a
lightweight way to group related resources
Network Tags
Text attributes applied to VM instances (and instance templates)
as a way of applying firewall rules and routes to specific instances
Metadata Server
• Labels and tags are forms of metadata
• Reside outside an instance on a metadata server
• Can be programmatically queried
• Instance itself can query without authorization
Metadata Server
• Use with startup and shutdown scripts
• Commonly used to find
• instance host name
• instance ID
• startup and shutdown scripts
• service account
Image
• Binary file used to instantiate VM root disk
• Usually based off OS image
• Also contains boot loader
• Can also contain customizations
• Managed by GCP image service
Snapshot
• Binary file with exact contents of persistent disk
• “Point-in-time” snapshot
• Managed by GCP snapshot service
• Incremental backups possible too
• Used to back up data from persistent disks
Images and Snapshots
Persistent Disk Images Persistent Disk Snapshots
• Create an image to use disk as • Create a snapshot to backup
basis for new instances data present in a disk
• Not incremental • Incremental
• Relatively expensive • Relatively cheap
• Can be directly used to • Must first be used to create a disk
instantiate new instance or before instances can be created
managed instance group from it
• Supports families and versioning • No support for families
• Share across projects • Specific to project
Google App Engine
Google App Engine
Web framework and platform for hosting web applications on the
Google Cloud Platform

Support for Go, PHP, Java, Python, Node.js, .NET, Ruby and
other languages
App Engine Environments

Standard Environment Flexible Environment


App Engine Environments
Standard Flexible
• App runs in a proprietary • Runs in Docker container on GCE
sandbox VM
• Code in few languages/versions • Code in far more languages/
only versions
• No other runtimes possible • Custom runtimes possible
• Apps cannot access Compute • Apps can access Compute Engine
Engine resources resources, some OS packages
• No installation of third-party • Can install and access third-party
binaries binaries
App Engine Environments
Standard Flexible
• Instance startup in seconds • Instance startup in minutes
• No background processes • Background processes supported
• No SSH debugging • SSH debugging supported
• Scale to zero • No scaling to zero (minimum 1
instance)
App Engine Environments
Standard Flexible
• Apps that experience traffic • Apps that experience consistent
spikes traffic
• Usually stateless HTTP web • General purpose apps
apps
• All instances in same zone • Instances in the same region
(moved in case of zone outage) (regional Managed Instance
Group)
App Engine App
Single regional application resource consisting of hierarchy of
services, versions and instances
Components of an Application
Application

Service Service

Version Version Version Version

Instance Instance Instance Instance


Google Cloud Functions

Month/Year
Cloud Functions
Event-driven serverless compute platform
Event-driven Serverless Compute

Cloud
Platform triggers Invokes other
Event occurs Function
execution GCP services
code runs
Types of Events
HTTP Background

Cloud Storage

Pub/Sub Firebase
Stackdriver
Logging
Concurrency and Scale
• Spin up function instances based on current load
• Functions do not share memory or variables
• An instance processes a single request
• Functions should be stateless
Session 3: Storage
Storage Technologies

Unstructured Data Structured Data

OLTP OLAP

Cloud SQL Cloud Spanner BigQuery BigTable


Unstructured Data

Block Storage Object Storage

Physically addressable
storage accessed from
compute
Unstructured Data

Block Storage Object Storage

Logically addressable
storage accessed from
compute or by human users
Persistent Disks vs. Buckets
Persistent Disks Buckets
• Block storage • Object storage
• Max 64TB in size • Infinitely scalable
• Pay what you allocate • Pay what you use
• Tied to GCE VMs • Independent of GCE VMs
• Zonal (or regional) access • Global access
GCS Storage Classes
How often is a data item accessed?

“Very rarely” “Not that often” “All the time”

Cold Data Cool Data Hot Data

Less than Many times a


once a year Once a month month
GCS Storage Classes
Cold Data Cool Data Hot Data

Coldline Nearline Standard


storage
All Storage Classes
Cold Data Cool Data Hot Data

Where is the data


item accessed from?
“A specific “Geographically “Accessed from
region” separate locations” anywhere in the world”

Region Dual-region Multi-region


Coldline has about the same speed of access as other
storage classes (different from AWS Glacier and S3)
Availability

Storage Costs
Different storage classes
Retrieval Costs represent different trade-offs

Durability Several parameters along which


to compare
Access Frequency

Use Cases
Availability

Storage Class Availability


Storage Costs
Standard storage (dual
99.95%
and multi-regional)
Retrieval Costs
Standard storage
99.9%
(regional)
Durability
Nearline (regional) 99.0%

Coldline (regional) 99.0%


Access Frequency

Use Cases
Availability
Storage Cost
Storage Class
Storage Costs (cents/GB/month)

Standard (multi-region) 2.6


Retrieval Costs
Nearline (multi-region) 1.0
Durability
Coldline (multi-region) 0.7

Access Frequency

Use Cases
Availability
Retrieval Cost
Storage Class
Storage Costs (cents/GB)

Standard None
Retrieval Costs
Nearline 1.0

Durability Coldline 5.0

Access Frequency

Use Cases
Availability
Minimum
Storage Class
Storage Costs Commitment

Standard None
Retrieval Costs
Nearline 30 days*

Durability Coldline 90 days*

Access Frequency

Use Cases
*Early deletion will incur charges
Availability

Storage Class Durability


Storage Costs
Standard 99.999999999%
Retrieval Costs
Nearline 99.999999999%

Durability Coldline 99.999999999%

Access Frequency

Use Cases
“11 nines”
Availability

Storage Class Access Frequency


Storage Costs
Standard Daily
Retrieval Costs
Nearline Monthly or less

Durability Coldline Monthly or less

Access Frequency

Use Cases
Availability
Storage Class Access Frequency

Storage Costs Serving websites, interactive


Standard storage (dual and
workloads, mobile and
multi-regional)
gaming applications
Retrieval Costs Access from Compute
Standard storage (regional) Engine VMs or Dataproc
cluster
Durability
Data backup, disaster
Nearline
recovery, archival storage
Access Frequency
Legal or regulatory needs;
Coldline also disaster recovery where
Use Cases recovery time is important
GCS for Object Storage
File Storage Object Storage
• Hierarchical structure • Flat, non-nested structure
• Support for nesting and • Nested structure merely
directories simulated
• File-level locks • No distributed lock - last write
wins
• File and directory headers • Unstructured series of bytes
Object Storage Class
• Every bucket has an associated storage class
• Every object also has an associated storage class
• On creation, object inherits storage class of bucket
Changing Storage Class of a
Bucket
• When storage class of a bucket is changed new objects
subsequently added to bucket pick up change
• But existing objects in bucket keep their storage class
Changing Storage Class of an
Object
• Can change storage class of object
• without changing storage class of bucket
• without moving object to different bucket
• without affecting URL of object
Object Versioning
• Needs to be enabled for bucket
• Once enabled, bucket creates archived versions of each object
• Whenever live object is overwritten or deleted
• Version with unique generation number is created
• Each copy charged separately
Object Lifecycle Management
• Can automatically specify changes to object storage class
• “Change from regional to nearline after 30 days”
• “Delete all data created before 1/8/2018”
• “Delete all but 2 most recent versions”
Encryption
• Encrypted even at rest
• Default: Google generates keys
• Can use CSEK
• Customer Supplied Encryption Key
GCS and Load Balancers
• Load balancers to distribute incoming traffic
• Usually, compute engine VMs as backend instances
• GCS bucket can be backend instances too
• Great for serving static data
Cloud IAM
• Identity and Access Management
• Used for all GCP resources
• Role-based Access Control (RBAC)
• Preferred to ACL-based access control
Access Control Lists
• IAM is preferred method for restricting control
• GCS is the only service where ACLs can be used too
Access Control Lists
• Each entry in an ACL includes
• Permission: What action can be performed
• Scope: Who can perform the specified action
Restricting Access
• Cloud IAM
• project-level
• bucket-level
• Access Control Lists
• for individual objects
• e.g. PII (Personally Identifiable Information)
Signed URLs
• Time-limited, signed URL
• Provides access without further authentication
• Valet key
• Specific operations can be specified
• GET, PUT, DELETE (not POST)
Object Change Notifications
• Can respond to changes in specific object
• Trigger web hook
• Use in combination with Pub/Sub
• GCP’s reliable messaging middleware
Storage Use Cases
Use Case Appropriate GCP Service Non-GCP Equivalents

Block storage Persistent disks or local SSDs AWS EBS, Azure Disk

Object/blob storage Cloud Storage (GCS) buckets AWS S3, Azure Blob Storage

Relational data - small,


Cloud SQL
regional payloads AWS RDS, Azure SQL
Relational data - large, global Database
Cloud Spanner
payloads
HTML/XML documents with
Datastore/Firestore
NoSQL access AWS DynamoDB, Azure
Large, naturally ordered data Cosmos DB
BigTable
with NoSQL access
Analytics and complex queries AWS Redshift, Azure Data
BigQuery
with SQL access Warehouse
Cloud SQL
Cloud SQL is the fully-managed MySQL and PostgreSQL
database service on the Google Cloud Platform

SQL Server currently available in beta

Transactional support, ACID support

Easiest migration path for on-premises RDBMS

High availability using failover replicas in different zones


Google Cloud Spanner
A global, horizontally scaling, strongly consistent relational
database service built on proprietary technology

Scales horizontally by adding nodes

ACID++ support at scale

Relatively expensive and Google proprietary


Cloud Firestore
Flexible, scalable, NoSQL database for keeping data in sync
across client apps.

Mobile and web server development as a part of GCP’s Firebase


platform

Realtime listeners and offline support


BigQuery Features
• Serverless: No cluster, no provisioning
• Structured data with fields
• Can ingest streaming data at scale
• Autoscaling
• Automatic high availability
• Simple SQL queries
Redis
Very popular in-memory key-value NoSQL database
Cloud Memorystore
Google managed service for Redis that offers scaling, high
availability and a convenient migration path
Google Cloud Bigtable
NoSQL database technology ideal for very large, sparse datasets
with sequential ordering in key column; provides very fast writes
as well as reads
Choose Bigtable For
• Time series data: Naturally ordered
• Internet of Things data: Constant stream of writes
• Financial data: Often efficiently represented as time series data
• Large datasets > 1 TB with each row < 10 MB
Session 4: Networking
Networking Requirements
Objective GCP Solution
• Resources within a project need • Internal IP addresses
to communicate
• Resources on GCP need to • External IP addresses
communicate with outside world
• Traffic sent to an IP address • Routes
needs to reach that address
• Platform users need to be able • Firewall rules
to restrict traffic flows
IP addresses, routes and firewall rules all exist
inside a GCP resource called a VPC Network
Google Virtual Private Cloud
A VPC network, often just called a network, is a global, private,
isolated virtual network partition that provides managed network
functionality on the GCP
Multiple VPCs in a Project
Project

VPC VPC VPC VPC VPC


Network Network Network Network Network
1 2 3 4 5
Projects and VPCs
• VPCs are global resources on the GCP
• Each VPC must exist inside a project
• Default VPC pre-created in each project
• Can add additional VPCs
• Auto Mode
• Custom Mode
VPCs Are Global
Project

Default VPC1 VPC2


VPC
VPCs Are Global
Project

asia-south1

Default VPC1 VPC2


VPC

us-east1
Subnets in Each Region
Subnets
Project

asia-south1

Default VPC1 VPC2


VPC

us-east1
Resources Provisioned on
Subnets
Project

asia-south1

Default VPC1 VPC2


VPC

us-east1
Subnets
• IP range partitions within global VPCs
• VPCs have no IP ranges
• Subnets are regional - can span zones inside a region
• Network has to have at least one subnet before you can use it
Subnets
• Auto Mode VPCs have pre-created subnets
• One in each GCP region
• Custom Mode VPCs start with no subnets
• Full control over which regions have subnets
• Can create multiple subnets in a region
Subnets and IP Ranges
• Each subnet must have primary address range
• Valid RFC 1918 CIDR block
• Subnet ranges in same network cannot overlap
• Subnet ranges in different networks can overlap
Communication on VPCs
Project

asia-south1

us-east1

Resources within a VPC communicate


using private IP addresses
Communication on VPCs
Project

asia-south1

us-east1

Wherever they are located in the world -


irrespective of physical location
Communication on VPCs
Project

asia-south1

Default VPC1 VPC2


VPC

us-east1

Resources on different VPCs communicate


over the internet using external IPs
Communication on VPCs
Project

asia-south1

Default VPC1
\ VPC2
VPC

us-east1

Even though they are in the same region - they may even be
in the same zone on the same physical hardware
Default VPC
Default VPC
• Pre-created on every project
• Includes subnet for each GCP region
• New subnets added when new regions are created
• Resources created here by default
Default VPC
• Includes routes for all resources
• All VMs on the default VPC can talk to each other
• Default gateway to internet
• Includes several firewall rules
Firewall Rules
• Every VPC is a distributed firewall
• Firewall rules defined in VPC
• Are applied on per-instance basis
• Can also regulate internal traffic
Firewall Rules
• Every VPC has two permanent rules
• Implied allow egress
• Implied deny ingress
• Can be overridden by more specific rules
• In addition, default VPC has several rules
Additional Rules in Default VPC
• default-allow-internal
• default-allow-ssh
• default-allow-rdp
• default-allow-icmp
VPCs on the Google Cloud

Auto Mode Custom Mode

Subnets automatically created Manually create subnets in


in each region, default firewall regions, no defaults
rules preconfigured
Changing VPC Mode
• Auto -> Custom: Possible
• Custom -> Auto: Not possible
Choosing Auto Mode
• Easy to use, GCP does all the work
• Automatically defined ranges for all regions
• Pre-defined IP ranges
Choosing Custom Mode
• More control over network configuration
• No need for subnets in each region
• Predefined IP ranges might clash with peer network
• Preferably use custom networks with
• VPC peering
• Cloud VPN
GCE VM IP Addresses

Internal External

Primary Secondary Static Ephemeral

Static Ephemeral Global Regional


Firewall Rules
Firewall Rules
Restrict and regulate network traffic flows in a VPC
Firewall Rules
• Every VPC has two permanent rules
• Implied allow egress
• Implied deny ingress
• Can be overridden by more specific rules
Firewall Rules
• Every firewall rule has several components
• Priority (0 highest, 65535 lowest)
• Direction (ingress/egress)
• Action (allow/deny)
• Target
• Source or destination
• Protocol and port
• Enforcement status (enabled/disabled)
Direction and Action
• Direction always defined from perspective of target
• Ingress: Traffic coming into target from some source
• Egress: Traffic sent out by target to some destination

• Action to be taken when match found


• Allow: Permit connection
• Deny: Block connection
• Rule can only specify one action
Target
• Three possible specifications
• All instances in network
• Instances by target tag
• Instances by target service account
Source or Destination Filter
• Can specify exactly one (not both)
• For ingress rules: specify source
• For egress rules: specify destination
Source and Destination
Sources
• Any IP (0.0.0.0/0)
• Source IP ranges
• Source tags
• Source service accounts
• Some combinations

Destinations
• Any IP (0.0.0.0/0)
• Destination IP ranges
Protocol and Port
• If both omitted - rule applies to all traffic
• Protocol can be name or decimal number
• If port omitted, applies to all ports
• Can specify combinations
• tcp:80
• tcp:20-22
• tcp:80; tcp:443
Connecting Networks

Month/Year
Shared VPC
• Share VPC across projects on GCP
• Projects must be in the same organization
• Host project, guest resources
• Shared VPC admin to administer the shared VPC
VPC Peering
• Share VPC across projects on GCP
• Projects need not be in the same organization
• Allows resources on different VPC network to communicate
using private IP addresses
• Reduced latency, higher security and lower cost as compared
with using external IPs
Shared VPCs vs. Network Peering
Shared VPCs Network Peering
• Only within same organization • Across organization boundaries
• One VPC used across projects • Multiple VPCs share resources
• Host and service projects are • Connected VPCs are peers
not peers
• Only single level of sharing • Multiple levels of peering possible
possible
Interconnecting Networks

Enterprise connectivity
GCP-to-GCP
Peering and interconnect
VPC Network Peering
options
Enterprise Connectivity

Internal IP Public IP
Addresses Addresses

SLA No SLA

Interconnect Peering
Enterprise Connectivity

Interconnect Peering

VPN Partner Dedicated Direct Carrier


Enterprise Connectivity

Interconnect Peering

VPN Partner Dedicated Direct Carrier

Internal IP addresses in
RFC 1918 address space
With SLA
VPN Tunnel
Configuration Property Choice

Encrypted tunnel to VPC


Connection networks through the public
Internet

Internal IP addresses in RFC


Access Type
1918 address space

Capacity 1.5-3 Gbps for each tunnel

Requires a VPN device on your


Other Considerations
on-premises network
Elements of VPN
• Cloud VPN gateway
• On-premises VPN gateway
• VPN tunnel
Cloud VPN
Mechanism for secure connection between on-premise and GCP
VPC; secure tunnel using two VPN gateways, one at each end
Cloud VPN
• Two VPN gateways
• Traffic encrypted at one gateway
• Decrypted at other gateway
• Keys need to be exchanged
Cloud Router
Fully distributed and managed GCP service (not a
physical device) that dynamically exchanges routes
between GCP and on-premise networks
Cloud Router
• Dynamic routing
• No route configuration needed
• Quickly picks up network changes
• Supports graceful restart
Dedicated Interconnect
Configuration Property Choice

Dedicated, direct connection to VPC


Connection
networks

Internal IP addresses in RFC 1918


Access Type
address space

Capacity 10 Gbps for each link

Must have connection in a Google


Other Considerations supported colocation facility, either directly
or through a carrier
Partner Interconnect
Configuration Property Choice

Dedicated Bandwidth, connection to VPC


Connection
network through a service provider

Internal IP addresses in RFC 1918


Access Type
address space

Capacity 50Mbps - 10Gbps per connection

Service providers might have specific


Other Considerations
restrictions or requirements
Direct Peering
Configuration Property Choice

Dedicated, direct connection to Google’s


Connection
network

Access Type Public IP addresses

Capacity 10 Gbps for each link

Must have connection in a Google


Other Considerations supported colocation facility, either directly
or through a carrier
Carrier Peering
Configuration Property Choice

Peering through service provider to


Connection
Google’s public network

Access Type Public IP addresses

Capacity Varies based on partner offering

Other Considerations Requirements vary by partner


Session 5: IAM and Security
Cloud IAM
Manage identity and access control by defining who (identity)
has what access (role) for which resource.
Cloud IAM
Permission to access a resource is not granted directly to the end
user. Instead, permissions are grouped into roles, and roles are
granted to authenticated members.

• Member: GCP identity - user, group, service account


• Role: Collection of permissions
• Policy: Binding members to a role
Identity and Access Management
(IAM)
Identities Access
Identity and Access Management
(IAM)
Identities

Individual Service
Groups
Users Accounts
GCP Identities
• Member types:
• Google accounts
• Service accounts
• Google groups
• G Suite domains
• Cloud Identity domains
Google account
A Google account represents a developer, an administrator, or
any other person who interacts with GCP.
Service account
A service account is an account that belongs to your application
instead of to an individual end user.
Google Group
A Google Group is a named collection of Google accounts and
service accounts. Every group has a unique email address that is
associated with the group.
G Suite domain
A G Suite domain represents a virtual group of all the Google
accounts that have been created in an organization’s G
Suite account. G Suite domains represent your organization's
Internet domain name.
Cloud Identity domain
A Cloud Identity domain is like a G Suite domain because it
represents a virtual group of all Google accounts in an
organization. However, Cloud Identity domain users don't have
access to G Suite applications and features.
allAuthenticatedUsers
Special identifier that represents anyone who is authenticated with
a Google account or a service account.
allUsers
Special identifier that represents anyone who is on the internet,
including authenticated and unauthenticated users.
Service account
A service account is an account that belongs to your application
instead of to an individual end user.

• Service account is both an identity and a resource


• Can have IAM policies attached to it to determine who can use
the service account
Identity and Access Management
(IAM)
Identities Access

Individual Service
Groups RBAC ACLs
Users Accounts

Primitive Predefined Custom


Primitive Roles
Three concentric roles that existed prior to the introduction of
Cloud IAM: Owner, Editor, and Viewer.
Primitive Roles
Role Name Role Title Permissions

Permissions for read-only actions that do not affect state, such as


roles/viewer Viewer
viewing existing resources or data

All viewer permissions, plus permissions for actions that modify state,
roles/editor Editor
such as changing existing resources

All editor permissions and permissions for the following actions:


roles/owner Owner • Manage roles, permissions for a project and all resources in project
• Set up billing for a project
Identity and Access Management
(IAM)
Identities Access

Individual Service
Groups RBAC ACLs
Users Accounts

Primitive Predefined Custom


Predefined Roles
• Project Roles
• App Engine Roles
• BigQuery Roles
• Cloud Bigtable Roles
• Cloud Billing Roles
Predefined Roles
roles/bigquery.dataViewer
bigquery.datasets.get
bigquery.datasets.getIamPolicy
bigquery.models.getData
bigquery.models.getMetadata
bigquery.models.list
bigquery.routines.get
bigquery.routines.list
bigquery.tables.export
bigquery.tables.get
bigquery.tables.getData
bigquery.tables.list
resourcemanager.projects.get
resourcemanager.projects.list
Custom Roles
User-defined Roles that bundle one or more supported
permissions to meet your specific needs.

Not maintained by Google; when new permissions, features, or


services are added to GCP, your custom roles will not be updated
automatically.
Cloud IAP
Use Identity-aware Proxy to enable role-based access control
for web applications.

Cloud IAP works by verifying user identity and context of the


request to determine if a user should be allowed to access an
application or a VM.
Cloud IAP
Application-level access control instead of using network
firewalls to regulate access

Establishes a central authorization layer for applications


accessed by HTTPS
BeyondCorp
Google’s implementation of the zero-trust security model

Shifts access control from network perimeter to individual users


and devices

Allows employees, contractors and users to work from any


location without using VPN
Cloud Armor
Security policies and IP allow and deny lists that work with
HTTP(S) load balancing on the GCP

• Works with HTTP(S) load balancer


• Provides built-in defense against DDoS
• Used by Google Search, Gmail, YouTube
Cloud Security Scanner

Cross-site scripting Flash injection Mixed content

Clear-text password Invalid headers Outdated libraries

Identifies security vulnerabilities in App Engine and Compute


Engine web applications
Data Loss Prevention
Data loss prevention (DLP) is a strategy for making sure that end
users do not send sensitive or critical information outside the
corporate network

• Classification of sensitive content whether text or images


• Redaction removes sensitive matches
• De-identification remove identifying features from data
Three Options for Encryption

Encryption by default - Customer-managed Customer-supplied


completely managed by Encryption Keys (CMEK) Encryption Keys (CSEK)
the GCP reside on cloud reside on-premise
Encryption by Default
• Simplest, with the least administrative overhead
• Automatically encrypted when written
• Keys and encryption managed by Google
• Using the same keystore as Google’s production services
• Most data on the GCP protected this way
CMEK Using Cloud Key
Management Solution
• Keys stored in the cloud used directly by services
• Create, manage, rotate, destroy keys easily
• Used for application layer encryption in all GCP products
CSEK
• Keys on premises, used to encrypt cloud services
• Google keeps key in memory, does not write out to disk
• Provide key as a part of API calls
CSEK
• Support available for Cloud Storage and Compute Engine
• Compliance or sensitivity issues require own key managed on
premises
Google Cloud KMS
Cloud-hosted key management service for generating, using,
rotating and destroying cryptographic keys. Easiest way to
implement CMEK on GCP.
Two Categories of Keys

Symmetric Asymmetric

Both encryption and decryption are Have a public/private: key pair one
performed using the same key for encryption, one for decryption
Three Purposes of Keys

Symmetric encryption Asymmetric signing Asymmetric encryption


Encrypt and decrypt messages Private key to encrypt text, Public key to encrypt text,
using the same key public key to decrypt text private key to decrypt text
Session 6: Containers
Drawbacks of VMs
• Contain guest OS
• Introduces platform dependency
• Bloats image size to GB (apps far smaller)
• Heavyweight
• Slow to boot up
• Not trivial to migrate
• VM migration tools needed
Container
A container image is a lightweight, stand-alone, executable
package of a piece of software that includes everything needed to
run it: code, runtime, system tools, system libraries, settings
Container
• Contains applications
• And all of the application’s dependencies
• Platform independent
• Run on layer of abstraction
• Docker Runtime
Attractions of Containers
• No guest OS
• Platform independent
• Considerably smaller than VM images
• Lightweight
• Small and fast
• Quick to start
• Speeds up autoscaling
• Hybrid, multi-cloud
• Hybrid: Work on-premise and on cloud
• Multi-cloud: Not tied to any specific cloud platform
Standalone Container Limitations
• No autohealing
• Crashed containers won’t restart automatically
• Need higher level orchestration
• No scaling or autoscaling
• Overloaded containers don’t spawn more automatically
• Need higher level orchestration
• No load balancing
• Containers can’t share load automatically
• Need higher level orchestration
• No isolation
• Crashing containers can take each other down
• Need sandbox to separate them
Kubernetes
Orchestration technology for containers - convert isolated
containers running on different hardware into a cluster
IaaS vs. PaaS
Infrastructure-as-a- Platform-as-a-Service
Service
• Heavy operational burden • Provider lock-in
• Migration is hard • Migration is very hard
Kubernetes as Orchestrator
• Fault-tolerance
• Autohealing
• Isolation
• Scaling
• Autoscaling
• Load balancing
Google Kubernetes Engine (GKE)
• Google Kubernetes Engine
• Service for working with Kubernetes clusters on GCP
• Runs Kubernetes on GCE VM instances
Kubernetes Clusters

Worker
nodes

Master node
Master
• One or more nodes designated master node
• Unified endpoint for your cluster
• Managed by GKE, not visible directly to user
• Multi-master for high-availability
• Pulls container images from the GCR for cluster nodes
• Kubernetes Control Plane directed from here
Kubernetes Clusters

Containers run
on worker nodes

Users interact with the


master node
Nodes

Nodes are on-premises or


cloud VMs on which
containers are run
Node Pools

A subset of node instances


which have the same
configuration are called
node pools
Node Images

Special operating system


images are available on the
Google Cloud to run on
Kubernetes nodes
Kubernetes does not interact directly with containers

Instead it uses a number of higher-level entities referred to


as objects
YAML Specification Files for
Objects

Current State Desired State


The current state of the object The end state of the object
YAML Specification Files for
Objects

Controllers in the Kubernetes cluster run reconciliation


loops to get the actual state to match the desired state
Using the Google Kubernetes Engine almost completely
eliminates the need to explicitly configure YAML files

Simply use the web console or the gcloud command line utility
Pods on Kubernetes Nodes
• Smallest and most basic deployable object in Kubernetes
• Can not run a container without enclosing pod
• Pods provide isolation between containers
• Pod acts as sandbox for enclosed containers
• Multi-container pods are possible
• tightly-coupled
• not usually recommended
Higher-level Abstractions
• ReplicaSet
• Scaling and healing
• Deployment
• Versioning and rollback
• Service
• Static (non-ephemeral) IP addresses
• Stable networking
• Persistent volumes
• Non-ephemeral storage
ReplicaSet
• If pod crashes, ReplicaSet will start a new one
• Key to scaling and healing
• All pods are replicas of each other
Deployment Objects
• Easy to push out new version of container
• Triggers creation of new ReplicaSet and new containers
• Pods in old ReplicaSet gradually reduced to zero
• Every change to a Deployment object triggers creation of a new
revision
• Trivial to rollback to previous revision
• Offers versioning support
Ephemeral IP Addresses
• Containers expose ports in pod spec
• Pod IP addresses are ephemeral
• Where should clients send requests?
Service Objects
• Provides stable (non-ephemeral) IP address
• Connects to set of back-end pods
• Set of pods changes dynamically
• Basic load balancing too
Storage with Containers
• On disk files within a container
• Only accessible to the container itself
• Ephemeral: is lost when the container stops or crashes
• Volume abstractions
• A directory accessible to all containers in a pod
• Have the same lifetime as the enclosing pod
For durable storage use
persistent volumes
The volume is preserved even when the pod is
removed and can be handed off to another pod
Workloads on Kubernetes
To deploy and manage containerized applications on the GKE the
Kubernetes system creates controller objects
Workloads on Kubernetes
• Stateless applications
• Does not preserve state, saves no data to persistent disk
• Deployed using the Deployment object
• Stateful applications
• State is saved or persisted, uses persistent volumes
• Deployed using the StatefulSet object
• Batch jobs
• Finite, independent, parallel jobs
• Deployed using the Job object
• Daemons
• Ongoing, background tasks, without intervention
• Deployed using a DaemonSet
Session 7: Load Balancing
Scalable Compute

User Traffic Backend Service


Incoming requests from users Group of instances to service
during sale day those requests
Scalable Compute

User Traffic Backend Service


Incoming requests from users Managed Instance Group
during sale day
Managed Instance Groups are a horizontally scaled
IaaS offering with autohealing and autoscaling
Managed Instance Group
Group of identical GCE VM instances, created from the same
instance template that are managed by the platform
Scalable Compute with MIGs

User Traffic Backend Service


Incoming requests from users Managed Instance Group to
during sale day serve those incoming requests
Load Balancers

Which specific instance?


What IP Address?
Individual VMs may be
Need a stable IP address to
terminated, restarted,
send traffic to - not ephemeral
overloaded
Load Balancers
Backend
Service

Which specific instance


What IP address to services request?
access?

User Request Load Balancer


Load Balancers
• Complex service
• Many moving parts
• Basic idea
• Stable front-end IP
• Forwarding rules to funnel traffic
• Connect to backend service
• Distribute load intelligently
• Health checks to avoid unhealthy instances
Load Balancers on the GCP
• Fully managed, software-defined, redundant and
highly available
• Supports > 1 million queries per second with high
performance and low latency
• Put resources behind a single IP address
• Autoscaling to meet increased traffic
• Route traffic to closest VM
Load balancers on the GCP can also work with
unmanaged instance groups which offer no
autoscaling and autohealing properties
Global Load Balancing
Use when your users and instances are globally distributed,
Provides IPv4 and IPv6 termination
Regional Load Balancing
Use when instances and users are concentrated in one region
and only IPv4 termination is needed
External Load Balancing
Distributes traffic from the internet to a GCP network
Internal Load Balancing
Distributes traffic only within a GCP network
Load Balancing

External Internal

Regional
Global Regional

Network
TCP/UDP
HTTP/
SSL Proxy TCP Proxy
HTTPS
OSI Network Stack
User

Application Layer HTTP/HTTPS


Presentation Layer

Session Layer SSL Proxy


Transport Layer TCP Proxy

Network Layer Network

Data Link Layer

Physical Layer

The different load balancers operate at different layers of the


OSI network stack
HTTP(S) Load Balancing
• Distributes HTTP(S) traffic among groups of instances
based on:
• Proximity to the user
• Requested URL
• Or both.
SSL Proxy Load Balancing
• Use only for non-HTTP(S) SSL traffic
• For HTTP(S), just use HTTP(S) load balancing
• SSL connections are terminated at the global layer
• Then proxied to the closest available instance group
TCP Proxy Load Balancing
• Allows you to use a single IP address for all users around the
world
• TCP connection terminated at the load balancing layer
• Automatically routes traffic to the instances that are closest to
the user
• More intelligent routing than network load balancing
• Better security, TCP vulnerabilities patched at the load
balancer
Network Load Balancing
• Based on incoming IP protocol data, such as address,
port, and protocol type
• Pass-through, regional load balancer - does not proxy
connections from clients
• Use it to load balance UDP traffic, and TCP and SSL
traffic
• Load balances traffic on ports that are not supported by
the SSL proxy and TCP proxy load balancers
Internal Load Balancing
• Private load balancing IP address that only your VPC
instances can access
• VPC traffic stays internal - less latency, more security
• No public IP address needed
• Internal HTTP(S) and TCP/UDP load balancing
• Useful to balance requests from your frontend to your
backend instances
HTTP(S) Load Balancing
Backend
Global Forwarding
Internet
Rule Instance Group

Target Proxy Backend Service


Backend

Instance Group
Health Check
Url Map

A global, external load balancing service offered on the GCP


Session Affinity
• Session affinity: All requests from same client to same server
based on either
• Client IP
• Cookie
Backend Service Components
• Health Check: Polls instances to determine which one
can receive requests
• Backends: Instance group of VMs which may be
automatically scaled
• Session Affinity: Attempts to send requests from the
same client to the same VM
• Timeout: Time the backend service will wait for a
backend to respond
Backends
• Instance group: Can be a managed or unmanaged instance
group
• Balancing mode: Determines when the backend is at full usage
• CPU utilization, Requests per second
• Capacity setting: A % of the balancing mode which determines
the capacity of the backend
Backend Buckets
• Allow you to use Cloud Storage buckets with HTTP(S) load
balancing
• Traffic is directed to the bucket instead of a backend
• Useful in load balancing requests to static content
Cloud CDN
Works with HTTP(S) load balancing to deliver content to users
from numerous worldwide caches located close to users and at
the edge of Google’s network
Cache Content Using Cloud CDN

HTTP(S)
Backend
End user A Cloud CDN Load
Instance
Balancer

The Cloud CDN will try and deliver content from the
cache if content is present in the cache

Content will be cached on cache misses


Session 8a: Managed Instance Groups

Month/Year
Cloud VM Instances
• The easiest compute option to begin with
• “Lift-and-shift” migration from on-premise data center
• However, two significant drawbacks
Individual VMs do not provide autoscaling
and autohealing
Cloud VM Instances
• Individual VM instances do not provide either advantage
• Some higher level abstraction is needed to do so
Managed Instance Groups are a horizontally scaled
IaaS offering with autohealing and autoscaling
Managed Instance Group
Group of identical GCE VM instances, created from the same
instance template that are managed by the platform
Instance Template
A specification of machine type, boot disk (or container
image), zone, labels and other instance properties that can
be used to instantiate either individual VM instances or a
Managed Instance Group
Features of MIGs
• Autoscaling policies
• Load balancing
• Identification and recreation of unhealthy instances
• Rolling updates
Unmanaged Instance Group
Dissimilar VM instances that are arbitrarily grouped together after-
the-fact, usually for load balancing
Unmanaged Instance Groups
• Do not support
• Autoscaling
• Rolling updates
• Do support
• Load balancing (primary use case)
Health Checks
Health Check

Managed Instance
Instance Template
Group
Health Checks
• If instances unhealthy, do not respond within time period
• Replace instance with new one
Autoscaling Policies

CPU Requests/
Utilization Second

Managed Instance
Instance Template
Group
Autoscaling Policies
• Check whether policy is being satisfied
• If more instances needed, add instances
• If fewer instances needed, remove instances
Session 8b: StackDriver,
Deployment Manager, Apigee,
Dataproc, Pubsub

Month/Year
Google Stackdriver
Suite of ops services providing monitoring, logging, debugging,
error reporting, tracing, alerting and profiling. Integrates with
several third-party tools
Stackdriver Suite

Monitoring Logging Trace

Error Reporting Debugging Profiling


Cloud Deployment Manager
Infrastructure deployment service that automates the creation and
management of Google Cloud Platform resources
Deployment Manager
• Infrastructure-as-Code (IAC)
• Declarative format (YAML) for provisioning infrastructure
• Configuration as code
• Repeatable and scalable deployments
API
A way for one application to invoke code or consume data from
another application.
APIs
• Ease of access
• Code re-use
• Specialization
Apigee
A company that built a full lifecycle API management platform
acquired by Google in 2016.
Apigee Edge
The API management platform built by Apigee, which allows
developers to build and manage API proxies.
API Proxy
A program that sits in front of your API and proxies incoming user
requests to the API and provides various value-added features.
Consuming APIs Directly
Client Apps Backend databases

Mobile

PoS Devices Backend services, App servers

Partners

Web
Apigee Edge
Client Apps Backend databases

Mobile

PoS Devices Apigee Edge Backend services, App servers

Partners

Web

Decouple apps from APIs, control access to APIs


Apigee Use Cases
• Build API proxies easily
• Secure API calls
• Secure data
• Manage and throttle traffic
• Monetize smartly
• Set and enforce policies
Hadoop on the GCP
• HDFS for storage
• MapReduce for compute on local machines
• YARN for co-ordination
Cloud Dataproc
• Google Cloud Storage for storage
• MapReduce for compute on GCE VMs
• Dataproc service = Dataproc + YARN for co-ordination
Dataproc clusters should be created on the fly for
compute - don’t use them for storage

Store data in Cloud Storage buckets which is cheap


and does not need VMs provisioned and running
Hadoop vs. Dataproc
Hadoop Dataproc
• Clusters always provisioned and • Create clusters on the fly for
running compute requirements
• HDFS runs on cluster node • HDFS runs on persistent disks
• Store data on cluster nodes • Store data in Cloud Storage
• Cluster is stateful • Cluster is stateless
Pub/Sub
• Many-to-many asynchronous messages
• Decouples senders and receivers
• Publishers publish messages to a topic
• Subscribers listen or subscribe to topics
• Reliable, scalable delivery
Session 8c: Anthos

Month/Year
Anthos
A single open application platform to manage and run your
applications across on-premises and cloud environments
Anthos
Modernize applications, migrate workloads, apply policies
and security at scale with a consistent experience across
on-premises and cloud
Computing Environment
• Google Kubernetes Engine (GKE) and GKE On-Prem to
manage installations
• Common orchestration layer no matter where your clusters and
applications are located
• Manages application deployment, configuration, upgrade and
scaling
Networking Environment
• Interconnect GCP and on-premises networks
• VPN tunnels using Cloud VPN on the GCP
• Dedicated and Partner interconnects for lower latency and high
throughput
Microservices Architecture
• Monolithic applications hard to scale and not robust
• Microservices architecture involve many services
communicating over the network
• Uses the service mesh model using the open-source
implementation Istio
• Manages network inconsistencies by abstracting
communication into a separate container in the same pod as
the application
Other Components
• Anthos Server Mesh to manage Istio + additional features
• Communication between services

• Centralized configuration management using configuration as


code

• Consolidated logging and monitoring using Stackdriver

• Unified user interface for GCP and on-prem

You might also like