0% found this document useful (0 votes)
25 views

gft-whitepaper-azure-terraform

Uploaded by

Manish Gupta
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
25 views

gft-whitepaper-azure-terraform

Uploaded by

Manish Gupta
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 20

Shaping Partners: Secure White paper

the future Microsoft Terraform


of digital Azure Delivery
business Pipeline

Secure
Terraform
Delivery
Pipeline
B
With the beginning of the cloud era, the
need for automation of cloud infrastructure
has become essential. Although still very
young (version 0.12), Terraform has already
become the leading solution in the field
of Infrastructure as Code. A completely
new tool in an emerging area, working in
a new programming model – this brings
a lot of questions and doubts, especially
when handling business-essential cloud
Shaping Partners: Secure White paper
the future Microsoft Terraform
of digital Azure Delivery
business Pipeline

01.
Why a secure Terraform
pipeline is needed?
A
At GFT, we face challenges of delivering Terraform deployments
at scale: on top of all major cloud providers, supporting large
organizations in a highly regulated environment of financial
services, with multiple teams working in environments in multiple
regions around the world. Automation of Terraform delivery
whilst ensuring proper security and mitigation of common risks
and errors is one of the main topics across our DevOps teams.
The goal is to create a process that ■ The change is deployed to a test ■ Terraform changes are applied to
allows a user to introduce changes into environment. Before that, the staging using a designated Terraform
a cloud environment without having Terraform plan is reviewed manually system account. There is no other way
explicit permissions for manual actions. and approved. to use this Terraform account as in this
The process is as follows: step of the process.
■ The change needs to be tested/
■ A change is reviewed and merged with approved in a test environment. ■ Follow the same procedure to promote
a pull request after a review of the changes from staging to the production
required reviewers. There is no other ■ The Terraform plan is approved for environment.
way to introduce the change. the staging environment. The change
is exactly the same as in the test
environment (e.g. the same revision).

Development TestEnv Stage Env Prod Env

Implement Change ■ Use ‘test’ Terraform ■ Use ‘stage’ Terraform ■ Use ‘prod’ Terraform
system account system account system account
■ Use non-prod ■ Use non-prod ■ Use non-prod
deployment agents deployment agents deployment agents
Pull request & code review ■ Use non-prod ■ Use non-prod ■ Use non-prod
Terraform backend Terraform backend Terraform backend

Pull-request CI build:
■ Terraform validate all
environment CD pipeline: CD pipeline: CD pipeline:
Terraform plan in dev Terraform apply Terraform apply Terraform apply
environment
■ Check Terraform
policies (e.g.) Sentinel) Review and approve plan Review and approve plan Review and approve plan
in a time span (e.g.) 1 hour) in a time span (e.g.) 1 hour) in a time span (e.g.) 1 hour)

Merge into ‘master’


CD pipeline: CD pipeline: CD pipeline:
Terraform apply Terraform apply Terraform apply
Master CI build:
■ Give a unique veresoin
number CD pipeline: CD pipeline:
Sign-off environment
Terraform validate all Run automated checks Run automated checks
environment and policies (prod-like and policies (prod-like
■ Check Terraform
controls) controls)
policies (e.g.) Sentinel)

Sign-off environment
2—3

01.1 Environments
Environments (dev/uat/stage/prod) have a
System accounts in higher environments
have permissions limited to only what is
Non-functional proper level of separation ensured: required in order to perform actions. Limit
requirements ■ Different system accounts are used
permissions to only the types of resources
that are used.
for Terraform in these environments.
A Each Terraform system account
Remove permissions for deleting critical
has permissions only for its own
resources (e.g. databases, storage)
environment.
to avoid automated re- creation of these
■ Network connectivity is limited
resources and losing data. On such
between resources across different
occasions, special permissions should be
environment
granted “just-in-time”.
■ (Optional) Only a designated agent or
set of agents configured in a special
Process
virtual network is permitted to modify
A change to a higher environment
the infrastructure (i.e. run Terraform)
(e.g. STAGE) can be deployed only if it was
and access sensitive resources (e.g.
previously tested in a lower environment.
Terraform backend, key vaults etc). It
There is a method to ensure that this is
is not possible to release to e.g. prod
exactly the same Git revision tested. The
using a non-prod build agent.
change can only be introduced with a pull
request with a required review process.
There is a way to ensure that Terraform
configuration is as similar as possible
An option to apply Terraform changes
between environments. (I.e. I cannot forget
can be only allowed after manual
about the whole module in PROD as
terraform plan review and approval on
compared to UAT)
each environment.
Terraform backend in higher environments
(e.g. UAT) is not accessible from local
machines (network + RBAC limitation). It
can be accessed only from build machines
and optionally from designated bastion
hosts.

System accounts for Terraform


Terraform runs with a system account
rather than a user account when possible.
Different system accounts are used for:
■ Terraform (a system user that modifies
the infrastructure),
■ Kubernetes (a system user that is
used by Kubernetes to create required
resources e.g. load balancers or to
download docker images from the repo
■ Runtime application components (as
compared to build-time or release-time)

System accounts that are permitted


to Terraform changes can be used
only in designated CD pipelines. I.e. it
is not possible that I can use e.g. a
production Terraform system account in
a newly created pipeline without a
special permission. (Optional) Access
to use the Terraform system account is
granted “just-in-time” for the release.
Alternatively, the system account is
granted permissions only for the time
of deployment.
Shaping Partners: Secure White paper
the future Microsoft Terraform
of digital Azure Delivery
business Pipeline

02.
Implementing a
secure Terraform pipeline
A

TIP! Take a look at this external documentation to start setting up Terraform in CI/CD pipelines:

■ Running Terraform in Automation


■ Terraform Cloud
■ Terraform Enterprise
■ How to Move from Semi-Automation to Infrastructure as Code
■ How to Move from Infrastructure as Code to Collaborative Infrastructure as Code

02.1 Make sure that the backend infrastructure


has enough protection. State files
order to maintain Terraform state history.
In some special cases manual access
Terraform will contain all sensitive information that to Terraform state files will be required.
backends goes through Terraform (keys, secrets,
generated passwords etc.).
Things like refactoring, breaking changes
or fixing defects will require running Terr-
A aform state operations by operations per-
■ This will most likely be AWS S3+Dyna- sonnel. For such occasions plan extraor-
Having a shared Terraform backend is the moDB, Google Cloud Storage or Azure dinary, controlled access to the Terraform
first step to build the pipeline. A Terraform Storage Account. state using bastion host, VPN etc.
backend is a key component that handles ■ Separate infrastructure (network +
shared state storage, management, as well RBAC) of production and non-prod When using Terraform Cloud/Enterprise
as locking, in order to prevent infrastruc- backends. with remote backend the tool will handle
ture modification by multiple Terraform ■ Plan to disable access to state files (net- the requirements for state storage.
processes. work access and RBAC) from outside of
a designated network (e.g. deployment
Some initial documentation: agent pool).
■ Do not keep Terraform backend infra-
■ Terraform Backend Configuration structure with the run-time environment.
■ backend providers list Use separate account/project/subscrip-
■ AWS s3 tion etc.
■ GCP cloud storage
■ Azure storage account Enable object versioning/soft delete op-
■ Remote backend for Terraform Cloud/ tions on your Terraform backends to avoid
Enterprise losing changes and, state-files, and in
4—5

02.2 Naturally, Terraform allows you to divide the structure into modules. However, you
should consider dividing your entire infrastructure into separate projects. A “Terraform
Divide into multiple project” in this description is a single piece of infrastructure that can be introduced in
projects many environments, usually with a single pipeline. Terraform projects will usually match
cloud architectural patterns like Shared VPC, Landing zone (Azure and AWS), hub-and-
A spoke network topology. There are many patterns in AWS Well-Architected Framework,
Azure Cloud Adoption Framework, Architecture Center or Google Cloud Solutions.

Here are some samples: application is tested, in order to avoid Terraform plans will become slower
interrupting the QA team’s work when and refactoring might be very risky.
Terraform Bootstrap applying potentially imperfect Terraform It might be a good idea to divide the
This is needed when Terraform remote configurations. entire infrastructure into layers. Overlays
state-files are stored in the cloud. This Moreover, be prepared to divide runtime may initially look like optional modules
is going to be a simple project that will environments across teams, services, de- required only in certain environments.
create the infrastructure required for the partments. It might be impossible to have Here are some theoretical examples:
backends of other projects. In general, a single project with the whole “company ■ Shared networking layer - virtual net-
avoid stateless projects. But this will be production” environment. works, firewalls, VPNs,
one of them (the old chicken and egg ■ Core infrastructure - compute, storage,
problem). Kubernetes,
Some general ■ Application layer - messaging services,
Landing Zone
Have a separate project (or projects) to
situations that suggest databases, key vaults, log aggregation,
■ Monitoring overlay - custom metrics,
set-up the presence in the cloud - a dividing infrastructure health checks, alerting rules.
network or a VPN connection, core
resources, security baseline. Building a
into projects: Serve different departments/systems
landing zone is a separate topic. See for Pieces of infrastructure that have different
example https://www.tranquilitybase.io/ Use different system accounts for change sources or serve different
Shared build-time infrastructure different security levels business areas or departments, usually
A piece of company infrastructure, usually Have a separate project when you need to have different security levels – in such
global, that handles the build-time opera- use a different Terraform system account cases, access control may be treated
tions. For example: for pieces of infrastructure. Otherwise, separately with different Terraform
■ Build agent pools you’ll have to give very wide permissions pipelines, release cycles etc. Try to avoid
■ container registry (or registries) for a single system account. Examples: huge monolithic configurations.
■ core key vaults storing ■ pieces of infrastructure across multiple
company certificates projects or organizational units,
■ global DNS configurations etc. ■ build-time infrastructure vs runtime
infrastructure,
Host runtime infrastructure ■ separate systems,
Usually, runtime environments have some ■ shared infrastructure vs single system/ TIP! When resources of multiple
prerequisites and pieces of infrastructure service, projects need to interact with each
other and rely on each other you can
that might be shared between prod and ■ serving different regions.
use Terraform data sources to “reach”
non-prod environments, such as bastion resources from different projects.
hosts, DNS, key vaults. This is also a good Have a different set of environments Terraform allows to have multiple
place to configure deployment agent pools When you identify that for one piece of providers of the same type that, for
separate for production and non-prod infrastructure you only need “prod” and example, access different projects/
environments (you may not need separate “non-prod” and for the other part you accounts/subscriptions.
for dev1, dev2, uat1, uat2 etc.) will have “dev”, “uat”, “stage” or “prod”,
then this is a sign that these pieces of Example: use separate provider and
Runtime environments infrastructure should be separate. data source to “find” the company’s
global VPN gateway subnet for
Naturally, this is the infrastructure under
setting up network connectivity for
the applications and services serving the Build layers and overlays the runtime environment.
business. Be sure that there is an If the Terraform configuration in one
environment to test Terraform scripts, project grows too big it might become
not necessarily the same that the challenging to handle it. Constructing
Shaping Partners: Secure White paper
the future Microsoft Terraform
of digital Azure Delivery
business Pipeline

An example of how
to divide infrastructure into
Terraform projects:

Built-time Infrastructure NON-PROD runtime Infrastracture

TF Backend TF Backend Monitoring obverlay Monitoring obverlay


(global) (non-prod) (dev) (stage)

Organization Landing System A System B System A System B


Zone (global) (dev) (dev) (stage) (stage)

Host Project (non-prod)

Deploy Agent Pool


(non-prod)

Shared
CI/CD Infrastructure and Audit logging …
Bastion (non-prod) networking
tools (global) (non-prod) (non-prod)
(non-prod)

Build Agents

TF Backend
Container Registry (prod) PROD runtime Infrastructure

Monitoring obverlay Monitoring obverlay


(dev) (dr)

Runtime App (prod) Runtime App (dr)

Host Project (non-prod)

Deploy Agent Pool


(non-prod)

Shared
Audit logging …
Bastion (non-prod) networking
(non-prod) (non-prod)
(non-prod)
6—7

02.3 Prepare to handle multiple environments


on day 1. This will be a very complicated
Some options for handling environments

Handle environments change later and may require heavy ■ Use Terraform Workspaces and
separately refactoring. terraform init --backend-config option to
switch backends and environments.
A ■ Make sure that each environment of ■ Use the module and directory layout
each project has its own state-file. Don’t (e.g. Terraform main-module approach)
keep multiple environments (dev/stage/ that allows handling multiple
prod) in a single Terraform state-file. environments by making use of module
■ Use different Terraform system structure
accounts for environments. Make sure
early on that the system accounts When building multiple environments,
have limited permissions and cannot make sure that you handle differences
access each other’s infrastructure. properly. Different environments will have
■ Lock-down access to e.g. staging different pricing tiers, VM sizes, or even
state file early. It will force you to think some resources totally disabled (WAF,
about building an automated and DDoS protection).
secure pipeline quickly.
■ Prepare non-prod and prod deployment ■ Use parameters and “feature flags”
agent pools early. Lockdown to enable/disable optional features
network access to storage, key vaults, combined with “count” and “for_each”
Kubernetes API etc. from specific virtual construct in Terraform,
networks of deployment agent pools. ■ If you use defaults make sure that
default is the production setting and
Keep in mind that Terraform does you override non-prod environment
not allow using variables in the provider settings,
and backend sections. A simple ■ You can easily point the differences
approach with multiple ‘.tfvars’ files may per environment and you know how
be challenging in the long run. feature changes across environments,
■ Have an explicit prod-like environment
to test with 1:1 production settings.
Have an explicit test for that in the
release procedure.

TIP! In the pipeline’s build step always run “terraform validate” for all environments.
Make sure that this step fails if you forget to set a property. Make sure that it is not
possible to forget about an entire module in your environment.

02.4 Nevertheless, even if a piece of a Like with any other


Terraform project does not look like a
Organize into shared module, it might be worth to programming language,
modules encapsulate it into a submodule. There modularization brings
are major benefits:
A value even if modules
■ large infrastructure code are an integral part of a
Terraform Modularization is a wide topic decomposition,
with a primary purpose of building a ■ focusing on single responsibility per single repository and
catalogue of reviewed, maintained and
reusable infrastructure components. ■
module,
the code readability improvement,
are used just once.
Besides a global, public Terraform Registry, ■ following the same
companies build their own module ■ tracking the dependencies (variables
libraries. This can be done either with the and outputs) between logical pieces
use of Terraform Cloud/Enterprise or with of infrastructure.
use of Git repositories.
Shaping Partners: Secure White paper
the future Microsoft Terraform
of digital Azure Delivery
business Pipeline

03.
Build a Terraform
Pipeline with CI/CD
A
This will depend a lot on the tool that is Follow a GitOps approach
available. Each CI/CD tool will have ■ Rely on Git repositories, branches and
different features and approaches, more or tags to control the process
less terraform specific. There are ■ Rely on Git access control, permissions
NOTE: Terraform Cloud/Terraform
two general options: and pull request enforcement
Enterprise is an opinionated solution
■ Use Git webhooks
addressing some of the aspects of
Use a built-in mechanism of the CD tool ■ Have a separate “code” repository and
implementing a secure Terraform
for release control “delivery” repository for GitOps control
pipeline such as:
■ Build-in package versioning
■ Secured pipeline/build artifacts Besides Terraform Cloud/Enterprise
■ Remote Backend
■ Release pipelines with the environment a popular free tool that
■ Module registry
promotion process supports GitOps Terraform control is
■ Terraform plan review and approval
■ Manual checks and approvals by > https://www.runatlantis.io/
■ Remote Terraform execution
entitled people
■ Secure system accounts credentials
■ Release plans, rollback plans
handling
■ Handling system accounts and secrets

Since it can be used through its UI as well


as API and CLI interfaces it can be used
together with a CI/CD tool, however,
Terraform Cloud/Enterprise is not a CI/
CD tool itself.

B 04. Have a Terraform “build” step 07. Protect access to separate system
It is not obvious that the Terraform code accounts per environment
General steps can be built. However, at least validating Multiple systems accounts need to be
to implement a secure Terraform against configurations of prepared for different environments. The
all environments and running a plan on goal of the pipeline is to ensure that the
Terraform pipeline a designated environment will allow access to the system account is provided
catching obvious errors. Terraform securely (i.e. credentials/keys are hidden/
01. Protect the “master” branch validate action can be executed without secret, write-only, are not logged in the
Configure your Git in such a way that connecting to a remote backend. It is console). Only an approved pipeline
“push” to the “master” branch is forbidden worth doing it on the master branch should be able to use this system account.
and allowed only with a pull request (or as well as on a pull request. This can also > Ask: Am I able to use the production
merge request). Set the required steps allow setting a unique version number Terraform system account in a non-prod
(e.g. Terraform “build” must pass) and or a tag on the master branch version. pipeline or non-prod stage?
required number of approvers from a list. > Ask:: How do I ensure that the
> Ask: Can I put an arbitrary change into a Terraform code has at least proper 08. Protect access to separate agent
branch that is the source of a release? syntax and complete configuration? pools per environment
Agent pools for non-prod and prod
02. Build a multi-stage pipeline 05. Have a manual approval step deployments should be separate. Network
If possible, build a pipeline that of Terraform plan access to services like Kubernetes API,
will visualize that a certain Terraform This will be one of the hardest key vaults, sensitive storage etc. should
configuration version is promoted requirements to implement in most be limited, including the deployment build
from one environment to another. the tools. Start from here. This is to ensure agents. However, non-prod build agents
> Ask: Can I put something in production that the plan is reviewed and approved by must not have access to the production
without testing in staging? Can I a person and exactly this plan is applied. runtime environment. Only approved
deploy something from an arbitrary > Ask: How do I review what is going to pipelines should be able to execute on
branch other than “master”? change in each environment? production deployment agent pool.
> Ask: Is it possible to run an arbitrary
03. Rely on versions rather than branches 06. Ensure that there is only one pending pipeline on production deployment agent
When you are building your pipeline make plan per environment pools?
sure that you promote an immutable When a plan is waiting for approval and
snapshot of your Terraform configuration. some other plan is applied, the first plan 09. Have a rollback plan
Avoid using branches for this. Promote is not true anymore. Prevent concurrent Rolling changes back usually means
a build artifact or a Git tag instead. plans pending on a single environment. running a release for a previous version of
> Ask: Am I 100% sure that I’m > Ask: How do I ensure that no other the Terraform configuration.
deploying exactly the same version changes are applied between “plan” and > Ask: How do I run a previous version?
as tested before? “apply” steps?
8—9

10. Control user permissions 11. Have just-in-time access control for 12. Run tests as part of the pipeline
to environments Terraform Like any other piece of software, virtual
Make sure that only certain people can Introduce checks into the process to infrastructure can and should be tested.
deploy changes to certain environments. ensure that the production Terraform Below is the Continuous Compliance
Implement a four-eye-check (approval by system account will be available only section with some testing guidelines.
at least 2 people) for production releases. during the time of a planned release. Make sure that after deployment to at
Have control over initializing a release and Alternatively, it can be granted least prod-like and prod environments,
Terraform plan approval. production-level RBAC permissions only there is a step to verify compliance,
> Ask: Can I approve the Terraform plan in for the time of the release. This is to make security policies and run some tests, even
production if I am not permitted? sure that proper personnel have access to smoke tests. The purpose is to have quick
both the CI/CD pipeline as well as to the feedback.
cloud provider during the release. This > Ask: Will I immediately know if my
is another method of a four-eye-check. changed infrastructure is not compliant
Think: multi-factor authentication for CI/ with non-functional requirements and
CD. policies?
> Ask: Can I perform production release
with access to CI/CD pipeline only?

B
Example of a release
pipeline with the use of
Azure DevOps

■ Code version is stored as an artifact and promoted from environment to environment


■ All “Plan” and “Apply” stages are controlled by pre-stage and post-sage approvals.
■ The tool maintains the system credentials per environment and ensures access to
deployment agents.
Shaping Partners: Secure White paper
the future Microsoft Terraform
of digital Azure Delivery
business Pipeline

Example GitOps process


with Terraform Cloud:

from terraform.io

■ There is a “delivery” repository


(or repositories) to control the actual
deployment to environments
■ The “delivery” repository will import
modules from “code” repositories
or module registry
■ The release procedure starts with
a pull request with a change to a given
repository
■ An automated tool will prepare
a Terraform Plan
■ The approval of the plan is implemented
as pull request approval, the merge will
trigger actual release
10 — 11

04.
Continuous
compliance
A
With the ease and Terraform can introduce changes very
quickly and have a huge impact on the in-
In general, there are two layers of
Infrastructure as Code verification:
speed of introducing frastructure. This is why automated testing
changes in resource and policy verification is just as important
as in any other programming platform.
■ Pre-deployment - Verification of the
Terraform code or Terraform plan, more
configuration of cloud It can mean: akin to static code analysis,
Post-deployment - Verification of the
resources, comes a ■ Requirements verification

resources created in the cloud environ-
great risk of introducing Automated verification of some ment after Terraform configuration is
non-functional requirements and applied.
issues as well as assumptions for the project that need
breaking compliance to be verified continuously before the
sign-off. This is similar to unit/integra-
rules or company tion testing in software development.
standards. Usually, the team that creates Terraform
scripts provides the tests as well.

■ Compliance as Code
Automated verification of company, or-
ganization or regulatory compliance pol-
The goal of infrastructure testing and icies using a set of rules. Rules can be
continuous compliance is to ensure related to resource type (e.g. forbidden
automated verification of infrastructure services), resource location, machine
rules. For example: types, OS type and version, replica-
tion options, pricing tier/SLA, tagging
■ Limit allowed resource types and loca- or naming conventions etc. required
tions, across a whole organization. A separate
■ Verify machine types and sizes, team might provide company-wide poli-
■ Verify resource configuration (e.g. pa- cies and compliance standards.
rameters, naming, tags, tiers, encryption
configuration), ■ Security as Code
■ Check software versions and extensions Automated verification of security
installed on VMs, policies of introduced infrastructure.
■ Check audit configuration applied to Rules can be related to RBAC control,
resources, network, firewalls, cloud access control,
■ Verify IAM roles and assignments (e.g. key vaults, keys, secrets and certifi-
min/max count of administrators), cates, encryption etc. required across
■ Verify the configuration of Kubernetes the entire organization. Security rules
deployments like allowed images, ports, may be built with the IT Security team
limits, naming conventions. as well as with third-party tooling.

Company
Company Security checks
and reulatory
standards Requirements
standards &
for Terraform verification
policies

Post-deployment
Pull Request Pre-deployment Scheduled
Development Development verification
Code Review policy verification verification

Terraform
Shaping Partners: Secure White paper
the future Microsoft Terraform
of digital Azure Delivery
business Pipeline

04.1 There are certain tools that allow verifying


Pre-deployment Terraform code or plan before it gets
applied with more rules than just syntax
verification (build-time) and correctness
■ Terraform Sentinel can be used with
A Terraform Cloud/Terraform Enterprise.
This tool allows creating a set of com-
pany-wide policies and applying them
Terraform validate is a built-in tool. It will
to all Terraform projects across multiple
check the correctness of syntax, variables
teams to ensure, that each project
etc. A good idea is to also run at least
adheres to the rules. This tool will verify
a Terraform plan as a validation step to
the actual plan before the deployment
check if it doesn’t fail with an error. Keep in
to live infrastructure and look for not
mind that running the plan against “empty”
allowed resource types, configuration
infrastructure may have different results
options etc. Think about it as static code
than a plan against the previous version of
analysis for Terraform, like Sonarqube.
infrastructure.
■ An open source Terraform Compliance
using Python BDD framework
■ A simple tool tflint will also check if
configuration parameters are correct
in a given cloud (for example inexistent
VM instance type). Currently available
for AWS only.
■ Forseti Terraform Validator can run For-
seti rules verification against Terraform
plan file. This one is for Google Cloud
only.
■ Another example of checking Terraform
files with Python (HCL can also be just
parsed and verified with a programming
language)

04.2 Verification of Terraform code before it


is applied brings just partial value. This
Testing infrastructure
Each cloud provider exposes the whole
Post-deployment will not verify items applied with custom infrastructure as plain REST/JSON API
verification (runtime) scripts (when Terraform does not support
some options) or will not find changes
(or gRPC) as well as SDKs for common
programming languages.
A introduced manually or due to an error. ■ AWS API Reference and AWS SDKs
Therefore, it is also valuable to verify the ■ Azure API Reference and SDKs and
running infrastructure. resource explorer
■ Google Cloud API Reference and API
SDKs

It is very easy to use a language of choice


NOTE: To address continuous compliance run verification on production-like and a favourite testing framework to easily
environments and in the production always after deployment but not only create tests with the use of language SDK
then (e.g. daily). Always use a system account with read-only permissions. or even pure REST API and tools like REST
Assured or JSON Assert.
12 — 13

Use a testing framework that Here is an example using Kotlin and Azure SDK:
will provide a nice and read-
able test report that can be a
class SamplePolicy: FunSpec({
document (BDD rather than
pure JUnit).
val requiredTags = listOf(“system”, “environment”, “managed_by”)
The tests can be executed:
/**
■ in a live environment (in-
*This is using Azure SDK
cluding
*/
production) to apply all
test(“All resource groups has required tags: $requiredTags”) {
requirements checks as well
as security or
val azure = Azure.authenticate(tokenCredentials)
compliance policies
.withSubscription(“SUBSCRIPTION-ID-GOES-HERE”)
■ In a deploy → test → unde-
ploy flow
azure.resourceGroups().list()
to verify the correctness of
.forAll { rg ->
the whole Terraform config-
rg.tags().keys.shouldContainAll(requiredTags)
uration
}
}
Here is an example using
Kotlin and Azure SDK:
/**
* This is using pure API and RestAssured
*/
test(“Soft delete is enabled on an important Key Vault”) {
val path = “https://management.azure.com/subscriptions/SUBSCRIPTION-ID-GOES-HERE” +
“/resourceGroups/resource-group-x” +
“/providers/Microsoft.KeyVault/vaults/imporrant-key-vault?api-version=2018-02-14”
RestAssured.given()
.header(“Authorization”, “TOKEN-GOES-HERE”)
.get(path)
.then()
.log().body(true)
.statusCode(200)
.body(“properties.enableSoftDelete”, Matchers.equalTo(true))
}

})

Using regular programming skills, it is very


easy to build a shared, parameterized set
of tests with some effort.
Bash scripting with CLI might not be a
best choice because the tests will become NOTE: In this approach, a native API or SDK is used which is usually the “source of
complex. Using a scripting language truth”. This is important when new cloud features are added to 3rd-party tools
(like Terraform or InSpec) with a delay and potential bugs. Relying on 3rd party tools
(Python, PowerShell) should be good.
for testing may cause problems. Sometimes even cloud CLI (bash or PowerShell)
Strongly-typed languages (like Kotlin) helps is delayed. API is always implemented first.
a lot when using cloud provider SDK due
to IDE support.
Shaping Partners: Secure White paper
the future Microsoft Terraform
of digital Azure Delivery
business Pipeline

04.3
Built-in cloud
policy tools
A

Each cloud provider has a native tool B


to address company-wide governance
policies. These are: AWS Config and Control
Tower
■ AWS Config
■ Azure Policy + Azure Security Center
■ Forseti Config Validator for GCP AWS Config is a service that
continuously monitors and records AWS
Cloud compliance services are sometimes resource configurations and allows
provided with a set of rules mapped to verifying overall compliance against the
industry standards such as HIPAA, ISO configurations specified in the internal
27001 or CSA Benchmarks. Creating cus- company guidelines. It comes with a set
tom rules is not always easy. These tools of around 150 pre-built managed rules as
can scope policy verification over a set of well with the SDK for creating and testing
company projects/accounts/subscriptions custom AWS Config rules.
not always “on-demand” during the Terra- Since AWS Config rules are in fact AWS
form pipeline run. One of the approaches Lambda functions defined in NodeJS or
observed in large organizations is that Python, there is a large library of rules
there are separate teams maintaining available in GitHub. A sample fragment of
company-wide compliance rules and code in Python:
infrastructure as code. This means that
the infrastructure team needs to adhere
to standards and policies but is not always
if not configuration_item[‘configuration’][‘distributionConfig’][‘logging’][‘enabled’]:
the author of new rules. The continuous
return build_evaluation_from_config_item(configuration_item,
policy tools should be used in addition to
‘NON_COMPLIANT’, annotation=’Distribution is not configured to store logs.’)
infrastructure testing.

AWS Config verification can be woven into With the use of AWS Control Tower, it is
the Terraform Deployment pipeline as a possible to:
post-release check. The results are howev- ■ integrate AWS Config rules into an end-
er not immediate, and some coding will be to-end compliance and conformance
required to gather an end-to-end compli- overview dashboard over a multi-ac-
ance report. Therefore, this is to ensure count organization,
that the implemented change still adheres ■ have an Account Factory for creating
to company standards rather than to use it new AWS Accounts with predefined
as a testing step. rules and settings.
In general, AWS Config is a versatile
AWS Config allows grouping the rules solution to handle company-wide stand-
together with remediation actions into ard compliance as a code and security
Conformance Packs (also “as a code” using as a code. It might not be the fastest way
YAML templates) to be easily deployable to implement individual solution require-
into multiple accounts and regions. Sample ments verification, where simple tests may
conformance packs: be easier to maintain. Its use in Terraform
■ Operational Best Practices pipeline is possible as an addition to built-
for Amazon S3 in AWS continuous compliance solution,
■ Operational Best Practices but not necessary.
for Amazon DynamoDB
■ Operational Best Practices for PCI-DSS
■ Operational Best Practices for AWS
Identity and Access Management
14 — 15

� �
Pros: Cons: �
■ Flexibility, and extensibility since ■ The rules code can become complex
the rules are actual code in Python or and constitute a whole programming
JavaScript, project,
■ SDK for rules development and a ■ The rule cannot prevent creating a
wide set of open-source rules in non-compliant resource (only detective
addition to built-in ones, mode),
■ Open-source tools for the whole ■ Including asynchronous rules
multi-account Compliance Engine verification into Terraform CD pipeline
available as well as integration with requires a complex solution.
Control Tower.

B Azure Policy is a system built with are defined in JSON code and each
declaratively defined rules applied to all policy consists of 3 parts:
Azure Policy and Security resources in the scope of the assigned
Center policy. Azure Policies can be assigned on ■ Parameters - they are defined during
Azure Subscription level as well as on the assignment
Management Group level (a group of ■ Policy Rule - the “if” part of the rule
Subscriptions, e.g. whole organization, ■ Effect - policy can either raise an alert
all production subscriptions etc.). (audit) or prevent creating a resource
Azure provides over 1000 predefined, (deny)
parameterized policies. Custom policies

A sample fragment of code of a policy:

“policyRule”: {
“if”: {
“field”: “[concat(‘tags[‘, parameters(‘tagName’), ‘]’)]”,
“exists”: “false”
},
“then”: {
“effect”:
“modify”,
“details”: {
“roleDefinitionIds”: [
“/providers/microsoft.authorization/roleDefinitions/b24988ac-6180-42a0-ab88-20f7382dd24c”
],
“operations”: [
{
“operation”: “add”,
“field”: “[concat(‘tags[‘, parameters(‘tagName’),
‘]’)]”, “value”: “[parameters(‘tagValue’)]”
}
]
}
}
}
Shaping Partners: Secure White paper
the future Microsoft Terraform
of digital Azure Delivery
business Pipeline

A sample fragment of code of a policy:

“policyRule”: {
“if”: {
“field”: “[concat(‘tags[‘, parameters(‘tagName’), ‘]’)]”,
“exists”: “false”
},
“then”: {
“effect”:
“modify”,
“details”: {
“roleDefinitionIds”: [
“/providers/microsoft.authorization/roleDefinitions/b24988ac-6180-42a0-ab88-20f7382dd24c”
],
“operations”: [
{
“operation”: “add”,
“field”: “[concat(‘tags[‘, parameters(‘tagName’),
‘]’)]”, “value”: “[parameters(‘tagValue’)]”
}
]
}
}
}

Policies in “deny” mode work like


additional validation rules, which means
Azure Policy check
that a resource that is not passing the can be included as a
verification will not be created.
post-deployment
In addition to that, policies can remediate correctness check in
some threats - e.g. automatically install
required VM extensions or modify the the Azure DevOps
configuration. release pipeline.
16 — 17

Azure Security Center is a single place to


govern results of all policy checks across
the organization as well as group results of
different threat detection systems (Net-
work, Active Directory, VMs etc.). Since pol-
icies are verified periodically, the Security
Center can address continuous compliance
in Azure providing the alerting mecha-
nism and verification history. To simplify
the process of managing corporate-wide
compliance, companies can also maintain
Azure Blueprints. A blueprint is a combina-
Besides individual policies, there are ■ Audit CIS Microsoft Azure Foundations tion of policies and initiatives together with
several predefined Policy Initiatives in Benchmark 1.1.0 recommendations default resource groups and IAM access
Azure, for example: and deploy specific supporting VM configuration.
Extensions (83 policy checks) Azure Policies are very powerful, but the
■ Audit ISO 27001:2013 controls and tool does not provide a developer-friendly
deploy specific VM Extensions to Policy Initiatives are parameterized groups interface for creating custom rules, espe-
support audit requirements (56 policy of policies to be assigned on Subscription cially, when JSON is used as a language.
checks) or Management Group level and can be Even without custom policies, the set of
■ Audit PCI v3.2.1:2018 controls and custom created for company standards. predefined policies is impressive and
deploy specific VM Extensions to There is a default Security Center initiative, can address a wide range of compliance
support audit requirements (37 policy containing over 90 configurable policies requirements. A complete Compliance as
checks) and is assigned by default to every Azure Code solution may combine infrastructure
subscription. tests and Azure policies.

� �
Pros: Cons: �

■ A large set predefined policies ■ Developing custom policies in JSON is


and initiatives for industry-standard hard,
compliance requirements, ■ Executing policies “on-demand” is not
■ Built-in integration with Azure DevOps possible.
and Azure Security Center,
■ The policy can work in “deny” mode,
■ Policy management with initiatives and
blueprints.
Shaping Partners: Secure White paper
the future Microsoft Terraform
of digital Azure Delivery
business Pipeline

Forseti and Google Google has open-sourced the Forseti ■ Forseti Config Validator that evaluates
Security project to address security rules GCP resources against Forseti rules.
Cloud Security validation and policy enforcement in ■ Forseti Terraform Validator that verifies
Command Center Google Cloud. This is a policy-as-code terraform plan against Forseti rules.
system that consists of multiple modules
B and works together with: The Forseti Rule Library is Open Source
and uses Rego (by Open Policy Agent
■ Forseti Security service that runs in framework) files to define policy rule
Google Cloud and takes configuration templates.
snapshots for policy monitoring. Here is a sample policy that forbids public
IPs for Cloud SQL databases:
Here is a sample policy that forbids public IPs for Cloud SQL databases:

deny [{
“msg”: message,
“details”: metadata,
}] {
asset := input.asset
asset.asset_type == “sqladmin.googleapis.com/Instance”

ip_config := lib.get_default(asset.resource.data.settings, “ipConfiguration”, {})


ipv4 := lib.get_default(ip_config, “ipv4Enabled”, true)
ipv4 == true
message := sprintf(“%v is not allowed to have a Public IP.”, [asset.name])
metadata := {“resource”: asset.name}
}

Using policy templates only requires ■ Cloud Data Loss Prevention Discovery,
defining constraints in YAML code. ■ Anomaly Detection,

If a policy template for the actual use case ■ Event Threat Detection, Pros:
is missing, Forseti provides tools and ■ 3rd party cloud security tools from
guidelines for authoring and testing Acalvio, Capsule8, Cavirin, Chef, Check
custom policies. Setting up Forseti Security Point CloudGuard Dome9, Cloudflare, ■ Support for GCP config validation as
requires running dedicated infrastructure in CloudQuest, McAfee, Qualys, Redblaze, well as for Terraform validation,
GCP that consists of Cloud SQL, compute Redlock, StackRox, Tenable.io, and ■ Declarative rules language with
and Cloud Storage. Google recommends Twistlock. tooling support,
creating a separate project to serve as ■ Integration with Security
a policy monitoring environment. Forseti Forseti Security offers complete Command Center.
provides a Terraform configuration for compliance-as-code tooling that can be
installation. The server picks policy used as part of Terraform CD pipeline
configuration deployed to Google Cloud in the pre-deployment step (with Forseti

Storage and then: Terraform Validator) as well as Cons: �
post-deployment (with Forseti Config
■ constantly monitors policies, Validator). Continuous compliance support
■ Requires installation and
■ can enforce security rules, with Cloud Command Center and other
maintenance of infrastructure and
■ stores Cloud Configuration snapshots plug-in systems adds up to a complete
setup of open-source components,
in Cloud SQL. solution. Being a security-oriented
■ A small library of predefined rules
solution, this may require significant effort
(around 70 templates) and lack of
To increase the overall policy and security to implement additional custom
industry-standard policy sets
status visibility, Google offers the Security non-functional requirements checks, thus,
(e.g. CIS Benchmarks, HIPAA etc.),
Command Center dashboard. it might be a good idea to combine
■ Lack of preventive-mode, only
Besides Forseti, GCSCC can use other it with infrastructure testing approach.
reactive/detection mode.
threat detection systems as a source
of vulnerabilities alerts like:
18 — 19

04.4
3rd party Policy
as Code tools
A

and some of the resources may be


very hard to test.

Pulumi CrossGuard
NOTE: Continuous Compliance and Another solution that allows policy as
security management for public cloud code is Pulumi CrossGuard. It allows for a
solutions is an emerging market for
more programmatic approach (JavaScript/
enterprise-grade solutions. Several
products became available like Check TypeScript) over an SDK supporting
Point CloudGuard Dome9, CloudQuest AWS, Azure, GCP as well as Kubernetes.
Guardian, Carvin or Qualys. Here is an example: This is currently in
beta and has the same dependency on 3rd
party provider for resources.
Here are some examples of
emerging open-source tools addressing
Azure Secure DevOps Kit
the policy-as-code topic.
Azure Secure DevOps Kit is an
open-source set of policies and rules
InSpec
implemented in a PowerShell-based
Chef InSpec is an open Compliance as
framework and ready to be executed
Code tool that can run a set of rules on
automatically in a pipeline or using e.g.
top of running infrastructure. Policies are
Azure Automation. It supports Azure
defined in a DSL that is quite descriptive
only, implemented in Microsoft, but it is
and readable. Many resource definitions
not an official Microsoft product. Some
are available for AWS, Azure and Google
of the policies overlap with the built-in
Cloud. The drawback of this solution is that
Azure Security Center.
resource definitions are added to InSpec
with some delay compared to cloud
provider API (just like to Terraform or ev
en to cloud provider’s CLI and SDK)
Shaping Partners: Secure White paper
the future Microsoft Terraform
of digital Azure Delivery
business Pipeline

Summary
A

The last few years is the time of


“-as-code” approaches to infrastructure,
compliance, security, configuration
management etc. Using software to address
hardware or process problems is the most
effective approach, and becomes possible
with hardware virtualization and the cloud.

When infrastructure configuration and scale


changes are introduced in minutes rather
than hours or days, and data or workload can
be placed on the wrong continent just
“by accident”, a totally new approach for
tooling and practices is required. This will
bring a lot more changes in the nearest future
and hopefully, the industry will start
to standardize around well-known tools.

Stay in touch
A
Piotr Gwiazda
B Technical Lead and Cloud Solution Architect, with 12 years of experience.
Senior Solutions Certified on professional levels with Google Cloud and MS Azure. Piotr is leading
Architect & Cloud Specialist GFT PL Cloud Practice. Highly experienced with design of multitier solutions for the
GFT Poland financial sector including global investment banks. Often acts as an Agile mentor,
E.: [email protected] helping organizations to build their Agile mindset.

blog.gft.com twitter.com/gft_en linkedin.com/company/gft-group facebook.com/GFTGroup >gft.com

GFT Technologies SE BSchelmenwasenstr. 34, 70567 Stuttgart, Germany BT. +49 711 620 420 Bwww.gft.com [email protected] GFT-201111-WP-TERRAFORM-EN-AEHM-RARR © GFT 2020

You might also like