microservices-on-aws
microservices-on-aws
Microservices on AWS
© 2021 Amazon Web Services, Inc. or its affiliates. All rights reserved.
Contents
Introduction ..........................................................................................................................5
Microservices architecture on AWS ....................................................................................6
User interface ...................................................................................................................6
Microservices....................................................................................................................7
Data store .........................................................................................................................9
Reducing operational complexity ......................................................................................10
API implementation ........................................................................................................11
Serverless microservices ...............................................................................................12
Disaster recovery ...........................................................................................................14
Deploying Lambda-based applications..........................................................................15
Distributed systems components ......................................................................................16
Service discovery ...........................................................................................................16
Distributed data management........................................................................................18
Configuration management............................................................................................21
Asynchronous communication and lightweight messaging ..........................................21
Distributed monitoring ....................................................................................................26
Chattiness.......................................................................................................................33
Auditing ...........................................................................................................................34
Resources ..........................................................................................................................37
Conclusion .........................................................................................................................38
Document Revisions..........................................................................................................39
Contributors .......................................................................................................................39
Abstract
Microservices are an architectural and organizational approach to software
development created to speed up deployment cycles, foster innovation and ownership,
improve maintainability and scalability of software applications, and scale organizations
delivering software and services by using an agile approach that helps teams work
independently. With a microservices approach, software is composed of small services
that communicate over well-defined application programming interfaces (APIs) that can
be deployed independently. These services are owned by small autonomous teams.
This agile approach is key to successfully scale your organization.
Three common patterns have been observed when AWS customers build
microservices: API driven, event driven, and data streaming. This whitepaper introduces
all three approaches and summarizes the common characteristics of microservices,
discusses the main challenges of building microservices, and describes how product
teams can use Amazon Web Services (AWS) to overcome these challenges.
Due to the rather involved nature of various topics discussed in this whitepaper,
including data store, asynchronous communication, and service discovery, the reader is
encouraged to consider specific requirements and use cases of their applications, in
addition to the provided guidance, prior to making architectural choices.
Amazon Web Services Implementing Microservices on AWS
Introduction
Microservices architectures are not a completely new approach to software engineering,
but rather a combination of various successful and proven concepts such as:
• Service-oriented architectures
• API-first design
5
Amazon Web Services Implementing Microservices on AWS
User interface
Modern web applications often use JavaScript frameworks to implement a single-page
application that communicates with a representational state transfer (REST) or RESTful
6
Amazon Web Services Implementing Microservices on AWS
API. Static web content can be served using Amazon Simple Storage Service (Amazon
S3) and Amazon CloudFront.
Because clients of a microservice are served from the closest edge location and get
responses either from a cache or a proxy server with optimized connections to the
origin, latencies can be significantly reduced. However, microservices running close to
each other don’t benefit from a content delivery network. In some cases, this approach
might actually add additional latency. A best practice is to implement other caching
mechanisms to reduce chattiness and minimize latencies. For more information, refer to
the Chattiness topic.
Microservices
APIs are the front door of microservices, which means that APIs serve as the entry point
for applications logic behind a set of programmatic interfaces, typically a RESTful web
services API. This API accepts and processes calls from clients, and might implement
functionality such as traffic management, request filtering, routing, caching,
authentication, and authorization.
Microservices implementation
AWS has integrated building blocks that support the development of microservices. Two
popular approaches are using AWS Lambda and Docker containers with AWS Fargate.
With AWS Lambda, you upload your code and let Lambda take care of everything
required to run and scale the implementation to meet your actual demand curve with
high availability. No administration of infrastructure is needed. Lambda supports several
programming languages and can be invoked from other AWS services or be called
directly from any web or mobile application. One of the biggest advantages of AWS
Lambda is that you can move quickly: you can focus on your business logic because
security and scaling are managed by AWS. Lambda’s opinionated approach drives the
scalable platform.
7
Amazon Web Services Implementing Microservices on AWS
Elastic Kubernetes Service (Amazon EKS) eliminate the need to install, operate, and
scale your own cluster management infrastructure. With API calls, you can launch and
stop Docker-enabled applications, query the complete state of your cluster, and access
many familiar features like security groups, Load Balancing, Amazon Elastic Block Store
(Amazon EBS) volumes, and AWS Identity and Access Management (IAM) roles.
AWS Fargate is a serverless compute engine for containers that works with both
Amazon ECS and Amazon EKS. With Fargate, you no longer have to worry about
provisioning enough compute resources for your container applications. Fargate can
launch tens of thousands of containers and easily scale to run your most mission-critical
applications.
Amazon EKS runs up-to-date versions of the open-source Kubernetes software, so you
can use all the existing plugins and tooling from the Kubernetes community.
Applications running on Amazon EKS are fully compatible with applications running on
any standard Kubernetes environment, whether running in on-premises data centers or
public clouds. Amazon EKS integrates IAM with Kubernetes, enabling you to register
IAM entities with the native authentication system in Kubernetes. There is no need to
manually set up credentials for authenticating with the Kubernetes control plane. The
IAM integration enables you to use IAM to directly authenticate with the control plane
itself and provide fine granular access to the public endpoint of your Kubernetes control
plane.
Docker images used in Amazon ECS and Amazon EKS can be stored in Amazon
Elastic Container Registry (Amazon ECR). Amazon ECR eliminates the need to operate
and scale the infrastructure required to power your container registry.
Continuous integration and continuous delivery (CI/CD) are best practices and a vital
part of a DevOps initiative that enables rapid software changes while maintaining
system stability and security. However, this is out of scope for this whitepaper. For more
8
Amazon Web Services Implementing Microservices on AWS
Private links
AWS PrivateLink is a highly available, scalable technology that enables you to privately
connect your virtual private cloud (VPC) to supported AWS services, services hosted by
other AWS accounts (VPC endpoint services), and supported AWS Marketplace partner
services. You do not require an internet gateway, network address translation device,
public IP address, AWS Direct Connect connection, or VPN connection to communicate
with the service. Traffic between your VPC and the service does not leave the Amazon
network.
Private links are a great way to increase the isolation and security of microservices
architecture. A microservice, for example, could be deployed in a totally separate VPC,
fronted by a load balancer, and exposed to other microservices through an AWS
PrivateLink endpoint. With this setup, using AWS PrivateLink, the network traffic to and
from the microservice never traverses the public internet. One use case for such
isolation includes regulatory compliance for services handling sensitive data such as
PCI, HIPPA and EU/US Privacy Shield. Additionally, AWS PrivateLink allows
connecting microservices across different accounts and Amazon VPCs, with no need
for firewall rules, path definitions, or route tables; simplifying network management.
Utilizing PrivateLink, software as a service (SaaS) providers, and ISVs can offer their
microservices-based solutions with complete operational isolation and secure access,
as well.
Data store
The data store is used to persist data needed by the microservices. Popular stores for
session data are in-memory caches such as Memcached or Redis. AWS offers both
technologies as part of the managed Amazon ElastiCache service.
Relational databases are still very popular to store structured data and business
objects. AWS offers six database engines (Microsoft SQL Server, Oracle, MySQL,
9
Amazon Web Services Implementing Microservices on AWS
Relational databases, however, are not designed for endless scale, which can make it
difficult and time intensive to apply techniques to support a high number of queries.
NoSQL databases have been designed to favor scalability, performance, and availability
over the consistency of relational databases. One important element of NoSQL
databases is that they typically don’t enforce a strict schema. Data is distributed over
partitions that can be scaled horizontally and is retrieved using partition keys.
Because individual microservices are designed to do one thing well, they typically have
a simplified data model that might be well suited to NoSQL persistence. It is important to
understand that NoSQL databases have different access patterns than relational
databases. For example, it is not possible to join tables. If this is necessary, the logic
has to be implemented in the application. You can use Amazon DynamoDB to create a
database table that can store and retrieve any amount of data and serve any level of
request traffic. DynamoDB delivers single-digit millisecond performance, however, there
are certain use cases that require response times in microseconds. Amazon
DynamoDB Accelerator (DAX) provides caching capabilities for accessing data.
10
Amazon Web Services Implementing Microservices on AWS
API implementation
Architecting, deploying, monitoring, continuously improving, and maintaining an API can
be a time-consuming task. Sometimes different versions of APIs need to be run to
assure backward compatibility for all clients. The different stages of the development
cycle (for example, development, testing, and production) further multiply operational
efforts.
Authorization is a critical feature for all APIs, but it is usually complex to build and
involves repetitive work. When an API is published and becomes successful, the next
challenge is to manage, monitor, and monetize the ecosystem of third-party developers
utilizing the APIs.
Amazon API Gateway addresses those challenges and reduces the operational
complexity of creating and maintaining RESTful APIs. API Gateway allows you to create
your APIs programmatically by importing Swagger definitions, using either the AWS API
or the AWS Management Console. API Gateway serves as a front door to any web
application running on Amazon EC2, Amazon ECS, AWS Lambda, or in any on-
premises environment. Basically, API Gateway allows you to run APIs without having to
manage servers.
The following figure illustrates how API Gateway handles API calls and interacts with
other components. Requests from mobile devices, websites, or other backend services
are routed to the closest CloudFront Point of Presence to minimize latency and provide
optimum user experience.
11
Amazon Web Services Implementing Microservices on AWS
Serverless microservices
“No server is easier to manage than no server.” — AWS re:Invent
Lambda is tightly integrated with API Gateway. The ability to make synchronous calls
from API Gateway to Lambda enables the creation of fully serverless applications and is
described in detail in the Amazon API Gateway Developer Guide.
The following figure shows the architecture of a serverless microservice with AWS
Lambda where the complete service is built out of managed services, which eliminates
the architectural burden to design for scale and high availability, and eliminates the
operational efforts of running and monitoring the microservice’s underlying
infrastructure.
12
Amazon Web Services Implementing Microservices on AWS
13
Amazon Web Services Implementing Microservices on AWS
Disaster recovery
As previously mentioned in the introduction of this whitepaper, typical microservices
applications are implemented using the Twelve-Factor Application patterns. The
Processes section states that “Twelve-factor processes are stateless and share-
nothing. Any data that needs to persist must be stored in a stateful backing service,
typically a database.”
For a typical microservices architecture, this means that the main focus for disaster
recovery should be on the downstream services that maintain the state of the
application. For example, these can be file systems, databases, or queues, for example.
When creating a disaster recovery strategy, organizations most commonly plan for the
recovery time objective and recovery point objective.
Recovery time objective is the maximum acceptable delay between the interruption of
service and restoration of service. This objective determines what is considered an
acceptable time window when service is unavailable and is defined by the organization.
14
Amazon Web Services Implementing Microservices on AWS
Recovery point objective is the maximum acceptable amount of time since the last
data recovery point. This objective determines what is considered an acceptable loss of
data between the last recovery point and the interruption of service and is defined by
the organization.
For more information, refer to the Disaster Recovery of Workloads on AWS: Recovery
in the Cloud whitepaper.
High availability
This section takes a closer look at high availability for different compute options.
Amazon EKS runs Kubernetes control and data plane instances across multiple
Availability Zones to ensure high availability. Amazon EKS automatically detects and
replaces unhealthy control plane instances, and it provides automated version upgrades
and patching for them. This control plane consists of at least two API server nodes and
three etcd nodes that run across three Availability Zones within a region. Amazon EKS
uses the architecture of AWS Regions to maintain high availability.
Amazon ECS is a regional service that simplifies running containers in a highly available
manner across multiple Availability Zones within an AWS Region. Amazon ECS
includes multiple scheduling strategies that place containers across your clusters based
on your resource needs (for example, CPU or RAM) and availability requirements.
AWS Lambda runs your function in multiple Availability Zones to ensure that it is
available to process events in case of a service interruption in a single zone. If you
configure your function to connect to a virtual private cloud (VPC) in your account,
specify subnets in multiple Availability Zones to ensure high availability.
15
Amazon Web Services Implementing Microservices on AWS
The AWS Serverless Application Model (AWS SAM) is a convenient way to define
serverless applications. AWS SAM is natively supported by CloudFormation and defines
a simplified syntax for expressing serverless resources. To deploy your application,
specify the resources you need as part of your application, along with their associated
permissions policies in a CloudFormation template, package your deployment artifacts,
and deploy the template. Based on AWS SAM, SAM Local is an AWS Command Line
Interface tool that provides an environment for you to develop, test, and analyze your
serverless applications locally before uploading them to the Lambda runtime. You can
use SAM Local to create a local testing environment that simulates the AWS runtime
environment.
Service discovery
One of the primary challenges with microservice architectures is enabling services to
discover and interact with each other. The distributed characteristics of microservice
architectures not only make it harder for services to communicate, but also presents
other challenges, such as checking the health of those systems and announcing when
new applications become available. You also must decide how and where to store meta
information, such as configuration data, that can be used by applications. In this section
several techniques for performing service discovery on AWS for microservices-based
architectures are explored.
Previously, to ensure that services were able to discover and connect with each other,
you had to configure and run your own service discovery system based on Amazon
Route 53, AWS Lambda, and ECS event streams, or connect every service to a load
balancer.
16
Amazon Web Services Implementing Microservices on AWS
Amazon ECS creates and manages a registry of service names using the Route 53
Auto Naming API. Names are automatically mapped to a set of DNS records so that you
can refer to a service by name in your code and write DNS queries to have the name
resolve to the service’s endpoint at runtime. You can specify health check conditions in
a service's task definition and Amazon ECS ensures that only healthy service endpoints
are returned by a service lookup.
In addition, you can also use unified service discovery for services managed by
Kubernetes. To enable this integration, AWS contributed to the External DNS project, a
Kubernetes incubator project.
Another option is to use the capabilities of AWS Cloud Map. AWS Cloud Map extends
the capabilities of the Auto Naming APIs by providing a service registry for resources,
such as Internet Protocols (IPs), Uniform Resource Locators (URLs), and Amazon
Resource Names (ARNs), and offering an API-based service discovery mechanism with
a faster change propagation and the ability to use attributes to narrow down the set of
discovered resources. Existing Route 53 Auto Naming resources are upgraded
automatically to AWS Cloud Map.
Third-party software
A different approach to implementing service discovery is using third-partysoftware such
as HashiCorp Consul, etcd, or Netflix Eureka. All three examples are distributed, reliable
key-value stores. For HashiCorp Consul, there is an AWS Quick Start that sets up a
flexible, scalable AWS Cloud environment and launches HashiCorp Consul
automatically into a configuration of your choice.
Service meshes
In an advanced microservices architecture, the actual application can be composed of
hundreds, or even thousands, of services. Often the most complex part of the
application is not the actual services themselves, but the communication between those
services. Service meshes are an additional layer for handling interservice
communication, which is responsible for monitoring and controlling traffic in
microservices architectures. This enables tasks, like service discovery, to be completely
handled by this layer.
Typically, a service mesh is split into a data plane and a control plane. The data plane
consists of a set of intelligent proxies that are deployed with the application code as a
17
Amazon Web Services Implementing Microservices on AWS
special sidecar proxy that intercepts all network communication between microservices.
The control plane is responsible forcommunicating with the proxies.
Service meshes are transparent, which means that application developers don’t have to
be aware of this additional layer and don’t have to make changes to existing application
code. AWS App Mesh is a service mesh that provides application-level networking to
enable your services to communicate with each other across multiple types of compute
infrastructure. App Mesh standardizes how your services communicate, giving you
complete visibility and ensuring high availability for your applications.
You can use App Mesh with existing or new microservices running on Amazon EC2,
Fargate, Amazon ECS, Amazon EKS, and self-managed Kubernetes on AWS. App
Mesh can monitor and control communications for microservices running across
clusters, orchestration systems, or VPCs as a single application without any code
changes.
18
Amazon Web Services Implementing Microservices on AWS
Building a centralized store of critical reference data that is curated by core data
management tools and procedures provides a means for microservices to synchronize
their critical data and possibly roll back state. Using AWS Lambda with scheduled
Amazon CloudWatch Events you can build a simple cleanup and deduplication
mechanism.
It’s very common for state changes to affect more than a single microservice. In such
cases, event sourcing has proven to be a useful pattern. The core idea behind event
sourcing is to represent and persist every application change as an event record.
Instead of persisting application state, data is stored as a stream of events. Database
transaction logging and version control systems are two well-known examples for event
sourcing. Event sourcing has a couple of benefits: state can be determined and
reconstructed for any point in time. It naturally produces a persistent audit trail and also
facilitates debugging.
The following figure shows how the event sourcing pattern can be implemented on
AWS. Amazon Kinesis Data Streams serves as the main component of the central
event store, which captures application changes as events and persists them on
19
Amazon Web Services Implementing Microservices on AWS
Amazon S3. The figure depicts three different microservices, composed of API
Gateway, AWS Lambda, and DynamoDB. The arrows indicate the flow of the events:
when Microservice 1 experiences an event state change, it publishes an event by
writing a message into Kinesis Data Streams. All microservices run their own Kinesis
Data Streams application in AWS Lambda which reads a copy of the message, filters it
based on relevancy for the microservice, and possibly forwards it for further processing.
If your function returns an error, Lambda retries the batch until processing succeeds or
the data expires. To avoid stalled shards, you can configure the event source mapping
to retry with a smaller batch size, limit the number of retries, or discard records that are
too old. To retain discarded events, you can configure the event source mapping to
send details about failed batches to an Amazon Simple Queue Service (SQS) queue or
Amazon Simple Notification Service (SNS) topic.
Amazon S3 durably stores all events across all microservices and is the single source of
truth when it comes to debugging, recovering application state, or auditing application
changes. There are two primary reasons why records may be delivered more than one
time to your Kinesis Data Streams application: producer retries and consumer retries.
Your application must anticipate and appropriately handle processing individual records
multiple times.
20
Amazon Web Services Implementing Microservices on AWS
Configuration management
In a typical microservices architecture with dozens of different services, each service
needs access to several downstream services and infrastructure components that
expose data to the service. Examples could be message queues, databases, and other
microservices. One of the key challenges is to configure each service in a consistent
way to provide information about the connection to downstream services and
infrastructure. In addition, the configuration should also contain information about the
environment in which the service is operating, and restarting the application to use new
configuration data shouldn’t be necessary.
The third principle of the Twelve-Factor App patterns covers this topic: “The twelve-
factor app stores config in environment variables (often shortened to env vars or env).”
For Amazon ECS, environment variables can be passed to the container by using the
environment container definition parameter which maps to the --env option to docker
run. Environment variables can be passed to your containers in bulk by using the
environmentFiles container definition parameter to list one or more files containing
the environment variables. The file must be hosted in Amazon S3. In AWS Lambda, the
runtime makes environment variables available to your code and sets additional
environment variables that contain information about the function and invocation
request. For Amazon EKS, you can define environment variables in the env-field of the
configuration manifest of the corresponding pod. A different way to use env-variables is
to use a ConfigMap.
REST-based communication
The HTTP/S protocol is the most popular way to implement synchronous
communication between microservices. In most cases, RESTful APIs use HTTP as a
21
Amazon Web Services Implementing Microservices on AWS
transport layer. The REST architectural style relies on stateless communication, uniform
interfaces, and standard methods.
With API Gateway, you can create an API that acts as a “front door” for applications to
access data, business logic, or functionality from your backend services. API
developers can create APIs that access AWS or other web services, as well as data
stored in the AWS Cloud. An API object defined with the API Gateway service is a
group of resources and methods.
A resource is a typed object within the domain of an API and may have associated a
data model or relationships to other resources. Each resource can be configured to
respond to one or more methods, that is, standard HTTP verbs such as GET, POST, or
PUT. REST APIs can be deployed to different stages, and versioned as well as cloned
to new versions.
API Gateway handles all the tasks involved in accepting and processing up to hundreds
of thousands of concurrent API calls, including traffic management, authorization and
access control, monitoring, and API version management.
Depending on specific requirements, like protocols, AWS offers different services which
help to implement this pattern. One possible implementation uses a combination of
Amazon Simple Queue Service (Amazon SQS) and Amazon Simple Notification Service
(Amazon SNS).
Both services work closely together. Amazon SNS enables applications to send
messages to multiple subscribers through a push mechanism. By using Amazon SNS
and Amazon SQS together, one message can be delivered to multiple consumers. The
following figure demonstrates the integration of Amazon SNS and Amazon SQS.
22
Amazon Web Services Implementing Microservices on AWS
When you subscribe an SQS queue to an SNS topic, you can publish a message to the
topic, and Amazon SNS sends a message to the subscribed SQS queue. The message
contains subject and message published to the topic along with metadata information in
JSON format.
Another option for building event-driven architectures with event sources spanning
internal applications, third-party SaaS applications, and AWS services, at scale, is
Amazon EventBridge. A fully managed event bus service, EventBridge receives events
from disparate sources, identifies a target based on a routing rule, and delivers near
real-time data to that target, including AWS Lambda, Amazon SNS, and Amazon
Kinesis Streams, among others. An inbound event can also be customized, by input
transformer, prior to delivery.
23
Amazon Web Services Implementing Microservices on AWS
API which means, if you have an existing application that you want to migrate from—for
example, an on-premises environment to AWS—code changes are necessary. With
Amazon MQ this is not necessary in many cases.
You can use AWS Step Functions to build applications from individual components that
each perform a discrete function. Step Functions provides a state machine that hides
the complexities of service orchestration, such as error handling, serialization, and
parallelization. This lets you scale and change applications quickly while avoiding
additional coordination code inside services.
Step Functions is a reliable way to coordinate components and step through the
functions of your application. Step Functions provides a graphical console to arrange
and visualize the components of your application as a series of steps. This makes it
easier to build and run distributed services.
Step Functions automatically starts and tracks each step and retries when there are
errors, so your application executes in order and as expected. Step Functions logs the
state of each step so when something goes wrong, you can diagnose and debug
problems quickly. You can change and add steps without even writing code to evolve
your application and innovate faster.
Step Functions is part of the AWS serverless platform and supports orchestration of
Lambda functions as well as applications based on compute resources, such as
Amazon EC2, Amazon EKS, and Amazon ECS, and additional services like Amazon
SageMaker and AWS Glue. Step Functions manages the operations and underlying
infrastructure for you to help ensure that your application is available at any scale.
24
Amazon Web Services Implementing Microservices on AWS
To build workflows, Step Functions uses the Amazon States Language. Workflows can
contain sequential or parallel steps as well as branching steps.
25
Amazon Web Services Implementing Microservices on AWS
Distributed monitoring
A microservices architecture consists of many different distributed parts that have to be
monitored. You can use Amazon CloudWatch to collect and track metrics, centralize
and monitor log files, set alarms, and automatically react to changes in your AWS
environment. CloudWatch can monitor AWS resources such as Amazon EC2 instances,
DynamoDB tables, and Amazon RDS DB instances, as well as custom metrics
generated by your applications and services, and any log files your applications
generate.
Monitoring
You can use CloudWatch to gain system-wide visibility into resource utilization,
application performance, and operational health. CloudWatch provides a reliable,
scalable, and flexible monitoring solution that you can start using within minutes. You no
longer need to set up, manage, and scale your own monitoring systems and
infrastructure. In a microservices architecture, the capability of monitoring custom
metrics using CloudWatch is an additional benefit because developers can decide which
metrics should be collected for each service. In addition, dynamic scaling can be
implemented based on custom metrics.
In addition to Amazon Cloudwatch, you can also use CloudWatch Container Insights to
collect, aggregate, and summarize metrics and logs from your containerized
applications and microservices. CloudWatch Container Insights automatically collects
metrics for many resources, such as CPU, memory, disk, and network and aggregate
as CloudWatch metrics at the cluster, node, pod, task, and service level. Using
CloudWatch Container Insights, you can gain access to CloudWatch Container Insights
dashboard metrics. It also provides diagnostic information, such as container restart
failures, to help you isolate issues and resolve them quickly. You can also set
CloudWatch alarms on metrics that Container Insights collects.
Container Insights is available for Amazon ECS, Amazon EKS, and Kubernetes
platforms on Amazon EC2. Amazon ECS support includes support for Fargate.
Another popular option, especially for Amazon EKS, is to use Prometheus. Prometheus
is an open-source monitoring and alerting toolkit that is often used in combination with
Grafana to visualize the collected metrics. Many Kubernetes components store metrics
at /metrics and Prometheus can scrape these metrics at a regular interval.
26
Amazon Web Services Implementing Microservices on AWS
AMP is often used in combination with Amazon Managed Service for Grafana (AMG).
AMG makes it easy to query, visualize, alert on and understand your metrics no matter
where they are stored. With AMG, you can analyze your metrics, logs, and traces
without having to provision servers, configure and update software, or do the heavy
lifting involved in securing and scaling Grafana in production.
Centralizing logs
Consistent logging is critical for troubleshooting and identifying issues. Microservices
enable teams to ship many more releases than ever before and encourage engineering
teams to run experiments on new features in production. Understanding customer
impact is crucial to gradually improving an application.
By default, most AWS services centralize their log files. The primary destinations for log
files on AWS are Amazon S3 and Amazon CloudWatch Logs. For applications running
on Amazon EC2 instances, a daemon is available to send log files to CloudWatch Logs.
Lambda functions natively send their logoutput to CloudWatch Logs and Amazon ECS
includes support for the awslogs log driver that enables the centralization of container
logs to CloudWatch Logs. For Amazon EKS, either Fluent Bit or Fluentd can forward
logs from the individual instances in the cluster to a centralized logging CloudWatch
Logs where they are combined for higher-level reporting using Amazon OpenSearch
Service and Kibana. Because of its smaller footprint and performance advantages,
Fluent Bit is recommended instead of Fluentd.
The following figure illustrates the logging capabilities of some of the services. Teams
are then able to search and analyze these logs using tools like Amazon OpenSearch
Service and Kibana. Amazon Athena can be used to run a one-time query against
centralized log files in Amazon S3.
27
Amazon Web Services Implementing Microservices on AWS
Distributed tracing
In many cases, a set of microservices works together to handle a request. Imagine a
complex system consisting of tens of microservices in which an error occurs in one of
the services in the call chain. Even if every microservice is logging properly and logs are
consolidated in a central system, it can be difficult to find all relevant log messages.
The central idea of AWS X-Ray is the use of correlation IDs, which are unique identifiers
attached to all requests and messages related to a specific event chain. The trace ID is
added to HTTP requests in specific tracing headers named X-Amzn-Trace-Id when
the request hits the first X-Ray integrated service (for example, Application Load
Balancer or API Gateway) and included in the response. Through the X-Ray SDK, any
microservice can read, but can also add or update this header.
X-Ray works with Amazon EC2, Amazon ECS, AWS Lambda, and AWS Elastic
Beanstalk. You can use X-Ray with applications written in Java, Node.js, and .NET that
are deployed on these services.
28
Amazon Web Services Implementing Microservices on AWS
Epsagon is fully managed SaaS that includes tracing for all AWS services, third-party
APIs (through HTTP calls), and other common services such as Redis, Kafka, and
Elastic. The Epsagon service includes monitoring capabilities, alerting to the most
common services, and payload visibility into each and every call your code is making.
29
Amazon Web Services Implementing Microservices on AWS
Amazon OpenSearch Service can be used for full-text search, structured search,
analytics, and all three in combination. Kibana is an open-source data visualization
plugin that seamlessly integrates with the Amazon OpenSearch Service.
The following figure demonstrates log analysis with Amazon OpenSearch Service and
Kibana. CloudWatch Logs can be configured to stream log entries to Amazon
OpenSearch Service in near real time through a CloudWatch Logs subscription. Kibana
visualizes the data and exposes a convenient search interface to data stores in Amazon
OpenSearch Service. This solution can be used in combination with software like
ElastAlert to implement an alerting system to send SNS notifications and emails, create
JIRA tickets, and so forth, if anomalies, spikes, or other patterns of interest are detected
in the data.
30
Amazon Web Services Implementing Microservices on AWS
Another option for analyzing log files is to use Amazon Redshift with Amazon
QuickSight.
QuickSight can be easily connected to AWS data services, including Redshift, Amazon
RDS, Aurora, Amazon EMR,DynamoDB, Amazon S3, and Amazon Kinesis.
CloudWatch Logs can act as a centralized store for log data, and, in addition to only
storing the data, it is possible to stream log entries to Amazon Kinesis Data Firehose.
The following figure depicts a scenario where log entries are streamed from different
sources to Redshift using CloudWatch Logs and Kinesis Data Firehose. QuickSight
uses the data stored in Redshift for analysis, reporting, and visualization.
31
Amazon Web Services Implementing Microservices on AWS
The following figure depicts a scenario of log analysis on Amazon S3. When the logs
are stored in Amazon S3 buckets, the log data can be loaded in different AWS data
services, such as Redshift or Amazon EMR, to analyze the data stored in the log stream
and find anomalies.
32
Amazon Web Services Implementing Microservices on AWS
Chattiness
By breaking monolithic applications into small microservices, the communication
overhead increases because microservices have to talk to each other. In many
implementations, REST over HTTP is used because it is a lightweight communication
protocol, but high message volumes can cause issues. In some cases, you might
consider consolidating services that send many messages back and forth. If you find
yourself in a situation where you consolidate an increased number of services just to
reduce chattiness, you should review your problem domains and your domain model.
Protocols
Earlier in this whitepaper, in the section Asynchronous communication and lightweight
messaging, different possible protocols are discussed. For microservices it is common
to use protocols like HTTP. Messages exchanged by services can be encoded in
different ways, such as human-readable formats like JSON or YAML, or efficient binary
formats such as Avro or Protocol Buffers.
33
Amazon Web Services Implementing Microservices on AWS
Caching
Caches are a great way to reduce latency and chattiness of microservices architectures.
Several caching layers are possible, depending on the actual use case and bottlenecks.
Many microservice applications running on AWS use ElastiCache to reduce the volume
of calls to other microservices by caching results locally. API Gateway provides a built-
in caching layer to reduce the load on the backend servers. In addition, caching is also
useful to reduce load from the data persistence layer. The challenge for any caching
mechanism is to find the right balance between a good cache hit rate, and the
timeliness and consistency of data.
Auditing
Another challenge to address in microservices architectures, which can potentially have
hundreds of distributed services, is ensuring visibility of user actions on each service
and being able to get a good overall view across all services at an organizational level.
To help enforce security policies, it is important to audit both resource access and
activities that lead to system changes.
Changes must be tracked at the individual service level as well as across services
running on the wider system. Typically, changes occur frequently in microservices
architectures, which makes auditing changes even more important. This section
examines the key services and features within AWS that can help you audit your
microservices architecture.
Audit trail
AWS CloudTrail is a useful tool for tracking changes in microservices because it
enables all API calls made in the AWS Cloud to be logged and sent to either
CloudWatch Logs in real time, or to Amazon S3 within several minutes.
All user and automated system actions become searchable and can be analyzed for
unexpected behavior, company policy violations, or debugging. Information recorded
includes a timestamp, user and account information, the service that was called, the
service action that was requested, the IP address of the caller, as well as request
parameters and response elements.
CloudTrail allows the definition of multiple trails for the same account, which enables
different stakeholders, such as security administrators, software developers, or IT
34
Amazon Web Services Implementing Microservices on AWS
auditors, to create and manage their own trail. If microservice teams have different AWS
accounts, it is possible to aggregate trails into a single S3 bucket.
The advantages of storing the audit trails in CloudWatch are that audit trail data is
captured in real time, and it is easy to reroute information to Amazon OpenSearch
Service for search and visualization. You can configure CloudTrail to log in to both
Amazon S3 and CloudWatch Logs.
When an event is fired and matches a defined rule, a pre-defined group of people in
your organization can be immediately notified, so that they can take the appropriate
action. If the required action can be automated, the rule can automatically trigger a built-
in workflow or invoke a Lambda function to resolve the issue.
The following figure shows an environment where CloudTrail and CloudWatch Events
work together to address auditing and remediation requirements within a microservices
architecture. All microservices are being tracked by CloudTrail and the audit trail is
stored in an Amazon S3 bucket. CloudWatch Events becomes aware of operational
changes as they occur. CloudWatch Events responds to these operational changes and
takes corrective action as necessary, by sending messages to respond to the
environment, activating functions, making changes, and capturing state information.
CloudWatch Events sit on top of CloudTrail and triggers alerts when a specific change
is made to your architecture.
35
Amazon Web Services Implementing Microservices on AWS
Although CloudTrail and CloudWatch Events are important building blocks to track and
respond to infrastructure changes across microservices, AWS Config rules enable a
company to define security policies with specific rules to automatically detect, track, and
alert you to policy violations.
The next example demonstrates how it is possible to detect, inform, and automatically
react to non-compliant configuration changes within your microservices architecture. A
member of the development team has made a change to the API Gateway for a
microservice to allow the endpoint to accept inbound HTTP traffic, rather than only
allowing HTTPS requests.
Because this situation has been previously identified as a security compliance concern
by the organization, an AWS Config rule is already monitoring for this condition.
36
Amazon Web Services Implementing Microservices on AWS
The rule identifies the change as a security violation, and performs two actions: it
creates a log of the detected change in an Amazon S3 bucket for auditing, and it
creates an SNS notification. Amazon SNS is used for two purposes in our scenario: to
send an email to a specified group to inform about the security violation, and to add a
message to an SQS queue. Next, the message is picked up, and the compliant state is
restored by changing the API Gateway configuration.
Resources
• AWS Architecture Center
• AWS Whitepapers
• AWS Architecture Monthly
• AWS Architecture Blog
• This Is My Architecture videos
• AWS Answers
• AWS Documentation
37
Amazon Web Services Implementing Microservices on AWS
Conclusion
Microservices architecture is a distributed design approach intended to overcome the
limitations of traditional monolithic architectures. Microservices help to scale
applications and organizations while improving cycle times. However, they also come
with a couple of challenges that might add additional architectural complexity and
operational burden.
AWS offers a large portfolio of managed services that can help product teams build
microservices architectures and minimize architectural and operational complexity. This
whitepaper guided you through the relevant AWS services and how to implement typical
patterns, such as service discovery or event sourcing, natively with AWS services.
38
Amazon Web Services Implementing Microservices on AWS
Document Revisions
Date Description
Contributors
The following individuals contributed to this document:
39
Amazon Web Services Implementing Microservices on AWS
40