0% found this document useful (0 votes)
146 views

Infrastructure Options For Serving Advertising Workloads (Part 1) PDF

This document discusses infrastructure options for platforms that serve advertising workloads. It explains that ad servers and bidders often use complementary technologies. The document provides an overview of considerations for the compute platform, geographic locations, reproducibility, and load balancing. It recommends options on Google Cloud like App Engine, Compute Engine, and Kubernetes Engine for the compute platform and discusses using preemptible VMs, multiple regions, instance templates, and containers to ensure reproducibility across regions.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
146 views

Infrastructure Options For Serving Advertising Workloads (Part 1) PDF

This document discusses infrastructure options for platforms that serve advertising workloads. It explains that ad servers and bidders often use complementary technologies. The document provides an overview of considerations for the compute platform, geographic locations, reproducibility, and load balancing. It recommends options on Google Cloud like App Engine, Compute Engine, and Kubernetes Engine for the compute platform and discusses using preemptible VMs, multiple regions, instance templates, and containers to ensure reproducibility across regions.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 21

26/10/2020 Infrastructure options for serving advertising workloads (part 1)

Infrastructure options for serving adve ising


workloads (pa 1)
This article explains the components that are shared across different ad tech platforms,
including ad servers and bidders. The article offers you options for implementing those
components.

Ad servers and bidders are often complementary platforms with overlapping technology. To
avoid duplicating content, this article and its companion, part 2
 (/solutions/infrastructure-options-for-data-pipelines-in-advertising) , provide context for the series:

Building advertising platforms (overview)


 (/solutions/infrastructure-options-for-building-advertising-platforms)

Infrastructure options for serving advertising workloads (part 1, this article)

Infrastructure options for data pipelines in advertising (part 2)


 (/solutions/infrastructure-options-for-data-pipelines-in-advertising)

Infrastructure options for ad servers (part 3)


 (/solutions/infrastructure-options-for-ad-servers)

Infrastructure options for RTB bidders (part 4)


 (/solutions/infrastructure-options-for-rtb-bidders)

For a high-level overview of the entire series and the terminology


 (/solutions/infrastructure-options-for-building-advertising-platforms#terminology) used throughout,
see Building advertising platforms (overview)
 (/solutions/infrastructure-options-for-building-advertising-platforms).

Pla orm considerations

When dealing with either the buy or sell side of the platforms, consider the following:

Compute platform: Programmatic advertising platforms comprise several services,


where each service offers one or more functions. Decide early if you can containerize
some or all of these functions, or if the service must run directly on virtual machine
(VM) instances.

Geographic locations: Deploying your infrastructure close to your customers and


providers helps reduce networking latency.

https://cloud.google.com/solutions/infrastructure-options-for-serving-advertising-workloads 1/21
26/10/2020 Infrastructure options for serving advertising workloads (part 1)

Reproducibility: When you replicate a system in different regions across the globe, the
ability to consistently deploy the same infrastructure provides consistency across the
platform.

Load balancing: A single machine cannot handle ad-tech loads. Distribute both
internal and external requests across multiple servers.

Autoscaling: Ad-request loads uctuate over the course of the day. You can reduce
costs and increase availability by automatically scaling your system up and down.

Network communication: With a distributed system come communication questions.


For example, suppose that you are bidding in Oregon but your campaign management
database is in Europe. Even if the communication consists of o ine synchronization,
you probably don't want to communicate over the public internet.

Compute pla orm

Google Cloud offers several options for running your computational workloads. Consider
the following options:

App Engine (/appengine) for running a web user interface (UI) minus most of the
operational overhead.

Compute Engine (/compute) for installing and managing some relational databases or


custom code not supported by App Engine.

Google Kubernetes Engine (GKE) (/kubernetes-engine) for setting up stateless frontends


or for running containerized applications.

These options are recommendations and are often interchangeable. Your requirements are
ultimately the deciding factor, whether that factor is cost, operational overhead, or
performance.

Compute Engine and GKE both support preemptible VMs (/preemptible-vms), which are often
used in ad-tech workloads to save on cost. Preemptible VMs can be preempted with only a
one-minute warning, however, so you might want to do the following:

If you use Compute Engine, two different managed instance groups (one preemptible
and the other with standard VMs) can reside behind the same load balancer. By
making sure that one group consists of standard VMs, you ensure that your frontend
is always able to handle incoming requests. The following diagram shows this
approach.

https://cloud.google.com/solutions/infrastructure-options-for-serving-advertising-workloads 2/21
26/10/2020 Infrastructure options for serving advertising workloads (part 1)

Standard Preemptible

Compute Engine Compute Engine

Anycast IP

Cloud Load
Balancing

If you use GKE, you can mitigate cost with availability by creating both non-
preemptible and preemptible node pools
 (/kubernetes-engine/docs/how-to/preemptible-vms) in your GKE cluster.

Geographic locations

Advertisers might want to target customers in all regions around the globe. Adding a few
extra milliseconds to one of the platform's UI frontends won't decrease your advertisers'
experience when, for example, they are visualizing performance data. Be careful, however, if
additional networking distance adds a few extra milliseconds to the bidding response.
Those few milliseconds might prove to be the difference in whether the advertiser's bid gets
accepted and their ad gets served to the customer.

Google Cloud has a presence in several regions (/about/locations), including us-east, us-


west, europe-west, asia-southeast, and asia-east. Each region includes multiple zones
 (/compute/docs/regions-zones) to offer both high availability and scale.

If latency is critical, you might want to distribute some of your services across those zones
in different regions. You can customize your setup based on your needs. For example, you
could decide to have some frontend servers in us-east4 and us-west1, but have data
stored in a database in us-central. In some cases, you could replicate some of the
database data locally onto the frontend servers; alternatively you might consider a multi-
regional Cloud Spanner instance (/spanner/docs/instances#multi-region_con gurations).

Reproducibility

Reproducibility offers simpler maintenance and deployment, and having the platform run in
all relevant geographic locations is key to returning bids before the critical deadline. To
ensure reproducibility, every region must perform similar work. The main difference is the

https://cloud.google.com/solutions/infrastructure-options-for-serving-advertising-workloads 3/21
26/10/2020 Infrastructure options for serving advertising workloads (part 1)

workload, and how many machines and zones are required to scale to meet the regional
demand.

With Compute Engine, instance templates (/compute/docs/instance-templates) are the base to


set up similar regional managed instance groups (/compute/docs/instance-groups). These
groups can be located in different regions for proximity to the SSPs, and they can span
multiple zones for high availability. The following diagram shows what this process looks
like.

Instance
template

us-west1 us-east1

Bidders Bidders

Compute Engine Compute Engine

Containers offer a higher level of abstraction than machine virtualization. Kubernetes


facilitates application reproducibility natively through YAML con guration les that can
de ne pods (https://kubernetes.io/docs/concepts/workloads/pods/pod/), services
 (https://kubernetes.io/docs/concepts/services-networking/service/), and deployments
 (https://kubernetes.io/docs/concepts/workloads/controllers/deployment/) with consistency across
the different clusters.

Load balancing

Two main scenarios require load balancing:

For external tra c, such as ad or bid requests, you can use Cloud Load Balancing
 (/load-balancing), which provides HTTP(S) and Network load balancing capabilities.
With no pre-warming and its global load balancing and single anycast IP
 (/load-balancing/docs/load-balancing-overview#types-of-cloud-load-balancing), your system
can receive millions of requests per second from anywhere in the world.

For internal tra c between services, Google Cloud provides Internal load balancing
 (/load-balancing/docs/internal).

If you decide to use Kubernetes for some parts of the infrastructure, we recommend that
you use GKE. Some Kubernetes features might need some extra implementation if your

https://cloud.google.com/solutions/infrastructure-options-for-serving-advertising-workloads 4/21
26/10/2020 Infrastructure options for serving advertising workloads (part 1)

provider does not natively support them. With GKE, Kubernetes can use native Google
Cloud features:

Creating an Ingress (https://kubernetes.io/docs/concepts/services-networking/ingress/) or
Istio Ingress Gateway
 (https://istio.io/docs/tasks/tra c-management/ingress/ingress-control/) uses HTTP load
balancing.

Creating a Load balancer-type service


 (https://kubernetes.io/docs/concepts/services-networking/service/#loadbalancer) uses
Network load balancing (/load-balancing/docs/network).

GKE also supports container-native load balancing


 (/kubernetes-engine/docs/how-to/container-native-load-balancing) to minimize networking time
and possible extra networking hops. At a high level, the load balancer prevents a request
from being routed to an instance that does not host a pod of the requested service.

Scaling

Because your platform must be able to parse and calculate billions of ad requests per day,
load-balancing is a must. Besides, a single machine would be inadequate for that task.
However, the number of requests tends to change throughout the day, meaning that your
infrastructure must be able to scale up and down based on demand.

If you decide to use Compute Engine, you can create autoscaling managed instance groups
 (/compute/docs/autoscaler) from instance templates (/compute/docs/instance-templates). You
can then scale those groups on different metrics such as HTTP load, CPU, and Cloud
Monitoring custom metrics, such as application latency. You can also scale these groups
on a combination of these metrics.

Autoscaling decisions are based on the metric average over the last ten minutes and are
made every minute using a sliding window. Every instance group can have its own set of
scaling rules.

If you decide to use GKE, Kubernetes' Cluster Autoscaler


 (https://github.com/kubernetes/autoscaler/blob/master/cluster-autoscaler/FAQ.md), you can
implement it natively by using GKE Cluster Autoscaler
 (/kubernetes-engine/docs/concepts/cluster-autoscaler). GKE Cluster Autoscaler behaves
differently than the Compute Engine Autoscaler and spins up new nodes when new pods
can no longer be scheduled on the existing nodes due to a shortage of CPU or memory on
the underlying nodes. Scaling down works automatically when CPU or memory is released
again.

https://cloud.google.com/solutions/infrastructure-options-for-serving-advertising-workloads 5/21
26/10/2020 Infrastructure options for serving advertising workloads (part 1)

Network communication

Virtual Private Clouds (VPCs) can span multiple regions. In other words, if you have
database read replicas in us-east and a master database in asia-southeast within the
same VPC, they can securely communicate using their private IPs or hostnames without
ever leaving the Google network.

In the following diagram, all instances are in the same VPC and can communicate directly
without the need of VPN, even if they are in different regions.

VPC
10.0.0.0/16

us-central1

Database

Compute Engine

us-west1 us-east1

Read replica Read replica

Compute Engine Compute Engine

Bidders Bidders

Compute Engine Compute Engine

Cloud Load
Balancing

GKE clusters are assigned a VPC when they are created and can use many of those existing
networking features.

Google also offers two types of networking infrastructure:

https://cloud.google.com/solutions/infrastructure-options-for-serving-advertising-workloads 6/21
26/10/2020 Infrastructure options for serving advertising workloads (part 1)

Premium (/network-tiers/docs/overview#premium_tier): Uses the Google global private


network. Prioritize this option for critical workloads such as cross-region database
replication.

Standard (/network-tiers/docs/overview#standard_tier): Prioritize this option if you are


price conscious and can use the public internet.

When you use managed products such as Cloud Bigtable, Cloud Storage, or BigQuery,
Google Cloud provides Private access (/vpc/docs/con gure-private-google-access) to those
products through the VPC.

User frontend

Your user frontend is important, but it requires the least technical overhead because it is
handling much smaller workloads. The user frontend offers platform users the ability to
administer advertising resources such as campaigns, creatives, billing, or bids. The
frontend also offers the ability to interact with reporting tools—for example, to monitor
campaigns or ad performance.

Both of these features require web serving to provide a UI to the platform user, and
datastores to store either transactional or analytical data.

Web serving

Your advertising frontend likely needs to:

Offer high availability.

Handle hundreds of requests per second.

Be globally available at an acceptable latency to offer a good user experience.

Provide a UI.

Your interface likely includes functionality such as a dashboard and pages to set up
advertisers, campaigns, and their related components. The UI design itself is a separate
discipline and beyond the scope of this article.

To minimize technical overhead, we recommend using App Engine (/appengine) as a


frontend platform. That choice will help minimize the time you spend managing your
website infrastructure. If you require a custom runtime, consider Custom Runtimes

https://cloud.google.com/solutions/infrastructure-options-for-serving-advertising-workloads 7/21
26/10/2020 Infrastructure options for serving advertising workloads (part 1)

 (/appengine/docs/ exible/custom-runtimes). Alternatively, you can use GKE or Compute Engine


if your preferred application stack has other requirements.

Datastores

There are two datastore types in user frontends:

Administration datastores that require online transaction processing (OLTP)


databases. Options are detailed in the metadata management store
 (#metadata_management_store) section.

Reporting datastores that require online analytical processing (OLAP) databases.


Options are detailed in the reporting/dashboarding store (#reportingdashboarding_store)
section.

Handling requests

Frontends

Requests are sent to be processed to an HTTP(S) endpoint that your platform provides. The
key components are as follows:

A load balancer able to process several hundreds of thousands of QPS.

A pool of instances that can scale up and down quickly based on various KPIs.

A possible API that throttles and/or authenticates to the endpoint.

Both Compute Engine and GKE are good options as computing platforms:

Compute Engine uses Cloud Load Balancing (#load_balancing) and managed instance


groups that are mentioned in the scaling (#scaling) section.

GKE uses Cloud Load Balancing and Ingress (or Istio Ingress Gateway), Horizontal
Pod Autoscaler, and Cluster Autoscaler.

Because pod scaling is faster than node scaling, GKE might offer faster autoscaling on a
per-service level. GKE also supports container-native load balancing
 (/kubernetes-engine/docs/how-to/container-native-load-balancing) to optimize requests routing
directly to an instance that hosts a pod of the relevant service.

Throttling and authentication can be managed with technologies like Apigee


 (https://apigee.com/) or Service Infrastructure (/service-infrastructure/docs/overview).

https://cloud.google.com/solutions/infrastructure-options-for-serving-advertising-workloads 8/21
26/10/2020 Infrastructure options for serving advertising workloads (part 1)

Parsing

Ad requests are commonly formatted in JSON or protobuf format with information such as
IP address, user agent, or site category. It's critical to extract this data, which might also
contain details on the (unique) user, to then retrieve segments to select and lter ads.

Static ltering

Some requests, typically received on the buyer side, can be discarded by making use of
static rules. Such early ltering can reduce the amount of data and the complex processing
that is required downstream.

Rules might be publisher blacklists or content type exclusion. During initialization the
workers can pull and load these rules from a le hosted on Cloud Storage (/storage).

Ad selection

Ad selection can be performed in the various services or platforms, including: the publisher
ad server, the advertiser ad server, or the DSP. There are different levels of complexity when
selecting an ad:

Some selections can be as simple as selecting an ad for a speci c category of the


publisher's website or page. In this case, prices do not differ on a per-ad basis.

More advanced selections incorporate user attributes and segments and potentially
involve machine learning–based ad-recommendation systems.

RTB systems usually make the most complex decisions. Ads are selected based on
attributes such as (unique) user segments and previous bid prices. The selection also
includes a bid calculation to optimize the bid price on a per-request basis.

Choosing the relevant ad is the core function of your system. You have many factors to
consider, including advanced rules–based or ML-selection algorithms. This article, however,
continues to focus on the high-level process and the interactions with the different
datastores.

The ad selection process consists of the following steps:

1. Retrieve the segments associated with the targeted user from the (unique) user
pro le store (#(unique)_user_pro le_store).

2. Select the campaigns or ads that are a good match with the user's segments. This
selection requires reading metadata from the metadata management store

https://cloud.google.com/solutions/infrastructure-options-for-serving-advertising-workloads 9/21
26/10/2020 Infrastructure options for serving advertising workloads (part 1)

 (#metadata_management_store), which is why this store requires you to implement one


of the heavy-read storing patterns (#heavy-read-storing_patterns).

3. Filter the selected campaigns or ads in alignment with the metrics, for example the
remaining budget, stored in one of the context stores (#context_stores).

4. Select the ad.

Bidders have more steps related to bids and auctions, and they have harder latency
requirements. For more details about bidder requirements during the ad selection, see
Infrastructure options for RTB bidders (part 4) (/solutions/infrastructure-options-for-rtb-bidders).

Heavy-read storing pa erns

Most of the decisions made when selecting an ad require intelligent data that:

Is read in milliseconds, sometimes sub-milliseconds.

Is written as soon as possible, especially for time-sensitive counters.

Is written often as part of an o ine process that uses background analytical or


machine learning tasks.

How you choose your datastore depends on how you prioritize the following requirements:

Minimizing read or write latencies: If latency is critical, you need a store that is close
to your servers and that can also handle fast reads or writes at scale.

Minimizing operational overhead: If you have a small engineering team, you might
need a fully managed database.

Scaling: To support either millions of targeted users or billions of events per day, the
datastore must scale horizontally.

Adapting querying style: Some queries can use a speci c key, where others might
need to retrieve records that meet a different set of conditions. In some cases, query
data can be encoded in the key. In other cases, the queries need SQL-like capabilities.

Maximizing data freshness: Some counters, such as budget, must be updated as


quickly as possible. Other data such as the audience segment or counters (for
example, daily caps) can be updated at a later time.

Minimizing costs: It might not always be economical or practical to handle billions of


reads and writes every day with strong consistency globally in a single database to
minimize DevOps.

https://cloud.google.com/solutions/infrastructure-options-for-serving-advertising-workloads 10/21
26/10/2020 Infrastructure options for serving advertising workloads (part 1)

There are different options to address heavy-read requirements. These include read
replicas, caching systems, in-memory NoSQL databases, and managed wide-column
NoSQL databases.

RDBMS read replicas

When using Cloud SQL (/sql) (or an equivalent RDBMS installed and managed on Compute
Engine), you can o oad the reads from the master instance. Many databases natively
support this feature. Workers could query needed information by:

Using read replicas that match the number of workers.

Using a pooling proxy.

The following diagram shows what this process looks like.

Rule Database

Cloud SQL

Read replica Read replica Read replica

Compute Engine Compute Engine Compute Engine

Frontend Frontend Frontend

Compute Engine Compute Engine Compute Engine

Read replicas are designed to serve heavy read tra c, but scalability is not linear and
performance can suffer with larger number of replicas. If you need either reads or writes
that can scale, with global consistency and minimum operational overhead, then consider
using Cloud Spanner (/spanner).

Local caching layer

You can use a caching layer such as Redis on Compute Engine with optional local replicas
on the workers. This layer can greatly minimize latency in both reading and networking. The
following diagram shows what this layer looks like.

https://cloud.google.com/solutions/infrastructure-options-for-serving-advertising-workloads 11/21
26/10/2020 Infrastructure options for serving advertising workloads (part 1)

Rule Database

Cloud SQL

NoSQL Master

Compute Engine

Frontends

Compute Engine

NoSQL slave

If you decide to use Kubernetes in this case, look at DaemonSet


 (https://kubernetes.io/docs/concepts/workloads/controllers/daemonset/) and the a nity to make
sure that:

You limit the amount of replicated data.

Data remains close to a serving pod.

In-memory key-value NoSQL

Deploy an in-memory database such as Aerospike or Redis to provide fast reads at scale.
This solution can be useful for regional data and counters. If you are concerned by the size
of the data structures stored, you can also leverage in-memory databases that can write to
SSD disks. The following diagram shows what this solution looks like.

https://cloud.google.com/solutions/infrastructure-options-for-serving-advertising-workloads 12/21
26/10/2020 Infrastructure options for serving advertising workloads (part 1)

Rule database

Cloud SQL

Aerospike Aerospike Aerospike

Compute Engine Compute Engine Compute Engine

Frontend

Kubernetes Engine

Managed wide-column NoSQL

Wide-column datastores are key-value stores that can provide fast reads and writes at
scale. You can install a common open source database such as Cassandra or HBase.

If you decide to use such a store, we recommend using Cloud Bigtable (/bigtable) to


minimize operational overhead. These stores enable you to scale your input/output
operations (IOPs) linearly with the number of nodes. With a proper key design, ad selectors
can read and the data pipeline can write at single-digit-millisecond speeds to the rst byte
on petabytes of data. The following diagram shows what this process looks like.

Intelligence tables

Cloud Bigtable

Frontend

Kubernetes Engine

https://cloud.google.com/solutions/infrastructure-options-for-serving-advertising-workloads 13/21
26/10/2020 Infrastructure options for serving advertising workloads (part 1)

Static object storage

For static data that can be saved in protobuf, AVRO, or JSON format, workers can load from
Cloud Storage (/storage) during initialization and persist the content in RAM. The following
diagram shows what this process looks like.

Cloud
Storage

Frontends

Compute Engine

There is no-one-size- ts-all solution. Choose between the solutions based on your priorities,
and balance latency, cost, operations, read/write performance, and data size.

Operational Read/Write
Solution Latency Cost Data size
overhead performance

RDBMS Read milliseconds Service or High Limited Limited


replicas compute-based

Cloud milliseconds Service based Low Linear scale with Petabytes


Spanner number of nodes

In-memory sub-millisecond Compute based High Scales with number of Scales with
stores nodes number of nodes

Cloud Single-digit Service-based Low Linear scale with Petabytes


Bigtable milliseconds number of nodes

Adve ising datastores

This section covers data storage options that address one of three different scenarios:

Ad-serving stores are used by services concerned with ad selection. Serving


workloads requires low latency and the ability to handle billions of reads per day. Data
size depends on the type of data.

https://cloud.google.com/solutions/infrastructure-options-for-serving-advertising-workloads 14/21
26/10/2020 Infrastructure options for serving advertising workloads (part 1)

Analytical stores are used o ine through ad hoc queries or batch data pipelines. They
support hundreds of terabytes of data stored daily.

Reporting/dashboard stores can be used for pre-made dashboards, time series, or


custom reporting. These options differentiate your frontend, enabling your platform
users to quickly gain insights and visualize how their business is performing.

Ad-serving stores can be further broken down, as follows:

The metadata management store (#metadata_management_store) is used when


selecting relevant campaigns and ads. The user frontend (#user_frontend) creates and
updates data. This data requires persistence.

The (unique) user pro le store (#unique_user_pro le_store) is used when pro ling the
(unique) user to match the user with an ad. Data is updated mostly using events
 (/solutions/infrastructure-options-for-data-pipelines-in-advertising#event_lifecycle) (in part 2).
This data requires persistence.

The serving context store (#serving_context_store) is used to lter ads and campaigns


based on multiple counters such as budget or frequency caps. Data is updated using
events. This data is frequently overwritten.

Metadata management store

The metadata management store contains the reference data to which you apply rules
when making the ad selection. Some resources stored here are speci c to a platform, but
others might overlap:

For a sell-side ad server, publishers manage data about campaigns, creatives,


advertisers, ad slots, and pricing. Some frontends might also grant access to their
buyers.

For a buy-side ad server, buyers manage data about campaigns, creatives, advertisers,
and pricing. Advertisers can often update this data themselves through a UI.

For a DSP, buyers manage data about campaigns, creatives, advertisers, and bid
prices. Advertisers can often update the data themselves through a UI.

The metadata store contains relational or hierarchical semi-static data:

Writes are the result of platform user edits through the frontend and happen
infrequently.

Data is read billions of times per day by the ad-selection servers.

https://cloud.google.com/solutions/infrastructure-options-for-serving-advertising-workloads 15/21
26/10/2020 Infrastructure options for serving advertising workloads (part 1)

Focusing on the user frontend, the campaign metadata database must be able to manage
resource relationships and hierarchies, and store megabytes to a few gigabytes of data.
The database must also provide reads and writes in the range of hundreds of QPS. To
address these requirements, Google Cloud offers several database options, both managed
and unmanaged:

Cloud SQL (/sql): A fully managed database service that can run MySQL or
PostgreSQL.

Datastore (/datastore/docs): A highly scalable, managed, and distributed NoSQL


database service. It supports ACID transactions, provides a SQL-like query language,
and has strong and eventual consistency levels
 (/datastore/docs/concepts/structuring_for_strong_consistency).

Spanner (/spanner): A horizontally scalable relational database providing strong,


consistent reads, global transactions, and cross-region replication. It can handle
heavy reads and writes.

Custom: You can also install and manage many of the open source or proprietary
databases (such as MySQL, PostgreSQL, MongoDB, or Couchbase) on Compute
Engine or GKE

Your requirements can help narrow down options, but at a high level you could use Cloud
SQL due to its support for relational data. Cloud SQL is also managed and provides read
replica options.

As mentioned previously, the metadata store is not only used by platform users for
reporting or administering but also by the servers that select ads. Those reads happen
billions of times a day. There are two main ways to approach that heavy-read requirements:

Using a database that can handle consistent writes globally and billions of reads per
day like Spanner.

Decoupling reads and writes. This approach is possible because metadata is not
changed often. You can read more about this approach in exports
 (/solutions/infrastructure-options-for-data-pipelines-in-advertising#exports) (in part 2).

(Unique) user pro le store

This store contains (unique) users and their associated information that provide key
insights to select a campaign or ad on request. Information can include the (unique) user's
attributes, your own segments, or segments imported from third-parties. In RTB, imported
segments often include bid price recommendations.

https://cloud.google.com/solutions/infrastructure-options-for-serving-advertising-workloads 16/21
26/10/2020 Infrastructure options for serving advertising workloads (part 1)

This datastore must be able to store hundreds of gigabytes, possibly terabytes of data. The
datastore must also be able to retrieve single records in, at most, single-digit-millisecond
speeds. How much data you store depends on how detailed your (unique) user information
is. At a minimum, you should be able to retrieve a list of segments for the targeted user.

The store is updated frequently based on the (unique) user's interaction with ads, sites they
visit, or actions they take. The more information, the better the targeting. You might also
want to use third-party data management platforms (DMPs) to enrich your rst-party data.

Cloud Bigtable or Datastore are common databases to use for (unique) user data. Both
databases are well suited to random reads and writes of single records. Consider using
Cloud Bigtable only if you have at least several hundreds of gigabytes of data.

Other common databases such as MongoDB, Couchbase, Cassandra, or Aerospike can also
be installed on Compute Engine or GKE. Although these databases often require more
management, some might provide more exibility, possibly lower latency, and in some
cases, cross-region replication.

For more details, see user matching


 (/solutions/infrastructure-options-for-rtb-bidders#user_matching) (in part 4).

Context stores

The context store is often used to store counters, for example, frequency caps and
remaining budget. How often the data is refreshed in the context store varies. For example,
a daily cap can be propagated daily, whereas the remaining campaign budget requires
recalculating and propagating as soon as possible.

Depending on the storage pattern (#heavy-read_storage_patterns) that you choose, the


counters that you update, and the capabilities of your chosen database, you could write
directly to the database. Or you might want to decouple the implementation by using a
publish-subscribe pattern (https://en.wikipedia.org/wiki/Publish%E2%80%93subscribe_pattern) with
a messaging system, such as Pub/Sub (/pubsub), to update the store after the calculation.

Good candidates for context stores are:

Cloud Bigtable

The regional in-memory key-value NoSQL pattern

The regional caching pattern

By using horizontal scaling, these stores can handle writes and reads at scale.
Infrastructure options for ad servers (part 3) (/solutions/infrastructure-options-for-ad-servers)

https://cloud.google.com/solutions/infrastructure-options-for-serving-advertising-workloads 17/21
26/10/2020 Infrastructure options for serving advertising workloads (part 1)

and Infrastructure options for RTB bidders (part 4)


 (/solutions/infrastructure-options-for-rtb-bidders) discuss some of these options in more detail.

Example of how to manage a budget counter in a distributed environment

You set up budgets in the campaign management tool. You don't want your campaigns to
overspend because, most of the time, advertisers will not pay for those extra impressions.
But it can be challenging in a distributed environment to aggregate counters such as
remaining budget, especially when the system can receive hundreds of thousands of ad
requests per second. Campaigns can quickly overspend in a few seconds if the global
remaining budget is not consolidated quickly.

By default, a worker spends slices of the budget without knowing how much sibling workers
spent. That lack of communication can lead to a situation where the worker spends money
that is no longer available.

There are different ways to handle this problem. Both of the following options implement a
global budget manager, but they behave differently.

1. Notifying workers about budget exhaustion: The budget manager tracks spending
and pushes noti cations to each of the workers when the campaign budget has been
exhausted. Due to the likelihood of high levels of QPS, noti cations should happen
within a second in order to quickly limit overspend. The following diagram shows this
process works.

Budget Store
Events
(Impressions, Clicks)
Compute Engine

Refresh

Budget Manager

Compute Engine

Stop

Workers

Compute Engine

https://cloud.google.com/solutions/infrastructure-options-for-serving-advertising-workloads 18/21
26/10/2020 Infrastructure options for serving advertising workloads (part 1)

2. Recurrently allocating slices of budget to each worker: The budget manager breaks
the overall remaining budget into smaller amounts that are allocated to each worker
individually. Workers spend their own, local budget, and when it is exhausted, they can
request more. This option has a couple of advantages:

a. Before being able to spend again, workers need to wait for the budget manager
to allocate them a new amount. This approach prevents overspending even if
some workers remain idle for a while.

b. The budget manager can adapt the allocated amount sent to each worker based
on the worker's spending behavior at each cycle. The following diagram shows
this process.

Budget Store
Events
(Impressions, Clicks)
Compute Engine

Budget Manager

Compute Engine

Refresh

$1.5 $2 $1

Worker Worker Worker

Compute Engine Compute Engine Compute Engine

Whichever option you choose, budget counters are calculated based on the pricing model
agreed upon by the publisher and advertiser. For example, if the model is:

CPM based, a billable impression sends an event to the system that decreases the
remaining budget based on the price set per thousand impressions.

CPC based, a user click sends an event to the system that decreases the remaining
budget based on the price set per click.

CPA based, a tracking pixel placed on the advertiser property sends an event to the
system that decreases the budget based on the price per action.

The number of impressions is often a few orders of magnitude higher than clicks. And the
number of clicks is often a few orders of magnitude higher than conversion. Ingesting these
events requires a scalable event-processing infrastructure. This approach is discussed in

https://cloud.google.com/solutions/infrastructure-options-for-serving-advertising-workloads 19/21
26/10/2020 Infrastructure options for serving advertising workloads (part 1)

more detail in the data pipeline (/solutions/infrastructure-options-for-data-pipelines-in-advertising)


article.

Analytical store

The analytical (/solutions/infrastructure-options-for-data-pipelines-in-advertising#analytics) store is a


database that is used to store and process terabytes of daily ingested data o ine; in other
words, not during any real-time processing. For example:

(Unique) user data is processed o ine to determine the associated segments, which
in turn are copied to faster databases, such as the user pro le database, for serving.
This process is explained in exports
 (/solutions/infrastructure-options-for-data-pipelines-in-advertising#exports).

Joining requests with impressions and (unique) user actions in order to aggregate
o ine counters used in a context store.

Repo ing/dashboarding store

The reporting/dashboarding store is used in the user frontend and provides insight on how
well campaigns or inventories perform. There are different reporting types. You might want
to offer some or all of them, including custom analytics capabilities and semi-static
dashboards updated every few seconds or in real time.

You can use BigQuery for its analytics capabilities. If you leverage views
 (/bigquery/docs/views-intro) to limit data access and share them
 (/bigquery/docs/view-access-controls) accordingly with your customers, you can give your
platform users ad hoc analytical capabilities through your own UI or their own visualization
tools. Not every company offers this option, but it is a great addition to be able to offer your
customers. Consider using at-rate pricing (/bigquery/pricing# at_rate_pricing) for this use
case.

If you want to offer OLAP capabilities to your customers with millisecond-to-second latency,
consider using a database in front of BigQuery. You can aggregate data for reporting and
export it from BigQuery to your choice of database. Relational databases, such as those
supported by Cloud SQL, are often used for this purpose.

Because App Engine or any other frontend that acts on behalf of the user by using a service
account (/iam/docs/understanding-service-accounts), BigQuery sees queries as coming from a
single user. The outcome is that BigQuery can cache (/bigquery/docs/cached-results) some
queries and return previously calculated results faster.

https://cloud.google.com/solutions/infrastructure-options-for-serving-advertising-workloads 20/21
26/10/2020 Infrastructure options for serving advertising workloads (part 1)

Semi-static dashboards are also commonly used. These dashboards use pre-aggregated
KPIs written by a data pipeline process. Stores are most likely NoSQL-based, such as
Firestore (/ restore) for easier real-time updates
 (https:// rebase.google.com/docs/ restore/query-data/listen), or caching layers such as
Memorystore (/memorystore). The freshness of the data will typically depend on the
frequency of the updates and duration of the window used to aggregate your data.

What's next

For information about managing data in ad tech, see Infrastructure options for data
pipelines in advertising (part 2)
 (/solutions/infrastructure-options-for-data-pipelines-in-advertising).

For information about serving ad requests, see Infrastructure options for ad servers
(part 3) (/solutions/infrastructure-options-for-ad-servers).

For information about serving bid requests or building your own DSP, see
Infrastructure options for RTB bidders (part 4)
 (/solutions/infrastructure-options-for-rtb-bidders).

Learn how to deploy your own serverless pixel tracking system


 (/solutions/serverless-pixel-tracking-tutorial).

Read this solution about using Google Cloud services from GKE
 (/solutions/using-gcp-services-from-gke).

Try out other Google Cloud features for yourself. Have a look at our tutorials
 (/docs/tutorials).

Rate and review

Except as otherwise noted, the content of this page is licensed under the Creative Commons Attribution 4.0
License (https://creativecommons.org/licenses/by/4.0/), and code samples are licensed under the Apache
2.0 License (https://www.apache.org/licenses/LICENSE-2.0). For details, see the Google Developers Site
Policies (https://developers.google.com/site-policies). Java is a registered trademark of Oracle and/or its
a liates.

Last updated 2020-10-05 UTC.

https://cloud.google.com/solutions/infrastructure-options-for-serving-advertising-workloads 21/21

You might also like