Serverless_Computing_Design_Implementation_and_Performance
Serverless_Computing_Design_Implementation_and_Performance
Serverless Computing:
Design, Implementation, and Performance
Garrett McGrath Paul R. Brenner
Dept. of Computer Science and Engineering Dept. of Computer Science and Engineering
University of Notre Dame University of Notre Dame
Notre Dame, Indiana Notre Dame, Indiana
Email: [email protected] Email: [email protected]
Abstract—We present the design of a novel performance- registries, pulling and running the containers when execution
oriented serverless computing platform implemented in .NET, is required [9]. Peer work on the OpenLambda platform
deployed in Microsoft Azure, and utilizing Windows containers presents an analysis of the scaling advantages of serverless
as function execution environments. Implementation challenges
such as function scaling and container discovery, lifecycle, and computing, as well as a performance analysis of various
reuse are discussed in detail. We propose metrics to evaluate the container transitions [10]. Other performance analyses have
execution performance of serverless platforms and conduct tests studied the effect of language runtime and VPC impact on
on our prototype as well as AWS Lambda, Azure Functions, AWS Lambda start times [11], and measured the potential of
Google Cloud Functions, and IBM’s deployment of Apache AWS Lambda for embarrassingly parallel high performance
OpenWhisk. Our measurements show the prototype achieving
greater throughput than other platforms at most concurrency scientific computing [12].
levels, and we examine the scaling and instance expiration trends Serverless computing has proved a good fit for IoT applica-
in the implementations. Additionally, we discuss the gaps and tions, intersecting with the edge/fog computing infrastructure
limitations in our current design, propose possible solutions, and conversation. There are ongoing efforts to integrate serverless
highlight future research. computing into a ”hierarchy of datacenters” to empower the
foreseen proliferation of IoT devices [13]. AWS has recently
I. I NTRODUCTION
joined this field with their Lambda@Edge [14] product, which
Following the lead of AWS Lambda [1], services such as allows application developers to place limited Lambda func-
Apache OpenWhisk [2], Azure Functions [3], Google Cloud tions in edge nodes. AWS has been pursuing other expansions
Functions [4], Iron.io IronFunctions [5], and OpenLambda [6] of serverless computing as well, including Greengrass [15],
have emerged and introduced serverless computing, a cloud which provides a single programming model across IoT and
offering where application logic is split into functions and Lambda functions. Serverless computing allows application
executed in response to events. These events can be triggered developers to decompose large applications into small func-
from sources external to the cloud platform but also com- tions, allowing application components to scale individually,
monly occur internally between the cloud platform’s service but this presents a new problem in the coherent management
offerings, allowing developers to easily compose applications of a large array of functions. AWS recently introduced Step
distributed across many services within a cloud. Functions [16], which allows for easier organization and
Serverless computing is a partial realization of an event- visualization of function interaction.
driven ideal, in which applications are defined by actions and The application of serverless computing is an active area
the events that trigger them. This language is reminiscent of of development. Our previous work on serverless computing
active database systems, and the event-driven literature has studied serverless programming paradigms such as function
theorized for some time about general computing systems cascades, and experimented with deploying monolithic appli-
in which actions are processed reactively to event streams cations on serverless platforms [17]. Other work has studied
[7]. Serverless function platforms fully embrace these ideas, the architecture of scalable chatbots in serverless platforms
defining actions through simple function abstractions and [18]. There are multiple projects aimed at extending the
building out event processing logic across their clouds. IBM functionality of existing serverless platforms. Lambdash [19]
strongly echoes these concepts in their OpenWhisk platform is a shim allowing the easy execution of shell commands
(now Apache OpenWhisk), in which functions are explicitly in AWS Lambda containers, enabling developers to explore
defined in terms of event, trigger, and action [8]. the Lambda runtime environment. Other efforts such as Apex
Beyond the event-driven foundation, design discussions [20] and Sparta [21] allow users to deploy functions to AWS
shift toward container management and software development Lambda in languages not supported natively, such as Go.
strategies used to leverage function centric infrastructure. Serverless computing is often championed as a cost-saving
Iron.io uses Docker to store function containers in private tool, and there are multiple works which report cost saving op-
406
Authorized licensed use limited to: Teesside University. Downloaded on April 04,2025 at 17:49:22 UTC from IEEE Xplore. Restrictions apply.
Fig. 1. Overview of prototype components, showing the organization of the web and worker services, as well as code,
metadata, and messaging entities in Azure Storage.
Once a container allocation message is found in a queue, arbitrarily set period of 15 minutes, after which it is removed
the web service sends an HTTP request to a worker service and its memory reclaimed. Whenever memory is reclaimed,
using the URI contained in the message. The worker then worker services send new container allocations to the cold
executes the function and returns the function outputs to the queue if their unused memory exceeds the maximum function
web service, which in turn responds to the invocation call. memory size.
Container expiration has implications for the web service
C. Container Allocation
because it is possible to dequeue an expired container from
Each worker manages a pool of unallocated memory which a function’s warm queue. In this case, when the web service
it can assign to function containers. When memory is reserved, sends the execution request, the worker service will return
a container name is generated, which uniquely identifies a HTTP 404 Not Found. The web service will then delete the
container and its memory reservation, and is embedded in expired message from the queue and retry.
the URI sent in container allocation messages. Therefore,
each message in the queues is uniquely identifiable and can E. Container Image
be associated with a specific memory reservation within a
worker service instance. Memory is allocated conservatively, The platform uses Docker to run Windows Nano Server
and worker services assume all functions will consume their containers and communicates with the Docker service through
allocated memory size. the Docker Engine API. The container image is built to
When container allocations are sent to the cold queue, they include the function runtime (currently only Node.js v6.9.5)
have not yet been assigned to a function. To ensure workers and an execution handler. Notably absent from the image is
do not over-provision their memory pool, it is assumed the any function code. Custom containers are not built for each
assigned function will have the maximum function memory function in the platform, instead we attach a read-only volume
size. Then, when a worker service receives an execution containing function code when starting the container. A single-
request for an unassigned allocation, it reclaims memory if image design was chosen for multiple reasons: it is simpler
the assigned function requires less than the maximum size. to only manage a single image, attaching volumes is a fast
After the container is created and its function executed for the operation, and Windows Nano Server container images are
first time, the container allocation message is placed in that significantly larger than lightweight Linux images such as
function’s warm queue. Alpine Linux, affecting both storage costs and start-up times.
In addition to the read-only volume, the memory size and CPU
D. Container Removal percentage of the container are proportionally set based upon
There are two ways a container can be removed. Firstly, the function’s memory size.
when a function is deleted, the web service deletes the The container’s execution handler is a simple Node.js server
function’s warm queue, which is periodically monitored for which receives function inputs from the worker service. The
existence by the worker service instances holding containers worker service sends function inputs to the handler in the
of that function. If a worker service detects that a deleted request body of an HTTP request, the handler calls the
function queue, it removes that function’s running containers function with the specified inputs, and responds to the worker
and reclaims their memory reservations. Secondly, in our service with the function outputs. The container is addressable
implementation a container can be removed if it is idle for an on the worker service’s LAN because containers are added to
407
Authorized licensed use limited to: Teesside University. Downloaded on April 04,2025 at 17:49:22 UTC from IEEE Xplore. Restrictions apply.
the default ”nat” network, which is the Windows equivalent
of the Linux container ”bridge” network.
III. P ERFORMANCE R ESULTS
We designed two tests to measure the execution perfor-
mance of our implementation, AWS Lambda, Azure Func-
tions, Google Cloud Functions, and Apache OpenWhisk. We
developed a performance tool2 to conduct these experiments,
which deploys a Node.js test function to the different services
using the Serverless Framework [28]. We also built a Server-
less Plugin3 to enable Serverless Framework support for our
platform.
This tool is deigned to measure the overhead introduced by
the platforms using a simple test function which immediately
completes execution and returns. This function is invoked
synchronously with HTTP events/triggers as supported by the
various platforms, and through the function’s invocation route Fig. 2. Concurrency test results, plotting the average number of executions
on our platform. Manual invocation calls were not used on completed per second versus the number of concurrent execution requests to
the function.
the other services as they are typically viewed as development
and testing routes, and we believed a popular production
event/trigger such as an HTTP endpoint would better reflect load is approaching the scalability targets of a single Azure
existing platform performance. A 512MB function memory Storage queue [29]. AWS Lambda appears to scale linearly and
size was used in all platforms except Microsoft Azure, which exhibits the highest throughput of the commercial platforms
dynamically discovers the memory requirements of functions. at 15 concurrent requests. Google Cloud Functions exhibits
The prototype was deployed in Microsoft Azure, where the sub-linear scaling and appears to taper off as the number
web service was an API App in Azure App Service, and of concurrent requests approaches 15. The performance of
the worker service was two DS2 v2 virtual machines running Azure Functions is extremely variable, although the through-
Windows Server 2016. All platform tables, queues, and blobs put reported is quite high in places, outperforming the other
resided in a single Azure storage account. platforms at lower concurrency levels. This variability is
Network latencies were not accounted for in our tests, but to intriguing, especially because it persists across test iterations.
reduce their effects we performed our experiments from virtual OpenWhisk’s performance is curious, and shows low through-
machines inside the same region as our target function, except put until eight concurrent requests, at which point the function
in the case of OpenWhisk, which we measured from Azure’s begins to sub-linearly scale. This behavior may be caused
South Central US region, and from which we observed single- by OpenWhisk’s container pool starting multiple containers
digit millisecond network latencies to our function endpoint in before beginning reuse, but this behavior is dependent on the
IBM’s US South region. configuration of IBM’s deployment.
A. Concurrency Test
B. Backoff Test
Figure 2 shows the results of the concurrency test, which
Figure 3 shows the results of the backoff test, which is
is designed to measure the ability of serverless platforms to
designed to study the cold start times and expiration behaviors
performantly invoke a function at scale. Our tool maintains
of function instances in the various platforms. The backoff
invocation calls to the test function by reissuing each request
test sends single execution requests to the test function at
immediately after receiving the response from the previous
increasing intervals, ranging from one to thirty minutes.
call. The test begins by maintaining a single invocation call in
this way, and every 10 seconds adds an additional concurrent As described in the prototype design, function containers
call, up to a maximum of 15 concurrent requests to the expire after 15 minutes of unuse. Figure 3 shows this behavior,
test function. The tool measures the number of responses and the execution latencies after 15 minutes show the cold start
received per second, which should increase with the level of performance of our prototype. It appears Azure Functions also
concurrency. This test was repeated 10 times on each of the expires function resources after a few minutes, and exhibits
platforms. similar cold start times as our prototype. It is important to note
The prototype demonstrates near-linear scaling between that although both our prototype and Azure Functions are Win-
concurrency levels 1 and 14, but sees a significant performance dows implementations, their function execution environments
drop at 15 concurrent requests. This drop is due to increased are very different, as our prototype uses Windows containers
latencies observed from the warm queue, indicating that the and Azure Functions runs in Azure App Service. OpenWhisk
also appears to deallocate containers after about 10 minutes
2 Available: https://github.com/mgarrettm/serverless-performance and has much lower cold start times than Azure Functions
3 Available: https://github.com/mgarrettm/serverless-prototype-plugin or our prototype. Most notably, AWS Lambda and Google
408
Authorized licensed use limited to: Teesside University. Downloaded on April 04,2025 at 17:49:22 UTC from IEEE Xplore. Restrictions apply.
is only returned once execution has completed. However,
asynchronous executions respond to clients before function
execution, so it is necessary to have additional logic to ensure
these executions complete successfully.
We believe the prototype can support this requirement by
storing active executions in a set of queues and introducing
a third service responsible for monitoring the status of these
queue messages. Worker services would continually update
message visibility delays during function execution, and the
monitoring service would detect failures by looking for visible
messages. Failed messages could then be re-executed. Note
that this is about handling platform execution failures and not
exceptions thrown by the function during execution, for which
retry may also be desired.
C. Worker Utilization
Fig. 3. Backoff test results, plotting the average execution latency of the A large area for improvement in our implementation is
function versus the time since the function’s previous execution.
worker utilization. Realistic designs would require an over-
allocation of worker resources, with the observation that not
Cloud Functions appear largely unaffected by function idling. all functions on a worker are constantly executing, or using all
Possible explanations for this behavior could be extremely of their memory reservation. Utilization in a serverless context
fast container start times or preallocation of containers as presents competing tradeoffs between execution performance
considered below in the discussion of Windows containers. and operating costs; however, the evaluation of utilization
strategies is difficult without representative datasets of exe-
IV. L IMITATIONS AND F UTURE W ORK cution loads on serverless platforms. Future research would
A. Warm Queues benefit from increased transparency from existing platforms,
The warm queue is a FIFO queue, which is problematic and from methods of synthesizing serverless computing loads.
for container expiration. Imagine a function under heavy load D. Windows Containers
has 10 containers allocated for execution, and then load drops
such that a single container could handle all of the function’s Windows containers have some limitations compared to
executions. Ideally, the extra 9 containers would expire after Linux containers, largely because Linux containers were de-
a short time, but because of the FIFO queue, so long as there signed around Linux cgroups which support useful operations
are 10 executions of the function per container expiration not available on Windows. Most notably in the context of
period, all containers will remain allocated to the function. serverless computing is the support of container resource up-
Of course, the solution is to use ”warm stacks” instead of dating and container pausing. A common pattern in serverless
”warm queues”, but Azure Storage does not currently support platform implementations is pausing containers when idle
LIFO queues. This is perhaps the largest issue with our to prevent resource consumption, and then unpausing them
current implementation; however, other warm stack storage before execution resumes [10], [32].
options such as a Redis cache [30] or a consistent hashing Another potentially useful operation is container resource
[31] implementation are promising, and may offer improved updating. Because we reserve resources for containers before
performance as well. executions begin, it would be beneficial for cold start per-
formance if we were able to start containers before they are
B. Asynchronous Executions assigned to a function, and then resize the container once an
Currently the prototype only supports synchronous invoca- execution request is received. Future work can study how to
tions. In other words, a request to execute a function will support these semantics in Windows containers, perhaps by
return the result of that function execution, it will not simply limiting or updating the resources to the function process itself
start the function and return. Asynchronous executions by rather than the container as a whole. Alternatively, the proto-
themselves are simple to support, the web service can simply type could experiment with Linux containers to compare start-
respond to the invocation call and then process the execution up performances and test the viability of container resizing
request normally. The difficulty in asynchronous execution is during cold starts.
in guaranteeing at-least-once execution rather than best effort
E. Security
execution. It is important to understand that synchronous or
asynchronous execution is only guaranteed once an invocation Security of serverless systems is also an open research
request returns with a successful status code. Therefore, no question. Hosting arbitrary user code in containers on multi-
further work is needed for synchronous execution requests tenant systems is a dangerous proposition, and care must be
(as in our implementation), because a successful status code taken when constructing and running function containers to
409
Authorized licensed use limited to: Teesside University. Downloaded on April 04,2025 at 17:49:22 UTC from IEEE Xplore. Restrictions apply.
prevent vulnerabilities. This intersection of remote procedure [10] S. Hendrickson, S. Sturdevant, T. Harter, V. Venkataramani, A. C.
Arpaci-Dusseau, and R. H. Arpaci-Dusseau, “Serverless computation
calls (RPC) and container security represents a significant real- with openlambda,” in Proceedings of the 8th USENIX Conference on
world test of general container security. Therefore, although Hot Topics in Cloud Computing, ser. HotCloud’16. Berkeley, CA,
serverless platforms are able to carefully craft the function USA: USENIX Association, 2016, pp. 33–39.
[11] R. Vojta, “AWS journey: API Gateway & Lambda
containers and restrict function permissions arbitrarily, in- & VPC performance,” Available: https://robertvojta.com/
creasing the chances of secure execution, further study is aws-journey-api-gateway-lambda-vpc-performance-452c6932093b,
needed to assess the attack surface within function execution 2016.
[12] E. Jonas, “Microservices and Teraflops,” Available: http://ericjonas.com/
environments. pywren.html, 2016.
[13] E. d. Lara, C. S. Gomes, S. Langridge, S. H. Mortazavi, and M. Roodi,
F. Performance Measures “Poster abstract: Hierarchical serverless computing for the mobile edge,”
in 2016 IEEE/ACM Symposium on Edge Computing (SEC), Oct 2016,
There are significant opportunities to expand understanding pp. 109–110.
of serverless platform performance by defining performance [14] Amazon Web Services, “AWS Lambda@Edge,” Available: http://docs.
aws.amazon.com/lambda/latest/dg/lambda-edge.html, 2017.
measures and tests thereof. This work focused on the overhead [15] ——, “AWS Greengrass,” Available: https://aws.amazon.com/
introduced by the platforms during single-function execution, greengrass/, 2017.
but the quality of these measurements can be improved by [16] ——, “AWS Step Functions,” Available: https://aws.amazon.com/
step-functions/, 2017.
better handling of network latencies and clocking considera- [17] G. McGrath, J. Short, S. Ennis, B. Judson, and P. Brenner, “Cloud event
tions. Other aspects of platform performance such as latency programming paradigms: Applications and analysis,” in 2016 IEEE 9th
variations between language runtimes and function code size, International Conference on Cloud Computing (CLOUD), June 2016,
pp. 400–406.
system-wide performance of serverless platforms, performance [18] M. Yan, P. Castro, P. Cheng, and V. Ishakian, “Building a chatbot with
differences between event types, and CPU allocation scaling serverless computing,” in Proceedings of the 1st International Workshop
also warrant study. on Mashups of Things and APIs, ser. MOTA ’16. New York, NY, USA:
ACM, 2016, pp. 5:1–5:4.
[19] E. Hammond, “Lambdash: Run sh commands inside AWS Lambda
V. C ONCLUSION environment,” Available: https://github.com/alestic/lambdash, 2017.
[20] Apex, “Apex: Serverless Architecture,” Available: http://apex.run/, 2017.
Serverless computing offers powerful, event-driven inte- [21] Sparta, “Sparta: A Go framework for AWS Lambda microservices,”
grations with numerous cloud services, simple programming Available: http://gosparta.io/, 2017.
and deployment models, and fine-grained scaling and cost [22] M. Villamizar, O. Garcs, L. Ochoa, H. Castro, L. Salamanca, M. Verano,
R. Casallas, S. Gil, C. Valencia, A. Zambrano, and M. Lang, “Infras-
management. Driven by these benefits, the growing adoption tructure cost comparison of running web applications in the cloud using
of serverless applications warrants the evaluation of serverless aws lambda and monolithic and microservice architectures,” in 2016
platform quality, and the development of new techniques to 16th IEEE/ACM International Symposium on Cluster, Cloud and Grid
Computing (CCGrid), May 2016, pp. 179–182.
maximize the technology’s potential. The performance results [23] B. Wagner and A. Sood, “Economics of Resilient Cloud Services,” ArXiv
of our platform are encouraging and our analysis of the current e-prints, Jul. 2016.
implementation presents many opportunities for continued [24] A. Warzon, “AWS Lambda pricing in context: A comparison to EC2,”
Available: https://www.trek10.com/blog/lambda-cost/, 2016.
development and study. We hope to see increased interest [25] C. Lowery, “Emerging Technology Analysis: Serverless Computing and
in serverless computing by academia and increased openness Function Platform as a Service,” Gartner, Tech. Rep., September 2016.
by the industry leaders for the wider benefit of serverless [26] J. S. Hammond, J. R. Rymer, C. Mines, R. Heffner, D. Bartoletti,
C. Tajima, and R. Birrell, “How To Capture The Benefits Of Microser-
technologies. vice Design,” Forrester Research, Tech. Rep., May 2016.
[27] B. Calder, J. Wang, A. Ogus, N. Nilakantan, A. Skjolsvold, S. McKelvie,
R EFERENCES Y. Xu, S. Srivastav, J. Wu, H. Simitci, J. Haridas, C. Uddaraju,
H. Khatri, A. Edwards, V. Bedekar, S. Mainali, R. Abbasi, A. Agarwal,
[1] Amazon Web Services, “AWS Lambda,” Available: https://aws.amazon. M. F. u. Haq, M. I. u. Haq, D. Bhardwaj, S. Dayanand, A. Adusumilli,
com/lambda/, 2017. M. McNett, S. Sankaran, K. Manivannan, and L. Rigas, “Windows
[2] The Apache Software Foundation, “Apache OpenWhisk,” Available: azure storage: A highly available cloud storage service with strong
http://openwhisk.org/, 2017. consistency,” in Proceedings of the Twenty-Third ACM Symposium on
[3] Microsoft, “Azure Functions,” Available: https://azure.microsoft.com/ Operating Systems Principles, ser. SOSP ’11. New York, NY, USA:
en-us/services/functions/, 2017. ACM, 2011, pp. 143–157.
[4] Google, “Google Cloud Functions,” Available: https://cloud.google.com/ [28] Serverless, Inc., “Serverless Framework,” Available: https://serverless.
functions/, 2017. com/, 2017.
[5] Iron.io, “Iron.io IronFunctions,” Available: http://open.iron.io/, 2017. [29] Microsoft, “Azure Storage Scalability and Performance
[6] OpenLambda, “OpenLambda,” Available: https://open-lambda.org/, Targets,” Available: https://docs.microsoft.com/en-us/azure/storage/
2017. storage-scalability-targets, March 2017.
[7] K. uwe Schmidt, D. Anicic, and R. Sthmer, “Event-driven reactivity: A [30] Redis, “Redis,” Available: https://redis.io/, 2017.
survey and requirements analysis,” in In 3rd International Workshop on [31] I. Stoica, R. Morris, D. Karger, M. F. Kaashoek, and H. Balakrishnan,
Semantic Business Process Management, 2008, pp. 72–86. “Chord: A scalable peer-to-peer lookup service for internet applications,”
in Proceedings of the 2001 Conference on Applications, Technologies,
[8] I. Baldini, P. Castro, P. Cheng, S. Fink, V. Ishakian, N. Mitchell,
Architectures, and Protocols for Computer Communications, ser. SIG-
V. Muthusamy, R. Rabbah, and P. Suter, “Cloud-native, event-based
COMM ’01. New York, NY, USA: ACM, 2001, pp. 149–160.
programming for mobile applications,” in Proceedings of the Interna-
[32] T. Wagner, “Understanding container reuse in AWS Lambda,” Available:
tional Conference on Mobile Software Engineering and Systems, ser.
https://aws.amazon.com/blogs/compute/container-reuse-in-lambda/,
MOBILESoft ’16. New York, NY, USA: ACM, 2016, pp. 287–288.
2014.
[9] I. Dwyer, “Serverless computing: Developer empowerment reaches
new heights,” Available: http://cdn2.hubspot.net/hubfs/553779/PDFs/
Whitepaper Serverless Screen Final V2.pdf, 2016.
410
Authorized licensed use limited to: Teesside University. Downloaded on April 04,2025 at 17:49:22 UTC from IEEE Xplore. Restrictions apply.