Akamai Cloudtest Sample Report For Production Load Testing
Akamai Cloudtest Sample Report For Production Load Testing
INTRODUCTION
This document presents the results of tests performed for “Customer” against the current production
environment. Domains or details that might help identify “Customer” have been anonymized. These tests
were conducted on the Akamai CloudTest platform.
OBJECTIVES/GOALS
The test was designed to measure system performance and scalability, while monitoring response time
metrics as user concurrency increases over time. This includes:
• Determine system scalability and performance while increasing user concurrency over time
• Determine system response times while increasing user concurrency over time
• Measure the impact of commercial load spikes on system performance while operating at peak load
TEST SCENARIOS
Additional scenarios were added to this composition for a secondary commercial spike. This was intended
to initiate the first commercial spike, hit peak, and then initiate another commercial spike while the first was
ramping down. This new scenario sought to mimic behavior seen during the “event.”
Results
WHAT WE LEARNED
• Load is unable to scale effectively to peak base load without performance degradation
• The majority of issues occurred on the xxxxxx domain along with two xxxxxx domains
• The majority of the error types were HTTP 504, 502, 404, and 400
• “Transaction X” was the largest 90th percentile measure, by far, at 49.194 seconds
• Transaction rates and response times degraded consistently as the test execution progressed
RECOMMENDATIONS
• Decide whether the development xxxxxx domain should be included in the test execution
and resources
• Reduce the system annotations to increase the performance of multiple dashboards; this is only
necessary when placing servers in an idle state
• Determine root cause for transaction rate reductions (specifically, xxxxxx) while executing
commercial spikes
CONCLUSIONS
Response time degraded continuously beyond xx,xxx users as concurrency increased, but leveled off
once it reached the initial ramp of xx,xxx users. Through the remainder of the test, average response time
increased each time a commercial spike was introduced into the system, but recovered as the commercial
users were ramped back down to average concurrency.
Significant increases in errors coincided with the commercial spikes. These primarily consisted of HTTP
504 (Gateway Timeout), HTTP 502 (Bad Gateway), HTTP 404 (Not Found), and HTTP 400 (Bad Request).
The majority of these errors occurred in the xxxxxx and xxxxxx domains. These errors can be attributed to
differences in the hardware on the seven in-play web servers. Three of the servers had older hardware,
with less load capacity (in terms of CPU and memory) than the other four. These were not recognized by
the load balancer, which continued to distribute the load equally, overloading those servers.
Two domains (XXXX and XXXX), each serving single-resource requests, were the slowest in terms of
response times. While these domains serve a very small percentage of the overall requests, it highlights
the possibility that third-party resource requests might have a detrimental impact on the performance of
certain pages.
The purpose of the test was to identify how the existing production site reacts while a large user
concurrency base is running on the system and specific types of users are added to the configuration
quickly. Various levels of commercial spikes were used to determine the configuration stability and
scalability. Transaction rates appear to degrade, over time, whether due to error rates, increased
concurrency, database struggles, or transaction X.
RESULTS SUMMARY
The virtual users for this test were generated with the following geographic distribution:
One of the primary metrics in a load test is the average response time. Average response times
provide a general idea of how the application is performing under load. The flatter the average
response time line (as the load/number of virtual users increases), the better.
The average response time chart shown below is an average of all the HTTP requests made during
the test. Response time average started to degrade at nearly xx,xxx users and continued until xx,xxx
concurrent users was attained. After that, response times only increased when subjected to commercial
spikes, but recovered once commercial was ramped back down.
When variances in response times are identified, it is important to identify the associated domains.
The chart below shows the average response times separated by domain over time. Both XXXX and
XXXX domains account for the larger response times seen below, but they correspond mainly with
commercial spikes.
Transactions represent a grouping of clip elements (HTTP requests, scripts, think times, etc.). Most tests
utilize transactions to closely represent the actual page load times that an end user would experience.
Ideally, the transaction completion time for each test clip should remain flat throughout the test.
Response times were very stable throughout the test execution and any notable increase appears to
coincide with the increased user concurrency spikes that were introduced at various intervals.
The chart below shows the 95th percentile measures throughout the duration of the test.
The collection analysis widget on the next pages identifies each of the transactions used in the test
composition and statistics associated with each clip.
Error Count
Another important metric in any load test is the throughput (i.e., send rate or hit rate) of the test.
This measures how many raw HTTP requests can be processed in a given interval (e.g., per second
or per minute) by the target application. In a perfectly scalable application, the send rate increases
in a straight line with the virtual user ramp. Send rate increased linearly as virtual user concurrency
increased, indicating that throughput is not a concern at this point in time. (Usually, there are legends
for the following charts, but they have been removed from this report in the interest of anonymity).
The XXXX domain accounted for the majority of the send rate data below.
Bandwidth is a potential bottleneck that can become maxed out very quickly. This is especially
true if an application has large downloads, lots of page resources, or does not properly utilize a
CDN. Bandwidth does not appear to be an issue at this point in time. This metric should always be
revisited after each test in order to verify there are no changes that may have caused a bottleneck or
degradation. (Usually, there are legends for the following charts, but they have been removed from this
report in the interest of anonymity).
Bandwidth Usage
The chart below shows the completion time for each of the user scenarios (test cases) over the course
of the test. Ideally, the completion time for each test clip should remain flat throughout the test.
(Usually, there are legends for the following charts, but they have been removed from this report in the
interest of anonymity).
The chart below shows how many clips were completed through the course of the test. Note that the
test execution itself incorporated spikes in traffic, which you can see accurately reflected in the chart at
the corresponding times.
Clip Completed
The clip analysis widget on the next page identifies each of the clips used in the test composition and
statistics associated with each clip.
The table below shows a detailed breakdown of error type and messages that resulted in error.
193,968,161 337,620
DomainX 973,271 64,291
Connection reset (java.net.SocketException) 271
Connection timeout of 60,000 ms exceeded. 62,147
Send was completed, but no response was received within the Socket Read
Timeout limit of 120,000 ms.
DomainX 1,049,386 23
Connection timeout of 60,000 ms exceeded. 15
HTTP 400 - Bad Request 5
DomainX 295,788 37
Connection timeout of 60,000 ms exceeded. 37
DomainX 60,762,484 31,124
Connection reset (java.net.SocketException) 1
Connection timeout of 60,000 ms exceeded. 89
Failed to Process Transaction X 14
HTTP 400 - Bad Request 1
HTTP 403 - Forbidden 3,135
HTTP 500 - Internal Server Error 986
HTTP 502 - Bad Gateway 331
HTTP 503 - Service Unavailable 17
HTTP 504 - Gateway Timeout 10,672
Send was completed, but no response was received within the Socket Read
67
Timeout limit of 120,000 ms.
Send was completed, but the connection was broken before the response
4
was received.
The SSL server certificate for domainx cannot be verified. (com.soasta.
8
common.exceptions.CommonException)
Unable to set value of Property - check events for specific properties 10,130
Unable to set value of the Property account” - JSON response was invalid.” 5,185
DomainX 1,980 32
HTTP 400 - Bad Request 2
HTTP 500 - Internal Server Error 6
HTTP 503 - Service Unavailable 24
DomainX 743,308 8
Connection timeout of 60000 ms exceeded. 8
DomainX 38,695 1,649
Connection reset (java.net.SocketException) 4
Connection timeout of 60,000 ms exceeded. 1,429
DomainX 5,317,556 93,871
Connection reset (java.net.SocketException) 377
Connection timeout of 60,000 ms exceeded. 92,016
Send was completed, but no response was received within the Socket Read
1
Timeout limit of 120,000 ms.
Send was completed, but the connection was broken before the response
7
was received.
DomainX 124,785,689 146,581
Custom Validation Error Message 1 24
Custom Validation Error Message 2 3,752
Custom Validation Error Message 3 14,016
Custom Validation Error Message 4 3
Connection timeout of 60,000 ms exceeded. 1,102
Custom Validation Error Message 5 6
HTTP 400 - Bad Request 11,596
HTTP 401 - Unauthorized 5
HTTP 417 - Expectation Failed 1
HTTP 500 - Internal Server Error 3,420
HTTP 502 - Bad Gateway 10,319
HTTP 503 - Service Unavailable 5,494
HTTP 504 - Gateway Timeout 49,335
Learn More
Akamai secures and delivers digital experiences for the world’s largest companies. Akamai’s intelligent edge platform
surrounds everything, from the enterprise to the cloud, so customers and their businesses can be fast, smart, and secure.
Top brands globally rely on Akamai to help them realize competitive advantage through agile solutions that extend the power
of their multi-cloud architectures. Akamai keeps decisions, apps, and experiences closer to users than anyone — and attacks
and threats far away. Akamai’s portfolio of edge security, web and mobile performance, enterprise access, and video delivery
solutions is supported by unmatched customer service, analytics, and 24/7/365 monitoring. To learn why the world’s top brands
trust Akamai, visit www.akamai.com, blogs.akamai.com, or @Akamai on Twitter. You can find our global contact information
at www.akamai.com/locations. Published 05/19.