This application is designed to allow like-for-like performance comparisons between Spring Boot and Quarkus.
Designing good benchmarks is hard! There are lots of trade-offs, and lots of "right" answers.
Here are the principles we used when making implementation choices:
- Parity
- The application code in the Spring and Quarkus versions of the application should be as equivalent as possible to perform the same function. This mean that the domain models should be identical, the underlying persistence mechanisms should be identical (i.e. JPA with Hibernate).
- Performance differences should come from architecture differences and library-integration optimisations in the frameworks themselves.
- If a change is made that changes the architecture of an application (i.e. moving blocking to reactive, using virtual threads, etc), then these changes should be applied to all the versions of the applications.
- Normal-ness
- Realism is more important than squeezing out every last bit of performance.
- High quality
- Applications should model best practices.
- Although we want the application to represent a typical usage, someone who copies it shouldn't ever be copying 'wrong' or bad code.
- Easy to try at home
- Running measurements should be easy for a non-expert to do with a minimum of infrastructure setup, and it should also be rigorous in terms of performance best practices.
- These two goals are contradictory, unfortunately! To try and achieve both, we have two versions of the scripts, one optimised for simplicity, and one for methodological soundness.
Initially, we wanted to measure the "out of the box" performance experience, meaning the use of tuning knobs are kept to a minimum. This is different from the goals for a benchmark like TechEmpower, where the aim is to tune things to get the highest absolute performance numbers. While having an out-of-the-box baseline is important, not all frameworks perform equally well out of the box. A more typical production scenario would involve tuning, so we wanted to be as fair as possible and capture that too.
To that end, we use different branches within this repository for separating the strategies. The scenario is recorded in the raw output data and visualisations, so an out-of-the-box strategy run can be recorded independently of a tuned strategy run.
Important
While the strategies and outcomes may be different, each strategy should still represent the same set of guiding principles when comparing applications within the strategy.
| Strategy | Goals | Constraints | Branch |
|---|---|---|---|
| OOTB (Out of the box) | - Simplicity - Measure performance characteristics each framework provides out of the box - Does one framework provide a more "production ready" experience? |
- No tuning allowed, even to fix load-related errors - May not achieve parity of pool sizes between Quarkus and Spring applications (or even between Spring 3/Spring 4) |
ootb |
| Tuned | Performance | Reasonable improvements to help improve performance without changing the architecture of the application - Code and architectural equivalence are still important Acceptable - Adjustments to HTTP/database thread/connection pool sizes - Removal of the open session in view pattern Unacceptable - Changes specific to a fixed number of CPU cores or memory |
mainThe default repository branch |
This project contains the following modules:
- springboot3
- A Spring Boot 3.x version of the application
- springboot4
- A Spring Boot 4.x version of the application
- quarkus3
- A Quarkus 3.x version of the application
- quarkus3-virtual
- A Quarkus 3.x version of the application using Virtual Threads
- quarkus3-spring-compatibility
- A Quarkus 3.x version of the application using the Spring compatibility layer. You can also recreate this application from the spring application using a few manual steps.
Each module can be built using
./mvnw clean verifyYou can also run ./mvnw clean verify at the project root to build all modules.
-
(macOS) You need to have a
timeoutcompatible command:- Via
coreutils(installed via Homebrew):brew install coreutilsbut note that this will install lots of GNU utils that will duplicate native commands and prefix them withg(e.g.gdate) - Use this implementation via Homebrew:
brew install aisk/homebrew-tap/timeout - More options at https://stackoverflow.com/questions/3504945/timeout-command-on-mac-os-x
- Via
-
Base JVM Version: 21
The application expects a PostgreSQL database to be running on localhost:5432. You can use Docker or Podman to start a PostgreSQL container:
cd scripts
./infra.sh -sThis will start the database, create the required tables and populate them with some data.
To stop the database:
cd scripts
./infra.sh -dThere are some scripts available to help you run the application:
run-requests.sh- Runs a set of requests against a running application.
infra.sh- Starts/stops required infrastructure
Of course you want to start generating some numbers and doing some comparisons, that's why you're here! There are lots of wrong ways to run benchmarks, and running them reliably requires a controlled environment, strong automation, and multiple machines. Realistically, that kind of setup isn't always possible.
Here's a range of options, from easiest to best practice. Remember that the easy setup will not be particularly accurate, but it does sidestep some of the worst pitfalls of casual benchmarking.
Before we go any further, know that this kind of test is not going to be reliable. Laptops usually have a number of other processes running on them, and modern laptop CPUs are subject to power management which can wildly skew results. Often, some cores are 'fast' and some are 'slow', and without extra care, you don't know which core your test is running on. Thermal management also means 'fast' jobs get throttled, while 'slow' jobs might run at their normal speed.
Load shouldn't be generated on the same machine as the one running the workload, because the work of load generation can interfere with what's being measured.
But if you accept all that, and know these results should be treated with caution, here's our recommendation for the least-worst way of running a quick and dirty test. We use Hyperfoil instead of wrk, to avoid coordinated omission issues. For simplicity, we use the wrk2 Hyperfoil bindings.
You can run these in any order.
The stress.sh script starts the infrastructure, and uses a load generator to measure
how many requests the applications can handle over a short period of time.
scripts/stress.sh quarkus3/target/quarkus-app/quarkus-run.jar
scripts/stress.sh quarkus3-spring-compatibility/target/quarkus-app/quarkus-run.jar
scripts/stress.sh springboot3/target/springboot3.jarFor each test, you should see output like
Thread Stats Avg Stdev Max +/- Stdev
Latency 9.58ms 6.03ms 94.90ms 85.57%
Req/Sec 9936.90 2222.61 10593.00 95.24The 1strequest.sh starts the infrastructure and runs an application X times and computes the time to 1st request and RSS for each iteration as well as an average over the X iterations.
For example,
scripts/1strequest.sh "java -XX:ActiveProcessorCount=8 -Xms512m -Xmx512m -jar quarkus3/target/quarkus-app/quarkus-run.jar" 5
scripts/1strequest.sh "java -XX:ActiveProcessorCount=8 -Xms512m -Xmx512m -jar quarkus3-spring-compatibility/target/quarkus-app/quarkus-run.jar" 5
scripts/1strequest.sh "java -XX:ActiveProcessorCount=8 -Xms512m -Xmx512m -jar springboot3/target/springboot3.jar" 5You should see output like
-------------------------------------------------
AVG RSS (after 1st request): 35.2 MB
AVG time to first request: 0.150 sec
-------------------------------------------------These scripts are being developed.
To produce charts from the output, you can use the scripts at https://github.com/quarkusio/benchmarks.
These tests are run on a regular schedule in Red Hat/IBM performance labs. The scripts are viewable in this repository. The controlled environment and the scripts ensure workloads are isolated properly across cpus and memory without any contention between components (application under test, load generator, database, etc).
The results are published to https://github.com/quarkusio/benchmarks/tree/main/results/spring-quarkus-perf-comparison, and also available in an internal Horreum instance. Charts of the results are also available. The latest results are shown below.
- Why Quarkus is Fast: https://quarkus.io/performance/
- How the Quarkus team measure performance (and some anti-patterns to be aware of): https://quarkus.io/guides/performance-measure