SQL Server 2025 Public Preview is not even a week old, but I’m impressed with another new capability that was released – a new backup compression algorithm: ZSTD. This one came as a surprise, despite being part of Private Preview, as it was only released with Public Preview.
TL;DR – ZSTD Can Be Fast!
Today, I’m only going to do a quick blog post to share some initial test results.
This is against a new database that I’m running on a VMware Workstation VM, on my laptop, on a Samsung 990 Pro. I’m backing up my new RecipesDemoDB, which is 25.13 GB in size.
| DISK = NUL (x4 files) | Speed |
| COMPRESSION | 3057442 pages in 49.966 seconds (478.050 MB/sec) |
| STD – Level = Low | 3057442 pages in 44.711 seconds (534.236 MB/sec) |
| STD – Level = Medium | 3057442 pages in 33.428 seconds (714.558 MB/sec) |
| STD – Level = High | 3057442 pages in 73.147 seconds (326.551 MB/sec) |
| DISK = 4 Backup Files | Speed | Total Backup Size | % size of original |
| COMPRESSION | 3057434 pages in 80.761 seconds (295.764 MB/sec) | 16.52 GB | 65.74% |
| STD – Level = Low | 3057434 pages in 39.920 seconds (598.351 MB/sec) | 17.37 GB | 69.13% |
| STD – Level = Medium | 3057434 pages in 56.676 seconds (421.451 MB/sec) | 15.93 GB | 63.38% |
| STD – Level = High | 3057434 pages in 94.440 seconds (252.924 MB/sec) | 15.86 GB | 63.10% |
Observations and Thoughts
First, note that there’s now three sub-options which is the compression level. Low is the default. It’s interesting to see that in this initial (SINGLE) test, that STD-Low was fastest but the final output was slightly larger than MSXpress (the legacy compression algorithm).
And a note about data composition… ~3-4GB consists of mostly text data (source) and the remaining ~20-21GB consists of vector embeddings + corresponding text chunks. Because of the nature of vector embeddings, that’ll impact the compressibility. I’ll be releasing this new database very soon with additional supporting documentation.
I’ll be writing much more on this topic later, but wanted to share these initial findings. I find them extremely compelling and am looking forward to testing this with larger databases on higher end hardware.
Thanks for reading!
