0% found this document useful (0 votes)

9 views

Split DWARF - Explanation

Uploaded by

akshat.jain30

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

9 views

Split DWARF - Explanation

Uploaded by

akshat.jain30

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 6

3/20/22, 8:46 PM Improving C++ Builds with Split DWARF – Productive C++

Home About Recommended Books Links Privacy Policy

PRODUCTIVE C++
Discussing the state of the art in C++ projects

Home » Compilers » Improving C++ Builds with Split DWARF

Improving C++ Builds with Split DWARF Search...

October 8, 2018 Martin 3 Comments

Large- and medium-sized C++ projects often suffer from long build times. We can distinguish these two scenarios:
Subscribe to Blo
Scratch build
After pulling in the latest changes from upstream, or after a major refactoring that affects central headers, a lot of Enter your email addres
source files need to be rebuilt. This can take a long time. To a large extent this is caused by the insufficient module this blog and receive not
concept of C++: Each source includes a lot of headers, so after preprocessing, there can be thousands or even posts by email.
millions of lines of C++ code the compiler has to process. Therefore, each source file will take seconds to compile, and
a large application can have thousands of source files. Compile clusters can speed things up by distributing the Name

compile jobs across multiple machines.

Incremental build
Email*
Most builds are incremental. You already have done a full build, then you make a small change to one file to fix a bug
or do an enhancement, and you build and test it. Such an incremental build compiles only a handful of source files and
then links the libraries and the application. This is much less work than a scratch make, but since you are doing it
many times a day, it is even more critical to get it fast. Compared to scripting languages, the turnaround time for Subscribe
making a change and testing it is quite long for C++.

In this article we will be discussing a great way of speeding up incremental builds which also benefits scratch makes. The
goal is to increase developer productivity: Shorter turnaround times allow for quicker iterations. You don’t have to switch Subscri
to other tasks while the build runs, but can keep your focus on the problem at hand.
RSS - Posts
Prerequisites
Your incremental build should be “minimal”: No unneeded steps should be performed when invoking the build. A
repeated build without any changes should not perform any actions at all. Sometimes this is not easy to achieve, and if Archive
the redundant actions only take a short time it is tolerable. But a large overhead here means your developers are
already wasting time. October 2018
April 2018
You should have a somewhat recent toolchain supporting split DWARF (introduced below) across the board. The
December 2017
versions below are the absolute minimum. More recent versions are recommended, especially for gdb.
October 2017
gcc >= 4.8
clang >= 3.3
September 2017
gdb >= 7.7
binutils >= 2.24 (including gold)

What Makes Incremental Builds Slow?

For an incremental build, only a few source files have to be recompiled. Most of the time is spent linking the application.
And here we can find lots of overhead:

https://www.productive-cpp.com/improving-cpp-builds-with-split-dwarf/ 1/6
3/20/22, 8:46 PM Improving C++ Builds with Split DWARF – Productive C++
The libraries or executables containing the changed source files need to be rebuilt. This means creating them from
Home About Recommended Books Links Privacy Policy
scratch. All the contained objects need to be read again, even if unchanged, then processed by the linker, and the new
binary must be written to disk. 1
All other binaries which are depending on binaries that were rebuilt must also be relinked. Although a smarter
approach seems possible, in most build systems this means recreating these binaries from scratch.

Note that the linker also needs to process all debug information contained in the object files. Duplicate information gets
removed, and the merged debug information is written to the generated binary. It gets duplicated on disk, since it is
already contained in the object files. And debug information tends to be very large:

“ In a large C++ application compiled with -O2 and -g, the debug information accounts for 87% of the total size
of the object files sent as inputs to the link step, and 84% of the total size of the output binary.

So a large bottleneck for an incremental build is processing of debug info. Ironically, debug info is most important when
analyzing and fixing bugs, during which you are doing lots of incremental builds! For release builds without debug info,
linking can be surprisingly fast, and sometimes developers working on large projects use them as a last resort.

Introducing Split DWARF

Linking, and therefore incremental builds, could be much faster if the linker didn’t have to process all the debug
information. Split DWARF² makes this possible: It generates a separate file for the debug info which the linker can ignore.
This file has the suffix .dwo (DWARF object file). DWARF is a debugging file format generally used on Unix. It is the
default on most Linux distributions, the only special thing here is that the DWARF info is split from the code.

The binaries generated by the linker will not contain debug information, but references to the .dwo files that are already
on disk. Let’s examine how this works in detail:

main.cpp
#include <iostream>

int main()
{
int a = 1;
std::cout << "Split DWARF test" << std::endl;

return 0;
}

We compile this simple program in two ways, with and without split DWARF. First, compiling with debug information only
( -g ):

Shell
1 $ g++ -c -g main.cpp -o main.o
2 $ g++ main.o -o app

Now we also enable split DWARF by adding -gsplit-dwarf to the compiler invocation:

Shell
1 $ g++ -c -g -gsplit-dwarf main.cpp -o main_splitdwarf.o
2 $ g++ main_splitdwarf.o -o app_splitdwarf

The program is not interesting here, but let’s take a look at the files generated:

1 -rwxrwxr-x 1 prodcpp prodcpp 20256 Oct 7 23:39 app*

2 -rwxrwxr-x 1 prodcpp prodcpp 12728 Oct 7 23:39 app_splitdwarf*
3 -rw-r--r-- 1 prodcpp prodcpp 110 Oct 7 22:36 main.cpp
4 -rw-rw-r-- 1 prodcpp prodcpp 22112 Oct 7 23:39 main.o
5 -rw-rw-r-- 1 prodcpp prodcpp 12296 Oct 7 23:39 main_splitdwarf.dwo
6 -rw-rw-r-- 1 prodcpp prodcpp 6968 Oct 7 23:39 main_splitdwarf.o

No surprises for the regular build, which produces main.o and app . The split DWARF compilation creates two files,
main_splitdwarf.o and main_splitdwarf.dwo . app_splitdwarf takes up only 12728 bytes, in contrast to app , which is
20224 bytes. The reason is that it references the debug info, instead of containing it:

Shell
1 $ readelf -wi app_splitdwarf | grep dwo
2 <20> DW_AT_GNU_dwo_name: (indirect string, offset: 0x0): main_splitdwarf.dwo

That reference is already present in the object file, so all the linker had to do with regards to debugging information is
copying that reference:

Shell
1 $ readelf -wi main_splitdwarf.o | grep dwo
2 <20> DW_AT_GNU_dwo_name: (indirect string, offset: 0x0): main_splitdwarf.dwo
3 <2c> DW_AT_GNU_dwo_id : 0xae0d75cbd6671bc1

https://www.productive-cpp.com/improving-cpp-builds-with-split-dwarf/ 2/6
3/20/22, 8:46 PM Improving C++ Builds with Split DWARF – Productive C++

Home Aboutyou need

This also means Recommended Booksfiles asLinks
to keep the .dwo Privacy
long as you Policy
want to debug your application.

Although I couldn’t get gdb to trace loading of .dwo files, you can see via strace that it pulls them in:

Shell
1 $ strace -o log gdb --batch-silent --eval-command=quit app_splitdwarf
2 $ grep dwo log
3 stat("/projects/prodcpp/splitdwarf/main_splitdwarf.dwo", {st_mode=S_IFREG|0664, st_size=12256, ...}) = 0
4 open("/projects/prodcpp/splitdwarf/main_splitdwarf.dwo", O_RDONLY|O_CLOEXEC) = 8
5 lstat("/projects/prodcpp/splitdwarf/main_splitdwarf.dwo", {st_mode=S_IFREG|0664, st_size=12256, ...}) = 0

A Real-Life Example: llvm

In the previous toy example, gains are minimal and speedup for incremental builds would be non-existent since we only
have one source file. So let’s take a look at a real application and perform some measurements to gauge the benefits of
the split DWARF approach.

We will be building llvm 7.0.0 with and without split DWARF. llvm in it’s latest incarnation is a rather large C++ project,
clocking in at 22838 C/C++ files. On top of that, the clang compiler is linked statically against the llvm libraries, so a lot of
work has to be redone even if only one file changes.

First, let’s do a scratch build. I’m using a clone of the git monorepo with the tag RELEASE_700/final checked out. The
root of the cmake project is in the llvm directory. To also build all the other projects, I have symlinked them to the root as
follows:

Shell
1 $ pwd
2 /h/sources/llvm-project-20170507
3 $ cd llvm/tools
4 $ ln -s ../../lld lld
5 $ ln -s ../../lldb lldb
6 $ ln -s ../../clang clang
7 $ cd ../projects
8 $ ln -s ../../compiler-rt compiler-rt

First, let’s use the defaults, which is a Debug build without split DWARF.

Shell
1 $ mkdir llvm
2 $ cd llvm
3 $ cmake /h/sources/llvm-project-20170507/llvm/
4 $ /usr/bin/time -v make -j 80
5 Percent of CPU this job got: 5072%
6 Elapsed (wall clock) time (h:mm:ss or m:ss): 13:00.54
7 ...
8 Maximum resident set size (kbytes): 11222228
9 $ du -shL llvm
10 55G

Now, with split DWARF:

Shell
1 $ mkdir llvm_sd
2 $ cd llvm_sd
3 $ cmake /h/sources/llvm-project-20170507/llvm/ -DLLVM_USE_SPLIT_DWARF=ON
4 $ /usr/bin/time -v make -j 80
5 Percent of CPU this job got: 5939%
6 Elapsed (wall clock) time (h:mm:ss or m:ss): 11:01.42
7 ...
8 Maximum resident set size (kbytes): 4940236
9 $ du -shL .
10 36G

Let’s look at the numbers:

Elapsed time goes down from 13min 01sec to 11min 2sec (about 15%). More CPU is used by the second build on
average, which probably means that the linker completes some blocking links faster and parallelism during the build
can increase.
Maximum resident size set halves (10.7GB to 4.7GB). The linker does not have to process debug info, therefore it
needs much less memory. This is significant on constrained machines.
Disk consumption goes down by 50% (55GB to 36GB). There is no duplication of debug info in the binaries, only a
reference to the .dwo file is stored. For you as developer this means you can keep more builds around.

These improvements are nice, considering the low effort needed to obtain them. But what about an incremental build?
Let’s change one file, and then rebuild clang. First, without split DWARF:

https://www.productive-cpp.com/improving-cpp-builds-with-split-dwarf/ 3/6
3/20/22, 8:46 PM Improving C++ Builds with Split DWARF – Productive C++
Shell
Home
1 $ echo About
"//" Recommended Books Links
>>./llvm/lib/Target/X86/X86FlagsCopyLowering.cpp Privacy Policy
2 $ /usr/bin/time -v make 80 clang
3 Percent of CPU this job got: 150%
4 Elapsed (wall clock) time (h:mm:ss or m:ss): 3:16.90
5 ...
6 Maximum resident set size (kbytes): 11222232

With split DWARF:

Shell
1 $ echo "//" >>./llvm/lib/Target/X86/X86FlagsCopyLowering.cpp
2 $ /usr/bin/time -v make -j 80 clang
3 Percent of CPU this job got: 195%
4 Elapsed (wall clock) time (h:mm:ss or m:ss): 1:42.74
5 ...
6 Maximum resident set size (kbytes): 4940236

Elapsed time nearly halves from 3min 17 sec to 1min 43 sec. Resident set size shows the same behavior as above,
which makes sense considering that clang is probably the largest executable in llvm, linking in all needed llvm libraries
statically.

All in all, split DWARF is a huge win for development workflows. At the cost of adding a flag, you get significant
improvements for everybody building the code base.

Packaging a Release from a Split DWARF Build

While split DWARF is great for developers, it doesn’t come in so handy for building a release that needs to work on
another machine. The debug info is spread over many files, and the dwo references stored in the binaries will expose all
your source file names and hierarchy. To solve this, a new tool called dwp was added to binutils . It operates on an
executable or shared library and produces a .dwp file with all relevant info to debug that file. gdb in turn will look for dwp
files and load debug info from them.

Continuing our example:

Shell
1 $ dwp -e app_splitdwarf
2 $ ll app_*
3 -rwxrwxr-x 1 prodcpp prodcpp 20256 Oct 7 23:39 app*
4 -rwxrwxr-x 1 prodcpp prodcpp 12728 Oct 7 23:39 app_splitdwarf*
5 -rw-rw-r-- 1 prodcpp prodcpp 12440 Oct 7 23:41 app_splitdwarf.dwp

We now have a new file app_splitdwarf.dwp containing all debug info we need. We can now delete the .dwo file. Let’s
verify that debugging still works afterwards:

Shell
1 $ rm *dwo
2 $ gdb app_splitdwarf
3 GNU gdb (Ubuntu 7.11.1-0ubuntu1~16.5) 7.11.1
4 ...
5 Reading symbols from app_splitdwarf...done.
6 (gdb) b main
7 Breakpoint 1 at 0x40084e: file main.cpp, line 5.
8 (gdb) r
9 Starting program: app_splitdwarf
10
11 Breakpoint 1, main () at main.cpp:5
12 5 int a = 1;
13 (gdb) p a
14 $1 = 0

The variable can be printed, so debug information is available. Without the .dwp file you will get a warning as follows:

Shell
1 $ rm *dwp
2 $ gdb app_splitdwarf
3 GNU gdb (Ubuntu 7.11.1-0ubuntu1~16.5) 7.11.1
4 ...
5 Reading symbols from app_splitdwarf...
6 warning: Could not find DWO CU main_splitdwarf.dwo(0xae0d75cbd6671bc1) referenced by CU at offset 0x0 [in module app_splitdwarf
7 done.

You will get the same warning when removing the .dwo file (provided there is no .dwp file either).

When there is a .dwp file present for a shared library or executable, gdb will not look for .dwo files. When your binaries
are more recent than your .dwp file, you will not be able to debug the changed files until you remove or update the .dwp
file.

That wraps up our discussion of split DWARF. Hopefully you can make use of it in your projects and reduce your build
times!

https://www.productive-cpp.com/improving-cpp-builds-with-split-dwarf/ 4/6
3/20/22, 8:46 PM Improving C++ Builds with Split DWARF – Productive C++

Remarks
Home About Recommended Books Links Privacy Policy
Since you want to speed up your link as much as possible, you should also use the fastest linker. gold is faster than the
default ld.bfd linker, lld is even faster. lld is still under development, so may have more issues. gold is more mature.
So add -fuse-ld=linker during linking. Example:

1 -fuse-ld=gold
2 -fuse-ld=/path/to/lld

Also, you may want to use -Wl,--gdb-index . This creates the .gdb_index section in binaries, which speeds up
debugging a bit.

Limitations
Split DWARF is not used by that many projects, so some friction with tooling is possible. If you encounter any problems,
please let me know in the comments.

icecream supports split DWARF, distcc doesn’t. I have tested neither. In a build cluster you need to ship two files as a
result of the compilation. Other than that, there is only one piece of information that needs to be adjusted: The
reference to the .dwo file encoded in the object file. To make it fit to the node running the build, the compiler options -
fdebug-map-prefix ( gcc ) or -fdebug-compilation-dir ( clang ) can be used.
clang 7.0.0 recently implemented partial support for DWARF5, which does not support split DWARF yet. But it is not
the default.

Notes
1
Incremental linking is another solution to this problem, but not discussed here. In my experience it does not work as
reliably as the split DWARF approach.

² Debug Fission is another name for this technique.

References
DebugFission – DWARF Extensions for Separate Debug Information Files

DWP tool

If you found this article helpful, you can email subscribe for new articles below and vote for it on your favorite
site. Thanks!

 Hackernews  Reddit  Subscribe  Tweet  Share  Weibo  Pocket

Compilers, Toolchain Build, clang, Compiler, Faster linking, gcc, Linker, Speed up build

← Previous
Ultra-wide curved screens for increased productivity

3 Replies to “Improving C++ Builds with Split DWARF”

Thomas Sondergaard says:

October 11, 2018 at 7:58 pm

Absolutely love this article! Implemented it in the build system at work and got these results:

* A full build “ninja install” with clean ccache is 1% slower (10m12s

to 10m19s).

* A full build “ninja install” with fully populated ccache is 25%

faster (35s to 26s).

https://www.productive-cpp.com/improving-cpp-builds-with-split-dwarf/ 5/6
3/20/22, 8:46 PM Improving C++ Builds with Split DWARF – Productive C++
* build folder size is reduced 25% (from 5.6 GiB to 4.2 GiB)
Home About Recommended Books Links Privacy Policy

* Finally and most important, an incremental build where

a single source file in a core shared library is the only change is 41% faster (from 17s to
10s).

For the incremental build a significant portion of time is now spent by CMake automoc (our software uses Qt), so if it
wasn’t for the that the improvement would be even more significant.

Gabor Horvath says:

October 24, 2018 at 4:33 am

Does split-dwarf help when we use shared-lib build of LLVM?

Martin says:
October 24, 2018 at 8:34 pm

It should also help. The effect won’t be as dramatic, but it’s worth the effort.

Split DWARF - Explanation

Uploaded by

Split DWARF - Explanation

Uploaded by

3/20/22, 8:46 PM Improving C++ Builds with Split DWARF – Productive C++

Home About Recommended Books Links Privacy Policy

Home » Compilers » Improving C++ Builds with Split DWARF

Improving C++ Builds with Split DWARF Search...

October 8, 2018 Martin 3 Comments

compile jobs across multiple machines.

What Makes Incremental Builds Slow?

Introducing Split DWARF

1 -rwxrwxr-x 1 prodcpp prodcpp 20256 Oct 7 23:39 app*

Home Aboutyou need

A Real-Life Example: llvm

Now, with split DWARF:

Let’s look at the numbers:

With split DWARF:

Packaging a Release from a Split DWARF Build

Continuing our example:

² Debug Fission is another name for this technique.

 Hackernews  Reddit  Subscribe  Tweet  Share  Weibo  Pocket

3 Replies to “Improving C++ Builds with Split DWARF”

Thomas Sondergaard says:

* A full build “ninja install” with clean ccache is 1% slower (10m12s

* A full build “ninja install” with fully populated ccache is 25%

* Finally and most important, an incremental build where

Gabor Horvath says:

Does split-dwarf help when we use shared-lib build of LLVM?

You might also like