Updated in 2022-10.
In January I wrote Compressed debug
sections. The venerable zlib shows its age and there are
replacements which are better in every metric except adoption and a
larger memory footprint. The obvious choice was Zstandard, but I was not
so confident about adoptinig it and solving the ecosystem issue. At any
rate, I slowly removed some legacy .zdebug
support from
llvm-project so that a new format could be more easily introduced.
In June, Cole Kissane posted [RFC]
Zstandard as a second compression method to LLVM on LLVM discourse
forums. I learned that other folks were investigating a better
compression format for ELF compressed debug sections and told myself:
it's high time to propose ELFCOMPRESS_ZSTD
to the generic
System V Application Binary Interface (generic ABI).
ELF is an elegant format which has passed the test of time. Many things created by the forefathers from 30 years ago carry over and are still used today. Every new feature, even a small addition like introducing a new constant has to pass a significant high bar for acceptance. There were many discussions on Add new ch_type value: ELFCOMPRESS_ZSTD.
Personally I think a selected format need to have these properties:
- It has an open compression algorithm and implementation.
- It provides significant benefits (compression speed, decompression speed, compression ratio) with a decent memory footprint and complexity.
- It has full backward compatibility. In 20 years I want to be able to decompress a debug section created today.
- It has a wide range of and active use cases. When the format value is standardized, consumers are willing to add support.
- It has good documentation.
- It's easy to use.
A compression format satisfying all these properties are rare. ELF does not like introducing a lot of options for one feature. It's not an experiment site for every new fancy compression format. We are wary of platform fragmentation and consumers don't like support a number of formats each claiming to be a good choice at a slightly different angle. See the appendix for my recent test of many compression utilities.
I made many arguments in the proposal thread. It took about one month
and ELFCOMPRESS_ZSTD
was accepted
in 2022-07.
Toolchain support
The next step is to add toolchain support. The most important pieces are assemblers, linkers, and debuggers. Many other pieces are needed as well.
Toolchain components:
- binutils: all
implemented as of 2022-11
- addr2line: symbolization needs to decompress debug sections
- gas: compress debug sections
- ld, gold: decompress compressed input sections and compress output debug sections. Implemented
- dwp: decompress compressed
.dwo
. dwp uses gold's code - nm:
--line-numbers
uses debug information - objcopy:
--decompress-debug-sections
and--compress-debug-sections=zstd
- objdump:
--dwarf
decompresses compressed debug sections - readelf:
--debug-dump
and--decompress
decompress compressed sections. feature request
- gdb: implemented
- decompress compressed debug sections in executables, shared objects,
separate debug files, and
.dwo
files. Feature request - MiniDebugInfo section
.gnu_debugdata
is compressed with xz. zstd feature request
- decompress compressed debug sections in executables, shared objects,
separate debug files, and
- GCC: 13.0 will
support
-gz=zstd
- llvm-project: all implemented as of 2022-09 (milestone: 16.0.0). The
default
LLVM_ENABLE_ZSTD=on
needs a CMake config file to take effects.- Clang: compress
.o
and (if split DWARF is enabled).dwo
with level 5 - llvm-objcopy:
--decompress-debug-sections
and--compress-debug-sections=zstd
(level 5). Implemented in D130458 (ELFCLASS64) and D134385 (ELFCLASS32) - ld.lld: decompress
ELFCOMPRESS_ZSTD
input sections (D129406) and compress output debug sections with level 3 (D133548, D133679) - llvm-dwarfdump: use LLVMObject API to decompress
ELFCOMPRESS_ZSTD
input sections (D134116) - llvm-dwp: use LLVMObject API
- llvm-symbolizer: use LLVMObject API
- lldb: use LLVMObject API
- Clang: compress
- elfutils: implemented in 2022-12
- mold: implemented in 2022-09
Other languages:
Other utilities:
- bloaty: its
-d compileunits
parses DWARF. Feature request - dwz
llvm-project support
On the llvm-project side, there was a lot of debate on how the API
should look like. In the week of 2022-09-09 we (Cole Kissane, David
Blaikie, I) reached an agreement that the free function style
compression API was acceptable. I have pushed some changes and
llvm-objcopy --compress-debug-sections=zstd
,
clang -gz=std
,
ld.lld --compress-debug-sections=zstd
are available now.
Note that I chose to implement llvm-objcopy support before others so
that I could test other components with llvm-objcopy.
1 | % cat a.cc |
ELFCOMPRESS_ZSTD
(2) can be identified by the first 4
bytes. In a little-endian object file, it displays as
02000000
.
If llvm-objcopy is built with zstd support, use
--decompress-debug-sections
to decompress an object file:
1
2
3
4
5
6
7
8% llvm-objcopy --decompress-debug-sections a.o a.o.decompressed
% readelf -x .debug_info a.o.decompressed
Hex dump of section '.debug_info':
NOTE: This section has relocations against it, but these have NOT been applied to this dump.
0x00000000 16180000 05000108 00000000 01002100 ..............!.
0x00000010 01000000 00000000 00020000 00000000 ................
...
On the llvm-project side we reached full feature readiness in 2022-10.
It would be nice that someone picks up the work items on the GNU side so that many Linux distributions can start investigating the adoption of zstd compressed debug sections.
GNU toolchain support
The main changes were for the binutils-gdb repository. This work turned out to be much more challenging than my work for llvm-project.
The entry points of zstd compression features were in binutils, gas, and ld. binutils and ld use bfd, so we needed to update bfd.
I created config/zstd.m4
by following
config/zlib.m4
. AC_ZSTD
in
config/zstd.m4
defines ZLIB_CFLAGS
and
ZLIB_LDLIBS
. After plumbing it into
bfd/configure.ac
and bfd/Makefile.am
, I needed
to adding AC_ZSTD
to every top-level project which uses bfd
as bfd is linked as an archive and there is no good transitive
dependency support.
Here was the change for bfd/Makefile.am
. The pattern
needed to be repeated in many other directories. 1
2
3
4
5
6
7
8--- a/bfd/Makefile.am
+++ b/bfd/Makefile.am
@@ -60 +60 @@ NO_WERROR = @NO_WERROR@
-AM_CFLAGS = $(WARN_CFLAGS) $(ZLIBINC)
+AM_CFLAGS = $(WARN_CFLAGS) $(ZLIBINC) $(ZSTD_CFLAGS)
@@ -779 +779 @@ libbfd_la_DEPENDENCIES = $(OFILES) ofiles
-libbfd_la_LIBADD = `cat ofiles` @SHARED_LIBADD@ $(LIBDL) $(ZLIB)
+libbfd_la_LIBADD = `cat ofiles` @SHARED_LIBADD@ $(LIBDL) $(ZLIB) $(ZSTD_LIBS)
1 | % rg -l --sort=path ZSTD_LIBS |
Remember to update auto-generated files with the appropriate versions
of autoconf and automake: 1
PATH=~/projects/automake-1.15.1/bin:$PATH ~/projects/autoconf-2.69/bin/autoreconf -vf bfd binutils gas ld libctf sim
1 | make -C bfd headers |
Some bfd/ file changes require updating bfd/bfd-in2.h
with make -C $build/bfd headers
. 1
2
3
4
5
6
7
8/* DO NOT EDIT! -*- buffer-read-only: t -*- This file is automatically
generated from "bfd-in.h", "init.c", "opncls.c", "libbfd.c",
"bfdio.c", "bfdwin.c", "section.c", "archures.c", "reloc.c",
"syms.c", "bfd.c", "archive.c", "corefile.c", "targets.c", "format.c",
"linker.c", "simple.c" and "compress.c".
Run "make headers" in your build bfd/ to regenerate. */
/* Main header file for the bfd library -- portable access to object files.
Appendix
(Conducted the experiment in 2022-10.) I have a
-DCMAKE_BUILD_TYPE=Debug -DLLVM_TARGETS_TO_BUILD=all
build
of trunk clang. The 3 largest DWARF v5 debug sections are
.debug_info
, .debug_str
, and
.debug_line
. 1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25% bloaty clang-16
FILE SIZE VM SIZE
-------------- --------------
32.3% 451Mi 0.0% 0 .debug_info
16.4% 229Mi 0.0% 0 .debug_str
11.4% 159Mi 0.0% 0 .debug_line
11.3% 157Mi 55.0% 157Mi .text
8.0% 112Mi 0.0% 0 .strtab
5.7% 80.2Mi 0.0% 0 .debug_str_offsets
3.6% 50.4Mi 17.6% 50.4Mi .rodata
2.4% 33.6Mi 0.0% 0 .debug_addr
2.2% 30.8Mi 10.7% 30.8Mi .eh_frame
1.7% 24.0Mi 0.0% 0 .symtab
1.0% 13.6Mi 4.7% 13.6Mi .rela.dyn
1.0% 13.4Mi 0.0% 0 .debug_rnglists
0.8% 11.0Mi 3.8% 11.0Mi .dynstr
0.7% 10.5Mi 3.7% 10.5Mi .data.rel.ro
0.6% 8.05Mi 0.0% 0 .debug_abbrev
0.5% 7.69Mi 2.7% 7.69Mi .eh_frame_hdr
0.2% 2.79Mi 1.0% 2.79Mi .dynsym
0.1% 848Ki 0.3% 848Ki .gnu.hash
0.1% 827Ki 0.1% 263Ki [21 Others]
0.0% 0 0.2% 544Ki .bss
0.0% 497Ki 0.2% 497Ki .data
100.0% 1.37Gi 100.0% 286Mi TOTAL
ninja -t commands bin/clang
dumps the compiler driver
command which links the executable. Invoke the command with
-fuse-ld=lld -Wl,--prproduce=/tmp/clang-debug.tar
to get a
tarball. Use llvm-objcopy --dump-section
to extract a
section.
1 | cd /tmp |
I have tried brotli, bzip2, gzip, lz4, lzo, pigz, xz, zstd, and manually verified that zstd is the best considering compression speed, decompression speed, and compression ratio. Figuring out API for all these libraries will be inconvenient. So I take a shortcut: install these compression utilities with the package manager and hope that they use similar compiler driver options and the comparison is relative fair.
Here are some results:
1 |
|
1 | % numactl -C 20 ./bench.sh debug_info |
When compressing debug sections, zstd and brotli are significantly better than the other choices. zstd slightly outperforms brotli in compression speed and compression ratio while being much fast at decompression.
xz -3
has a great compression ratio (higher levels are
too slow). zstd and brotli with higher levels are extremely slow and can
hardly achieve the xz compression ratio, but their decompression speed
may compensate for that.
zlib (used by pigz) and bzip2 look pretty bad.
For zstd, the built-in parallel compression support is a plus.