2022年总结

一如既往,主要在工具链领域耕耘。给这些high-profile OSS贡献的时候,希望透过这个微小的角度改变世界。

Highlights

  • RELR relative relocation format (glibc, musl, DynamoRIO)
  • zstd compressed debug sections (binutils, gdb, clang, lld/ELF, lldb)
  • lld/ELF (huge performance improvement, RISC-V linker relaxation, SHT_RISCV_ATTRIBUTES)
  • Clang built glibc (get the ball rolling)
  • Make protected symbols work in binutils/glibc
  • Involved in sanitizers, ThinLTO, AArch64/x86 hardening features, AArch64 Memtag ABI, RISC-V psABI, etc

RELR relative relocation format

  • (In 2021-10, upstreamed DT_RELR patch to FreeBSD rtld-elf)
  • In April, upstreamed DT_RELR patch to glibc (highlighted feature for the 2.36 release)
  • In August, upstreamed DT_RELR patch to musl (milestone: 1.2.4)
  • Upstreamed DT_RELR patch to DynamoRIO
  • Contributed an unmerged gold patch

Relative relocations and RELR

Carlos O'Donell said:

It's exactly that, .rela.dyn is 30x larger than .rela.plt in glibc. I applaud Fangrui Song's efforts here to move DT_RELR forward. If you're going to do one thing that has high impact and move multiple communities forward then picking .rela.dyn is the section to pick.

zstd compressed debug sections

  • Added zstd support to gas, ld.bfd, gold, gdb, objcopy, readelf, objdump, addr2line, etc
  • Added zstd support to clang, ld.lld, lldb, llvm-objcopy, llvm-symbolizer, llvm-dwarfdump, etc

zstd compressed debug sections

lld/ELF

RISC-V

陈枝懋 added initial RISC-V support for non-PIC in 2018. I added PIC and TLS support in 2019 (acknowledgements by lowRISC). The port was mature but linker relaxation was the last main piece to bring feature parity with GNU ld. This year I

  • Implemented RISC-V linker relaxation (acknowledgement)
  • Implemented SHT_RISCV_ATTRIBUTES merge support which has a niche value
  • Implemented DT_RISCV_VARIANT_CC

RISC-V linker relaxation in lld

Performance

I spent some weekends improving the performance of lld/ELF this year. Let's compare an lld 13 built with latest Clang (/tmp/out/custom0/bin/lld) with latest lld built with latest Clang (/tmp/out/custom2/bin/lld).

Link a -DCMAKE_BUILD_TYPE=Release -DLLVM_ENABLE_ASSERTIONS=on build of clang 16:

1
2
3
4
5
6
lld 13:     Time (mean ± σ):     687.1 ms ±   7.1 ms    [User: 642.6 ms, System: 431.7 ms]
latest lld: Time (mean ± σ): 422.9 ms ± 5.3 ms [User: 579.8 ms, System: 470.7 ms]

Summary
'numactl -C 32-39 /tmp/out/custom2/bin/lld -flavor gnu @response.txt --threads=8' ran
1.62 ± 0.03 times faster than 'numactl -C 32-39 /tmp/out/custom0/bin/lld -flavor gnu @response.txt --threads=8'

Link a -DCMAKE_BUILD_TYPE=Debug build of clang 16:

1
2
3
4
5
6
lld 13:     Time (mean ± σ):      4.494 s ±  0.039 s    [User: 7.516 s, System: 2.909 s]
latest lld: Time (mean ± σ): 3.174 s ± 0.037 s [User: 7.361 s, System: 3.202 s]

Summary
'numactl -C 32-39 /tmp/out/custom2/bin/lld -flavor gnu @response.txt --threads=8' ran
1.42 ± 0.02 times faster than 'numactl -C 32-39 /tmp/out/custom0/bin/lld -flavor gnu @response.txt --threads=8'

This great speedup is achieved by

See lld 14 ELF changes and lld 15 ELF changes for detail. As usual, I wrote the ELF port's release notes for the two releases.

Clang built glibc (get the ball rolling)

glibc is probably the most prominent OSS which cannot be built with Clang yet. I sent some patches last year and made a few this year. See my notes from the last year: When can glibc be built with Clang?

This year Adhemerval Zanella from Linaro maintained a local branch to fix aarch64/i386/x86_64 builds. I reviewed some of his patches.

It seems that such work will benefit some research projects. For example, Intel FineIBT used a GRTE branch of glibc.

llvm-project

  • C++/ObjC++: switch to gnu++17 as the default standard (fixed many tests)
  • --gcc-install-dir=: use clang++ --gcc-install-dir=/usr/lib/gcc/x86_64-linux-gnu/12 to use the selected GCC installation directory. Gentoo uses this (/etc/clang/gentoo-gcc-install.cfg) to select the configured GCC installation
  • Defaulted to -fsanitize-address-use-odr-indicator
  • Fixed a long-term bug related to local linkage GlobalValue in non-prevailing COMDAT, exposed in (Thin)LTO+PGO
  • Fixed some -fdebug-prefix-map= issues for debug information for assembly sources
  • Supported SOURCE_DATE_EPOCH in Clang
  • Helped some opaque pointers migration
  • Helped legacy pass manager deprecation

Reviewed many commits. A lot of people don't add a Reviewed By: tag. Anyway, counting commits with the tag can give an underestimate.

1
2
% git shortlog -sn bfc8f76e60a8efd920dbd6efc4467ffb6de15919.. --grep 'Reviewed .*MaskRay' | awk '{s+=$1}END{print s}'
386

My number of commits exceeded 4000 this year. Many are clean-up commits or fixup for others' work. I hope that I can do more useful work next year.

binutils

Reported many bugs and feature requests:

My commits:

  • ar: Add --thin for creating thin archives
  • ld: Support customized output section type
  • objcopy --weaken-symbol: apply to STB_GNU_UNIQUE symbols
  • gas: copy st_size only if unset
  • gas: Port "copy st_size only if unset" to aarch64 and riscv
  • aarch64: Disallow copy relocations on protected data
  • aarch64: Define elf_backend_extern_protected_data to 0 [PR 18705]
  • aarch64: Allow PC-relative relocations against protected STT_FUNC for -shared
  • arm: Define elf_backend_extern_protected_data to 0 [PR 18705]
  • x86: Make protected symbols local for -shared
  • RISC-V: Remove R_RISCV_GNU_VTINHERIT/R_RISCV_GNU_VTENTRY
  • binutils, gdb: support zstd compressed debug sections
  • libctf: Add ZSTD_LIBS to LIBS so that ac_cv_libctf_bfd_elf can be true
  • sim: Link ZSTD_LIBS
  • ld: Add --undefined-version
  • readelf: support zstd compressed debug sections [PR 29640]
  • gold, dwp: support zstd compressed input debug sections [PR 29641]
  • gold: add --compress-debug-sections=zstd [PR 29641]

glibc

27 commits. Some work on the dynamic loader. Notable commits:

The longstanding problem that glibc did wacky things with copy relocations/canonical PLT entries on protected data/functions symbols are correctly unsupported. I hope that future GCC can stop using indirect access for external protected symbol accesses. See Copy relocations, canonical PLT entries and protected visibility for detail.

Linux kernel

4 commits. Fixed linux-perf when unwinding ld.lld linked objects. Consulted on a number of toolchain questions.

Blog

Wrote 25 blog posts (including this one, mainly about toolchains) and revised many posts initially written in 2020 and 2021.

Misc

I was added as a collaborator of riscv-non-isa/riscv-elf-psabi-doc.

Learned some Jai programming language. It has many exciting ideas. Solved some algorithm challenges with Nim. Unfortunately my workflow transpiling Nim to standalone C somehow broke later this year.

Trips: Tucson, Greater Los Angeles, Cambridge and Boston, Denver, San Diego, Bellevue and Seattle, Kalispell, Washington, Honolulu.

Mastodon: https://hachyderm.io/@meowray