Linker notes on AArch64

This article describes target-specific details about AArch64 in ELF linkers. AArch64 is the 64-bit execution state for the Arm architecture. The AArch64 execution state runs the A64 instruction set. The AArch32 and AArch64 execution states use very different instruction sets, so many pieces of software use two ports for the two execution states of the Arm architecture.

There were the "ARM architecture" and the "ARM instruction set", leading to many software projects using "ARM" or "arm" as their port names. In 2011, ARMv8 introduced two execution states, AArch32 and AArch64. The previous instruction sets "ARM" and "Thumb" were renamed to "A32" and "T32", respectively. In 2017, the architecture was renamed to the "Arm architecture" to reflect the rebranding of the company name. So, the "ARMv8-A" architecture profile is now named "Armv8-A".

For the AArch64 execution state, while many projects use "AArch64" as their port name, for legacy reasons, macOS, Windows, the Linux kernel, and some BSD operating systems unfortunately use "arm64". (Support for AArch64 was added to the Linux kernel in version 3.7. Initially, the patch set was named "aarch64", but it was later changed at the request of kernel developers.)

Read More

Linker notes on Power ISA

This article describes target-specific details about Power ISA in ELF linkers. Initially there was IBM POWER. The 1991 Apple–IBM–Motorola alliance created PowerPC. In 2006, the architecture was rebranded as Power ISA. According to the ISA manual, "In 2006, Freescale and IBM collaborated on the creation of the Power ISA Version 2.03, which represented the reunification of the architecture by combining Book E content with the more general purpose PowerPC Version 2.02."

The terms "PowerPC" and "powerpc" remain popular in numerous places, including the powerpc-*-*-* and powerpc64-*-*-* in official target triple names. The abbreviation "PPC" ("ppc") is used in numerous places as well. For simplicity, I will refer to the 32-bit architecture as "PPC32" and the 64-bit architecture as "PPC64".

We will see how the lack of PC-relative addressing before Power10 has caused great complexity to the ABI and linkers.

Read More

All about LeakSanitizer

Clang and GCC 4.9 implemented LeakSanitizer in 2013. LeakSanitizer (LSan) is a memory leak detector. It intercepts memory allocation functions and by default detects memory leaks at atexit time. The implementation is purely in the runtime (compiler-rt/lib/lsan) and no instrumentation is needed.

LSan has very little architecture-specific code and supports many 64-bit targets. Some 32-bit targets (e.g. Linux arm/x86-32) are supported as well, but there may be high false negatives because pointers with fewer bits are more easily confused with integers/floating points/other data of a similar pattern. Every supported operating system needs to provide some way to "stop the world".

Read More

Function multi-versioning

Updated in 2023-12.

GCC supports some function attributes for function multi-versioning: a way for a function to have multiple implementations, each using a different set of ISA extensions. A function attribute specifies different requirements of ISA extensions. The generated program decodes the CPU model and features at run-time, and picks the most restrictive implementation that is satisfied by the CPU, assuming that the most restrictive implementation has the best performance.

Read More

All about UndefinedBehaviorSanitizer

Updated in 2024-08.

UndefinedBehaviorSanitizer (UBSan) is an undefined behavior detector for C/C++. It consists of code instrumentation and a runtime. Both components have multiple independent implementations.

Clang implemented the first few checks in 2009-12, initially named -fcatch-undefined-behavior. In 2012 -fsanitize=undefined was added and -fcatch-undefined-behavior was removed. GCC 4.9 implemented -fsanitize=undefined in 2013-08.

The runtime used by Clang lives in llvm-project/compiler-rt/lib/ubsan. GCC from time to time syncs its downstream fork of the sanitizers part of compiler-rt (libsanitizer). The end of the article lists some alternative runtime implementations.

Read More

All about sanitizer interceptors

Many sanitizers want to know every function in the program. User functions are instrumented and therefore known by the sanitizer runtime. For library functions, some (e.g. mmap, munmap, memory allocation/deallocation functions, longjmp, vfork) need special treatment. Sanitizers leverage symbol interposition to redirect such function calls to its own implementation: interceptors. Other library functions can be treated as normal user code. Either instrumenting the function or providing an interceptor is fine.

In some cases instrumenting is infeasible:

  • Assembly source files usually do not (or are inconvenient to) call sanitizer callbacks
  • Many libc implementations cannot be instrumented. When can glibc be built with Clang?
  • Some functions have performance issues if instrumented instead of intercepted (mostly mem* and str*)

And interceptors may be the practical choice.

This article talks about how interceptors work and the requirements of sanitizer interceptors.

Read More

2022年总结

一如既往,主要在工具链领域耕耘。给这些high-profile OSS贡献的时候,希望透过这个微小的角度改变世界。

Highlights

  • RELR relative relocation format (glibc, musl, DynamoRIO)
  • zstd compressed debug sections (binutils, gdb, clang, lld/ELF, lldb)
  • lld/ELF (huge performance improvement, RISC-V linker relaxation, SHT_RISCV_ATTRIBUTES)
  • Clang built glibc (get the ball rolling)
  • Make protected symbols work in binutils/glibc
  • Involved in sanitizers, ThinLTO, AArch64/x86 hardening features, AArch64 Memtag ABI, RISC-V psABI, etc

Read More

kth element in a subarray

本文总结经典的区间第k小值数据结构题。 给定一个长为n的数组,元素为范围为[0,σ)的整数。有m个询问:求区间[l,r)中第k小的元素。

一些方法支持扩展问题:有m个操作,或者修改某个位置上的元素,或者询问区间[l,r)中第k小的元素。

Read More