Updated in 2024-01.
This article describes target-specific details about x86 in ELF linkers. I will use "x86" to refer to both x86-32 and x86-64.
Updated in 2024-01.
This article describes target-specific details about x86 in ELF linkers. I will use "x86" to refer to both x86-32 and x86-64.
Clang and GCC 4.9
implemented LeakSanitizer in 2013. LeakSanitizer
(LSan) is a memory leak detector. It intercepts memory allocation
functions and by default detects memory leaks at atexit
time. The implementation is purely in the runtime
(compiler-rt/lib/lsan
) and no instrumentation is
needed.
LSan has very little architecture-specific code and supports many 64-bit targets. Some 32-bit targets (e.g. Linux arm/x86-32) are supported as well, but there may be high false negatives because pointers with fewer bits are more easily confused with integers/floating points/other data of a similar pattern. Every supported operating system needs to provide some way to "stop the world".
Updated in 2023-12.
GCC supports some function attributes for function multi-versioning: a way for a function to have multiple implementations, each using a different set of ISA extensions. A function attribute specifies different requirements of ISA extensions. The generated program decodes the CPU model and features at run-time, and picks the most restrictive implementation that is satisfied by the CPU, assuming that the most restrictive implementation has the best performance.
Updated in 2025-01.
UndefinedBehaviorSanitizer (UBSan) is an undefined behavior detector for C/C++. It consists of code instrumentation and a runtime. Both components have multiple independent implementations.
Clang implemented
the first few checks in 2009-12, initially named
-fcatch-undefined-behavior
. In 2012
-fsanitize=undefined
was added and
-fcatch-undefined-behavior
was removed.
GCC 4.9 implemented
-fsanitize=undefined
in 2013-08.
The runtime used by Clang lives in
llvm-project/compiler-rt/lib/ubsan
. GCC from time to time
syncs its downstream fork of the sanitizers part of compiler-rt
(libsanitizer
). The end of the article lists some
alternative runtime implementations.
Updated in 2025-08.
Many sanitizers want to know every function in the program. User functions are instrumented and therefore known by the sanitizer runtime. For library functions, some (e.g. mmap, munmap, memory allocation/deallocation functions, longjmp, vfork) need special treatment. Sanitizers leverage symbol interposition to redirect such function calls to its own implementation: interceptors. Other library functions can be treated as normal user code. Either instrumenting the function or providing an interceptor is fine.
In some cases instrumenting is infeasible:
mem*
and str*
)And interceptors may be the practical choice.
This article talks about how interceptors work and the requirements of sanitizer interceptors.
一如既往,主要在工具链领域耕耘。给这些high-profile OSS贡献的时候,希望透过这个微小的角度改变世界。
SHT_RISCV_ATTRIBUTES
)本文总结经典的区间第k小值数据结构题。
给定一个长为n的数组,元素为范围为[0,σ)
的整数。有m个询问:求区间[l,r)中第k小的元素。
一些方法支持扩展问题:有m个操作,或者修改某个位置上的元素,或者询问区间[l,r)中第k小的元素。
Updated in 2025-05.
A control-flow graph (CFG) is a graph representation of all paths that might be traversed through a program during its execution. Control-flow integrity (CFI) refers to security policy dictating that program execution must follow a control-flow graph. This article describes some features that compilers and hardware can use to enforce CFI, with a focus on llvm-project implementations.
CFI schemes are typically divided into forward-edge (e.g. indirect calls) and backward-edge (mainly function returns). It should be noted that exception handling and symbol interposition are not included in these categories, as far as my understanding goes.
Updated in 2025-02.
In GNU ld, -r
produces a relocatable object file. This
is known as relocatable linking or partial linking. This mode suppresses
many passes done for an executable or shared object output (in
-no-pie/-pie/-shared
modes). -r
,
-no-pie
, -pie
, and -shared
specify 4 different modes. The 4 options are mutually exclusive.
The relocatable output can be used for analysis and binary manipulation. Then, the output can be used to link the final executable or shared object.
1 | clang -pie a.o b.o |
Let's go through various linker passes and see how relocatable linking changes the operation.
This article describes how to detect C++ One Definition Rule (ODR) violations. There are many good resources on the Internet about how ODR violations can introduce subtle bugs, so I will not repeat that here.