Updated in 2024-01.
This article describes target-specific details about x86 in ELF linkers. I will use "x86" to refer to both x86-32 and x86-64.
Updated in 2024-01.
This article describes target-specific details about x86 in ELF linkers. I will use "x86" to refer to both x86-32 and x86-64.
Clang and GCC 4.9
implemented LeakSanitizer in 2013. LeakSanitizer
(LSan) is a memory leak detector. It intercepts memory allocation
functions and by default detects memory leaks at atexit
time. The implementation is purely in the runtime
(compiler-rt/lib/lsan
) and no instrumentation is
needed.
LSan has very little architecture-specific code and supports many 64-bit targets. Some 32-bit targets (e.g. Linux arm/x86-32) are supported as well, but there may be high false negatives because pointers with fewer bits are more easily confused with integers/floating points/other data of a similar pattern. Every supported operating system needs to provide some way to "stop the world".
Updated in 2023-12.
GCC supports some function attributes for function multi-versioning: a way for a function to have multiple implementations, each using a different set of ISA extensions. A function attribute specifies different requirements of ISA extensions. The generated program decodes the CPU model and features at run-time, and picks the most restrictive implementation that is satisfied by the CPU, assuming that the most restrictive implementation has the best performance.
Updated in 2024-08.
UndefinedBehaviorSanitizer (UBSan) is an undefined behavior detector for C/C++. It consists of code instrumentation and a runtime. Both components have multiple independent implementations.
Clang implemented
the first few checks in 2009-12, initially named
-fcatch-undefined-behavior
. In 2012
-fsanitize=undefined
was added and
-fcatch-undefined-behavior
was removed.
GCC 4.9 implemented
-fsanitize=undefined
in 2013-08.
The runtime used by Clang lives in
llvm-project/compiler-rt/lib/ubsan
. GCC from time to time
syncs its downstream fork of the sanitizers part of compiler-rt
(libsanitizer
). The end of the article lists some
alternative runtime implementations.
Many sanitizers want to know every function in the program. User functions are instrumented and therefore known by the sanitizer runtime. For library functions, some (e.g. mmap, munmap, memory allocation/deallocation functions, longjmp, vfork) need special treatment. Sanitizers leverage symbol interposition to redirect such function calls to its own implementation: interceptors. Other library functions can be treated as normal user code. Either instrumenting the function or providing an interceptor is fine.
In some cases instrumenting is infeasible:
mem*
and str*
)And interceptors may be the practical choice.
This article talks about how interceptors work and the requirements of sanitizer interceptors.
一如既往,主要在工具链领域耕耘。给这些high-profile OSS贡献的时候,希望透过这个微小的角度改变世界。
SHT_RISCV_ATTRIBUTES
)本文总结经典的区间第k小值数据结构题。
给定一个长为n的数组,元素为范围为[0,σ)
的整数。有m个询问:求区间[l,r)中第k小的元素。
一些方法支持扩展问题:有m个操作,或者修改某个位置上的元素,或者询问区间[l,r)中第k小的元素。
Updated in 2023-05.
A control-flow graph (CFG) is a graph representation of all paths that might be traversed through a program during its execution. Control-flow integrity (CFI) refers to security policy dictating that program execution must follow a control-flow graph. This article describes some features that compilers and hardware can use to enforce CFI, with a focus on llvm-project implementations.
CFI schemes are typically divided into forward-edge (e.g. indirect calls) and backward-edge (mainly function returns). It should be noted that exception handling and symbol interposition are not included in these categories, as far as my understanding goes.
Updated in 2023-09.
In GNU ld, -r
produces a relocatable object file. This
is known as relocatable linking or partial linking. This mode suppresses
many passes done for an executable or shared object output (in
-no-pie/-pie/-shared
modes). -r
,
-no-pie
, -pie
, and -shared
specify 4 different modes. The 4 options are mutually exclusive.
The relocatable output can be used for analysis and binary manipulation. Then, the output can be used to link the final executable or shared object.
1 | clang -pie a.o b.o |
Let's go through various linker passes and see how relocatable linking changes the operation.
This article describes how to detect C++ One Definition Rule (ODR) violations. There are many good resources on the Internet about how ODR violations can introduce subtle bugs, so I will not repeat that here.