Static initialization order fiasco and ELF
Table of Contents
- "Static initialization order fiasco" bugs
- ELF initialization and termination functions
- glibc behavior
ld.lld -Wl,--shuffle-sections=.init_array=-1,--shuffle-sections=.fini_array=-1
Initialization
- Constant initialization and zero initialization
- Dynamic initialization
main- Deferred dynamic initialization (e.g. optimized out, on-demand shared library)
Dynamic initialization
- Unordered dynamic initialization (static data members and variable templates not explicitly specialized)
- Partially-ordered dynamic initialization (inline variables that are not an implicitly or explicitly instantiated specialization)
- Ordered dynamic initialization (other non-local variables)
1 | struct A { A(); }; |
[basic.start.dynamic]: If V and W have ordered initialization and the definition of V is appearance-ordered before the definition of W, or if V has partially-ordered initialization, W does not have unordered initialization, and for every definition E of W there exists a definition D of V such that D is appearance-ordered before E, then ... V is sequenced before the initialization of W ... otherwise the initializations of V and W are indeterminately sequenced.
If no appearance-ordered relationship, two initializations are indeterminately sequenced. Two initializations in different TUs have an unspecified order.
Static initialization order fiasco
1 | // a.cc |
1 | // registry.cc |
- SIGSEGV
- unpredicted order of a collection
- Use an unconstructed object (initial value is usually zero) to construct another object
Solutions
- constant initialization
- lazy initialization (dynamic initialization of function-locale static, llvm::ManagedStatic, etc)
- manual initialization
- constexpr
- constinit (constexpr - const)
clang -Wglobal-constructors:warning: declaration requires a global constructor [-Wglobal-constructors]
GCC extension
GCC supports __attribute__((constructor)) which can make
an arbitrary function to be called before main.
In addition, a constructor can have an optional priority. Priorities
from 0 to 100 are reserved for the implementation
(-Wprio-ctor-dtor).
A constructor runs before another with a larger priority.
For example, gcov uses
__attribute__((destructor(100))).
Applications can use 101 to 65535. 65535 (.init_array or
.ctors, without a suffix) has the same priority as a
dynamic initialization in C++.
GCC supports A a __attribute__((init_priority(2000)));
which can control the priority of a C++ dynamic initialization.
.ctors and .init_array
- C++ dynamic initialization
- GNU function attribute
__attribute__((constructor)) - Assembly (rare), e.g.
.section .init_array,"aw",@init_array
.dtors and .fini_array are for
finalization/termination/destruction/cleanup.
- System V Release 4: The implementation processes the
DT_INIT(_init) function, which executes.ctorsbackwards - HP-UX: The implementation processes the
DT_INIT_ARRAYarray forwards
For GCC, the scheme is fixed at configure time. Since 4.7
.init_array is the default for Linux.
Clang is a natural cross compiler. -fno-use-init-array
(default on PS4 (FreeBSD 9) and MinGW) selects the
.ctors/.dtors scheme.
-fuse-init-array (default elsewhere) selects the
.init_array/.fini_array scheme.
Example
1 | // Google C++ Style Guide: Dynamic initialization of nonlocal |
1 | .section .text.startup,"ax",@progbits |
GNU ld
GNU ld can merge input .init_array* and
.ctors* into the output .init_array. The
sorting may not be sound if both are used, though...
1 | /* Fragment of GNU ld's internal linker script */ |
The next slide uses ld.lld, which does not do the smart merging.
Linker behavior
1 | a.o:(.init_array.101) b.o:(.init_array.101) |
1 | ## ctors_priority = 65535-init_array_priority |
Note the leading zero digits. crtbegin/crtend are sentinels. ld uses
strcmp to compare two non-sentinel sections.
The .ctors approach uses magic sentinel values.
_init uses fragmented functions which is a bad
practice.
Its elements are run in the reversed order. This is a clever trick to make static linking similar to dynamic linking.
Dedicated section types (SHT_INIT_ARRAY
SHT_FINI_ARRAY) are a good practice.
The elements are run in the forward order.
C++ dynamic initialization does not promise an order between two TUs. However, an implementation has to define an order for determinism. (read "reproducible builds")
Determinism became promise. Promise became bugs.
glibc ld.so and libc behavior
- ld.so runs
DT_INITandDT_INIT_ARRAYin shared objects. If a.so depends on b.so, a.so's ctors run after b.so's - libc_nonshared.a runs
DT_INITandDT_INIT_ARRAYin the executable - crtbegin.o defines a
_initfragment which calls.ctorsconstructors
Order:
- ld.so runs
c.so:DT_INIT. The crtbegin.o fragment of_initcalls.ctors - ld.so runs
c.so:DT_INIT_ARRAY - ld.so runs
b.so:DT_INIT. The crtbegin.o fragment of_initcalls.ctors - ld.so runs
b.so:DT_INIT_ARRAY - libc_nonshared.a runs
a:DT_INIT. The crtbegin.o fragment of_initcalls.ctors - libc_nonshared.a runs
a:DT_INIT_ARRAY
On modern Linux/*BSD distributions, crtbegin.o no longer calls
.ctors. glibc's RISC-V port doesn't even define
ELF_INIT_FINI, i.e. DT_INIT is gone.
Example
1 | // a.cc -> a |
1 | % clang -fpic -shared b.cc -o b.so |
A Bazel example
--dynamic_mode=offuses*.a(or--start-libwhich has similar semantics)--dynamic_mode=fullyuses*.so
1 | cat > ./a0.cc <<eof |
1 | sed 's/X/b0/g' x.cc > b0.cc |
Re-think .ctors vs .init_array
- b.a + c.a with
.ctors: c1, c0, b1, b0, a1, a0 - b.so + c.so with
.ctors: c1, c0, b1, b0, a1, a0 - b.a + c.a with
.init_array: a0, a1, b0, b1, c0, c1 - b.so + c.so with
.init_array: c1, c0, b1, b0, a1, a0
1 | ## .init_array |
Re-think .ctors vs .init_array (cont)
With .ctors, --dynamic_mode=off and
--dynamic_mode=fully have similar orders. The orders may be
different when the glibc topological order does not agree with the build
system dependency order.
Corollary: We don't get sufficient coverage. Re-organizing libraries may expose lurking bugs.
With .init_array, --dynamic_mode=off and
--dynamic_mode=fully have very different orders.
cc_test+cc_binary give a good coverage. However, the order within a
cc_library is identical.
With .init_array and --dynamic_mode=off,
testing both the regular order and the reversed order gives a good
coverage.
If both a/b/c and c/b/a work, we have confidence that a, b, c unlikely have order dependency.
The ELF specification guarantee "if a.so depends on b.so, then b.so's constructors run first" can hardly be leveraged.
--shuffle-sections=.init_array=-1
--shuffle-sections=<section-glob>=<seed>:
Shuffle matched input sections using the given seed before mapping them
to the output sections.
If -1, reverse the section order.
--shuffle-sections=.init_array=-1: reverse the order of
.init_array input sections.
.init_array.priority sections are unaffected.
AddressSanitizer check_initialization_order=true
Enabled by default due to strict_init_order=true
This checks a dynamic initialization does not touch memory regions of other global variables.
My feeling: it can catch less than 15% bugs.