Updated in 2023-03.
In C++, dynamic initializations for non-local variables happen before
the first statement of the main
function. All (most?)
implementations just ensure such dynamic initializations happen before
main
.
As an extension, GCC supports
__attribute__((constructor))
which can make an arbitrary
function run before main
. A constructor function can have
an optional priority (__attribute__((constructor(N)))
).
Priorities from 0 to 100 are reserved for the implementation
(-Wprio-ctor-dtor
catches violation), e.g. gcov uses
__attribute__((destructor(100)))
. Applications can use 101
to 65535. 65535 (.init_array
or .ctors
,
without a suffix) has the same priority as a non-local variable's
dynamic initialization in C++.
1 | struct S { S(); }; |
Under the hood, on ELF platforms, the initialization functions or
constructors are implemented in two schemes. The legacy one uses
.init
/.ctors
while the new one uses
.init_array
.
1 | .section .text.startup,"ax",@progbits |
.init
and .fini
System V release 4 introduced the dynamic tags DT_INIT
and DT_FINI
to implement ELF initialization and termination
functions. Today it is difficult to figure out what it actually did, but
it was likely similar to the GCC scheme described below.
On a GCC+glibc system, traditionally the section .init
in an executable/shared object consisted of four fragments:
1 | glibc crti.o:(.init) _init |
The linker combines .init
input sections and places the
fragments into the .init
output section. _init
is defined at offset 0 in the first input section, so its address equals
the address of the .init
output section. The linker defines
DT_INIT
according to the value of _init
(which
can be changed by the -init
linker option). In the absence
of .dynamic
, DT_INIT
does not exist. The
runtime references the symbol _init
.
.fini
is similar: 1
2
3
4glibc crti.o:(.fini) _fini
GCC crtbegin.o:(.fini) # not existent on modern systems
GCC crtend.o:(.fini) # not existent on modern systems
glibc crtn.o:(.fini)
The linker defines DT_FINI
according to the value of
_fini
(which can be changed by the -fini
linker option).
In glibc x86-64, sysdeps/x86_64/crti.S
and
sysdeps/x86_64/crtn.S
provide the definitions for
crti.o
and crtn.o
:
1 | # crti.o |
crti.o
calls __gmon_start__
(gmon profiling
system) if defined. This is used by gcc -pg
.
musl just provides empty crti.o
and
crtn.o
.
.ctors
and
.dtors
In GCC libgcc/crtstuff.c
, when
__LIBGCC_INIT_ARRAY_SECTION_ASM_OP__
is not defined and
__LIBGCC_INIT_SECTION_ASM_OP__
is defined
(HAVE_INITFINI_ARRAY_SUPPORT
is 1 in
$builddir/gcc/auto-host.h
), the following scheme is used.
Note: the condition is not satisfied on modern systems.
C++ dynamic initializations and
__attribute__((constructor))
do not use _init
directly. They are implemented as ELF functions. The addresses are
collected in the .ctors
section which will be called by the
runtime. Assume that we have one object files a.o
and
b.o
with .ctors
sections with different
priorities, the layout of the .ctors
output section is:
1 | crtbegin.o:(.ctors) __CTOR_LIST__ |
.dtors
is similar: 1
2
3
4
5
6
7
8
9crtbegin.o:(.dtors) __DTOR_LIST__
a.o:(.dtors) b.o:(.dtors)
a.o:(.dtors.00001) b.o:(.dtors.00001)
a.o:(.dtors.00002) b.o:(.dtors.00002)
...
a.o:(.dtors.65533) b.o:(.dtors.65533)
a.o:(.dtors.65534) b.o:(.dtors.65534)
...
crtend.o:(.dtors) __DTOR_LIST_END__
crtbegin.o
defines.ctors
and.dtors
with one element, -1 (0xffffffff on 32-bit platforms and 0xffffffffffffffff on 64-bit platforms).crtend.o
defines.ctors
and.dtors
with one element, 0.crtend.o
defines a.init
section which calls__do_global_ctors_aux
.__do_global_ctors_aux
calls the static constructors in the.ctors
section. The -1 and 0 sentinels are skipped.crtbegin.o
defines a.fini
section which calls__do_global_dtors_aux
.__do_global_dtors_aux
calls the static constructors in the.dtors
section. The -1 and 0 sentinels are skipped.
Reversed execution order
Here is an interesting property: .ctors
elements are run
in the reversed order and .dtors
elements are run in the
regular order. E.g. for a.o:(.ctors) b.o:(.ctors)
, b.o's
constructor runs before a.o's.
This is to make dynamic linking similar to static linking for
.ctors
sections without a suffix (having the lowest
priority).
The origin may be related to a generic ABI promise: if a.so depends
on b.so, then b.so's constructors run first. If we only look at
.ctors
sections without a suffix, the behavior of
ld main.o a.so b.so
may be quite similar to the static
linking ld main.o a.a b.a
.
.dtors
can be seen as undoing .ctors
, so
its order is the reverse of .ctors
, which is the regular
order.
.init_array
and
.fini_array
HP-UX developers noticed that the
.init
/.ctors
scheme have multiple
problems:
- Fragmented
_init
function is ugly and error-prone. - Sentinel values in
.ctors
are ugly. .init
and.ctors
use magic names instead of dedicated section types.
They invented DT_INIT_ARRAY
as an alternative. glibc
implemented the scheme in 1999.
The GCC and binutils implementations were also quite old.
FreeBSD added support in 2012-03.
OpenBSD added support in 2016-08.
NetBSD made DT_INIT_ARRAY
available for all ports in 2018-12.
glibc and BSD implementations call the constructors with
argc, argv, environ
while musl's calls the constructors
with no argument.
In this scheme, .init_array
and
.init_array.N
sections have a dedicated type
SHT_INIT_ARRAY
. crtbegin.o
and
crtend.o
do not provide fragments.
Below is a layout.
1 | a.o:(.init_array.1) b.o:(.init_array.1) |
Note: ctors_priority = 65535-init_array_priority
The linker defines DT_INIT_ARRAY
and
DT_INIT_ARRAYSZ
according to the address and size of
.init_array
. The linker also defines
__init_array_start
and __init_array_end
if
referenced. The pair of symbols can be used by a statically linked
position dependent executable which may not have
.dynamic
.
Unlike .ctors
, the execution order of
.init_array
is forward (follows .init
).
a.o:(.init_array) b.o:(.init_array)
has a different order
from a.o:(.ctors) b.o:(.ctors)
. In a future section we will
discuss that this difference can expose a type of very subtle bugs
called "static initialization order fiasco".
In GCC, newer ABI implementations like AArch64 and RISC-V only use
.init_array
and don't provide .ctors
.
.preinit_array
The linker defines DT_PREINIT_ARRAY
and
DT_PREINIT_ARRAYSZ
according to the address and size of
.preinit_array
. The linker also defines
__preinit_array_start
and __preinit_array_end
if referenced.
The generic ABI says:
DT_PREINIT_ARRAY: This element holds the address of the array of pointers to pre-initialization functions, discussed in ``Initialization and Termination Functions'' below. The DT_PREINIT_ARRAY table is processed only in an executable file; it is ignored if contained in a shared object.
DT_PREINIT_ARRAY
only applies to the executable. This
feature gives the executable a way to run initialization functions
before shared object dependencies.
There is no .postfini_array
.
Most ld.so implementations support DT_PREINIT_ARRAY
.
musl does not support the feature. See add
preinit_array support.
Runtime behavior
The generic ABI says:
If an object contains both DT_INIT and DT_INIT_ARRAY entries, the function referenced by the DT_INIT entry is processed before those referenced by the DT_INIT_ARRAY entry for that object. If an object contains both DT_FINI and DT_FINI_ARRAY entries, the functions referenced by the DT_FINI_ARRAY entry are processed before the one referenced by the DT_FINI entry for that object.
If the executable a
depends on b.so
and
c.so
(in order), the glibc ld.so and libc behavior is:
ld.so
runsc.so:DT_INIT
. The crtbegin.o fragment of_init
calls.ctors
ld.so
runsc.so:DT_INIT_ARRAY
ld.so
runsb.so:DT_INIT
. The crtbegin.o fragment of_init
calls.ctors
ld.so
runsb.so:DT_INIT_ARRAY
libc_nonshared.a
runsa:DT_INIT
. The crtbegin.o fragment of_init
calls.ctors
libc_nonshared.a
runsa:DT_INIT_ARRAY
As a new ABI, glibc's RISC-V port doesn't define
ELF_INIT_FINI
, so DT_INIT
does not run.
Here is a test for the execution order of atexit
and
DT_FINI_ARRAY
. 1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30printf > a.c %s '
#include <stdio.h>
#include <stdlib.h>
void hook() { puts("atexit"); }
__attribute__((used,retain)) void fini() { puts("fini"); }
asm(".pushsection .fini_array,\"aw\",@fini_array; .quad fini; .popsection");
__attribute__((constructor)) void ctor() { atexit(hook); }
int main() {}
'
printf > b.c %s '
#include <stdio.h>
#include <stdlib.h>
void hook_b() { puts("atexit b"); }
__attribute__((constructor)) void ctor_b() { atexit(hook_b); }
__attribute__((used,retain)) void fini_b() { puts("fini b"); }
asm(".pushsection .fini_array,\"aw\",@fini_array; .quad fini_b; .popsection");
'
printf %s '
.MAKE.MODE := meta curdirOk=1
a: a.c b.so
${CC} -Wl,--no-as-needed -Wl,-rpath=. $> -o $@
b.so: b.c
${CC} -fpic -shared $> -o $@
' | sed 's/ /\t/' > ./Makefile
musl ensures that atexit registered hooks run before
DT_FINI_ARRAY
. 1
2
3
4
5
6
7
8
9
10
11
12
13# musl
% ./a
atexit
atexit b
fini
fini b
# glibc, FreeBSD
% ./a
atexit
fini
fini b
atexit b
.ctors
to
.init_array
transition
In 2010-12, Mike Hommey filed Replace .ctors/.dtors with .init_array/.fini_array on targets supporting them which I believe was related to his ELF hack work for Firefox.
Switching sections needed to consider backward compatibility: how to
handle old object files using .ctors
sections. H.J. Lu
proposed that the internal linker script of GNU ld could be changed to
place .ctors .ctors.N .init_array .init_array.N
input
sections into the .init_array
output section in RFC: Support mixing .init_array.* and .ctors.* input sections
.
With this GNU ld support, GCC 4.7 made the switch.
1 | // a.cc -> a |
1 | % clang -fpic -shared b.cc -o b.so |
gold doesn't have the concept of an internal linker script. Ian Lance
Taylor added the enabled-by-default linker option --ctors-in-init-array
to emulate the GNU ld behavior
Since .ctors
is rare, ld.lld does not implement
converting .ctors
into .init_array
.
You can link the output with a linker script fragment:
1
2
3OVERWRITE_SECTIONS {
.init_array : { *(SORT_BY_INIT_PRIORITY(.init_array.*)) *(.init_array) *(SORT_BY_INIT_PRIORITY(.ctors.*)) *(.ctors) }
}
GCC vs Clang
GCC's .ctors
/.init_array
choice is a
configure option --enable-initfini-array
.
Clang uses a CC1 option -fno-use-init-array
. This makes
cross compilation and testing multiple targets in one build
convenient.
In the llvm-project supported toolchains, only MinGW and PlayStation
4 still use .ctors
for the latest version. For MinGW, this
is related to the fact that PE/COFF does not have section types and the
MinGW runtime doesn't have the .init
pain, so there isn't
motivation for a switch. For PlayStation 4, it is presumably related to
the fact that PlayStation 4 uses a modified FreeBSD 9 image. I saw
.ctors
patches to llvm-project in 2021.
Linux remnant of
.ctors
in 2021
If you don't use prebuilt object files from GCC<4.7, it is
difficult to see .ctors
on Linux in 2021. However, I found
two exceptions.
First, a libgcc file for the split stack implementation had
.ctors.65535
assembly code. I filed morestack.S should support .init_array.0 besides .ctors.65535
which was fixed in 2021-10.
Second, GCC cross compilers targeting Linux did not enable
--enable-initfini-array
. H.J. Lu reported --enable-initfini-array should be enabled for cross compiler to Linux
and fixed it for GCC 12. This affected GCC 11 builds by
scripts/build-many-glibcs.py
Manually convert
.ctors
to .init_array
Run
objcopy --rename-section .ctors=.init_array --rename-section .dtors=.fini_array $file
.
Rarely, .ctors.$x
may be present. Convert such a section
to .init_array.$((65535-$x))
.
C++ dynamic initialization
In a typical C++ object, most .init_array
elements are
dynamic initializations, so I will spend some paragraphs describing
it.
The standard defines the order for various initializations.
- Constant initialization and zero initialization
- Dynamic initialization
main
- Deferred dynamic initialization (e.g. optimized out, on-demand shared library)
Dynamic initialization has three types with different degrees of order guarantee:
- Unordered dynamic initialization (static data members and variable templates not explicitly specialized)
- Partially-ordered dynamic initialization (inline variables that are not an implicitly or explicitly instantiated specialization)
- Ordered dynamic initialization (other non-local variables)
Basically, in one translation unit, the order of dynamic
initializations usually matches the intuition, e.g. a
's
initializations happen before b
's below.
1 | struct A { A(); }; |
C++ static initialization order fiasco
If no appearance-ordered relationship is defined, we say that two initializations are indeterminately sequenced. Relying on a particular order is called "static initialization order fiasco". (I don't know what "static" refers to. Perhaps it refers to static variables or static storage duration.)
Below is a registry example. The order that a, b, and C are registered depends on the link order. If somehow only one order works, than the program may be brittle.
1 | // registry.cc |
Fixing such bugs requires thoughts on the initialization order. Basically one needs to do one of the following:
- constant initialization
- lazy initialization (dynamic initialization of function-locale static, llvm::ManagedStatic, etc)
- manual initialization
- Nifty Counter idiom
Some ways to prevent static initialization order fiasco:
- constexpr
- constinit (constexpr - const)
clang -Wglobal-constructors
:warning: declaration requires a global constructor [-Wglobal-constructors]
AddressSanitizer check_initialization_order
is enabled
by default due to strict_init_order
. It enforces that a
dynamic initialization does not touch memory regions of other global
variables. Unfortunately in practice it misses many many cases.
ld.lld --shuffle-sections=.init_array=-1
In ld.lld, Rafael Espindola added --shuffle-sections
motivated by making tests stabilized. I changed the option to apply to
.init_array
/.init_array
as well and later
changed it to the current form:
--shuffle-sections=<section-glob>=<seed>
:
shuffle matched input sections using the given seed before mapping them
to the output sections. I specialized the seed value -1 to mean the
deterministic reversed order.
You can specify
--shuffle-sections=.init_array=-1 --shuffle-sections=.fini_array=-1
to reverse the input section order. It's unclear whether
.fini_array
needs to be reversed as well, but it is safe to
do so. This does not change .init_array.N
and
.fini_array.N
, but in practice static initialization order
fiasco from prioritized sections are rare.
In a linker script, you can use the REVERSE
keyword (https://reviews.llvm.org/D145381) in ld.lld.
In practice, most static initialization order fiasco bugs are due to the order between two translation units. For a mostly statically linked executable, testing the regular order and the reversed order is sufficient to catch such bugs.
However, for a program with many shared objects, the checks may be
insufficient. An executable and all its DT_NEEDED
transitive dependencies form a dependency graph. The generic ABI
requirement "if a.so depends on b.so, then b.so's initialization
functions run first" imposes some orders in the graph.
If the executable a
depends on b.so
and
c.so
, and b.so
and c.so
are
unrelated. We may consider it a bug if c.so
's
initialization functions need to run before b.so
, but the
linker cannot do anything to improve the checks. There is also no ld.so
feature altering the order. A manual way is to change the link order of
b.so
and c.so
, but it may be difficult to do
so in a build system.