In ISO C++ standards, [basic.start.term] specifies that:
Constructed objects ([dcl.init]) with static storage duration are destroyed and functions registered with std::atexit are called as part of a call to std::exit ([support.start.term]). The call to std::exit is sequenced before the destructions and the registered functions. [Note 1: Returning from main invokes std::exit ([basic.start.main]). — end note]
For example, consider the following code:
1 | struct A { ~A(); } a; |
The destructor for object a will be registered for execution at program termination.
__cxa_atexit
The Itanium C++ ABI employs __cxa_atexit
rather than
atexit for object destructor registration for two primary reasons:
- Limited
atexit
guarantee: ISO C (up to C23) guarantees support for 32 registered functions, although most implementations support many more. - Dynamic library unloading:
__cxa_atexit
provides a mechanism for handling destructors when dynamic libraries are unloaded viadlclose
before program termination.
Several standard libraries, including glibc, musl, and FreeBSD libc,
implement atexit
using __cxa_atexit
.
- In glibc,
atexit
returns__cxa_atexit ((void (*) (void *)) func, NULL, __dso_handle)
, where__dso_handle
is part of libc itself. - musl uses 0 instead of
__dso_handle
.
https://itanium-cxx-abi.github.io/cxx-abi/abi.html#dso-dtor-runtime-api provides detailed documentation on object destruction mechanisms. Let's illustrate this with a GCC and glibc example:
1 | cat > a.cc <<'eof' |
An invocation yields:
1 | foo |
Key points:
- The compiler registers destructors with
__cxa_atexit
using the__dso_handle
symbol as an argument. crtbeginS.o
defines the.fini_array
section (triggering__do_global_dtors_aux
) and the hidden symbol__dso_handle
.- Since 2017, lld defines
__dso_handle
as a hidden symbol if crtbegin does not. dlclose
invokes.fini_array
functions.__cxa_finalize(d)
iterates through the termination function list, calling matching destructors based on the DSO handle.__cxa_atexit
implementations typically allocate memory dynamically and may fail. The failures are simply ignored.
Note: In glibc, the DF_1_NODELETE
flag marks a shared
object as unloadable. Additionally, symbol lookups with
STB_GNU_UNIQUE
automatically set this flag.
musl provides a no-op
implementation for dlclose
and
__cxa_finalize
.
Thread storage duration variables
Objects with thread storage duration that have non-trivial
destructors will register those destructors using
__cxa_thread_atexit
during construction.
When exit-time destructors are undesired
Exit-time destructors for static and thread storage duration variables can be undesired due to
- Unnecessary overhead and complexity: This includes operating system kernels and memory-constrained systems.
- Potential race conditions: Destructors might execute during thread termination, while other threads still attempt to access the object. Examples: webkit
Clang provides -Wexit-time-destructors
(disabled by
default) to warn about exit-time destructors.
1 | % clang++ -c -Wexit-time-destructors g.cc |
Disabling exit-time destructors
Then, I will describe some approaches to disable exit-time destructors.
Pointer/reference to a dynamically-allocated object
We can use a reference or pointer that refers to a dynamically-allocated object.
1 | struct A { int v; ~A(); }; |
This approach prevents the destructor from running at program exit, as pointers and references have a trivial destructor. Note that this does not create a memory leak, since the pointer/reference is part of the root set.
The primary downside is unnecessary pointer indirection when accessing the object. Additionally, this approach uses a mutable pointer in the data segment and requires a memory allocation.
1 | # %bb.2: // initializer |
Class template with an empty destructor
A common approach, as outlined in P1247, is to use a class template with an empty destructor to prevent exit-time destruction:
1 | template <class T> class no_destroy { |
libstdc++ employs a variant that uses a union member.
1 | struct A { ~A(); }; |
C++20 will support constexpr destructor:
1 | template <class T> union no_destroy { |
Libraries like absl::NoDestructor
and folly::Indestructible
offer similar functionality. The absl version optimizes for trivially
destructible types.
Compiler optimization for no-op destructors
Ideally, compilers should optimize out exit-time destructors for empty user-provided destructors:
1 | struct C { C(); ~C() {} }; |
LLVM has addressed this since
2011. Its GlobalOpt pass eliminates __cxa_atexit
calls
related to empty destructors, along with other global variable
optimizations.
In contrast, GCC has an open feature request for this optimization since 2005.
no_destroy
attribute
Clang supports [[clang::no_destroy]]
(alternative form:
__attribute__((no_destroy))
) to disable exit-time
destructors for variables of static or thread storage duration. Its
-fno-c++-static-destructors
option allows disabling
exit-time destructors globally.
- July 2018 discussion: https://discourse.llvm.org/t/rfc-suppress-c-static-destructor-registration/49128
- Patch: https://reviews.llvm.org/D50994 with follow-up https://reviews.llvm.org/D54344
- Documentation: https://clang.llvm.org/docs/AttributeReference.html#no-destroy
Standardization efforts for this attribute are underway P1247R0.
I recently encountered a scenario where the no_destroy
attribute would have been beneficial. I've filed a GCC feature request
(PR114357) after I learned
that GCC doesn't have the attribute.
Case study
LLVM provides ManagedStatic
to construct an object
on-demand (good for reducing startup time) and make destruction
explicitly through llvm_shutdown
.
ManagedStatic
is intended to be used at namespace scope. A
prime example is LLVM's statistics mechanisms (-stats
and
-time-passes
).
Programs using LLVM can strategically avoid calling
llvm_shutdown
for fast teardown by skipping some
destructors. The lld linker employs this approach unless the
LLD_IN_TEST
environment variable is set to a non-zero
integer.
DSO plugin users requiring library unloading may find
ManagedStatic
unsuitable. This is because:
- A DSO may not be able to determine if other active LLVM users exist
within the process, making it unsafe to call
llvm_shutdown
. - If
llvm_shutdown
is deferred until around program exit, executing destructors becomes unsafe once the DSO's code has been removed.
The mold linker improves perceived linking speed by spawning a separate process for the linking task. This allows the parent process (the one launched from the shell or other programs) to exit early. This approach eliminates overhead associated with static destructors and other operations.