C++ exception handling ABI

Updated in 2024-11.

中文版

I wrote an article a few weeks ago to introduce stack unwinding in detail. Today I will introduce C++ exception handling, an application of stack unwinding. Exception handling has a variety of ABI (interoperability of C++ implementations), the most widely used of which is Itanium C++ ABI: Exception Handling

Itanium C++ ABI: Exception Handling

Simplified exception handling process (from throw to catch):

  • Call __cxa_allocate_exception to allocate space to store the exception object and the exception header __cxa_exception
  • Jump to __cxa_throw, set the __cxa_exception fields and then jump to _Unwind_RaiseException
  • In _Unwind_RaiseException, execute the search phase, call personality routines to find matching try catch (type matching)
  • In _Unwind_RaiseException, execute the cleanup phase: call personality routines to find stack frames containing out-of-scope variables, and for each stack frame, jump to its landing pad to execute the constructors. The landing pad uses _Unwind_Resume to resume the cleanup phase
  • The cleanup phase executed by _Unwind_RaiseException jumps to the landing pad corresponding to the matching try catch
  • The landing pad calls __cxa_begin_catch, executes the catch code, and then calls __cxa_end_catch
  • __cxa_end_catch decreases the handler count of the exception object, and if it reaches zero, it also destroys the exception object

Note: each stack frame may use a different personality routine. It is common that all frames share the same routine, though.

Among these steps, _Unwind_RaiseException is responsible for stack unwinding and is language independent. The language-related concepts (catch block, out-of-scope variable) in stack unwinding are interpreted/encapsulated by the personality. This is a key idea that makes the ABI applicable to other languages and allows other languages to be mixed with C++.

Therefore, Itanium C++ ABI: Exception Handling is divided into Level 1 Base ABI and Level 2 C++ ABI. Base ABI describes the language-independent stack unwinding part and defines the _Unwind_* API. Common implementations are:

  • libgcc: libgcc_s.so.1 and libgcc_eh.a
  • Multiple libraries named libunwind (libunwind.so or libunwind.a). If you use Clang, you can use --rtlib=compiler-rt --unwindlib=libunwind to choose to link to libunwind, you can use llvm-project/libunwind or nongnu.org/libunwind

The C++ ABI is related to the C++ language and defines the __cxa_* API (__cxa_allocate_exception, __cxa_throw, __cxa_begin_catch, etc.). Common implementations are:

  • libsupc++, part of libstdc++
  • libc++abi in llvm-project

The C++ standard library implementation in llvm-project, libc++, can leverage libc++abi, libcxxrt or libsupc++, but libc++abi is recommended.

Level 1 Base ABI

Data structures

The main data structure is:

1
2
3
4
5
6
7
// Level 1
struct _Unwind_Exception {
_Unwind_Exception_Class exception_class; // an identifier, used to tell whether the exception is native
_Unwind_Exception_Cleanup_Fn exception_cleanup;
_Unwind_Word private_1; // zero: normal unwind; non-zero: forced unwind, the _Unwind_Stop_Fn function
_Unwind_Word private_2; // saved stack pointer
} __attribute__((aligned));
1
2
3
4
5
6
7
8
9
10
11
int main() {
try {
throw 1;
} catch (...) {
try {
throw 2;
} catch (...) {
// The global exception stack has two exceptions here.
}
}
}

exception_class and exception_cleanup are set by the API that throws exceptions in Level 2. The Level 1 API does not process exception_class, but passes it to the personality routine. Personality routines use this value to distinguish native and foreign exceptions.

libc++abi __cxa_throw will set exception_class to uint64_t representing "CLNGC++\0". libsupc++ uses uint64_t which means "GNUCC++\0". The ABI requires that the lower bits contain "C++\0". The exceptions thrown by libstdc++ will be treated as foreign exceptions by libc++abi. Only catch (...) can catch foreign exceptions.

Exception propagation implementation mechanism will use another exception_class identifier to represent dependent exceptions.

exception_cleanup stores the destroying delete function of this exception object, which is used by __cxa_end_catch to destroy a foreign exception.

The private unwinder state (private_1 and private_2) in an exception object should be neither read by nor written to by personality routines or other parts of the language-specific runtime.

The information required for the Unwind operation (for a given IP/SP, how to obtain the register information such as the IP/SP of the upper stack frame) is implementation-dependent, and Level 1 ABI does not define it. In the ELF system, .eh_frame and .eh_frame_hdr (PT_EH_FRAME program header) store unwind information. See Stack unwinding.

Level 1 API

_Unwind_Reason_Code _Unwind_RaiseException(_Unwind_Exception *obj); Perform stack unwinding for exceptions. It is noreturn under normal circumstances, and will give control to matched catch handlers (catch block) or non-catch handlers (code blocks that need to execute destructors) like longjmp. It is a two-phase process, divided into phase 1 (search phase) and phase 2 (cleanup phase).

  • In the search phase, find matched catch handler and record the stack pointer in private_2
    • Trace the call chain based on IP/SP and other saved registers
    • For each stack frame, skip if there is no personality routine; call if there is (actions set to _UA_SEARCH_PHASE)
    • If personality returns _URC_CONTINUE_UNWIND, continue searching
    • If personality returns _URC_HANDLER_FOUND, it means that a matched catch handler or unmatched exception specification is found, and the search stops
  • In the cleanup phase, jump to non-catch handlers (usually local variable destructors), and then transfer the control to the matched catch handler located in the search phase
    • Trace the call chain based on IP/SP and other saved registers
    • For each stack frame, skip if there is no personality routine; call if there is one (actions are set to _UA_CLEANUP_PHASE, and the stack frame marked by search phase will also set _UA_HANDLER_FRAME)
    • If personality returns _URC_CONTINUE_UNWIND, it means there is no landing pad, continue to unwind
    • If personality returns _URC_INSTALL_CONTEXT, it means there is a landing pad, jump to the landing pad
    • For intermediate stack frames that are not marked in the search phase, the landing pad performs cleanup work (usually destructors of out-of-scope variables), and calls _Unwind_Resume to jump back to the cleanup phase
    • For the stack frame marked by the search phase, the landing pad calls __cxa_begin_catch, then executes the code in the catch block, and finally calls __cxa_end_catch to destroy the exception object

The point of the two-phase process is to avoid any actual stack unwinding if there is no handler. If there are just cleanup frames, an abort function can be called. Cleanup frames are also less expensive than matching a handler. However, parsing .gcc_except_table is probably not much less expensive than additionally matching a handler:)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
static _Unwind_Reason_Code unwind_phase1(unw_context_t *uc, _Unwind_Context *ctx,
_Unwind_Exception *obj) {
// Search phase: unwind and call personality with _UA_SEARCH_PHASE for each frame
// until a handler (catch block) is found.
unw_init_local(uc, ctx);
for(;;) {
if (ctx->fdeMissing) return _URC_END_OF_STACK;
if (!step(ctx)) return _URC_FATAL_PHASE1_ERROR;
ctx->getFdeAndCieFromIP();
if (!ctx->personality) continue;
switch (ctx->personality(1, _UA_SEARCH_PHASE, obj->exception_class, obj, ctx)) {
case _URC_CONTINUE_UNWIND: break;
case _URC_HANDLER_FOUND:
unw_get_reg(ctx, UNW_REG_SP, &obj->private_2);
return _URC_NO_REASON;
default: return _URC_FATAL_PHASE1_ERROR; // e.g. stack corruption
}
}
return _URC_NO_REASON;
}

static _Unwind_Reason_Code unwind_phase2(unw_context_t *uc, _Unwind_Context *ctx,
_Unwind_Exception *obj) {
// Cleanup phase: unwind and call personality with _UA_CLEANUP_PHASE for each frame
// until reaching the handler. Restore the register state and transfer control.
unw_init_local(uc, ctx);
for(;;) {
if (ctx->fdeMissing) return _URC_END_OF_STACK;
if (!step(ctx)) return _URC_FATAL_PHASE2_ERROR;
ctx->getFdeAndCieFromIP();
if (!ctx->personality) continue;
_Unwind_Action actions = _UA_CLEANUP_PHASE;
size_t sp;
unw_get_reg(ctx, UNW_REG_SP, &sp);
if (sp == obj->private_2) actions |= _UA_HANDLER_FRAME;
switch (ctx->personality(1, actions, obj->exception_class, obj, ctx)) {
case _URC_CONTINUE_UNWIND:
break;
case _URC_INSTALL_CONTEXT:
unw_resume(ctx); // Return if there is an error
return _URC_FATAL_PHASE2_ERROR;
default: return _URC_FATAL_PHASE2_ERROR; // Unknown result code
}
}
return _URC_FATAL_PHASE2_ERROR;
}

_Unwind_Reason_Code _Unwind_RaiseException(_Unwind_Exception *obj) {
unw_context_t uc;
_Unwind_Context ctx;
__unw_getcontext(&uc);
_Unwind_Reason_Code phase1 = unwind_phase1(&uc, &ctx, obj);
if (phase1 != _URC_NO_REASON) return phase1;
return unwind_phase2(&uc, &ctx, obj);
}

C++ does not support resumptive exception handling (correcting the exceptional condition and resuming execution at the point where it was raised), so the two-phase process is not necessary, but two-phase allows C++ and other languages to coexist on the call stack.

_Unwind_Reason_Code _Unwind_ForcedUnwind(_Unwind_Exception *obj, _Unwind_Stop_Fn stop, void *stop_parameter); Execute forced unwinding: Skip the search phase and perform a slightly different cleanup phase. private_2 is used as the parameter of the stop function. It is similar to a foreign exception but is rarely used.

void _Unwind_Resume(_Unwind_Exception *obj); Continue the unwind process of phase 2. It is similar to longjmp, is noreturn, and is the only Level 1 API that is directly called by the compiler. The compiler usually calls this function at the end of non-catch handlers.

void _Unwind_DeleteException(_Unwind_Exception *obj); Destroy the specified exception object. It is the only Level 1 API that handles exception_cleanup and is called by __cxa_end_catch.

Many implementations provide extensions. Notably _Unwind_Reason_Code _Unwind_Backtrace(_Unwind_Trace_Fn callback, void *ref); is another special unwind process: it ignores personality and notifies an external callback of stack frame information.

Level 2 C++ ABI

This part deals with language-related concepts such as throw, catch blocks, and out-of-scope variable destructors in C++.

Data structures

Each thread has a global stack of currently caught exceptions, linked through the nextException field of the exception header. caughtExceptions stores the most recent exception on the stack, and __cxa_exception::nextException points to the next exception in the stack.

1
2
3
4
struct __cxa_eh_globals {
__cxa_exception *caughtExceptions;
unsigned uncaughtExceptions;
};
1
2
3
4
5
6
7
8
9
10
11
int main() {
try {
throw 1;
} catch (...) {
try {
throw 2;
} catch (...) {
// The global exception stack has two exceptions here.
}
}
}

The definition of __cxa_exception is as follows, and the end of it stores the _Unwind_Exception defined by Base ABI. __cxa_exception adds C++ semantic information on the basis of _Unwind_Exception.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
// Level 2
struct __cxa_exception {
void *reserve; // here on 64-bit platforms
size_t referenceCount; // here on 64-bit platforms
std::type_info *exceptionType;
void (*exceptionDestructor)(void *);
unexpected_handler unexpectedHandler; // by default std::get_unexpected()
terminate_handler terminateHandler; // by default std::get_terminate()
__cxa_exception *nextException; // linked to the next exception on the thread stack
int handlerCount; // incremented in __cxa_begin_catch, decremented in __cxa_end_catch, negated in __cxa_rethrow; last non-dependent performs the clean

// The following fields cache information the catch handler found in phase 1.
int handlerSwitchValue; // ttypeIndex in libc++abi
const char *actionRecord;
const char *languageSpecificData;
void *catchTemp; // landingPad
void *adjustedPtr; // adjusted pointer of the exception object

_Unwind_Exception unwindHeader;
};

The information needed to process the exception (for a given IP, whether it is in a try catch, whether there are out-of-scope variable destructors that need to be executed, whether there is a dynamic exception specification) is called language-specific data area (LSDA), which is the implementation detail nor defined by Level 2 ABI.

Landing pad

A landing pad is a section of code related to exceptions in the text section which performs one of the three tasks:

  • In the cleanup clause, call destructors of out-of-scope variables or callbacks registered by __attribute__((cleanup(...))), and then use _Unwind_Resume to resume cleanup phase
  • A catch clause which captures the exception: call the destructors of out-of-scope variables, then call __cxa_begin_catch, execute the catch code, and finally call __cxa_end_catch
  • rethrow: call destructors of out-of-scope variables in the catch clause, then call __cxa_end_catch, and then use _Unwind_Resume to resume cleanup phase

If a try block has multiple catch clauses, there will be multiple action table entries in series in the language-specific data area, but the landing pad includes all (conceptually merged) catch clauses. Before the personality transfers control to the landing pad, it will call _Unwind_SetGP to set __buitin_eh_return_data_regno(1) to store switchValue and inform the landing pad which type matches.

A rethrow is triggered by __cxa_rethrow in the middle of the execution of the catch code. It needs to destruct the local variables defined by the catch clause and call __cxa_end_catch to offset the __cxa_begin_catch called at the beginning of the catch clause.

.gcc_except_table

The language-specific data area on the ELF platforms is usually stored in the .gcc_except_table section. This section is parsed by __gxx_personality_v0 and __gcc_personality_v0. Its structure is very simple:

  • header (@LPStart, @TType and call sites coding, the starting offset of action records)
  • call site table: Describe the landing pad offset (0 if not exists) and action record offset (biased by 1, 0 for no action) that should be executed for each call site (an address range)
  • action table
  • type table (referennced by postive switch values)
  • dynamic exception specification (deprecated in C++, so rarely used) (referenced by negative switch values)

Here is an example:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
  .section        .gcc_except_table,"a",@progbits
.p2align 2
GCC_except_table0:
.Lexception0:
.byte 255 # @LPStart Encoding = omit
.byte 3 # @TType Encoding = udata4
.uleb128 .Lttbase0-.Lttbaseref0 # The start of action records
.Lttbaseref0:
.byte 1 # Call site Encoding = uleb128
.uleb128 .Lcst_end0-.Lcst_begin0
.Lcst_begin0: # 2 call site code ranges
.uleb128 .Ltmp0-.Lfunc_begin0 # >> Call Site 1 <<
.uleb128 .Ltmp1-.Ltmp0 # Call between .Ltmp0 and .Ltmp1
.uleb128 .Ltmp2-.Lfunc_begin0 # jumps to .Ltmp2
.byte 1 # On action: 1
.uleb128 .Ltmp1-.Lfunc_begin0 # >> Call Site 2 <<
.uleb128 .Lfunc_end0-.Ltmp1 # Call between .Ltmp1 and .Lfunc_end0
.byte 0 # has no landing pad
.byte 0 # On action: cleanup
.Lcst_end0:
.byte 1 # >> Action Record 1 <<
# Catch TypeInfo 1
.byte 0 # No further actions
.p2align 2
# >> Catch TypeInfos <<
.long _ZTIi # TypeInfo 1

Each call site record has two values besides call site offset and length: landing pad offset and action record offset.

  • The landing pad offset is 0. The action record offset should also be 0. No landing pad
  • The landing pad offset is not 0. With landing pad
    • The action record offset is 0, also called cleanup (the description of "cleanup" is somewhat ambiguous, because Level 1 has the term clean phase), usually describing local variable destructors and __attribute__((cleanup(...)))
    • The action record offset is not 0. The action record offset points to an action record in the action table. catch or noexcept specifier or exception specification

Each action record has two values:

  • switch value (SLEB128): a positive index indicates the TypeInfo of the catch type in the type table; a negative number indicates the offset of the exception specification; 0 indicates a cleanup action which is similar to an action record offset of 0 in the call site record
  • offset to the next action record: 0 indicates there is no next action record. This singly linked list form can describe multiple catches or an exception specification list

The offset to next action record can be used not only as a singly linked list, but also as a trie, but it is rare such compression can find its usage in the wild.

The values of landing pad offset/action record offset corresponding to different areas in the program:

  • A non-try block without local variable destructor: landing_pad_offset==0 && action_record_offset==0
  • A non-try block with local variable destructors: landing_pad_offset!=0 && action_record_offset==0. phase 2 should stop and call cleanup
  • A non-try block with __attribute__((cleanup(...))): landing_pad_offset!=0 && action_record_offset==0. Same as above
  • A try block: landing_pad_offset!=0 && action_record_offset!=0. The landing pad points to the code block obtained by catch splicing. Action record describes a catch for a switch value greater than 0
  • A try block with catch (...): Same as above. The action record is a switch value greater than 0 pointing to an entry with a value of 0 in the type table (indicating catch any)
  • In a function with noexcept specifier, it is possible to propagate the exception to the caller area: landing_pad_offset!=0 && action_record_offset!=0. The landing pad points to the code block that calls std::terminate. The action record is a switch value greater than 0 pointing to an entry with a value of 0 in the type table (indicating catch any)
  • In a function with an exception specifier, it may propagate the exception to the caller area: landing_pad_offset!=0 && action_record_offset!=0. The landing pad points to the code block that calls __cxa_call_unexpected. Action record is a switch value less than 0 describing an exception specifier list

Level 2 API

void *__cxa_allocate_exception(size_t thrown_size);. The compiler generates a call to this function for throw A(); and allocates a section of memory to store __cxa_exception and A object. __cxa_exception is immediately to the left of A object. The following function illustrates the relationship between the address of the exception object operated by the program and __cxa_exception:

1
2
3
static void *thrown_object_from_cxa_exception(__cxa_exception *exception_header) {
return static_cast<void *>(exception_header + 1);
}

void __cxa_throw(void *thrown, std::type_info *tinfo, void (*destructor)(void *)); Call the above function to find the __cxa_exception header, and fill in each field (referenceCount, exception_class, unexpectedHandler, terminateHandler, exceptionType , exceptionDestructor, unwindHeader.exception_cleanup) and then call _Unwind_RaiseException. This function is noreturn.

void *__cxa_begin_catch(void *obj); The compiler generates a call to this function at the beginning of the catch block. For a native exception,

  • Add handlerCount
  • Push the global exception stack of the thread to decrease uncaught_exception
  • Return the adjusted pointer of the exception object

For a foreign exception (there is not necessarily a __cxa_exception header),

  • Push if the global exception stack of the thread is empty, otherwise execute std::terminate (I don’t know if there is a field similar to __cxa_exception::nextException)
  • Return static_cast<_Unwind_Exception *>(obj) + 1 (assuming _Unwind_Exception is next to the thrown object)

Simplified implementation:

1
2
3
4
5
6
7
8
9
10
11
12
void __cxa_throw(void *thrown, std::type_info *tinfo, void (*destructor)(void *)) {
__cxa_exception *hdr = (__cxa_exception *)thrown - 1;
hdr->exceptionType = tinfo; hdr->destructor = destructor;
hdr->unexpectedHandler = std::get_unexpected();
hdr->terminateHandler = std::get_terminate();
hdr->unwindHeader.exception_class = ...;
__cxa_get_globals()->uncaughtExceptions++;
_Unwind_RaiseException(&hdr->unwindHeader);
// Failed to unwind, e.g. the .eh_frame FDE is absent.
__cxa_begin_catch(&hdr->unwindHeader);
std::terminate();
}

void __cxa_end_catch(); is called at the end of the catch block or when rethrow. For native exception:

  • Get the current exception from the global exception stack of the thread, reduce handlerCount
  • When handlerCount reaches 0, pop the global exception stack of the thread
  • If this is a native exception: call __cxa_free_exception when handlerCount is decreased to 0 (if this is a dependent exception, decrease referenceCount and call __cxa_free_exception when it reaches 0)

For a foreign exception,

  • Call _Unwind_DeleteException
  • Execute __cxa_eh_globals::uncaughtExceptions = nullptr; (due to the nature of __cxa_begin_catch, there is exactly one exception in the stack)

void __cxa_rethrow(); will mark the exception object, so that when handlerCount is reduced to 0 by __cxa_end_catch, it will not be destroyed, because this object will be reused by the cleanup phase restored by _Unwind_Resume.

Note that, except for __cxa_begin_catch and __cxa_end_catch, most __cxa_* functions cannot handle foreign exceptions (they do not have the __cxa_exception header).

Examples

For the following code:

1
2
3
4
5
6
#include <stdio.h>
struct A { ~A(); };
struct B { ~B(); };
void foo() { throw 0xB612; }
void bar() { B b; foo(); }
void qux() { try { A a; bar(); } catch (int x) { puts(""); } }

The compiled assembly conceptually looks like this:

1
2
3
4
5
6
7
8
9
10
11
12
13
void foo() {
__cxa_exception *thrown = __cxa_allocate_exception(4);
*thrown = 42;
__cxa_throw(thrown, &typeid(int), /*destructor=*/nullptr);
}
void bar() {
B b; foo(); return;
landing_pad: b.~B(); _Unwind_Resume();
}
void qux() {
A a; bar(); return;
landing_pad: __cxa_begin_catch(obj); puts(""); __cxa_end_catch(obj);
}

Control flow:

  • qux calls bar. bar calls foo. foo throws an exception
  • foo dynamically allocates a memory block, stores the thrown int and __cxa_exception header, and then executes __cxa_throw
  • __cxa_throw fills in other fields of __cxa_exception and calls _Unwind_RaiseException

Next, _Unwind_RaiseException drives the two-phase process of Level 1.

  • _Unwind_RaiseException executes phase 1: search phase
    • For bar, call personality with _UA_SEARCH_PHASE as the actions parameter and return _URC_CONTINUE_UNWIND (no catch handler)
    • For qux, call personality with _UA_SEARCH_PHASE as the actions parameter and return _URC_HANDLER_FOUND (with catch handler)
    • The stack pointer of the stack frame that marked qux will be marked (stored in private_2) and the search will stop
  • _Unwind_RaiseException executes phase 2: cleanup phase
    • bar's stack frame is not marked by search phase, call personality with _UA_CLEANUP_PHASE as actions parameter, return _URC_INSTALL_CONTEXT
    • Jump to the landing pad of the bar's stack frame
    • After cleaning the landing pad, use _Unwind_Resume to return to the cleanup phase
    • The stack frame of qux is marked by search phase, call personality with _UA_CLEANUP_PHASE|_UA_HANDLER_FRAME as the actions parameter, and return _UA_INSTALL_CONTEXT
    • Jump to the landing pad of the qux stack frame
    • The landing pad calls __cxa_begin_catch, executes the catch code, and then calls __cxa_end_catch

__gxx_personality_v0

A personality routine is called by Level 1 ABI (both phase 1 and phase 2) to provide language-related processing. Different languages, implementations or architectures may use different personality routines. Common personalities are as follows:

  • __gxx_personality_v0: C++
  • __gxx_personality_sj0: sjlj
  • __gcc_personality_v0: C -fexceptions for __attribute__((cleanup(...)))
  • __CxxFrameHandler3: Windows MSVC
  • __gxx_personality_seh0: MinGW-w64 -fseh-exceptions
  • __objc_personality_v0: ObjC in the macOS environment

The most common C++ implementation on ELF systems is __gxx_personality_v0. It is implemented by:

  • GCC: libstdc++-v3/libsupc++/eh_personality.cc
  • libc++abi: src/cxa_personality.cpp

_Unwind_Reason_Code (*__personality_routine)(int version, _Unwind_Action action, uint64 exceptionClass, _Unwind_Exception *exceptionObject, _Unwind_Context *context);

In the absence of errors:

  • For _UA_SEARCH_PHASE, returns
    • _URC_CONTINUE_UNWIND: no lsda, or there is no landing pad, there is a non-catch handler or a matched exception specification
    • _URC_HANDLER_FOUND: there is a matched catch handler or an unmatched exception specification
  • For _UA_CLEANUP_PHASE, returns
    • _URC_CONTINUE_UNWIND: no lsda, or there is no landing pad, or (not produced by a compiler) there is no cleanup action
    • _URC_INSTALL_CONTEXT: the other cases

Before transferring control to the landing pad, the personality will call _Unwind_SetGP to set two registers (architecture related, __buitin_eh_return_data_regno(0) and __buitin_eh_return_data_regno(1)) to store _Unwind_Exception * and switchValue.

Code:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
_unwind_Reason_Code __gxx_personality_v0(int version, _Unwind_Action actions, uint64_t exceptionClass, _Unwind_Exception *exc, _Unwind_Context *ctx) {
scan_results results;
if (actions == (_UA_CLEANUP_PHASE | _UA_HANDLER_FRAME) && is_native) {
auto *hdr = (__cxa_exception *)(exc+1) - 1;
// Load cached results from phase 1.
results.switchValue = hdr->handlerSwitchValue;
results.actionRecord = hdr->actionRecord;
results.languageSpecificData = hdr->languageSpecificData;
results.landingPad = reinterpret_cast<uintptr_t>(hdr->catchTemp);
results.adjustedPtr = hdr->adjustedPtr;

_Unwind_SetGR(...);
_Unwind_SetGR(...);
_Unwind_SetIP(ctx, results.landingPad);
return _URC_INSTALL_CONTEXT;
}
scan_eh_tab(results, actions, native_exception, unwind_exception, context);
if (results.reason == _URC_CONTINUE_UNWIND ||
results.reason == _URC_FATAL_PHASE1_ERROR)
return results.reason;
if (actions & _UA_SEARCH_PHASE) {
auto *hdr = (__cxa_exception *)(exc+1) - 1;
// Cache LSDA results in hdr.
hdr->handlerSwitchValue = results.switchValue;
hdr->actionRecord = results.actionRecord;
hdr->languageSpecificData = results.languageSpecificData;
hdr->catchTemp = reinterpret_cast<void *>(results.landingPad);
hdr->adjustedPtr = results.adjustedPtr;
return _URC_HANDLER_FOUND;
}
// _UA_CLEANUP_PHASE
_Unwind_SetGR(...);
_Unwind_SetGR(...);
_Unwind_SetIP(ctx, results.landingPad);
return _URC_INSTALL_CONTEXT;
}

For a native exception, when the personality returns _URC_HANDLER_FOUND in the search phase, the LSDA related information of the stack frame will be cached. When the personality is called again in the cleanup phase with the argument actions == (_UA_CLEANUP_PHASE | _UA_HANDLER_FRAME), the personality loads the cache and there is no need to parse .gcc_except_table.

In the remaining three cases, the personality has to parse .gcc_except_table:

  • actions & _UA_SEARCH_PHASE
  • actions & _UA_CLEANUP_PHASE && actions & _UA_HANDLER_FRAME && !is_native: catch (...) can catch a foreign exception. An exception specification terminates upon a foreign exception.
  • actions & _UA_CLEANUP_PHASE && !(actions & _UA_HANDLER_FRAME): non-catch handlers and unmatched catch handlers, matched exception specification. Another case is _Unwind_ForcedUnwind.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
static void scan_eh_tab(...) {
...
const uint8_t *lsda = (const uint8_t *)_Unwind_GetLanguageSpecificData(context);
if (lsda == nullptr) { res.reason = _URC_CONTINUE_UNWIND; return; }
res.languageSpecificData = lsda;
uintptr_t ipOffset = _Unwind_GetIP(context) - 1 - _Unwind_GetRegionStart(context);
for each call site entry {
if (!(start <= ipOffset && ipOffset < start + length))
continue;
res.landingPad = landingPad;
if (landingPad == 0) { res.reason = _URC_CONTINUE_UNWIND; return; }
if (actionRecord == 0) { // cleanup
res.reason = actions & _UA_SEARCH_PHASE ? _URC_CONTINUE_UNWIND : _URC_HANDLER_FOUND;
return;
}
// A catch or a dynamic exception specification.
const uint8_t *action = actionTableStart + (actionRecord - 1);
bool hasCleanup = false;
for(;;) {
res.actionRecord = action;
int64_t switchValue = readSLEB128(&action);
if (switchValue > 0) { // catch
auto *catchType = ...;
if (catchType == nullptr) { // catch (...)
res.switchValue = switchValue; res.actionRecord = action;
res.adjustedPtr = getThrownObjectPtr(exc); res.reason = _URC_HANDLER_FOUND;
return;
} else if (is_native) { // catch (T ...)
auto *hdr = (__cxa_exception *)(exc+1) - 1;
if (catchType->can_catch(hdr->exceptionType, adjustedPtr)) {
res.switchValue = switchValue; res.actionRecord = action;
res.adjustedPtr = adjustedPtr; res.reason = _URC_HANDLER_FOUND;
return;
}
}
} else if (switchValue < 0) { // dynamic exception specification
if (actions & _UA_FORCE_UNWIND) {
// Skip if forced unwinding.
} else if (is_native) {
if (!exception_spec_can_catch) {
// The landing pad will call __cxa_call_unexpected.
assert(actions & _UA_SEARCH_PHASE);
res.switchValue = switchValue; res.actionRecord = action;
res.adjustedPtr = adjustedPtr; res.reason = _URC_HANDLER_FOUND;
return;
}
} else {
// A foreign exception cannot be matched by the exception specification. The landing pad will call __cxa_call_unexpected.
res.switchValue = switchValue; res.actionRecord = action;
res.adjustedPtr = getThrownObjectPtr(exc); res.reason = _URC_HANDLER_FOUND;
return;
}
} else { // switchValue == 0: cleanup
hasCleanup = true;
}
const uint8_t *temp = action;
int64_t actionOffset = readSLEB128(&temp);
if (actionOffset == 0) { // End of action list
res.reason = hasCleanup && actions & _UA_CLEANUP_PHASE
? _URC_HANDLER_FOUND : _URC_CONTINUE_UNWIND;
return;
}
action += actionOffset;
}
}
call_terminate();
}

__gcc_personality_v0

libgcc and compiler-rt/lib/builtins implement this function to handle __attribute__((cleanup(...))). The implementation does not return _URC_HANDLER_FOUND in the search phase, so the cleanup handler cannot serve as a catch handler. However, we can supply our own implementation to return _URC_HANDLER_FOUND in the search phase... On x86-64, __buitin_eh_return_data_regno(0) is RAX. We can let the cleanup handler pass RAX to the landing pad.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
// a.cc
#include <exception>
#include <stdio.h>

extern "C" void my_catch();
extern "C" void throw_exception() { throw 42; }

int main() {
fprintf(stderr, "uncaught exceptions: %d\n", std::uncaught_exceptions());
my_catch();
fprintf(stderr, "uncaught exceptions: %d\n", std::uncaught_exceptions());
}

// b.c
#include <setjmp.h>
#include <stdbool.h>
#include <stdint.h>
#include <stdio.h>
#include <unwind.h>

void throw_exception();

struct __cxa_eh_globals {
struct __cxa_exception *caughtExceptions;
unsigned uncaughtExceptions;
};

struct __cxa_eh_globals *__cxa_get_globals();

static uintptr_t readULEB128(const uint8_t **a) {
uintptr_t res = 0, shift = 0;
const uint8_t *p = *a;
uint8_t b;
do {
b = *p++;
res |= (b & 0x7f) << shift;
shift += 7;
} while (b & 0x80);
*a = p;
return res;
}

_Unwind_Reason_Code __gcc_personality_v0(int version, _Unwind_Action actions,
uint64_t exception_class,
struct _Unwind_Exception *obj,
struct _Unwind_Context *ctx) {
const uint8_t *lsda = _Unwind_GetLanguageSpecificData(ctx);
if (lsda == 0)
return _URC_CONTINUE_UNWIND;
uintptr_t func = _Unwind_GetRegionStart(ctx);
uintptr_t pc = _Unwind_GetIP(ctx) - 1 - func;
if (*lsda++ != 255) // Skip LPStart
readULEB128(&lsda);
if (*lsda++ != 255) // Skip TType
readULEB128(&lsda);
uintptr_t call_site_table_len = 0;
if (*lsda++ == 1)
call_site_table_len = readULEB128(&lsda);
const uint8_t *end = lsda + call_site_table_len;
while (lsda < end) {
uintptr_t start = readULEB128(&lsda), len = readULEB128(&lsda),
lpad = readULEB128(&lsda);
if (!(start <= pc && pc < start + len))
continue;
if (lpad == 0)
return _URC_CONTINUE_UNWIND;
if (actions & _UA_SEARCH_PHASE)
return _URC_HANDLER_FOUND;
_Unwind_SetGR(ctx, __builtin_eh_return_data_regno(0), (uintptr_t)obj);
_Unwind_SetGR(ctx, __builtin_eh_return_data_regno(1), 0); // switchValue==0
_Unwind_SetIP(ctx, func + lpad);
return _URC_INSTALL_CONTEXT;
}
return _URC_FATAL_PHASE2_ERROR;
}

struct Catch {
struct _Unwind_Exception *obj;
jmp_buf env;
bool do_catch;
};

__attribute__((used))
static void my_jump(struct Catch *c) {
if (c->do_catch) {
struct __cxa_eh_globals *globals = __cxa_get_globals();
globals->uncaughtExceptions--;
longjmp(c->env, 1);
}
}

__attribute__((naked)) static void my_cleanup(struct Catch *c) {
asm("movq %rax, (%rdi); jmp my_jump");
}

void my_catch() {
__attribute__((cleanup(my_cleanup))) struct Catch c;
if (setjmp(c.env) == 0) {
c.do_catch = 1;
throw_exception();
} else {
fprintf(stderr, "caught exception: %p\n", c.obj);
fprintf(stderr, "value: %d\n", *(int *)(c.obj + 1));
c.do_catch = 0;
}
}
1
2
3
4
5
6
7
% clang -c -fexceptions a.cc b.c
% clang++ a.o b.o
% ./a.out
uncaught exceptions: 0
caught exception: 0x10f7f10
value: 42
uncaught exceptions: 0

Rethrow

The landing pad section briefly described the code executed by rethrow. Usually caught exception will be destroyed in __cxa_end_catch, so __cxa_rethrow will mark the exception object and increase handlerCount.

C++11 introduced Exception Propagation (N2179; std::rethrow_exception etc), and libstdc++ uses __cxa_dependent_exception to achieve. For design see https://gcc.gnu.org/legacy-ml/libstdc++/2008-05/msg00079.html

1
2
3
4
struct __cxa_dependent_exception {
void *reserve;
void *primaryException;
};

std::current_exception and std::rethrow_exception will increase the reference count.

In libstdc++, __cxa_rethrow calls GCC extension _Unwind_Resume_or_Rethrow which can resume forced unwinding.

LLVM IR

In construction.

  • nounwind: cannot unwind
  • unwtables: force generation of the unwind table regardless of nounwind
1
2
3
4
5
6
7
if uwtables
if nounwind
CantUnwind
else
Unwind Table
else
do nothing

Compiler behavior

  • -fno-exceptions -fno-asynchronous-unwind-tables: neither .eh_frame nor .gcc_except_table exists
  • -fno-exceptions -fasynchronous-unwind-tables: .eh_frame exists, .gcc_except_table doesn't
  • -fexceptions: both .eh_frame and .gcc_except_table exist
    • In GCC, for a noexcept function, a possibly-throwing call site unhandled by a try block does not get an entry in the .gcc_except_table call site table. If the function has no try block, it gets a header-only .gcc_except_table (4 bytes)
    • In Clang, there is a call site entry calling __clang_call_terminate. The size overhead is larger than GCC's scheme. Improving this requires LLVM IR work

When an exception propagates from a function to its caller (libgcc_s/libunwind & libsupc++/libc++abi):

  • no .eh_frame: _Unwind_RaiseException returns _URC_END_OF_STACK. __cxa_throw calls std::terminate
  • .eh_frame without .gcc_except_table: pass-through (local variable destructors are not called). This is the case of -fno-exceptions -fasynchronous-unwind-tables.
  • .eh_frame with .gcc_except_table not covering the throwing call site: __gxx_personality_v0 calls std::terminate since no call site code range matches
  • .eh_frame with .gcc_except_table covering the throwing call site: do possible cleanup and unwind to the parent frame

Combined with the above description, when an exception will propagate to a caller of a noexcept function:

  • -fno-exceptions -fno-asynchronous-unwind-tables: propagating through a function calls std::terminate
  • -fno-exceptions -fasynchronous-unwind-tables: pass-through. Local variable destructors are not called. This behavior is unexpected.
  • -fexceptions: propagating through a noexcept function calls std::terminate

When std::terminate is called, there is a diagnostic looking like terminate called after throwing an instance of 'int' (libstdc++; libc++ has a smiliar one). There is no stack trace. If the process installs a SIGABRT signal handler, the handler may get a stack trace and symbolize the addresses.

Catching exceptions while unwinding through -fno-exceptions code is a proposal to improve the diagnostics.

Personality and typeinfo encoding

.eh_frame contains information about the unwind operation. See Stack unwinding for its format.

In -fpie/-fpic mode, the personality and type info encodings have the DW_EH_PE_indirect|DW_EH_PE_pcrel bits on most targets.

1
2
3
4
5
6
void raise() { throw 42; }
bool foo() {
try { raise(); } catch (int) { return true; }
return false;
}
int main() { foo(); }
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
_Z3foov:
.cfi_startproc
.cfi_personality 155, DW.ref.__gxx_personality_v0
.cfi_lsda 27, .Lexception0
...

.section .gcc_except_table,"a",@progbits
...
# >> Catch TypeInfos <<
.Ltmp3: # TypeInfo 1
.long .L_ZTIi.DW.stub-.Ltmp3
.Lttbase0:

.data
.p2align 3, 0x0
.L_ZTIi.DW.stub:
.quad _ZTIi
.hidden DW.ref.__gxx_personality_v0
.weak DW.ref.__gxx_personality_v0
.section .data.DW.ref.__gxx_personality_v0,"aGw",@progbits,DW.ref.__gxx_personality_v0,comdat
.p2align 3, 0x0
.type DW.ref.__gxx_personality_v0,@object
.size DW.ref.__gxx_personality_v0, 8
DW.ref.__gxx_personality_v0:
.quad __gxx_personality_v0

In the example, .eh_frame contains a PC-relative relocations referencing DW.ref.__gxx_personality_v0 .gcc_except_table contains a PC-relative relocation referencing .L_ZTIi.DW.stub. The relocations are link-time constants, so .eh_frame can remain readonly.

DW.ref.__gxx_personality_v0 and .L_ZTIi.DW.stub reside in writable sections which will contain dynamic relocations if __gxx_personality_v0 and _ZTIi are defined in a shared object - which is often the case.

For -fno-pic code, different targets have different ideas. AArch64 and RISC-V use DW_EH_PE_indirect|DW_EH_PE_pcrel as well. On x86, .cfi_personality refers to __gxx_personality_v0. This will lead to a canonical PLT if __gxx_personality_v0 is defined in a shared object (e.g. libstdc++.so.6). I sent a patch https://gcc.gnu.org/PR108622 to use DW_EH_PE_indirect|DW_EH_PE_pcrel.

R_MIPS_32 and R_MIPS_64 personality encoding

https://github.com/llvm/llvm-project/issues/58377

1
void foo() { try { throw 1; } catch (...) {} }

mips64el-linux-gnuabi64-g++ -fpic and clang++ --target=mips64el-unknown-linux-gnuabi64 -fpic use DW_EH_PE_absptr | DW_EH_PE_indirect to encode personality routine pointers. Using DW_EH_PE_absptr instead of DW_EH_PE_pcrel is wrong. GNU ld works around the compiler design problem by converting DW_EH_PE_absptr to DW_EH_PE_pcrel. ld.lld does not support this and will report an error:

1
2
3
4
5
6
% clang++ --target=mips64el-linux-gnuabi -fpic -fuse-ld=lld -shared ex.cc
ld.lld: error: relocation R_MIPS_64 cannot be used against symbol 'DW.ref.__gxx_personality_v0'; recompile with -fPIC
>>> defined in /tmp/ex-40a996.o
>>> referenced by ex.cc
>>> /tmp/ex-40a996.o:(.eh_frame+0x13)
...

R_MIPS_32 for 32-bit builds is similar.

Potentially-throwing __cxa_end_catch

__cxa_end_catch is potentially-throwing because it may destroy an exception object with a potentially-throwing destructor (e.g. ~C() noexcept(false) { ... }).

1
2
3
4
5
6
7
8
struct A { ~A(); };
void opaque();
void foo() {
A a;
// The exception object has an unknown type and may throw. The landing pad
// then needs to call A::~A for `a` before jumping to _Unwind_Resume.
try { opaque(); } catch (...) { }
}

To support an exception object with a potentially-throwing destructor, Clang generates conservative code for a catch-all clause or a catch clause matching a record type:

  • assume that the exception object may have a throwing destructor
  • emit invoke void @__cxa_end_catch (as the call is not marked as the nounwind attribute).
  • emit a landing pad to destroy local variables and call _Unwind_Resume

Per C++ [dcl.fct.def.coroutine], a coroutine's function body implies a catch (...). Clang's code generation pessimizes even simple code, like:

1
2
3
4
5
6
7
UserFacing foo() {
A a;
opaque();
co_return;
// For `invoke void @__cxa_end_catch()`, the landing pad destroys the
// promise_type and deletes the coro frame.
}

Throwing destructors are typically discouraged. In many environments, the destructors of exception objects are guaranteed to never throw, making our conservative code generation approach seem wasteful.

Furthermore, throwing destructors tend not to work well in practice:

  • GCC does not emit call site records for the region containing __cxa_end_catch. This has been a long time, since 2000.
  • If a catch-all clause catches an exception object that throws, both GCC and Clang using libstdc++ leak the allocated exception object.

To avoid code generation pessimization, I added -fassume-nothrow-exception-dtor for Clang 18 to assume that __cxa_end_catch calls have the nounwind attribute. This requires that thrown exception objects' destructors will never throw.

To detect misuses, diagnose throw expressions with a potentially-throwing destructor. Technically, it is possible that a potentially-throwing destructor never throws when called transitively by __cxa_end_catch, but these cases seem rare enough to justify a relaxed mode.

Misc

Use libc++ and libc++abi

On Linux, compared with clang, clang++ additionally links against libstdc++/libc++ and libm.

Dynamically link against libc++.so (which depends on libc++abi.so) (additionally specify -pthread if threads are used):

1
2
clang++ -stdlib=libc++ -nostdlib++ a.cc -lc++ -lc++abi
# clang -stdlib=libc++ a.cc -lc++ -lc++abi does not pass -lm to the linker.

If compile actions and link actions are separate (-stdlib=libc++ passes -lc++ but its position is undesired, so just don't use it):

1
clang++ -nostdlib++ a.cc -lc++ -lc++abi

Statically link in libc++.a (which includes the members of libc++abi.a). This requires a -DLIBCXX_ENABLE_STATIC_ABI_LIBRARY=on build:

1
clang++ -stdlib=libc++ -static-libstdc++ -nostdlib++ a.cc -pthread

Statically link in libc++.a and libc++abi.a. This is a bit inferior because there is a duplicate -lc++ passed by the driver.

1
clang++ -stdlib=libc++ -static-libstdc++ -nostdlib++ a.cc -Wl,--push-state,-Bstatic -lc++ -lc++abi -Wl,--pop-state -pthread

libc++abi and libsupc++

It is worth noting that the <exception> <stdexcept> type layout provided by libc++abi (such as logic_error, runtime_error, etc.) are specifically compatible with libsupc++. After GCC 5 libstdc++ abandoned ref-counted std::string, libsupc++ still uses __cow_string for logic_error and other exception classes. libc++abi uses a similar ref-counted string.

libsupc++ and libc++abi do not use inline namespace and have conflicting symbol names. Therefore, usually a libc++/libc++abi application cannot use a shared object (ODR violation) of a dynamically linked libstdc++.so.

If you make some efforts, you can still solve this problem: compile the non-libsupc++ part of libstdc++ to get self-made libstdc++.so.6. The executable file link libc++abi provides the C++ ABI symbols required by libstdc++.so.6.

Monolithic .gcc_except_table

Prior to Clang 12, a monolithic .gcc_except_table was used. Like many other metadata sections, the main problem with the monolithic sections is that they cannot be garbage collected by the linker. For RISC-V -mrelax and basic block sections, there is a bigger problem: .gcc_except_table has relocations pointing to text sections local symbols. If the pointed text sections are discarded in the COMDAT group, these relocations will be rejected by the linker (error: relocation refers to a symbol in a discarded section). .eh_frame with monolithic .gcc_except_table monolithic .gcc_except_table

The solution is to use fragmented .gcc_except_table(https://reviews.llvm.org/D83655). fragmented .gcc_except_table

But the actual deployment is not that simple:) ld.lld processes --gc-sections first (it is not clear which .eh_frame pieces are live), and then processes (and garbage collects) .eh_frame.

During --gc-sections, all .eh_frame pieces are live. They will mark all .gcc_except_table.* live. According to the GC rules of the section group, a .gcc_except_table.* will mark other sections (including .text.*) live in the same section group. The result is that .text.* in all section groups cannot be GC, resulting in increased input size. bad GC with .gcc_except_table.*

https://reviews.llvm.org/D91579 fixed this problem: For .eh_frame, do not mark .gcc_except_table in section group. good GC with .gcc_except_table.*

clang -fbasic-block-sections=

This option produces one section for each basic block (more aggressive than -ffunction-sections) for aggressive machine basic block optimizations. There are some challenges integrating LSDA into this framework.

You can either allocate a .gcc_except_table for each basic block section needing LSDA, or let all basic block sections use the same .gcc_except_table. The LLVM implementation chose the latter, which has several advantages:

  • No duplicate headers
  • Sharable type table
  • Sharable action table (this only matters for the deprecated exception specification)

There is only one LPStart when using the same .gcc_except_table, and it is necessary to ensure that all offsets from landing pads to LPStart can be represented by relocations. Because most architectures do not have a difference relocation type (R_RISCV_SUB*), placing landing pads in the same section is the choice.

Exception handling ABI for the ARM architecture

The overall structure is the same as Itanium C++ ABI: Exception Handling, with some differences in data structure, _Unwind_*, etc.

https://maskray.me/blog/2020-11-08-stack-unwinding contains a few notes.

Compact Exception Tables for MIPS ABIs

In construction.

Use .eh_frame_entry and .gnu_extab to describe.

Design thoughts:

  • Exception code ranges are sorted and must be linearly searched. Therefore it would be more compact to specify each relative to the previous one, rather than relative to a fixed base.
  • The landing pad is often close to the exception region that uses it. Therefore it is better to use the end of the exception region as the reference point, than use the function base address.
  • The action table can be integrated directly with the exception region definition itself. This removes one indirection. The threading of actions can still occur, by providing an offset to the next exception encoding of interest.
  • Often the action threading is to the next exception region, so optimizing that case is important.
  • Catch types and exception specification type lists cannot easily be encoded inline with the exception regions themselves. It is necessary to preserve the unique indices that are automatically created by the DWARF scheme.

It uses compact unwind descriptors similar to ARM EH. Builtin PR1 means there is no language-dependent data, Builtin PR2 is used for C/C++

Misc

Khalil Estell's CppCon 2024 talk C++ Exceptions for Smaller Firmware mentions that a custom exception implementation that drops some rare functionality can make the library code size mush smaller, suitable for firmware development.

中文版

几周前写了一篇文章详细介绍stack unwinding。 今天介绍C++ exception handling,stack unwinding的一个应用。Exception handling有多种ABI(interoperability of C++ implementations),其中应用最广泛的是Itanium C++ ABI: Exception Handling

Itanium C++ ABI: Exception Handling

简化的exception处理流程(从throw到catch):

  • 调用__cxa_allocate_exception分配空间存放exception object和exception header __cxa_exception
  • 跳转到__cxa_throw,设置__cxa_exception字段后跳转到_Unwind_RaiseException
  • _Unwind_RaiseException执行search phase,调用personality查找匹配的try catch(类型匹配)
  • _Unwind_RaiseException执行cleanup phase:调用personality查找包含out-of-scope变量的stack frames,对于每个stack frame,跳转到其landing pad执行destructors。该landing pad用_Unwind_Resume跳转回cleanup phase
  • _Unwind_RaiseException执行的cleanup phase跳转到匹配的try catch对应的landing pad
  • 该landing pad调用__cxa_begin_catch,执行catch代码,然后调用__cxa_end_catch
  • __cxa_end_catch销毁exception object

注意:每个栈帧的personality routine可以不同。实践中多个栈帧使用同一个personality routine是很常见的。

其中_Unwind_RaiseException负责stack unwinding,是语言无关的。而stack unwinding中的语言相关概念(catch block、out-of-scope variable)用personality解释/封装。 这是一个核心思想,使得该ABI可以应用与其他语言并允许其他语言和C++混用。

因此,Itanium C++ ABI: Exception Handling分成Level 1 Base ABI and Level 2 C++ ABI两部分。Base ABI描述了语言无关的stack unwinding部分,定义了_Unwind_* API。常见实现是:

  • libgcc: libgcc_s.so.1 and libgcc_eh.a
  • 多个名称为libunwind的库(libunwind.solibunwind.a)。使用Clang的话可以用--rtlib=compiler-rt --unwindlib=libunwind选择链接libunwind,可以用llvm-project/libunwind或nongnu.org/libunwind

C++ ABI则和C++语言相关,定义了__cxa_* API(__cxa_allocate_exception, __cxa_throw, __cxa_begin_catch等)。常见实现是:

  • libsupc++,libstdc++的一部分
  • llvm-project中的libc++abi

llvm-project中的C++标准库实现libc++可以接入libc++abi、libcxxrt或libsupc++,推荐使用libc++abi。

Level 1 Base ABI

Data structures

主要数据结构是:

1
2
3
4
5
6
7
// Level 1
struct _Unwind_Exception {
_Unwind_Exception_Class exception_class; // an identifier, used to tell whether the exception is native
_Unwind_Exception_Cleanup_Fn exception_cleanup;
_Unwind_Word private_1; // zero: normal unwind; non-zero: forced unwind, the _Unwind_Stop_Fn function
_Unwind_Word private_2; // saved stack pointer
} __attribute__((aligned));
1
2
3
4
5
6
7
8
9
10
11
int main() {
try {
throw 1;
} catch (...) {
try {
throw 2;
} catch (...) {
// The global exception stack has two exceptions here.
}
}
}

exception_classexception_cleanup是Level 2抛出exception的API设置的。Level 1 API不处理exception_class,只是把它传递给personality routine。Personality routine用该值区分native和foreign exceptions。

libc++abi __cxa_throw会设置exception_class为表示"CLNGC++\0"的uint64_t。libsupc++则使用表示"GNUCC++\0"的uint64_t。ABI要求低位包含"C++\0"。 libstdc++抛出的exceptions会被libc++abi当作foreign exceptions。只有catch (...)可以捕获foreign exceptions。

Exception propagation实现机制会用另一个exception_class标识符来表示dependent exceptions。

exception_cleanup存放这个exception object的destroying delete函数,被__cxa_end_catch用来销毁一个foreign exception。

private_1private_2是Level 1私有的,不应被personality使用。

Unwind操作需要的信息(对于给定的IP/SP,如何获取上一层栈帧的IP/SP等寄存器信息)是实现相关的,Level 1 ABI没有定义。 在ELF系统里,.eh_frame.eh_frame_hdr(PT_EH_FRAME program header)存储unwind信息。 参见Stack unwinding

Level 1 API

_Unwind_Reason_Code _Unwind_RaiseException(_Unwind_Exception *obj);执行用于exception的stack unwinding。 它正常情况下是noreturn的,会像longjmp那样把控制权交给matched catch handler(catch block)或non-catch handlers(需要执行destructors的代码块)。 它是个two-phase process,分为phase 1 (search phase)和phase 2 (cleanup phase)。

  • search phase查找matched catch handler,把stack pointer记录在private_2
    • 根据IP/SP及其他保存的寄存器追溯调用链
    • 对于每个栈帧,如果没有personality routine则跳过;有则调用(actions设置为_UA_SEARCH_PHASE)
    • 若personality返回_URC_CONTINUE_UNWIND,继续搜索
    • 若personality返回_URC_HANDLER_FOUND,表示找到了一个matched catch handler or unmatched exception specification,停止搜索
  • cleanup phase跳转到non-catch handlers(通常是local variable destructors),再把控制权交给phase 1定位的matched catch handler
    • 根据IP/SP及其他保存的寄存器追溯调用链
    • 对于每个栈帧,如果没有personality routine则跳过;有则调用(actions设置为_UA_CLEANUP_PHASE,search phase标记的栈帧还会设置_UA_HANDLER_FRAME)
    • 若personality返回_URC_CONTINUE_UNWIND,表示没有landing pad,继续unwind
    • 若personality返回_URC_INSTALL_CONTEXT,表示有landing pad,跳转到landing pad
    • 对于search phase没有标记的中间栈帧,landing pad执行清理工作(一般是destructors of out-of-scope variables),会调用_Unwind_Resume跳转回cleanup phase
    • 对于被search phase标记的栈帧,landing pad调用__cxa_begin_catch,然后执行catch block中的代码,最后调用__cxa_end_catch销毁exception object
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
static _Unwind_Reason_Code unwind_phase1(unw_context_t *uc, _Unwind_Context *ctx,
_Unwind_Exception *obj) {
// Search phase: unwind and call personality with _UA_SEARCH_PHASE for each frame
// until a handler (catch block) is found.
unw_init_local(uc, ctx);
for(;;) {
if (ctx->fdeMissing) return _URC_END_OF_STACK;
if (!step(ctx)) return _URC_FATAL_PHASE1_ERROR;
ctx->getFdeAndCieFromIP();
if (!ctx->personality) continue;
switch (ctx->personality(1, _UA_SEARCH_PHASE, obj->exception_class, obj, ctx)) {
case _URC_CONTINUE_UNWIND: break;
case _URC_HANDLER_FOUND:
unw_get_reg(ctx, UNW_REG_SP, &obj->private_2);
return _URC_NO_REASON;
default: return _URC_FATAL_PHASE1_ERROR; // e.g. stack corruption
}
}
return _URC_NO_REASON;
}

static _Unwind_Reason_Code unwind_phase2(unw_context_t *uc, _Unwind_Context *ctx,
_Unwind_Exception *obj) {
// Cleanup phase: unwind and call personality with _UA_CLEANUP_PHASE for each frame
// until reaching the handler. Restore the register state and transfer control.
unw_init_local(uc, ctx);
for(;;) {
if (ctx->fdeMissing) return _URC_END_OF_STACK;
if (!step(ctx)) return _URC_FATAL_PHASE2_ERROR;
ctx->getFdeAndCieFromIP();
if (!ctx->personality) continue;
_Unwind_Action actions = _UA_CLEANUP_PHASE;
size_t sp;
unw_get_reg(ctx, UNW_REG_SP, &sp);
if (sp == obj->private_2) actions |= _UA_HANDLER_FRAME;
switch (ctx->personality(1, actions, obj->exception_class, obj, ctx)) {
case _URC_CONTINUE_UNWIND:
break;
case _URC_INSTALL_CONTEXT:
unw_resume(ctx); // Return if there is an error
return _URC_FATAL_PHASE2_ERROR;
default: return _URC_FATAL_PHASE2_ERROR; // Unknown result code
}
}
return _URC_FATAL_PHASE2_ERROR;
}

_Unwind_Reason_Code _Unwind_RaiseException(_Unwind_Exception *obj) {
unw_context_t uc;
_Unwind_Context ctx;
__unw_getcontext(&uc);
_Unwind_Reason_Code phase1 = unwind_phase1(&uc, &ctx, obj);
if (phase1 != _URC_NO_REASON) return phase1;
return unwind_phase2(&uc, &ctx, obj);
}

C++不支持resumptive exception handling (correcting the exceptional condition and resuming execution at the point where it was raised),所以two-phase process不是必需的,但two-phase允许C++和其他语言共存于call stack上。

_Unwind_Reason_Code _Unwind_ForcedUnwind(_Unwind_Exception *obj, _Unwind_Stop_Fn stop, void *stop_parameter);执行forced unwinding: 跳过search phase,执行稍微不同的cleanup phase。private_2被用作stop function的参数。 这个函数很少用到。

void _Unwind_Resume(_Unwind_Exception *obj);继续phase 2的unwind过程。它类似longjmp,是noreturn的,是唯一被编译器直接调用的Level 1 API。编译器通常在non-catch handlers末尾调用该函数。

void _Unwind_DeleteException(_Unwind_Exception *obj);销毁指定的exception object。它是唯一处理exception_cleanup的Level 1 API,被__cxa_end_catch调用。

很多实现提供扩展:_Unwind_Reason_Code _Unwind_Backtrace(_Unwind_Trace_Fn callback, void *ref);是另一种特殊的unwind过程:忽略personality,将栈帧信息通知一个外部callback。

Level 2 C++ ABI

这一部分处理C++的throw、catch block、out-of-scope variable destructors等语言相关概念。

Data structures

每个thread有一个全局exception栈,caughtExceptions存储栈顶(最新)的exception,__cxa_exception::nextException指向栈中下一个exception。

1
2
3
4
struct __cxa_eh_globals {
__cxa_exception *caughtExceptions;
unsigned uncaughtExceptions;
};

1
2
3
4
5
6
7
8
9
10
11
int main() {
try {
throw 1;
} catch (...) {
try {
throw 2;
} catch (...) {
// The global exception stack has two exceptions here.
}
}
}

__cxa_exception的定义如下,其末尾存放Base ABI定义的_Unwind_Exception__cxa_exception_Unwind_Exception基础上添加了C++语义信息。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
// Level 2
struct __cxa_exception {
void *reserve; // here on 64-bit platforms
size_t referenceCount; // here on 64-bit platforms
std::type_info *exceptionType;
void (*exceptionDestructor)(void *);
unexpected_handler unexpectedHandler; // by default std::get_unexpected()
terminate_handler terminateHandler; // by default std::get_terminate()
__cxa_exception *nextException; // linked to the next exception on the thread stack
int handlerCount; // incremented in __cxa_begin_catch, decremented in __cxa_end_catch, negated in __cxa_rethrow; last non-dependent performs the clean

// The following fields cache information the catch handler found in phase 1.
int handlerSwitchValue; // ttypeIndex in libc++abi
const char *actionRecord;
const char *languageSpecificData;
void *catchTemp; // landingPad
void *adjustedPtr; // adjusted pointer of the exception object

_Unwind_Exception unwindHeader;
};

处理exception需要的信息(对于给定的IP,是否在try catch中、是否有需要执行的out-of-scope variable destructors、是否有dynamic exception specification)叫作language-specific data area (LSDA),是实现相关的,Level 2 ABI没有定义。

Landing pad

Landing pad是text section中的一段和exception相关的代码,它有三种:

  • cleanup clause:通常调用destructors of out-of-scope variables或__attribute__((cleanup(...)))注册的callbacks,然后用_Unwind_Resume跳转回cleanup phase
  • 捕获exception的catch clause:调用destructors of out-of-scope variables,然后调用__cxa_begin_catch,执行catch代码,最后调用__cxa_end_catch
  • rethrow:调用destructors of out-of-scope variables in the catch clause,然后调用__cxa_end_catch,接着用_Unwind_Resume跳转回cleanup phase

如果一个try有多个catch,那么language-specific data area里会有多个串联的action table entries,但landing pad描述合并的catch clauses。 Personality在转移控制权给landing pad前,会调用_Unwind_SetGP设置__buitin_eh_return_data_regno(1)存放switchValue,告知landing pad哪一个类型匹配了。

Rethrow是在执行catch代码中间被__cxa_rethrow触发的,需要destruct catch clause定义的局部变量,调用__cxa_end_catch抵消catch clause开头调用的__cxa_begin_catch

.gcc_except_table

ELF系统里language-specific data area通常存储在.gcc_except_table section中。该section被__gxx_personality_v0__gcc_personality_v0解析。它的结构很简单:

  • header(@LPStart@TType和call sites的编码,action records的起始偏移)
  • call site table: 描述每个call site(一个地址区间)应执行的landing pad offset (0 if not exists)和action record offset (biased by 1, 0 for no action)
  • action table
  • type table (referennced by postive switch values)
  • dynamic exception specification (deprecated in C++, so rarely used) (referenced by negative switch values)

下面是一个例子:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
  .section        .gcc_except_table,"a",@progbits
.p2align 2
GCC_except_table0:
.Lexception0:
.byte 255 # @LPStart Encoding = omit
.byte 3 # @TType Encoding = udata4
.uleb128 .Lttbase0-.Lttbaseref0 # The start of action records
.Lttbaseref0:
.byte 1 # Call site Encoding = uleb128
.uleb128 .Lcst_end0-.Lcst_begin0
.Lcst_begin0: # 2 call site code ranges
.uleb128 .Ltmp0-.Lfunc_begin0 # >> Call Site 1 <<
.uleb128 .Ltmp1-.Ltmp0 # Call between .Ltmp0 and .Ltmp1
.uleb128 .Ltmp2-.Lfunc_begin0 # jumps to .Ltmp2
.byte 1 # On action: 1
.uleb128 .Ltmp1-.Lfunc_begin0 # >> Call Site 2 <<
.uleb128 .Lfunc_end0-.Ltmp1 # Call between .Ltmp1 and .Lfunc_end0
.byte 0 # has no landing pad
.byte 0 # On action: cleanup
.Lcst_end0:
.byte 1 # >> Action Record 1 <<
# Catch TypeInfo 1
.byte 0 # No further actions
.p2align 2
# >> Catch TypeInfos <<
.long _ZTIi # TypeInfo 1

每个call site record除了call site offset和length外还有两个值landing pad offset和action record offset。

  • landing pad offset为0。action record offset也应为0。没有landing pad
  • landing pad offset非0。有landing pad
    • action record offset为0,也叫做cleanup("cleanup"这个描述有些歧义,因为Level 1有clean phase的术语),通常描述local variable destructors和__attribute__((cleanup(...)))
    • action record offset非0。action record offset指向action table中一条action record。catch or noexcept specifier or exception specification

每个action record有两个值:

  • switch value (SLEB128): 正数表示catch的类型的TypeInfo在type table中的下标;负数表示type table中一个exception specification的offset;0表示cleanup action,效果类似于call site record中action record offset为0
  • offset to next action record: 须要处理的下一个action record,0表示结束。这种单链表形式可以描述串联的多个catch,或exception specification list

offset to next action record不仅可以用作单链表,也可用作trie,但几乎碰不到可以用上trie性质的场景。

程序中不同区域对应的landing pad offset/action record offset取值:

  • 无local variable destructor的非try区域:landing_pad_offset==0 && action_record_offset==0
  • 有local variable destructor的非try区域:landing_pad_offset!=0 && action_record_offset==0。phase 2应停下调用cleanup
  • __attribute__((cleanup(...)))的变量的非try区域:landing_pad_offset!=0 && action_record_offset==0。同上
  • try区域:landing_pad_offset!=0 && action_record_offset!=0。landing pad指向catch拼接得到的代码块。action record为大于0的type filter描述一个catch
  • try区域,含catch (...):同上。action record为大于0的type filter指向type table中一个值0的项(表示catch any)
  • 在一个含noexcept specifier的函数可能propagate exception到caller的区域:landing_pad_offset!=0 && action_record_offset!=0。landing pad指向调用std::terminate的代码块。action record为大于0的type filter指向type table中一个值0的项(表示catch any)
  • 在一个含exception specifier的函数可能propagate exception到caller的区域:landing_pad_offset!=0 && action_record_offset!=0。landing pad指向调用__cxa_call_unexpected的代码块。action record为小于0的type filter描述一个exception specifier list

Level 2 API

void *__cxa_allocate_exception(size_t thrown_size);。编译器为throw A();生成该函数的调用,分配一段内存存放__cxa_exception和A object。__cxa_exception紧挨在A object左侧。 下面这个函数说明了程序操作的exception object的地址和__cxa_exception的关系:

1
2
3
static void *thrown_object_from_cxa_exception(__cxa_exception *exception_header) {
return static_cast<void *>(exception_header + 1);
}

void __cxa_throw(void *thrown, std::type_info *tinfo, void (*destructor)(void *));调用上述函数找到__cxa_exception header,填充各个字段(referenceCount, exception_class, unexpectedHandler, terminateHandler, exceptionType, exceptionDestructor, unwindHeader.exception_cleanup)后调用_Unwind_RaiseException。这个函数是noreturn的。

void *__cxa_begin_catch(void *obj);。编译器在catch block的开头生成该函数的调用。对于native exception:

  • handlerCount
  • 压入该thread的全局exception栈,减少uncaught_exception
  • 返回adjusted pointer of the exception object

对于foreign exception(不一定有__cxa_exception header):

  • 该thread的全局exception栈为空的话则push,否则执行std::terminate(不知道是否有类似__cxa_exception::nextException的字段)
  • 返回static_cast<_Unwind_Exception *>(obj) + 1(假设_Unwind_Exception紧挨着thrown object)

简化实现:

1
2
3
4
5
6
7
8
9
10
11
12
void __cxa_throw(void *thrown, std::type_info *tinfo, void (*destructor)(void *)) {
__cxa_exception *hdr = (__cxa_exception *)thrown - 1;
hdr->exceptionType = tinfo; hdr->destructor = destructor;
hdr->unexpectedHandler = std::get_unexpected();
hdr->terminateHandler = std::get_terminate();
hdr->unwindHeader.exception_class = ...;
__cxa_get_globals()->uncaughtExceptions++;
_Unwind_RaiseException(&hdr->unwindHeader);
// Failed to unwind, e.g. the .eh_frame FDE is absent.
__cxa_begin_catch(&hdr->unwindHeader);
std::terminate();
}

void __cxa_end_catch();在catch block末尾或rethrow时被调用。对于native exception:

  • 从该thread的全局exception栈上获取当前exception,减少handlerCount
  • handlerCount到〇则pop该thread的全局exception栈
  • 如果是native exception:handlerCount减少到0时调用__cxa_free_exception(有dependent exception时得减少referenceCount,到0时调用__cxa_free_exception)

对于foreign exception:

  • 调用_Unwind_DeleteException
  • 执行__cxa_eh_globals::uncaughtExceptions = nullptr;(由于__cxa_begin_catch性质,栈中有恰好一个exception)

void __cxa_rethrow();会标注exception object,使handlerCount__cxa_end_catch减低到0时不会被销毁,因为这个object会被_Unwind_Resume恢复的cleanup phase复用。

注意,除了__cxa_begin_catch__cxa_end_catch,多数__cxa_*函数无法处理foreign exceptions(没有__cxa_exception header)。

实例

对于如下代码:

1
2
3
4
5
6
#include <stdio.h>
struct A { ~A(); };
struct B { ~B(); };
void foo() { throw 0xB612; }
void bar() { B b; foo(); }
void qux() { try { A a; bar(); } catch (int x) { puts(""); } }

编译得到的汇编概念上长这样:

1
2
3
4
5
6
7
8
9
10
11
12
13
void foo() {
__cxa_exception *thrown = __cxa_allocate_exception(4);
*thrown = 42;
__cxa_throw(thrown, &typeid(int), /*destructor=*/nullptr);
}
void bar() {
B b; foo(); return;
landing_pad: b.~B(); _Unwind_Resume();
}
void qux() {
A a; bar(); return;
landing_pad: __cxa_begin_catch(obj); puts(""); __cxa_end_catch(obj);
}

运行流程:

  • qux调用bar,bar调用foo,foo抛出exception
  • foo动态分配内存块,存放抛出的int和__cxa_exception header,然后执行__cxa_throw
  • __cxa_throw填充__cxa_exception的其他字段,调用_Unwind_RaiseException

接下来_Unwind_RaiseException驱动Level 1的two-phase process。

  • _Unwind_RaiseException执行phase 1: search phase
    • 对于bar,以_UA_SEARCH_PHASE为actions参数调用personality,返回_URC_CONTINUE_UNWIND(没有catch handler)
    • 对于qux,以_UA_SEARCH_PHASE为actions参数调用personality,返回_URC_HANDLER_FOUND(有catch handler)
    • 标记qux的栈帧的stack pointer会被标记(保存在private_2中),并停止搜索
  • _Unwind_RaiseException执行phase 2: cleanup phase
    • bar的栈帧不是search phase标记的,以_UA_CLEANUP_PHASE为actions参数调用personality,返回_URC_INSTALL_CONTEXT
    • 跳转到bar的栈帧的landing pad
    • landing pad清理b之后用_Unwind_Resume回到cleanup phase
    • qux的栈帧是search phase标记的,以_UA_CLEANUP_PHASE|_UA_HANDLER_FRAME为actions参数调用personality,返回_UA_INSTALL_CONTEXT
    • 跳转到qux栈帧的landing pad
    • landing pad调用__cxa_begin_catch,执行catch代码,然后调用__cxa_end_catch

__gxx_personality_v0

Personality routine被Level 1 phase 1和phase 2调用,用于提供语言相关处理。不同的语言、实现或架构可能使用不同的personality routines。常见的personality如下:

  • __gxx_personality_v0: C++
  • __gxx_personality_sj0: sjlj
  • __gcc_personality_v0: C -fexceptions,用于__attribute__((cleanup(...)))
  • __CxxFrameHandler3: Windows MSVC
  • __gxx_personality_seh0: MinGW-w64 -fseh-exceptions
  • __objc_personality_v0: MacOSX环境ObjC

C++在ELF系统上的实现最常用的是__gxx_personality_v0,其实现在:

  • GCC: libstdc++-v3/libsupc++/eh_personality.cc
  • libc++abi: src/cxa_personality.cpp

_Unwind_Reason_Code (*__personality_routine)(int version, _Unwind_Action action, uint64 exceptionClass, _Unwind_Exception *exceptionObject, _Unwind_Context *context);

没有错误的情况下:

  • For _UA_SEARCH_PHASE, returns
    • _URC_CONTINUE_UNWIND: no lsda, or there is no landing pad, there is a non-catch handler or a matched exception specification
    • _URC_HANDLER_FOUND: there is a matched catch handler or an unmatched exception specification
  • For _UA_CLEANUP_PHASE, returns
    • _URC_CONTINUE_UNWIND: no lsda, or there is no landing pad, or (not produced by a compiler) there is no cleanup action
    • _URC_INSTALL_CONTEXT: the other cases

Personality转移控制权给landing pad前,会调用_Unwind_SetGP设置两个寄存器(架构相关,__buitin_eh_return_data_regno(0)__buitin_eh_return_data_regno(1))存放_Unwind_Exception *switchValue

代码:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
_unwind_Reason_Code __gxx_personality_v0(int version, _Unwind_Action actions, uint64_t exceptionClass, _Unwind_Exception *exc, _Unwind_Context *ctx) {
if (actions == (_UA_CLEANUP_PHASE | _UA_HANDLER_FRAME) && is_native) {
auto *hdr = (__cxa_exception *)(exc+1) - 1;
// Load cached results from phase 1.
results.switchValue = hdr->handlerSwitchValue;
results.actionRecord = hdr->actionRecord;
results.languageSpecificData = hdr->languageSpecificData;
results.landingPad = reinterpret_cast<uintptr_t>(hdr->catchTemp);
results.adjustedPtr = hdr->adjustedPtr;

_Unwind_SetGR(...);
_Unwind_SetGR(...);
_Unwind_SetIP(ctx, res.landingPad);
return _URC_INSTALL_CONTEXT;
}
scan_eh_tab(results, actions, native_exception, unwind_exception, context);
if (results.reason == _URC_CONTINUE_UNWIND ||
results.reason == _URC_FATAL_PHASE1_ERROR)
return results.reason;
if (actions & _UA_SEARCH_PHASE) {
auto *hdr = (__cxa_exception *)(exc+1) - 1;
// Cache LSDA results in hdr.
hdr->handlerSwitchValue = results.switchValue;
hdr->actionRecord = results.actionRecord;
hdr->languageSpecificData = results.languageSpecificData;
hdr->catchTemp = reinterpret_cast<void *>(results.landingPad);
hdr->adjustedPtr = results.adjustedPtr;
return _URC_HANDLER_FOUND;
}
// _UA_CLEANUP_PHASE
_Unwind_SetGR(...);
_Unwind_SetGR(...);
_Unwind_SetIP(ctx, res.landingPad);
return _URC_INSTALL_CONTEXT;
}

对于native exception,search phase personality返回_URC_HANDLER_FOUND时会缓存该栈帧的LSDA相关信息。在cleanup phase再度调用personality时actions == (_UA_CLEANUP_PHASE | _UA_HANDLER_FRAME),personality知道可以读取缓存,不需要解析.gcc_except_table

在剩下三种情况下会调用scan_eh_tab解析.gcc_except_table

  • actions & _UA_SEARCH_PHASE
  • actions & _UA_CLEANUP_PHASE && actions & _UA_HANDLER_FRAME && !is_native: foreign exception可以被catch (...)捕获,但遇到exception specification则应terminate
  • actions & _UA_CLEANUP_PHASE && !(actions & _UA_HANDLER_FRAME): non-catch handlers and unmatched catch handlers, matched exception specification。还有一种可能是_Unwind_ForcedUnwind的phase 2
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
static void scan_eh_tab(...) {
...
const uint8_t *lsda = (const uint8_t *)_Unwind_GetLanguageSpecificData(context);
if (lsda == nullptr) { res.reason = _URC_CONTINUE_UNWIND; return; }
res.languageSpecificData = lsda;
uintptr_t ipOffset = _Unwind_GetIP(context) - 1 - _Unwind_GetRegionStart(context);
for each call site entry {
if (!(start <= ipOffset && ipOffset < start + length))
continue;
res.landingPad = landingPad;
if (landingPad == 0) { res.reason = _URC_CONTINUE_UNWIND; return; }
if (actionRecord == 0) { // cleanup
res.reason = actions & _UA_SEARCH_PHASE ? _URC_CONTINUE_UNWIND : _URC_HANDLER_FOUND;
return;
}
// A catch or a dynamic exception specification.
const uint8_t *action = actionTableStart + (actionRecord - 1);
bool hasCleanup = false;
for(;;) {
res.actionRecord = action;
int64_t switchValue = readSLEB128(&action);
if (switchValue > 0) { // catch
auto *catchType = ...;
if (catchType == nullptr) { // catch (...)
res.switchValue = switchValue; res.actionRecord = action;
res.adjustedPtr = getThrownObjectPtr(exc); res.reason = _URC_HANDLER_FOUND;
return;
} else if (is_native) { // catch (T ...)
auto *hdr = (__cxa_exception *)(exc+1) - 1;
if (catchType->can_catch(hdr->exceptionType, adjustedPtr)) {
res.switchValue = switchValue; res.actionRecord = action;
res.adjustedPtr = adjustedPtr; res.reason = _URC_HANDLER_FOUND;
return;
}
}
} else if (switchValue < 0) { // dynamic exception specification
if (actions & _UA_FORCE_UNWIND) {
// Skip if forced unwinding.
} else if (is_native) {
if (!exception_spec_can_catch) {
// The landing pad will call __cxa_call_unexpected.
assert(actions & _UA_SEARCH_PHASE);
res.switchValue = switchValue; res.actionRecord = action;
res.adjustedPtr = adjustedPtr; res.reason = _URC_HANDLER_FOUND;
return;
}
} else {
// A foreign exception cannot be matched by the exception specification. The landing pad will call __cxa_call_unexpected.
res.switchValue = switchValue; res.actionRecord = action;
res.adjustedPtr = getThrownObjectPtr(exc); res.reason = _URC_HANDLER_FOUND;
return;
}
} else { // switchValue == 0: cleanup
hasCleanup = true;
}
const uint8_t *temp = action;
int64_t actionOffset = readSLEB128(&temp);
if (actionOffset == 0) { // End of action list
res.reason = hasCleanup && actions & _UA_CLEANUP_PHASE
? _URC_HANDLER_FOUND : _URC_CONTINUE_UNWIND;
return;
}
action += actionOffset;
}
}
call_terminate();
}

__gcc_personality_v0

libgcc and compiler-rt/lib/builtins实现了这个函数来处理__attribute__((cleanup(...)))。 默认的实现在search phase没有返回_URC_HANDLER_FOUND,所以cleanup handler不能用作catch handler。 然而,我们可以提供自己的实现在search phase返回_URC_HANDLER_FOUND... 在x86-64上,__buitin_eh_return_data_regno(0)是RAX。我们可以让cleanup handler传递RAX给landing pad。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
// a.cc
#include <exception>
#include <stdio.h>

extern "C" void my_catch();
extern "C" void throw_exception() { throw 42; }

int main() {
fprintf(stderr, "uncaught exceptions: %d\n", std::uncaught_exceptions());
my_catch();
fprintf(stderr, "uncaught exceptions: %d\n", std::uncaught_exceptions());
}

// b.c
#include <setjmp.h>
#include <stdbool.h>
#include <stdint.h>
#include <stdio.h>
#include <unwind.h>

void throw_exception();

struct __cxa_eh_globals {
struct __cxa_exception *caughtExceptions;
unsigned uncaughtExceptions;
};

struct __cxa_eh_globals *__cxa_get_globals();

static uintptr_t readULEB128(const uint8_t **a) {
uintptr_t res = 0, shift = 0;
const uint8_t *p = *a;
uint8_t b;
do {
b = *p++;
res |= (b & 0x7f) << shift;
shift += 7;
} while (b & 0x80);
*a = p;
return res;
}

_Unwind_Reason_Code __gcc_personality_v0(int version, _Unwind_Action actions,
uint64_t exception_class,
struct _Unwind_Exception *obj,
struct _Unwind_Context *ctx) {
const uint8_t *lsda = _Unwind_GetLanguageSpecificData(ctx);
if (lsda == 0)
return _URC_CONTINUE_UNWIND;
uintptr_t func = _Unwind_GetRegionStart(ctx);
uintptr_t pc = _Unwind_GetIP(ctx) - 1 - func;
if (*lsda++ != 255) // Skip LPStart
readULEB128(&lsda);
if (*lsda++ != 255) // Skip TType
readULEB128(&lsda);
uintptr_t call_site_table_len = 0;
if (*lsda++ == 1)
call_site_table_len = readULEB128(&lsda);
const uint8_t *end = lsda + call_site_table_len;
while (lsda < end) {
uintptr_t start = readULEB128(&lsda), len = readULEB128(&lsda),
lpad = readULEB128(&lsda);
if (!(start <= pc && pc < start + len))
continue;
if (lpad == 0)
return _URC_CONTINUE_UNWIND;
if (actions & _UA_SEARCH_PHASE)
return _URC_HANDLER_FOUND;
_Unwind_SetGR(ctx, __builtin_eh_return_data_regno(0), (uintptr_t)obj);
_Unwind_SetGR(ctx, __builtin_eh_return_data_regno(1), 0); // switchValue==0
_Unwind_SetIP(ctx, func + lpad);
return _URC_INSTALL_CONTEXT;
}
return _URC_FATAL_PHASE2_ERROR;
}

struct Catch {
struct _Unwind_Exception *obj;
jmp_buf env;
bool do_catch;
};

__attribute__((used))
static void my_jump(struct Catch *c) {
if (c->do_catch) {
struct __cxa_eh_globals *globals = __cxa_get_globals();
globals->uncaughtExceptions--;
longjmp(c->env, 1);
}
}

__attribute__((naked)) static void my_cleanup(struct Catch *c) {
asm("movq %rax, (%rdi); jmp my_jump");
}

void my_catch() {
__attribute__((cleanup(my_cleanup))) struct Catch c;
if (setjmp(c.env) == 0) {
c.do_catch = 1;
throw_exception();
} else {
fprintf(stderr, "caught exception: %p\n", c.obj);
fprintf(stderr, "value: %d\n", *(int *)(c.obj + 1));
c.do_catch = 0;
}
}
1
2
3
4
5
6
7
% clang -c -fexceptions a.cc b.c
% clang++ a.o b.o
% ./a.out
uncaught exceptions: 0
caught exception: 0x10f7f10
value: 42
uncaught exceptions: 0

Rethrow

前面Landing pad节简述了rethrow执行的代码。通常caught exception会在__cxa_end_catch销毁,因此__cxa_rethrow会标记exception object并增加handlerCount

C++11 引入了Exception Propagation (N2179; std::rethrow_exception etc),libstdc++中使用__cxa_dependent_exception实现。 设计参见https://gcc.gnu.org/legacy-ml/libstdc++/2008-05/msg00079.html

1
2
3
4
struct __cxa_dependent_exception {
void *reserve;
void *primaryException;
};

std::current_exceptionstd::rethrow_exception会增加引用计数。

在libstdc++里, __cxa_rethrow调用GCC扩展_Unwind_Resume_or_Rethrow(能resume forced unwinding)。

LLVM IR

待补充

  • nounwind: cannot unwind
  • unwtables: force generation of the unwind table regardless of nounwind
1
2
3
4
5
6
7
if uwtables
if nounwind
CantUnwind
else
Unwind Table
else
do nothing

编译器行为

  • -fno-exceptions -fno-asynchronous-unwind-tables: neither .eh_frame nor .gcc_except_table exists
  • -fno-exceptions -fasynchronous-unwind-tables: .eh_frame exists, .gcc_except_table doesn't
  • -fexceptions: both .eh_frame and .gcc_except_table exist
    • In GCC, for a noexcept function, a possibly-throwing call site unhandled by a try block does not get an entry in the .gcc_except_table call site table. If the function has no try block, it gets a header-only .gcc_except_table (4 bytes)
    • In Clang, there is a call site entry calling __clang_call_terminate. The size overhead is larger than GCC's scheme. Improving this requires LLVM IR work

如果某个exception将要propagate到一个function的caller时:

  • no .eh_frame: _Unwind_RaiseException returns _URC_END_OF_STACK. __cxa_throw calls std::terminate
  • .eh_frame without .gcc_except_table: pass-through (local variable destructors are not called). This is the case of -fno-exceptions -fasynchronous-unwind-tables.
  • .eh_frame with empty .gcc_except_table: __gxx_personality_v0 calls std::terminate since no call site code range matches
  • .eh_frame with proper .gcc_except_table: unwind

结合上述描述,某个exception将要propagate到一个noexcept function的caller时:

  • -fno-exceptions -fno-asynchronous-unwind-tables: propagating through a function calls std::terminate
  • -fno-exceptions -fasynchronous-unwind-tables: pass-through. Local variable destructors are not called. This behavior is unexpected.
  • -fexceptions: propagating through a noexcept function calls std::terminate

When std::terminate is called, there is a diagnostic looking like terminate called after throwing an instance of 'int' (libstdc++; libc++ has a smiliar one). There is no stack trace. If the process installs a SIGABRT signal handler, the handler may get a stack trace and symbolize the addresses.

Catching exceptions while unwinding through -fno-exceptions code is a proposal to improve the diagnostics.

Personality and typeinfo encoding

.eh_frame contains information about the unwind operation. See Stack unwinding for its format.

In -fpie/-fpic mode, the personality and type info encodings have the DW_EH_PE_indirect|DW_EH_PE_pcrel bits on most targets.

1
2
3
4
5
6
void raise() { throw 42; }
bool foo() {
try { raise(); } catch (int) { return true; }
return false;
}
int main() { foo(); }
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
_Z3foov:
.cfi_startproc
.cfi_personality 155, DW.ref.__gxx_personality_v0
.cfi_lsda 27, .Lexception0
...

.section .gcc_except_table,"a",@progbits
...
# >> Catch TypeInfos <<
.Ltmp3: # TypeInfo 1
.long .L_ZTIi.DW.stub-.Ltmp3
.Lttbase0:

.data
.p2align 3, 0x0
.L_ZTIi.DW.stub:
.quad _ZTIi
.hidden DW.ref.__gxx_personality_v0
.weak DW.ref.__gxx_personality_v0
.section .data.DW.ref.__gxx_personality_v0,"aGw",@progbits,DW.ref.__gxx_personality_v0,comdat
.p2align 3, 0x0
.type DW.ref.__gxx_personality_v0,@object
.size DW.ref.__gxx_personality_v0, 8
DW.ref.__gxx_personality_v0:
.quad __gxx_personality_v0

In the example, .eh_frame contains a PC-relative relocations referencing DW.ref.__gxx_personality_v0 .gcc_except_table contains a PC-relative relocation referencing .L_ZTIi.DW.stub. The relocations are link-time constants, so .eh_frame can remain readonly.

DW.ref.__gxx_personality_v0 and .L_ZTIi.DW.stub reside in writable sections which will contain dynamic relocations if __gxx_personality_v0 and _ZTIi are defined in a shared object - which is often the case.

For -fno-pic code, different targets have different ideas. AArch64 and RISC-V use DW_EH_PE_indirect|DW_EH_PE_pcrel as well. On x86, .cfi_personality refers to __gxx_personality_v0. This will lead to a canonical PLT if __gxx_personality_v0 is defined in a shared object (e.g. libstdc++.so.6). I sent a patch https://gcc.gnu.org/PR108622 to use DW_EH_PE_indirect|DW_EH_PE_pcrel.

R_MIPS_32 and R_MIPS_64 personality encoding

https://github.com/llvm/llvm-project/issues/58377

1
void foo() { try { throw 1; } catch (...) {} }

mips64el-linux-gnuabi64-g++ -fpic and clang++ --target=mips64el-unknown-linux-gnuabi64 -fpic use DW_EH_PE_absptr | DW_EH_PE_indirect to encode personality routine pointers. Using DW_EH_PE_absptr instead of DW_EH_PE_pcrel is wrong. GNU ld works around the compiler design problem by converting DW_EH_PE_absptr to DW_EH_PE_pcrel. ld.lld does not support this and will report an error:

1
2
3
4
5
6
% clang++ --target=mips64el-linux-gnuabi -fpic -fuse-ld=lld -shared ex.cc
ld.lld: error: relocation R_MIPS_64 cannot be used against symbol 'DW.ref.__gxx_personality_v0'; recompile with -fPIC
>>> defined in /tmp/ex-40a996.o
>>> referenced by ex.cc
>>> /tmp/ex-40a996.o:(.eh_frame+0x13)
...

R_MIPS_32 for 32-bit builds is similar.

Potentially-throwing __cxa_end_catch

__cxa_end_catch is potentially-throwing because it may destroy an exception object with a potentially-throwing destructor (e.g. ~C() noexcept(false) { ... }).

1
2
3
4
5
6
7
8
struct A { ~A(); };
void opaque();
void foo() {
A a;
// The exception object has an unknown type and may throw. The landing pad
// then needs to call A::~A for `a` before jumping to _Unwind_Resume.
try { opaque(); } catch (...) { }
}

To support an exception object with a potentially-throwing destructor, Clang generates conservative code for a catch-all clause or a catch clause matching a record type:

  • assume that the exception object may have a throwing destructor
  • emit invoke void @__cxa_end_catch (as the call is not marked as the nounwind attribute).
  • emit a landing pad to destroy local variables and call _Unwind_Resume

Per C++ [dcl.fct.def.coroutine], a coroutine's function body implies a catch (...). Clang's code generation pessimizes even simple code, like:

1
2
3
4
5
6
7
UserFacing foo() {
A a;
opaque();
co_return;
// For `invoke void @__cxa_end_catch()`, the landing pad destroys the
// promise_type and deletes the coro frame.
}

Throwing destructors are typically discouraged. In many environments, the destructors of exception objects are guaranteed to never throw, making our conservative code generation approach seem wasteful.

Furthermore, throwing destructors tend not to work well in practice:

  • GCC does not emit call site records for the region containing __cxa_end_catch. This has been a long time, since 2000.
  • If a catch-all clause catches an exception object that throws, both GCC and Clang using libstdc++ leak the allocated exception object.

To avoid code generation pessimization, I added -fassume-nothrow-exception-dtor for Clang 18 to assume that __cxa_end_catch calls have the nounwind attribute. This requires that thrown exception objects' destructors will never throw.

To detect misuses, diagnose throw expressions with a potentially-throwing destructor. Technically, it is possible that a potentially-throwing destructor never throws when called transitively by __cxa_end_catch, but these cases seem rare enough to justify a relaxed mode.

其他

使用libc++和libc++abi

On Linux, compared with clang, clang++ additionally links against libstdc++/libc++ and libm.

Dynamically link against libc++.so (which depends on libc++abi.so) (additionally specify -pthread if threads are used):

1
2
clang++ -stdlib=libc++ -nostdlib++ a.cc -lc++ -lc++abi
# clang -stdlib=libc++ a.cc -lc++ -lc++abi does not pass -lm to the linker.

If compile actions and link actions are separate (-stdlib=libc++ passes -lc++ but its position is undesired, so just don't use it):

1
clang++ -nostdlib++ a.cc -lc++ -lc++abi

Statically link in libc++.a (which includes the members of libc++abi.a). This requires a -DLIBCXX_ENABLE_STATIC_ABI_LIBRARY=on build:

1
clang++ -stdlib=libc++ -static-libstdc++ -nostdlib++ a.cc -pthread

Statically link in libc++.a and libc++abi.a. This is a bit inferior because there is a duplicate -lc++ passed by the driver.

1
clang++ -stdlib=libc++ -static-libstdc++ -nostdlib++ a.cc -Wl,--push-state,-Bstatic -lc++ -lc++abi -Wl,--pop-state -pthread

libc++abi和libsupc++

值得注意的是,libc++abi提供的<exception> <stdexcept>类型布局(如logic_error runtime_error等)都是特意和libsupc++兼容的。 GCC 5的libstdc++抛弃ref-counted std::string后libsupc++仍使用__cow_string用于logic_error等。libc++abi也使用了类似的ref-counted string。

libsupc++和libc++abi不使用inline namespace,有冲突的符号名,因此通常一个libc++/libc++abi应用无法使用某个动态链接libstdc++.so的shared object(ODR violation)。

如果花一些工夫,还是能解决这个问题的:编译libstdc++中非libsupc++的部分得到自制libstdc++.so.6。可执行档链接libc++abi提供libstdc++.so.6需要的C++ ABI符号。

Monolithic .gcc_except_table

Clang 12之前采用monolithic .gcc_except_table。和其他很多metadata sections一样,monolithic设计的主要问题是无法被linker garbage collect。 对于RISC-V -mrelax和basic block sections则会有更大的问题:.gcc_except_table有指向text sections local symbols的relocations。 如果指向的text sections在COMDAT group中被丢弃,则这些relocations会被linker拒绝(error: relocation refers to a symbol in a discarded section)。 .eh_frame with monolithic .gcc_except_table monolithic .gcc_except_table

解决方案就是采用fragmented .gcc_except_table(https://reviews.llvm.org/D83655)。 fragmented .gcc_except_table

但实际部署没有那么简单:)LLD先处理--gc-sections(尚不明确哪些.eh_frame pieces是live的),后处理(包括GC).eh_frame

--gc-sections时,所有.eh_frame pieces是live的。它们会标记所有.gcc_except_table.* live。 根据section group的GC规则,一个.gcc_except_table.*会标注同一section group的其他sections(包含.text.*) live。 结果就是所有section groups中的.text.*无法被GC,导致输入大小增大。 bad GC with .gcc_except_table.*

https://reviews.llvm.org/D91579修复了这个问题:对于.eh_frame,不要标注section group中的.gcc_except_tablegood GC with .gcc_except_table.*

-fbasic-block-sections=

使用basic block sections时,可以选择每个basic block section获得其专属的.gcc_except_table,或者让一个函数的所有basic block sections使用同一个.gcc_except_table。LLVM实现选择了后者,有几个好处:

  • No duplicate headers
  • Sharable type table
  • Sharable action table (this only matters for the deprecated exception specification)

使用同一个.gcc_except_table就只有一个LPStart,得保证所有landing pads到LPStart的offsets均可以用relocations表示。 因为多数架构没有表示差的relocation type,因此把landing pads放在同一个section是最合适的表示方式。

Exception handling ABI for the ARM architecture

整体结构和Itanium C++ ABI: Exception Handling相同,数据结构、_Unwind_*等有些许差异。

https://maskray.me/blog/2020-11-08-stack-unwinding含有少量注记。

Compact Exception Tables for MIPS ABIs

.eh_frame_entry.gnu_extab描述。

设计理念:

  • Exception code ranges are sorted and must be linearly searched. Therefore it would be more compact to specify each relative to the previous one, rather than relative to a fixed base.
  • The landing pad is often close to the exception region that uses it. Therefore it is better to use the end of the exception region as the reference point, than use the function base address.
  • The action table can be integrated directly with the exception region definition itself. This removes one indirection. The threading of actions can still occur, by providing an offset to the next exception encoding of interest.
  • Often the action threading is to the next exception region, so optimizing that case is important.
  • Catch types and exception specification type lists cannot easily be encoded inline with the exception regions themselves. It is necessary to preserve the unique indices that are automatically created by the DWARF scheme.

使用和ARM EH类似的compact unwind descriptors。Builtin PR1表示没有language-dependent data,Builtin PR2用于C/C++