# Unwinding through a signal handler

This post has some notes about unwinding through a signal handler. You may want to read Stack unwinding first.

(printf and dladdr are not required to be async-signal-safe functions, but here we apparently know using them can't cause problems.)

Tips: we can additionally add the following code block to get memory mappings.

Build the program with either llvm-project libunwind or nongnu libunwind:

(Some targets default to -fno-asynchronous-unwind-tables. In the absence of C++ exceptions, we need at least -funwind-tables.)

## glibc x86-64

With either implementation, the output looks like the following on Linux glibc x86-64. I annotated the lines with location information.

__restore_rt is a signal trampoline defined in glibc sysdeps/unix/sysv/linux/x86_64/libc_sigaction.c:

glibc's sigaction sets the sa_restorer field of sigaction to __restore_rt, and sets the SA_RESTORER. The kernel sets up the __restore_rt frame with saved process context information (ucontext_t structure) before jumping to the signal handler. See kernel arch/x86/kernel/signal.c:setup_rt_frame. Upon returning from the signal handler, control passes to __restore_rt. See man 2 sigreturn.

__restore_rt is implemented in assembly. It comes with DWARF call frame information in .eh_frame.

The DW_OP_breg7 RSP offsets correspond to the ucontext_t offsets of these registers.

With the information, libunwind can unwind through the trampoline without knowing the ucontext_t structure. Note that all general purpose registers are encoded. libunwind/docs/unw_get_reg.man says

However, for signal frames (see unw_is_signal_frame(3)), it is usually possible to access all registers.

Volatile registers are also saved in the saved process context information. This is different from other frames where volatile registers' information is typically lost.

## glibc AArch64

The output looks like:

As a relatively new port, Linux AArch64 defines the signal trampoline __kernel_rt_sigreturn in the VDSO (see arch/arm64/kernel/vdso/sigreturn.S). This is unlike x86-64 which defines the function in libc. We can use gdb to dump the VDSO.

As of Linux 5.8 (https://git.kernel.org/linus/87676cfca14171fc4c99d96ae2f3e87780488ac4), vdso.so does not have PT_GNU_EH_FRAME. Therefore unwinders (llvm-project libunwind, nongnu libunwind, libgcc_s.so.1) ignore its unwind tables. In gdb, gdb/aarch64-linux-tdep.c recognizes the two instructions and encodes how the kernel sets up the ucontext_t structure.

Previously, vdso.so generated a small set of CFI instructions to encode X29 (FP) and X30 (LR).

However, there was a serious problem: CFI cannot describe a signal trampoline frame. AArch64 does not define a register number for PC and provides no direct way to encode the PC of the previous frame. Instead, it sets return_address_register to X30 and the unwinder updates the PC to whatever value the saved X30 is. Actually, with unw_get_reg(&cursor, UNW_REG_IP, &pc); unw_get_reg(&cursor, UNW_AARCH64_X30, &x30);, we know pc == x30. This approach works fine when LR forms a chain since we know between two adjacent frames, the sets {PC, X30} differ by one element. However, when unwinding through the signal trampoline, the CFI can describe the previous PC but not the previous X30.

## musl x86-64

src/signal/x86_64/restore.s implements a signal trampoline __restore_rt. There is no .eh_frame information.

nongnu libunwind does not know that __restore_rt is a signal trampoline (unw_is_signal_frame always returns 0). On ELF targets, -O1 and above typically imply -fomit-frame-pointer and many functions do not save RBP. Note: some functions may save RBP even with -fomit-frame-pointer.

In the absence of a valid frame chain, combined with the fact that nongnu libunwind does not recognize Linux x86-64's signal trampoline, libunwind cannot unwind through the __restore_rt frame. gdb recognizes the signal trampoline frame and with its FP-based unwinding it can retrieve several frames, but not the ones above raise.

If musl is built with -fno-omit-frame-pointer, nongnu libunwind will use its FP-based fallback (see src/x86_64/Gstep.c). The output looks like:

unw_step uses the saved RBP to infer RSP/RBP/RIP in the previous frame. If the signal handler saves RBP and calls unw_step, the saved RBP is essentially the RBP value in the signal trampoline frame.

Actually, not every source file needs to be built with -fno-omit-frame-pointer. We just need to build the source files that transfer control to the user program, and their callers. For this example, building src/signal/raise.c with -fno-omit-frame-pointer allows us to unwind to main. Additionally rebuilding src/env/__libc_start_main.c allows us to unwind to _start.

musl's Makefile specifies -fno-asynchronous-unwind-tables (see option to enable eh_frame for a 2011 discussion). If CFLAGS -g is specified, libc.so will have .debug_frame. gdb can retrieve the caller of raise:

nongnu libunwind can be built with --enable-debug-frame to support .debug_frame. Unfortunately, since it does not recognize the signal trampoline, it cannot retrieve the main frame for this example.

## Unwinders' compatibility with libc implementations

The values represent how the unwinder unwinds through the signal trampoline frame.

 Linux glibc Linux musl nongnu libunwind AArch64 recognizes signal trampoline in VDSO not tested nongnu libunwind x86-64 .eh_frame in libc.so.6 unwindable if FP is enabled gdb AArch64 recognizes signal trampoline in VDSO not tested gdb x86-64 recognizes signal trampoline recognizes signal trampoline

Links to signal trampoline frame related code

• gcc libgcc/config/aarch64/linux-unwind.h:aarch64_fallback_frame_state
• gdb gdb/aarch64-linux-tdep.c:aarch64_linux_rt_sigframe, gdb/amd64-linux-tdep.c:amd64_linux_sigtramp_start
• llvm-project libunwind https://reviews.llvm.org/D90898
• Linux kernel arch/x86/kernel/signal.c:setup_rt_frame

## Core dump

The kernel core dumper coredump.c is simple. The glibc __restore_rt page or the VDSO is not prioritized in the presence of a core file limit. If the page is missing in the core file, gdb prog core -ex bt -batch will not be able to unwind past the signal trampoline. A userspace core dumper may be handy.