C++ exit-time destructors

In ISO C++ standards, [basic.start.term] specifies that:

Constructed objects ([dcl.init]) with static storage duration are destroyed and functions registered with std::atexit are called as part of a call to std::exit ([support.start.term]). The call to std::exit is sequenced before the destructions and the registered functions. [Note 1: Returning from main invokes std::exit ([basic.start.main]). — end note]

For example, consider the following code:

1
struct A { ~A(); } a;

The destructor for object a will be registered for execution at program termination.

Read More

A compact relocation format for ELF

ELF's design emphasizes natural size and alignment guidelines for its control structures. This principle, outlined in Proceedings of the Summer 1990 USENIX Conference, ELF: An Object File to Mitigate Mischievous Misoneism, promotes ease of random access for structures like program headers, section headers, and symbols.

All data structures that the object file format defines follow the "natural" size and alignment guidelines for the relevant class. If necessary, data structures contain explicit padding to ensure 4-byte alignment for 4-byte objects, to force structure sizes to a multiple of four, etc. Data also have suitable alignment from the beginning of the file. Thus, for example, a structure containing an Elf32_Addr member will be aligned on a 4-byte boundary within the file. Other classes would have appropriately scaled definitions. To illustrate, the 64-bit class would define Elf64 Addr as an 8-byte object, aligned on an 8-byte boundary. Following the strictest alignment for each object allows the format to work on any machine in a class. That is, all ELF structures on all 32-bit machines have congruent templates. For portability, ELF uses neither bit-fields nor floating-point values, because their representations vary, even among pro- cessors with the same byte order. Of course the programs in an ELF file may use these types, but the format itself does not.

Read More

MMU-less systems and FDPIC

This article describes ABI and toolchain considerations about systems without a Memory Management Unit (MMU). We will focus on FDPIC and the in-development FDPIC ABI for RISC-V, with updates as I delve deeper into the topic.

Embedded systems often lack MMUs, relying on real-time operating systems (RTOS) like VxWorks or special Linux configurations (CONFIG_MMU=n). In these systems, the offset between the text and data segments is often not knwon at compile time. Therefore, a dedicated register is typically set to somewhere in the data segment and writable data is accessed relative to this register.

Why is the offset not knwon at compile time? There are primarily two reasons.

First, eXecute in Place (XIP), where code resides in ROM while the data segment is copied to RAM. Therefore, the offset between the text and data segments is often not knwon at compile time.

Second, all processes share the same address space without MMU. However, it is still desired for these processes to share text segments. Therefore needs a mechanism for code to find its corresponding data.

Read More

Toolchain notes on z/Architecture

This article describes some notes about z/Architecture with a focus on the ELF ABI and ELF linkers. An lld/ELF patch sparked my motivation to study the architecture and write this post.

z/Architecture is a big-endian mainframe computer architecture supporting 24-bit, 31-bit, and 64-bit addressing modes. It is the latest generation in a lineage stretching back to the 1964 with IBM System/360 (32-bit general-purpose registers and 24-bit addressing). This lineage includes System/370 (1970), System/370 Extended Architecture (1983), Enterprise Systems Architecture/370 (1988), and Enterprise Systems Architecture/390 (1990). For a deeper dive into the design choices behind z/Architecture's extension from ESA/390, you can refer to "Development and attributes of z/Architecture." IBM System/360 at Computer History Museum

Read More

Raw symbol names in inline assembly

For operands in asm statements, GCC has supported the constraints "i" and "s" for a long time (since at least 1992).

1
2
3
4
5
6
7
8
9
10
11
// gcc/common.md
(define_constraint "i"
"Matches a general integer constant."
(and (match_test "CONSTANT_P (op)")
(match_test "!flag_pic || LEGITIMATE_PIC_OPERAND_P (op)")))

(define_constraint "s"
"Matches a symbolic integer constant."
(and (match_test "CONSTANT_P (op)")
(match_test "!CONST_SCALAR_INT_P (op)")
(match_test "!flag_pic || LEGITIMATE_PIC_OPERAND_P (op)")))

Read More

Modified condition/decision coverage (MC/DC) and compiler implementations

Key metrics for code coverage include:

  • function coverage: determines whether each function been executed.
  • line coverage (aka statement coverage): determines whether every line has been executed.
  • branch coverage: ensures that both the true and false branches of each conditional statement or the condition of each loop statement been evaluated.

Condition coverage offers a more fine-grained evaluation of branch coverage. It requires that each individual boolean subexpression (condition) within a compound expression be evaluated to both true and false. For example, in the boolean expression if (a>0 && f(b) && c==0), each of a>0, f(b), and c==0, condition coverage would require tests that:

Read More

RISC-V TLSDESC works!

Back in 2019, I studied a bit about RISC-V and filed Support Thread-Local Storage Descriptors (TLSDESC). Last year, Tatsuyuki Ishi added a specification for TLSDESC.

LLVM

On the LLVM side, the RISC-V TLSDESC work has been completed.

The the most important patch is [RISCV] Support Global Dynamic TLSDESC in the RISC-V backend by Paul Kirth. The linker patch by me is also significant. Furthermore, Clang requires a -mtls-dialect= patch.

These patches are expected to be included in the upcoming LLVM 18.1 release. To obtain TLSDESC code sequences, compile your program with clang --target=riscv64-linux -fpic -mtls-dialect=desc.

GCC

Latest patch: https://inbox.sourceware.org/gcc-patches/20231205070152.38360-1-ishitatsuyuki@gmail.com/

binutils

Latest patch: https://inbox.sourceware.org/binutils/20231128085109.28422-1-ishitatsuyuki@gmail.com/

glibc

Latest patch: https://inbox.sourceware.org/libc-alpha/20230914084033.222120-1-ishitatsuyuki@gmail.com/

musl

musl added support in February 2024.

Bionic

No patch yet.

Testing

The LLVM patches need testing. Unfortunately, I didn't have a RISC-V image at hand, so I used qemu-user.

Patch musl per Re: Draft riscv64 TLSDESC implementation

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
diff --git c/arch/riscv64/reloc.h w/arch/riscv64/reloc.h
index 1ca13811..7c7c0611 100644
--- c/arch/riscv64/reloc.h
+++ w/arch/riscv64/reloc.h
@@ -17,6 +17,7 @@
#define REL_DTPMOD R_RISCV_TLS_DTPMOD64
#define REL_DTPOFF R_RISCV_TLS_DTPREL64
#define REL_TPOFF R_RISCV_TLS_TPREL64
+#define REL_TLSDESC R_RISCV_TLSDESC

#define CRTJMP(pc,sp) __asm__ __volatile__( \
"mv sp, %1 ; jr %0" : : "r"(pc), "r"(sp) : "memory" )
diff --git c/include/elf.h w/include/elf.h
index 72d17c3a..7f342a23 100644
--- c/include/elf.h
+++ w/include/elf.h
@@ -3254,6 +3254,7 @@ enum
#define R_RISCV_TLS_DTPREL64 9
#define R_RISCV_TLS_TPREL32 10
#define R_RISCV_TLS_TPREL64 11
+#define R_RISCV_TLSDESC 12

#define R_RISCV_BRANCH 16
#define R_RISCV_JAL 17
diff --git c/src/ldso/riscv64/tlsdesc.s w/src/ldso/riscv64/tlsdesc.s
new file mode 100644
index 00000000..56d1ce89
--- /dev/null
+++ w/src/ldso/riscv64/tlsdesc.s
@@ -0,0 +1,33 @@
+.text
+.global __tlsdesc_static
+.hidden __tlsdesc_static
+.type __tlsdesc_static,%function
+__tlsdesc_static:
+ ld a0,8(a0)
+ jr t0
+
+.global __tlsdesc_dynamic
+.hidden __tlsdesc_dynamic
+.type __tlsdesc_dynamic,%function
+__tlsdesc_dynamic:
+ add sp,sp,-16
+ sd t1,(sp)
+ sd t2,8(sp)
+
+ ld t2,-8(tp) # t2=dtv
+
+ ld a0,8(a0) # a0=&{modidx,off}
+ ld t1,8(a0) # t1=off
+ ld a0,(a0) # a0=modidx
+ sll a0,a0,3 # a0=8*modidx
+
+ add a0,a0,t2 # a0=dtv+8*modidx
+ ld a0,(a0) # a0=dtv[modidx]
+ add a0,a0,t1 # a0=dtv[modidx]+off
+ sub a0,a0,tp # a0=dtv[modidx]+off-tp
+
+ ld t1,(sp)
+ ld t2,8(sp)
+ add sp,sp,16
+ jr t0
+

1
(mkdir -p out/rv64 && cd out/rv64 && ../../configure --target=riscv64-linux-gnu && make -j 50)

Adjust ~/musl/out/rv64/lib/musl-gcc.specs and update ~/musl/out/rv64/obj/musl-gcc

1
2
3
4
cat > ~/musl/out/rv64/obj/musl-gcc <<eof
#!/bin/sh
exec "${REALGCC:-riscv64-linux-gnu-gcc}" "$@" -specs ~/musl/out/rv64/lib/musl-gcc.specs
eof

I have also modified musl-clang (clang wrapper). Adjust ~/musl/out/rv64/obj/musl-clang to use --target=riscv64-linux-musl. Adjust ~/musl/out/rv64/obj/ld.musl-clang to define cc="/tmp/Rel/bin/clang --target=riscv64-linux-gnu" and invoke exec /tmp/Rel/bin/ld.lld "$@" -lc.

Prepare a runtime test mentioned at the end of https://maskray.me/blog/2021-02-14-all-about-thread-local-storage

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
cat > ./a.c <<eof
#include <assert.h>
int foo();
int bar();
int main() {
assert(foo() == 2);
assert(foo() == 4);
assert(bar() == 2);
assert(bar() == 4);
}
eof

cat > ./b.c <<eof
#include <stdio.h>
__thread int tls0;
extern __thread int tls1;
int foo() { return ++tls0 + ++tls1; }
static __thread int tls2, tls3;
int bar() { return ++tls2 + ++tls3; }
eof

echo '__thread int tls1;' > ./c.c

sed 's/ /\t/' > ./Makefile <<'eof'
.MAKE.MODE = meta curDirOk=true

CC := ~/musl/out/rv64/obj/musl-clang -O1 -g -fpic -mtls-dialect=desc -w
LDFLAGS := -Wl,-rpath=.

all: a0 a1 a2

run: all
./a0 && ./a1 && ./a2

c.so: c.o; ${LINK.c} -shared $> -o $@
bc.so: b.o c.o; ${LINK.c} -shared $> -o $@
b.so: b.o c.so; ${LINK.c} -shared $> -o $@

a0: a.o b.o c.o; ${LINK.c} $> -o $@
a1: a.o b.so; ${LINK.c} $> -o $@
a2: a.o bc.so; ${LINK.c} $> -o $@
eof

bmake run => succeeded!

1
2
3
4
5
6
7
8
9
10
11
% bmake run
~/musl/out/rv64/obj/musl-clang -O1 -g -fpic -mtls-dialect=desc -w -g -c a.c
~/musl/out/rv64/obj/musl-clang -O1 -g -fpic -mtls-dialect=desc -w -g -c b.c
~/musl/out/rv64/obj/musl-clang -O1 -g -fpic -mtls-dialect=desc -w -g -c c.c
~/musl/out/rv64/obj/musl-clang -O1 -g -fpic -mtls-dialect=desc -w -g -Wl,-rpath=. a.o b.o c.o -o a0
~/musl/out/rv64/obj/musl-clang -O1 -g -fpic -mtls-dialect=desc -w -g -Wl,-rpath=. -shared c.o -o c.so
~/musl/out/rv64/obj/musl-clang -O1 -g -fpic -mtls-dialect=desc -w -g -Wl,-rpath=. -shared b.o c.so -o b.so
~/musl/out/rv64/obj/musl-clang -O1 -g -fpic -mtls-dialect=desc -w -g -Wl,-rpath=. a.o b.so -o a1
~/musl/out/rv64/obj/musl-clang -O1 -g -fpic -mtls-dialect=desc -w -g -Wl,-rpath=. -shared b.o c.o -o bc.so
~/musl/out/rv64/obj/musl-clang -O1 -g -fpic -mtls-dialect=desc -w -g -Wl,-rpath=. a.o bc.so -o a2
./a0 && ./a1 && ./a2

Test GCC

During my development of the linker patch, the Clang Driver patch was actually not ready yet. I used a more hacky approach by compiling using GCC, replacing some assembly fragments with TLSDESC code sequences, and assemblying using Clang.

Compile b.c to bb.s. Replace general-dynamic code sequences (e.g. la.tls.gd a0,tls0; call __tls_get_addr@plt) with TLSDESC, e.g.

1
2
3
4
5
6
.Ltlsdesc_hi0:
auipc a0, %tlsdesc_hi(tls0)
ld a1, %tlsdesc_load_lo(.Ltlsdesc_hi0)(a0)
addi a0, a0, %tlsdesc_add_lo(.Ltlsdesc_hi0)
jalr t0, 0(a1), %tlsdesc_call(.Ltlsdesc_hi0)
add a0, a0, tp

Create an alias bin/ld.lld to be used with -Bbin -fuse-ld=lld. I made some adjustment to the Makefile so that an invocation looks like:

1
2
3
4
5
6
7
8
9
10
11
% bmake run
~/musl/out/rv64/obj/musl-gcc -O1 -g -fpic -Bbin -fuse-ld=lld -g -c a.c
/tmp/Rel/bin/clang --target=riscv64-linux -c bb.s -o b.o
~/musl/out/rv64/obj/musl-gcc -O1 -g -fpic -Bbin -fuse-ld=lld -g -c c.c
~/musl/out/rv64/obj/musl-gcc -O1 -g -fpic -Bbin -fuse-ld=lld -g -Wl,-rpath=. a.o b.o c.o -o a0
~/musl/out/rv64/obj/musl-gcc -O1 -g -fpic -Bbin -fuse-ld=lld -g -Wl,-rpath=. -shared c.o -o c.so
~/musl/out/rv64/obj/musl-gcc -O1 -g -fpic -Bbin -fuse-ld=lld -g -Wl,-rpath=. -shared b.o c.so -o b.so
~/musl/out/rv64/obj/musl-gcc -O1 -g -fpic -Bbin -fuse-ld=lld -g -Wl,-rpath=. a.o b.so -o a1
~/musl/out/rv64/obj/musl-gcc -O1 -g -fpic -Bbin -fuse-ld=lld -g -Wl,-rpath=. -shared b.o c.o -o bc.so
~/musl/out/rv64/obj/musl-gcc -O1 -g -fpic -Bbin -fuse-ld=lld -g -Wl,-rpath=. a.o bc.so -o a2
./a0 && ./a1 && ./a2

Exploring object file formats

My journey with the LLVM project began with a deep dive into the world of lld and binary utilities. Countless hours were spent unraveling the intricacies of object file formats and shaping LLVM's relevant components. Though my interests have since broadened, object file formats remain a personal fascination, often drawing me into discussions around potential changes within LLVM.

This article compares several prominent object file formats, drawing upon my experience and insights.

At the heart of each format lies the representation of essential components like symbols, sections, and relocations. For each control structure, We'll begin with ELF, a widely used format, before venturing into the landscapes of other notable formats.

Read More