MMU-less systems and FDPIC

This article describes ABI and toolchain considerations about systems without a Memory Management Unit (MMU). We will focus on FDPIC and the in-development FDPIC ABI for RISC-V, with updates as I delve deeper into the topic.

Embedded systems often lack MMUs, relying on real-time operating systems (RTOS) like VxWorks or special Linux configurations (CONFIG_MMU=n). In these systems, the offset between the text and data segments is often not knwon at compile time. Therefore, a dedicated register is typically set to somewhere in the data segment and writable data is accessed relative to this register.

Why is the offset not knwon at compile time? There are primarily two reasons.

First, eXecute in Place (XIP), where code resides in ROM while the data segment is copied to RAM. Therefore, the offset between the text and data segments is often not knwon at compile time.

Second, all processes share the same address space without MMU. However, it is still desired for these processes to share text segments. Therefore needs a mechanism for code to find its corresponding data.

Read More

Toolchain notes on z/Architecture

This article describes some notes about z/Architecture with a focus on the ELF ABI and ELF linkers. An lld/ELF patch sparked my motivation to study the architecture and write this post.

z/Architecture is a big-endian mainframe computer architecture supporting 24-bit, 31-bit, and 64-bit addressing modes. It is the latest generation in a lineage stretching back to the 1964 with IBM System/360 (32-bit general-purpose registers and 24-bit addressing). This lineage includes System/370 (1970), System/370 Extended Architecture (1983), Enterprise Systems Architecture/370 (1988), and Enterprise Systems Architecture/390 (1990). For a deeper dive into the design choices behind z/Architecture's extension from ESA/390, you can refer to "Development and attributes of z/Architecture." IBM System/360 at Computer History Museum

Read More

Raw symbol names in inline assembly

For operands in asm statements, GCC has supported the constraints "i" and "s" for a long time (since at least 1992).

1
2
3
4
5
6
7
8
9
10
11
// gcc/common.md
(define_constraint "i"
"Matches a general integer constant."
(and (match_test "CONSTANT_P (op)")
(match_test "!flag_pic || LEGITIMATE_PIC_OPERAND_P (op)")))

(define_constraint "s"
"Matches a symbolic integer constant."
(and (match_test "CONSTANT_P (op)")
(match_test "!CONST_SCALAR_INT_P (op)")
(match_test "!flag_pic || LEGITIMATE_PIC_OPERAND_P (op)")))

Read More

Modified condition/decision coverage (MC/DC) and compiler implementations

Key metrics for code coverage include:

  • function coverage: determines whether each function been executed.
  • line coverage (aka statement coverage): determines whether every line has been executed.
  • branch coverage: ensures that both the true and false branches of each conditional statement or the condition of each loop statement been evaluated.

Condition coverage offers a more fine-grained evaluation of branch coverage. It requires that each individual boolean subexpression (condition) within a compound expression be evaluated to both true and false. For example, in the boolean expression if (a>0 && f(b) && c==0), each of a>0, f(b), and c==0, condition coverage would require tests that:

Read More

RISC-V TLSDESC works!

Back in 2019, I studied a bit about RISC-V and filed Support Thread-Local Storage Descriptors (TLSDESC). Last year, Tatsuyuki Ishi added a specification for TLSDESC.

LLVM

On the LLVM side, the RISC-V TLSDESC work has been completed.

The the most important patch is [RISCV] Support Global Dynamic TLSDESC in the RISC-V backend by Paul Kirth. The linker patch by me is also significant. Furthermore, Clang requires a -mtls-dialect= patch.

These patches are expected to be included in the upcoming LLVM 18.1 release. To obtain TLSDESC code sequences, compile your program with clang --target=riscv64-linux -fpic -mtls-dialect=desc.

GCC

Latest patch: https://inbox.sourceware.org/gcc-patches/20231205070152.38360-1-ishitatsuyuki@gmail.com/

binutils

Latest patch: https://inbox.sourceware.org/binutils/20231128085109.28422-1-ishitatsuyuki@gmail.com/

glibc

Latest patch: https://inbox.sourceware.org/libc-alpha/20230914084033.222120-1-ishitatsuyuki@gmail.com/

musl

musl added support in February 2024.

Bionic

No patch yet.

Testing

The LLVM patches need testing. Unfortunately, I didn't have a RISC-V image at hand, so I used qemu-user.

Patch musl per Re: Draft riscv64 TLSDESC implementation

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
diff --git c/arch/riscv64/reloc.h w/arch/riscv64/reloc.h
index 1ca13811..7c7c0611 100644
--- c/arch/riscv64/reloc.h
+++ w/arch/riscv64/reloc.h
@@ -17,6 +17,7 @@
#define REL_DTPMOD R_RISCV_TLS_DTPMOD64
#define REL_DTPOFF R_RISCV_TLS_DTPREL64
#define REL_TPOFF R_RISCV_TLS_TPREL64
+#define REL_TLSDESC R_RISCV_TLSDESC

#define CRTJMP(pc,sp) __asm__ __volatile__( \
"mv sp, %1 ; jr %0" : : "r"(pc), "r"(sp) : "memory" )
diff --git c/include/elf.h w/include/elf.h
index 72d17c3a..7f342a23 100644
--- c/include/elf.h
+++ w/include/elf.h
@@ -3254,6 +3254,7 @@ enum
#define R_RISCV_TLS_DTPREL64 9
#define R_RISCV_TLS_TPREL32 10
#define R_RISCV_TLS_TPREL64 11
+#define R_RISCV_TLSDESC 12

#define R_RISCV_BRANCH 16
#define R_RISCV_JAL 17
diff --git c/src/ldso/riscv64/tlsdesc.s w/src/ldso/riscv64/tlsdesc.s
new file mode 100644
index 00000000..56d1ce89
--- /dev/null
+++ w/src/ldso/riscv64/tlsdesc.s
@@ -0,0 +1,33 @@
+.text
+.global __tlsdesc_static
+.hidden __tlsdesc_static
+.type __tlsdesc_static,%function
+__tlsdesc_static:
+ ld a0,8(a0)
+ jr t0
+
+.global __tlsdesc_dynamic
+.hidden __tlsdesc_dynamic
+.type __tlsdesc_dynamic,%function
+__tlsdesc_dynamic:
+ add sp,sp,-16
+ sd t1,(sp)
+ sd t2,8(sp)
+
+ ld t2,-8(tp) # t2=dtv
+
+ ld a0,8(a0) # a0=&{modidx,off}
+ ld t1,8(a0) # t1=off
+ ld a0,(a0) # a0=modidx
+ sll a0,a0,3 # a0=8*modidx
+
+ add a0,a0,t2 # a0=dtv+8*modidx
+ ld a0,(a0) # a0=dtv[modidx]
+ add a0,a0,t1 # a0=dtv[modidx]+off
+ sub a0,a0,tp # a0=dtv[modidx]+off-tp
+
+ ld t1,(sp)
+ ld t2,8(sp)
+ add sp,sp,16
+ jr t0
+

1
(mkdir -p out/rv64 && cd out/rv64 && ../../configure --target=riscv64-linux-gnu && make -j 50)

Adjust ~/musl/out/rv64/lib/musl-gcc.specs and update ~/musl/out/rv64/obj/musl-gcc

1
2
3
4
cat > ~/musl/out/rv64/obj/musl-gcc <<eof
#!/bin/sh
exec "${REALGCC:-riscv64-linux-gnu-gcc}" "$@" -specs ~/musl/out/rv64/lib/musl-gcc.specs
eof

I have also modified musl-clang (clang wrapper). Adjust ~/musl/out/rv64/obj/musl-clang to use --target=riscv64-linux-musl. Adjust ~/musl/out/rv64/obj/ld.musl-clang to define cc="/tmp/Rel/bin/clang --target=riscv64-linux-gnu" and invoke exec /tmp/Rel/bin/ld.lld "$@" -lc.

Prepare a runtime test mentioned at the end of https://maskray.me/blog/2021-02-14-all-about-thread-local-storage

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
cat > ./a.c <<eof
#include <assert.h>
int foo();
int bar();
int main() {
assert(foo() == 2);
assert(foo() == 4);
assert(bar() == 2);
assert(bar() == 4);
}
eof

cat > ./b.c <<eof
#include <stdio.h>
__thread int tls0;
extern __thread int tls1;
int foo() { return ++tls0 + ++tls1; }
static __thread int tls2, tls3;
int bar() { return ++tls2 + ++tls3; }
eof

echo '__thread int tls1;' > ./c.c

sed 's/ /\t/' > ./Makefile <<'eof'
.MAKE.MODE = meta curDirOk=true

CC := ~/musl/out/rv64/obj/musl-clang -O1 -g -fpic -mtls-dialect=desc -w
LDFLAGS := -Wl,-rpath=.

all: a0 a1 a2

run: all
./a0 && ./a1 && ./a2

c.so: c.o; ${LINK.c} -shared $> -o $@
bc.so: b.o c.o; ${LINK.c} -shared $> -o $@
b.so: b.o c.so; ${LINK.c} -shared $> -o $@

a0: a.o b.o c.o; ${LINK.c} $> -o $@
a1: a.o b.so; ${LINK.c} $> -o $@
a2: a.o bc.so; ${LINK.c} $> -o $@
eof

bmake run => succeeded!

1
2
3
4
5
6
7
8
9
10
11
% bmake run
~/musl/out/rv64/obj/musl-clang -O1 -g -fpic -mtls-dialect=desc -w -g -c a.c
~/musl/out/rv64/obj/musl-clang -O1 -g -fpic -mtls-dialect=desc -w -g -c b.c
~/musl/out/rv64/obj/musl-clang -O1 -g -fpic -mtls-dialect=desc -w -g -c c.c
~/musl/out/rv64/obj/musl-clang -O1 -g -fpic -mtls-dialect=desc -w -g -Wl,-rpath=. a.o b.o c.o -o a0
~/musl/out/rv64/obj/musl-clang -O1 -g -fpic -mtls-dialect=desc -w -g -Wl,-rpath=. -shared c.o -o c.so
~/musl/out/rv64/obj/musl-clang -O1 -g -fpic -mtls-dialect=desc -w -g -Wl,-rpath=. -shared b.o c.so -o b.so
~/musl/out/rv64/obj/musl-clang -O1 -g -fpic -mtls-dialect=desc -w -g -Wl,-rpath=. a.o b.so -o a1
~/musl/out/rv64/obj/musl-clang -O1 -g -fpic -mtls-dialect=desc -w -g -Wl,-rpath=. -shared b.o c.o -o bc.so
~/musl/out/rv64/obj/musl-clang -O1 -g -fpic -mtls-dialect=desc -w -g -Wl,-rpath=. a.o bc.so -o a2
./a0 && ./a1 && ./a2

Test GCC

During my development of the linker patch, the Clang Driver patch was actually not ready yet. I used a more hacky approach by compiling using GCC, replacing some assembly fragments with TLSDESC code sequences, and assemblying using Clang.

Compile b.c to bb.s. Replace general-dynamic code sequences (e.g. la.tls.gd a0,tls0; call __tls_get_addr@plt) with TLSDESC, e.g.

1
2
3
4
5
6
.Ltlsdesc_hi0:
auipc a0, %tlsdesc_hi(tls0)
ld a1, %tlsdesc_load_lo(.Ltlsdesc_hi0)(a0)
addi a0, a0, %tlsdesc_add_lo(.Ltlsdesc_hi0)
jalr t0, 0(a1), %tlsdesc_call(.Ltlsdesc_hi0)
add a0, a0, tp

Create an alias bin/ld.lld to be used with -Bbin -fuse-ld=lld. I made some adjustment to the Makefile so that an invocation looks like:

1
2
3
4
5
6
7
8
9
10
11
% bmake run
~/musl/out/rv64/obj/musl-gcc -O1 -g -fpic -Bbin -fuse-ld=lld -g -c a.c
/tmp/Rel/bin/clang --target=riscv64-linux -c bb.s -o b.o
~/musl/out/rv64/obj/musl-gcc -O1 -g -fpic -Bbin -fuse-ld=lld -g -c c.c
~/musl/out/rv64/obj/musl-gcc -O1 -g -fpic -Bbin -fuse-ld=lld -g -Wl,-rpath=. a.o b.o c.o -o a0
~/musl/out/rv64/obj/musl-gcc -O1 -g -fpic -Bbin -fuse-ld=lld -g -Wl,-rpath=. -shared c.o -o c.so
~/musl/out/rv64/obj/musl-gcc -O1 -g -fpic -Bbin -fuse-ld=lld -g -Wl,-rpath=. -shared b.o c.so -o b.so
~/musl/out/rv64/obj/musl-gcc -O1 -g -fpic -Bbin -fuse-ld=lld -g -Wl,-rpath=. a.o b.so -o a1
~/musl/out/rv64/obj/musl-gcc -O1 -g -fpic -Bbin -fuse-ld=lld -g -Wl,-rpath=. -shared b.o c.o -o bc.so
~/musl/out/rv64/obj/musl-gcc -O1 -g -fpic -Bbin -fuse-ld=lld -g -Wl,-rpath=. a.o bc.so -o a2
./a0 && ./a1 && ./a2

Exploring object file formats

My journey with the LLVM project began with a deep dive into the world of lld and binary utilities. Countless hours were spent unraveling the intricacies of object file formats and shaping LLVM's relevant components. Though my interests have since broadened, object file formats remain a personal fascination, often drawing me into discussions around potential changes within LLVM.

This article compares several prominent object file formats, drawing upon my experience and insights.

At the heart of each format lies the representation of essential components like symbols, sections, and relocations. For each control structure, We'll begin with ELF, a widely used format, before venturing into the landscapes of other notable formats.

Read More

2023年总结

一如既往,主要在工具链领域耕耘。

llvm-project

I made 700+ commits this year. Many are clean-up commits or fixup for others' work. I hope that I can do more useful work next year.

  • Enabled --features=layering_check for Bazel's llvm and clang projects
  • Implemented LLVM_ENABLE_REVERSE_ITERATION for StringMap
  • Added llvm::xxh3_64bits and adopted it in llvm/clang/lld
  • Made AArch64 changes to msan and dfsan
  • Made various improvements to the clang driver, including -### exit code, auxiliary files, errors for unsupported target-specific options, -fsanitize=kcfi, and XRay
  • Made various sanitizer changes
  • Made -fsanitize=function work for C and non-x86 architectures
  • Fixed several major problems of assembler implementation of RISC-V linker relaxation
  • clangSema: %lb recognization for printf/scanf, checking the failure memory order for atomic_compare_exchange-family built-in functions, -Wc++11-narrowing-const-reference
  • llvm-objdump: @plt symbols for x86 .plt.got, mapping symbol improvements, --disassemble-symbols changes, etc
  • Supported R_RISCV_SET_ULEB128/R_RISCV_SUB_ULEB128 for .uleb128 directives
  • gcov: fix instrumentation crashes when using inline variables, #line, and #include, made llvm-cov gcov work with a smaller stack size
  • LTO: fixed local ifunc for ThinLTO, reported errors when parsing module-level inline assembly
  • CodeLayout: fixed a correctness bug and improved performance to be suitable for lld/ELF --call-graph-profile-sort=hfsort default

Reviewed many commits. A lot of people don't add a Reviewed By: tag. Anyway, counting commits with the tag can give an underestimate.

1
2
% git shortlog -sn 2679e8bba3e166e3174971d040b9457ec7b7d768...main --grep 'Reviewed .*MaskRay' | awk '{s+=$1}END{print s}'
395

Read More

reviews.llvm.org became a read-only archive

For approximately 10 years, reviews.llvm.org functioned as the code view site for the LLVM project, utilizing a Phabricator instance. This website hosted numerous invaluable code review discussions. However, following LLVM's transition to GitHub pull requests, there arises a necessity for a read-only archive of the existing Phabricator instance. (https://archive.org/ archives a subset of the reviews.llvm.org/Dxxxxx pages.)

The intent is to eliminate a SQL engine. Phabicator operates on a complex database scheme. To minimize time investment, the most feasible approach seems to involve downloading the static HTML pages and employing a lightweight scraping process.

Raphaël Gomès developed phab-archive to serve a read-only archive for Mercurial's Phabricator instance. I have modified the code to suit reviews.llvm.org.

The DNS records of reviews.llvm.org have been pointed to the archive website.

Read More

Exploring the section layout in linker output

This article describes section layout and its interaction with dynamic loaders and huge pages.

Let's begin with a Linux x86-64 example involving global variables exhibiting various properties such as read-only versus writable, zero-initialized versus non-zero, and more.

1
2
3
4
5
6
7
#include <stdio.h>
const int ro = 1;
int w0, w1 = 1;
int *const pw0 = &w0;
int main() {
printf("%d %d %d %p\n", ro, w0, w1, pw0);
}

Read More