LLVM 19.1 will soon be released. This post provides a summary of my contributions in this release cycle to record my learning progress.
lld 19 ELF changes
LLVM 19 will be released. As usual, I maintain lld/ELF and have added some notes to https://github.com/llvm/llvm-project/blob/release/19.x/lld/docs/ReleaseNotes.rst. I've meticulously reviewed nearly all the patches that are not authored by me. I'll delve into some of the key changes.
Mapping symbols: rethinking for efficiency
In object files, certain code patterns embed data within instructions or transitions occur between instruction sets. This can create hurdles for disassemblers, which might misinterpret data as code, resulting in inaccurate output. Furthermore, code written for one instruction set could be incorrectly disassembled as another. To address these issues, some architectures (Arm, C-SKY, NDS32, RISC-V, etc) define mapping symbols to explicitly denote state transition. Let's explore this concept using an AArch32 code example:
Linker compatibility and the "User-Agent" problem
The output of ld.lld -v
includes a message "compatible
with GNU linkers" to address detection
mechanism used by GNU Libtool. This problem is described by Software
compatibility and our own "User-Agent" problem.
The latest m4/libtool.m4
continues to rely on a
GNU
check.
Integrated assembler improvements in LLVM 19
Within the LLVM project, MC is a library responsible for handling assembly, disassembly, and object file formats. Intro to the LLVM MC Project, which was written back in 2010, remains a good source to understand the high-level structures.
In the latest release cycle, substantial effort has been dedicated to refining MC's internal representation for improved performance and readability. These changes have decreased compile time significantly. This blog post will delve into the details, providing insights into the specific changes.
Understanding orphan sections
GNU ld's output section layout is determined by a linker script,
which can be either internal (default) or external (specified with
-T
or -dT
). Within the linker script,
SECTIONS
commands define how input sections are mapped into
output sections.
Input sections not explicitly placed by SECTIONS
commands are termed "orphan
sections".
Orphan sections are sections present in the input files which are not explicitly placed into the output file by the linker script. The linker will still copy these sections into the output file by either finding, or creating a suitable output section in which to place the orphaned input section.
GNU ld's default behavior is to create output sections to hold these orphan sections and insert these output sections into appropriate places.
Orphan section placement is crucial because GNU ld's built-in linker
scripts, while understanding common sections like
.text
/.rodata
/.data
, are unaware
of custom sections. These custom sections should still be included in
the final output file.
- Grouping: Orphan input sections are grouped into orphan output sections that share the same name.
- Placement: These grouped orphan output sections are then inserted
into the output sections defined in the linker script. They are placed
near similar sections to minimize the number of
PT_LOAD
segments needed.
Evolution of the ELF object file format
The ELF object file format is adopted by many UNIX-like operating systems. While I've previously delved into the control structures of ELF and its predecessors, tracing the historical evolution of ELF and its relationship with the System V ABI can be interesting in itself.
The format consists of the generic specification, processor-specific specifications, and OS-specific specifications. Three key documents often surface when searching for the generic specification:
- Tool Interface Standard (TIS) Portable Formats Specification, version 1.2 on https://refspecs.linuxfoundation.org/
- System V Application Binary Interface - DRAFT - 10 June 2013 on www.sco.com
- Oracle Solaris Linkers and Libraries Guide
The TIS specification breaks ELF into the generic specification, a processor-specific specification (x86), and an OS-specific specification (System V Release 4). However, it has not been updated since 1995. The Solaris guide, though well-written, includes Solaris-specific extensions not applicable to Linux and *BSD. This leaves us primarily with the System V ABI hosted on www.sco.com, which dedicates Chapters 4 and 5 to the ELF format.
Let's trace the ELF history to understand its relationship with the System V ABI.
Exploring GNU extensions in the Linux kernel
The Linux kernel is written in C, but it also leverages extensions
provided by GCC. In 2022, it moved from GCC/Clang
-std=gnu89
to -std=gnu11
. This article
explores my notes on how these GNU extensions are utilized within the
kernel.
Clang's -O0 output: branch displacement and size increase
tl;dr Clang 19 will remove the -mrelax-all
default at
-O0
, significantly decreasing the text section size for
x86.
Span-dependent instructions
In assembly languages, some instructions with an immediate operand can be encoded in two (or more) forms with different sizes. On x86-64, a direct JMP/JCC can be encoded either in 2 bytes with a 8-bit relative offset or 6 bytes with a 32-bit relative offset. A short jump is preferred because it takes less space. However, when the target of the jump is too far away (out of range for a 8-bit relative offset), a near jump must be used.
1 | ja foo # jump short if above, 77 <rel8> |
A 1978 paper by Thomas G. Szymanski ("Assembling Code for Machines with Span-Dependent Instructions") used the term "span-dependent instructions" to refer to such instructions with short and long forms. Assemblers grapple with the challenge of choosing the optimal size for these instructions, often referred to as the "branch displacement problem" since branches are the most common type. A good resource for understanding Szymanski's work is Assembling Span-Dependent Instructions.
When QOI meets XZ
QOI, the Quite OK Image format, has been gaining in popularity. Chris Wellons offers a great analysis.
QOI's key advantages is its simplicity. Being a byte-oriented format without entropy encoding, it can be further compressed with generic data compression programs like LZ4, XZ, and zstd. PNG, on the other hand, uses DEFLATE compression internally and is typically resistant to further compression. By applying a stronger compression algorithm on QOI output, you can often achieve a smaller file size compared to PNG.