In ISO C++ standards, [basic.start.term] specifies that:
Constructed objects ([dcl.init]) with static storage duration are
destroyed and functions registered with std::atexit are called as part
of a call to std::exit ([support.start.term]). The call to std::exit is
sequenced before the destructions and the registered functions. [Note
1: Returning from main invokes std::exit ([basic.start.main]). — end
note]
For example, consider the following code:
1
structA { ~A(); } a;
The destructor for object a will be registered for execution at
program termination.
ELF's design emphasizes natural size and alignment guidelines for its
control structures. This principle, outlined in Proceedings of the
Summer 1990 USENIX Conference, ELF: An Object File to Mitigate
Mischievous Misoneism, promotes ease of random access for
structures like program headers, section headers, and symbols.
All data structures that the object file format defines follow the
"natural" size and alignment guidelines for the relevant class. If
necessary, data structures contain explicit padding to ensure 4-byte
alignment for 4-byte objects, to force structure sizes to a multiple of
four, etc. Data also have suitable alignment from the beginning of the
file. Thus, for example, a structure containing an Elf32_Addr member
will be aligned on a 4-byte boundary within the file. Other classes
would have appropriately scaled definitions. To illustrate, the 64-bit
class would define Elf64 Addr as an 8-byte object, aligned on an 8-byte
boundary. Following the strictest alignment for each object allows the
format to work on any machine in a class. That is, all ELF structures on
all 32-bit machines have congruent templates. For portability, ELF uses
neither bit-fields nor floating-point values, because their
representations vary, even among pro- cessors with the same byte order.
Of course the programs in an ELF file may use these types, but the
format itself does not.
This article describes ABI and toolchain considerations about systems
without a Memory Management Unit (MMU). We will focus on FDPIC and the
in-development FDPIC ABI for RISC-V, with updates as I delve deeper into
the topic.
Embedded systems often lack MMUs, relying on real-time operating
systems (RTOS) like VxWorks or special Linux configurations
(CONFIG_MMU=n). In these systems, the offset between the
text and data segments is often not knwon at compile time. Therefore, a
dedicated register is typically set to somewhere in the data segment and
writable data is accessed relative to this register.
Why is the offset not knwon at compile time? There are primarily two
reasons.
First, eXecute in Place (XIP), where code resides in ROM while the
data segment is copied to RAM. Therefore, the offset between the text
and data segments is often not knwon at compile time.
Second, all processes share the same address space without MMU.
However, it is still desired for these processes to share text segments.
Therefore needs a mechanism for code to find its corresponding data.
This article describes some notes about z/Architecture
with a focus on the ELF ABI and ELF linkers. An lld/ELF patch
sparked my motivation to study the architecture and write this post.
z/Architecture
is a big-endian mainframe computer architecture supporting 24-bit,
31-bit, and 64-bit addressing modes. It is the latest generation in a
lineage stretching back to the 1964 with IBM System/360 (32-bit
general-purpose registers and 24-bit addressing). This lineage includes
System/370 (1970), System/370 Extended Architecture (1983), Enterprise
Systems Architecture/370 (1988), and Enterprise Systems Architecture/390
(1990). For a deeper dive into the design choices behind
z/Architecture's extension from ESA/390, you can refer to
"Development and attributes of z/Architecture."
function coverage: determines whether each function been
executed.
line coverage (aka statement coverage): determines whether every
line has been executed.
branch coverage: ensures that both the true and false branches of
each conditional statement or the condition of each loop statement been
evaluated.
Condition coverage offers a more fine-grained evaluation of branch
coverage. It requires that each individual boolean subexpression
(condition) within a compound expression be evaluated to both true and
false. For example, in the boolean expression
if (a>0 && f(b) && c==0), each of
a>0, f(b), and c==0, condition
coverage would require tests that:
These patches are expected to be included in the upcoming LLVM 18.1
release. To obtain TLSDESC code sequences, compile your program with
clang --target=riscv64-linux -fpic -mtls-dialect=desc.
I have also modified musl-clang (clang wrapper). Adjust
~/musl/out/rv64/obj/musl-clang to use
--target=riscv64-linux-musl. Adjust
~/musl/out/rv64/obj/ld.musl-clang to define
cc="/tmp/Rel/bin/clang --target=riscv64-linux-gnu" and
invoke exec /tmp/Rel/bin/ld.lld "$@" -lc.
During my development of the linker patch, the Clang Driver patch was
actually not ready yet. I used a more hacky approach by compiling using
GCC, replacing some assembly fragments with TLSDESC code sequences, and
assemblying using Clang.
Compile b.c to bb.s. Replace
general-dynamic code sequences (e.g.
la.tls.gd a0,tls0; call __tls_get_addr@plt) with TLSDESC,
e.g.
My journey with the LLVM project began with a deep dive into the
world of lld and binary utilities. Countless hours were spent unraveling
the intricacies of object file formats and shaping LLVM's relevant
components. Though my interests have since broadened, object file
formats remain a personal fascination, often drawing me into discussions
around potential changes within LLVM.
This article compares several prominent object file formats, drawing
upon my experience and insights.
At the heart of each format lies the representation of essential
components like symbols, sections, and relocations. For each control
structure, We'll begin with ELF, a widely used format, before venturing
into the landscapes of other notable formats.