ELF's design emphasizes natural
size and alignment guidelines for its control structures. While
ensured efficient processing in the old days, this can lead to larger
file sizes. I propose "Light ELF" (EV_LIGHT
, version 2) – a
whimsical exploration inspired by Light Elves of Tolkien's legendarium
(who had seen the light of the Two Trees in Valinor).
In a light ELF file, the e_version
member of the ELF
header is set to 2. EV_CURRENT
remains 1 for backward
compatibility.
1 |
When linking a program, traditional ELF (version 1) and light ELF (version 2) files can be mixed together.
Relocations
Light ELF utilizes CREL for relocations. REL and RELA from traditional ELF are unused.
Existing lazy binding schemes rely on random access to relocation
entries within the DT_JMPREL
table. Due to CREL's
sequential nature, keeping lazy binding requires a memory allocation
that holds decoded JUMP_SLOT relocations.
Section header table
Traditional ELF (version 1) section
header tables can be large. Light ELF addresses this through a
compact section header table format signaled by
e_shentsize == 0
in the ELF header.
A compact section header table for ELF contains the detail. Its current version is copied below for your convenience.
nshdr
denotes the number of sections (including
SHN_UNDEF
). The section header table (located at
e_shoff
) begins with nshdr
Elf_Word
values. These values specify the offset of each
section header relative to e_shoff
.
Following these offsets, nshdr
section headers are
encoded. Each header begins with a presence
byte indicating
which subsequent Elf_Shdr
members use explicit values vs.
defaults:
sh_name
, ULEB128 encodedsh_type
, ULEB128 encoded (ifpresence & 1
), defaults toSHT_PROGBITS
sh_flags
, ULEB128 encoded (ifpresence & 2
), defaults to 0sh_addr
, ULEB128 encoded (ifpresence & 4
), defaults to 0sh_offset
, ULEB128 encodedsh_size
, ULEB128 encoded (ifpresence & 8
), defaults to 0sh_link
, ULEB128 encoded (ifpresence & 16
), defaults to 0sh_info
, ULEB128 encoded (ifpresence & 32
), defaults to 0sh_addralign
, ULEB128 encoded as log2 value (ifpresence & 64
), defaults to 1sh_entsize
, ULEB128 encoded (ifpresence & 128
), defaults to 0
In traditional ELF, sh_addralign
can be 0 or a positive
integral power of two, where 0 and 1 mean the section has no alignment
constraints. While the compact encoding cannot encode
sh_addralign
value of 0, there is no loss of
generality.
Example C++ code that decodes a specific section header:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16// readULEB128(const uint8_t *&p);
const uint8_t *sht = base + ehdr->e_shoff;
const uint8_t *p = sht + ((Elf_Word*)sht)[i];
uint8_t presence = *p++;
Elf_Shdr shdr = {};
shdr.sh_name = readULEB128(p);
shdr.sh_type = presence & 1 ? readULEB128(p) : ELF::SHT_PROGBITS;
shdr.sh_flags = presence & 2 ? readULEB128(p) : 0;
shdr.sh_addr = presence & 4 ? readULEB128(p) : 0;
shdr.sh_offset = readULEB128(p);
shdr.sh_size = presence & 8 ? readULEB128(p) : 0;
shdr.sh_link = presence & 16 ? readULEB128(p) : 0;
shdr.sh_info = presence & 32 ? readULEB128(p) : 0;
shdr.sh_addralign = presence & 64 ? 1UL << readULEB128(p) : 1;
shdr.sh_entsize = presence & 128 ? readULEB128(p) : 0;
While the current format allows for O(1) in-place random access of section headers using offsets at the beginning of the table, this access pattern seems uncommon in practice. At least, I haven't encountered (or remembered) any instances within the llvm-project codebase. Therefore, I'm considering removing this functionality.
In a release build of llvm-project
(-O3 -ffunction-sections -fdata-sections -Wa,--crel
, the
traditional section header tables occupy 16.4% of the .o
file size while the compact section header table drastically reduces the
ratio to 4.7%.
Symbol table
Like other sections, symbol
table and string table sections (SHT_SYMTAB
and
SHT_STRTAB
) can be compressed through
SHF_COMPRESSED
. However, compressing the dynamic symbol
table (.dynsym
) and its associated string table
(.dynstr
) is not recommended.
Symbol table sections have a non-zero sh_entsize
, which
remains unchanged after compression.
The string table, which stores symbol names (also section names in LLVM output), is typically much larger than the symbol table itself. To reduce its size, we can utilize a text compression algorithm. While compressing the string table, compressing the symbol table along with it might make sense, but using a compact encoding for the symbol table itself won't provide significant benefits.
Program headers
Program headers, while individually large (each
Elf64_Phdr
is 56 bytes) and no random access is needed,
typically have a limited quantity within an executable or shared object.
Consequently, their overall size contribution is relatively small. Light
ELF maintains the existing format.
Section compression
Compressed sections face a challenge due to header overhead especially for ELFCLASS64.
1 | typedef struct { |
The overhead and alignment
padding limit the effectiveness when used with features like
-ffunction-sections
and -fdata-sections
that
generate many smaller sections. For example, I have found that the large
Elf64_Chdr
makes evaluating compressed .rela.*
sections difficult. Light ELF addresses this challenge by introducing an
header format of smaller footprint:
ch_type
, ULEB128 encodedch_size
, ULEB128 encodedch_addralign
, ULEB128 encoded as log2 value
This approach allows Light ELF to represent the header information in just 3 bytes for smaller sections, compared to the 24 bytes required by the traditional format. The content is no longer guaranteed to be word-aligned, a property that most compression libraries don't require anyway.
Furthermore, compressed
sections with the SHF_ALLOC
flag are allowed. Using
them outside of relocatable files needs caution, though.
Experiments
I have developed a Clang/lld prototype that implements compact section header table and CREL (https://github.com/MaskRay/llvm-project/tree/april-2024).
.o size |
sht size | build |
---|---|---|
136599656 | 18429824 | -O3 |
112088088 | 18431616 | -O3 -Wa,--crel |
97517175 | 3860703 | -O3 -Wa,--crel,--cshdr |
2166435360 | 260303360 | -g |
1755222784 | 260305152 | -g -Wa,--crel |
1557420523 | 62502891 | -g -Wa,--crel,--chsdr |
WebAssembly has a dense object file format. We can make a comparison with WebAssembly.
1 | clang -c --target=wasm64 a.cc |
Light ELF: a thought experiment
By now, you might have realized that post is about a joke. While
bumping e_version
and modifying Elf_Chdr
might
not be feasible, it's interesting to consider the possibilities of
compact section headers and compressed symbol/string tables. Perhaps
this can spark some interesting discussions!