UNDER CONSTRUCTION (COFF, Mach-O)
Symbol processing is a major step in a linker. In most binary formats, the linker maintains a global symbol table and performs symbols resolution for each input file (object file, shared object, archive, LLVM bitcode file). Some command line options can define/undefine symbols as well. The symbol resolution can affect archive processing and many subsequent steps (LTO, relocation processing, as-needed shared objects, etc).
ELF
Symbol tables
An object file can have optional symbol tables.
A relocatable object file almost always has a symbol table, which is
represented by a section .symtab
of type
SHT_SYMTAB
. The symbol table is sometimes called a "static
symbol table".
An executable or shared object almost always has a dynamic symbol
table, which is represented by a section .dynsym
of type
SHT_DYNSYM
. The dynamic symbol table specifies defined and
undefined symbols, which can be seen as its export and import lists.
They are needed by runtime relocation processing and symbol binding. A
position dependent statically linked executable usually has no dynamic
symbol table, because (1) it usually does not need dynamic relocations
and (2) there is only one component and every needed symbol is defined
internally, no need for symbol binding.
An executable or shared object may have an optional symbol table of
type SHT_SYMTAB
. ld produces the symbol table
(.symtab
) by default. The static symbol table is a superset
of the dynamic symbol table and has many entries (local symbols and
other non-exported symbols) not needed by runtime. It has value for
symbolization without debug information but otherwise is not useful.
Therefore an executable or shared object is usually post-processed by
strip --strip-all
which can remove .symtab
along with .strtab
and debug sections.
An archive (also named a static library) is like a tarball. It almost always contains multiple relocatable object files.
Symbols
A symbol table holds an array of entries. Each symbol table entry indicates a symbol. Let's look at the representations of a 32-bit ELF object file and a 64-bit ELF object file:
1 | typedef struct { |
Here is the description from the ELF specification:
st_name
: This member holds an index into the object file's symbol string table, which holds the character representations of the symbol names. If the value is non-zero, it represents a string table index that gives the symbol name. Otherwise, the symbol table entry has no name.st_value
: This member gives the value of the associated symbol. Depending on the context, this may be an absolute value, an address, and so on; details appear below.st_size
: Many symbols have associated sizes. For example, a data object's size is the number of bytes contained in the object. This member holds 0 if the symbol has no size or an unknown size.st_info
: This member specifies the symbol's type and binding attributes. A list of the values and meanings appears below. The following code shows how to manipulate the values for both 32 and 64-bit objects.st_other
: This member currently specifies a symbol's visibility. A list of the values and meanings appears below. The following code shows how to manipulate the values for both 32 and 64-bit objects. Other bits contain 0 and have no defined meaning.st_shndx
: Every symbol table entry is defined in relation to some section. This member holds the relevant section header table index. As thesh_link
and sh_info interpretation table and the related text describe, some section indexes indicate special meanings. If this member containsSHN_XINDEX
, then the actual section header index is too large to fit in this field. The actual value is contained in the associated section of typeSHT_SYMTAB_SHNDX
.
Explanation:
st_name
indicates the name.
st_shndx
and st_value
indicate whether the
symbol is defined or undefined, and the associated section and the
offset if defined. If st_shndx==SHN_UNDEF
, we say the
symbol is undefined. For an undefined symbol foo
, we often
say the object file references foo
. If
st_shndx!=SHN_UNDEF
, we say the symbol is defined.
Some st_shndx
values are special. If
st_shndx==SHN_ABS
, this is an absolute symbol. If
st_shndx==SHN_COMMON
, this is a common symbol (FORTRAN
COMMON
blocks or C tentative definitions). The binding must
be STB_GLOBAL
. A common symbol can also be represented with
the type STT_COMMON
but that is uncommon.
st_info
encodes the type and the binding. Among types,
STT_FILE
, STT_SECTION
and STT_TLS
are special. Most symbols are of type STT_NOTYPE
,
STT_OBJECT
, and STT_FUNC
. Other types are
uncommon. The binding is a very important attribute. All of
STB_LOCAL
, STB_GLOBAL
, and
STB_WEAK
are important. A symbol of binding
STB_LOCAL
is often called a local symbol. A local symbol
must be defined. It is not visible outside the object file, therefore it
doesn't contribute to the global symbol table. STB_WEAK
represents a weak symbol. See Weak symbol for details.
STB_GLOBAL
represents a regular symbol visible outside the
object file. Both weak and global symbols contribute to the global
symbol table.
st_other
encodes the visibility. The other bits are used
by ppc64 ELFv2, AArch64, MIPS, etc. The visibility attribute represents
different symbol resolution strategies for a non-local symbol. The
linker only uses the information for a relocatable object file, not for
a shared object.
A STV_HIDDEN
or STV_INTERNAL
symbol will be
made STB_LOCAL
in the linker output. This provides a
mechanism to ensure a relocatable object file symbol will not be visible
to other components. A STV_PROTECTED
symbol provides a way
to defeat performance loss due to symbol interposition for a relocatable
object file which will be linked into a shared object.
STV_DEFAULT
is the default.
SGI
introduced STV_INTERNAL
for their link-time
interprocedural optimization. They are more restricted than
STV_HIDDEN
in that they are not accessed outside of the
compile unit.
If multiple relocatable object files have a non-local symbol, the
most constraining visibility will be the visibility in the output. The
attributes, ordered from least to most constraining, are:
STV_DEFAULT
, STV_PROTECTED
,
STV_HIDDEN
, and STV_INTERNAL
. For a
non-definition declaration in C/C++, we can make it
STV_PROTECTED
or STV_HIDDEN
to ensure the
symbol must be defined in the component. Actually, if every undefined is
STV_PROTECTED
by default, the model will be similar to
PE/COFF's non-export by default model.
st_size
just wastes space. The only minor use case is to
provide a hint to .symtab
based symbolization.
Symbol resolution
The following pseudocode gives a summary of input file processing in the linker.
1 | for file in input { |
The linker maintains a global symbol table for non-local symbols
(STB_GLOBAL
, STB_WEAK
, or
STB_GNU_UNIQUE
). The table can be seen as a collection
mapping names to states. Each state encodes the symbol kind. In the
following list, I place the ld.lld internal struct name before the
description.
Undefined
: An undefined symbol only referenced by shared objects (different from the below in that this does not need an output symbol table entry)Undefined
: An undefined symbol referenced by at least one relocatable object file/bitcode fileLazyArchive
/LazyObjFile
: An entry in an archive index or a definition in a relocatable object file/bitcode file inside a pair of--start-lib
--end-lib
Shared
: A definition in a shared objectCommonSymbol
: A common symbol in a relocatable object file or LLVM bitcode fileDefined
: A definition in a relocatable object file or LLVM bitcode file
The kinds are listed in increasing precedence. If the symbol in the current input file has a higher precedence than the global symbol table entry, the global symbol table entry will be overwritten.
Note the first Undefined
kind. Such an undefined symbol
only referenced by shared objects will not contribute a symbol table
entry to the output. It is needed for several purposes:
- Archive member extraction
--no-allow-shlib-undefined
- Mark the symbol as exported if it ends up defined
If a Shared
is defined by multiple shared objects, the
first shared object wins.
CommonSymbol
can be regarded a special
Defined
which has a precedence lower than
STB_GLOBAL
but higher than STB_WEAK
.
Among Defined
symbols, a STB_GLOBAL
overrides a STB_WEAK
. If two STB_GLOBAL
occur,
the linker will report a duplicate definition error. There is an
exception for two SHN_ABS
symbols with the same value.
I will illustrate the symbol resolution rules with examples. As an
abbreviation, I use "insert foo
" to mean "insert
foo
to the global symbol table".
Duplicate definitions between relocatable object files
Both a.o
and b.o
define
STB_GLOBAL
foo
. The linker command line is
ld a.o b.o
.
- For
a.o
, insertfoo
as aDefined
. - For
b.o
, notice thatfoo
exists. Both areSTB_GLOBAL
=> duplicate definition error.
STB_GLOBAL overrides STB_WEAK between relocatable object files
a.o
defines STB_GLOBAL
foo
.
c.o
defines STB_WEAK
foo
. The
linker command line is ld a.o c.o
.
- For
a.o
, insertfoo
as aDefined
. - For
c.o
, notice thatfoo
exists. The existingSTB_GLOBAL
definition wins.
In the output, the definition will come from a.o
.
For ld c.o a.o
, the incoming STB_GLOBAL
definition will override the existing STB_WEAK
definition.
Note: the STB_GLOBAL
overriding STB_WEAK
rule is between two relocatable object files.
Common overrides STB_WEAK between relocatable object files
c.o
defines STB_WEAK
foo
.
d.o
defines STB_GLOBAL SHN_COMMON
foo
. The linker command line is
ld c.o d.o
.
- For
c.o
, insertfoo
as aDefined
. - For
d.o
, the incomingCommonSymbol
wins.
The linker command line is ld d.o c.o
.
- For
d.o
, insertfoo
as aCommonSymbol
. - For
c.o
, notice thatfoo
exists. The incomingSTB_WEAK
definition is ignored.
Relocatable object file overriding shared object
a.so
defines STB_GLOBAL
foo
.
c.o
defines STB_WEAK
foo
. The
linker command line is ld a.so c.o
.
- For
a.so
, insertfoo
as aShared
. - For
c.o
, notice thatfoo
exists. The relocatable object file definition wins.
For ld c.o a.so
, the definition in a.so
will be ignored.
Note: the binding in a shared object is ignored for symbol
resolution. The STB_GLOBAL
overriding STB_WEAK
rule does not apply, because a shared object is involved.
First shared object wins
a.so
defines STB_GLOBAL
foo
.
c.so
defines STB_WEAK
foo
. The
linker command line is ld c.so a.so
.
- For
c.so
, insertfoo
as aShared
. - For
a.so
, notice thatfoo
exists. The first shared object wins.
Note: the binding in a shared object is ignored for symbol
resolution. The STB_GLOBAL
overriding STB_WEAK
rule does not apply, because two shared objects are involved.
An undefined symbol in a shared object does not change the binding
w.o
references STB_WEAK
foo
.
x.so
references STB_GLOBAL
foo
.
The linker command line is ld w.o x.so
.
- For
w.o
, insertfoo
as anSTB_WEAK
Undefined
. - For
x.so
, notice thatfoo
exists. Ignore the incoming symbol.
The output binding will be STB_WEAK
. For an executable
link, -z defs
is the default. The linker will report an
error.
Archive processing
Archive processing is the most difficult part of the symbol resolution. The complexity comes from the two states an archive member has.
- not extracted (lld calls this state "lazy")
- extracted (non-lazy), like a regular relocatable object file
Initially every archive member is in a lazy state. A defined non-local symbol is called a lazy symbol. The undefined symbols do not participate in symbol resolution. After symbol resolution, if an archive member stays in the lazy state, its symbols will be ignored and not affect the output symbol table. The link behavior is as if the archive member was not present.
When the linker scans an archive, if a lazy symbol can be used to
resolve an STB_GLOBAL
undefined symbol referenced by a
relocatable object file, bitcode file, or shared object, the linker will
extract the member defining it. Upon extraction, the member turns into a
non-lazy state. It will be processed like a regular relocatable object
file/bitcode file.
As a replacement for thin archives, gold and ld.lld support
--start-lib
and --end-lib
. A relocatable
object file/bitcode file surrounded by the options is said lazy in the
ld.lld representation. Such a lazy object file is similar to an archive
with just one member. A symbol defined by such a lazy object file is
also a lazy symbol.
Here is the simplified symbol resolution algorithm focusing on
archive member extraction. Basically the linker maintains a set of
non-weak (must be STB_GLOBAL
) undefined symbols. All
relocatable object files, bitcode files, and shared objects can
contribute to the set. For an incoming archive, each member is
inspected. If a member resolves a STB_GLOBAL
undefined
symbol, the member will be extracted. (I emphasize
STB_GLOBAL
because a STB_WEAK
undefined symbol
does not trigger extraction. In Mach-O, a weak reference triggers
extraction as well.)
1 | // Used by traditional Unix ELF linkers |
As you see, the while (cont)
loop is a bottleneck.
Typically linkers require an index for each archive. The index is a list
of (name, member) pairs for defined non-local symbols. See Archives and
--start-lib for the background. With the index, the
if sym.isDefined() && undef.count(sym)
part can be
optimized.
Some options can alter the archive semantics.
For an archive following --whole-archive
, its members
enter the non-lazy state directly.
The archives surrounded by a pair of --start-group
and
--end-group
can be perceived as a single archive. More
precisely, the linker will consider all the archives surrounded by
--start-group
and --end-group
in the
while (cont)
loop.
Some takeaway:
The symbol definition precedence is roughly: relocatable object file/bitcode file > shared object > archive. An archive definition loses to a shared definition, but it can win over the shared definition if extraction happens.
Without --whole-archive
, an archive may contribute zero
or some members to the relocatable object file/bitcode file set.
If the defined symbol set of an archive member is a subset of a preceding relocatable object file/bitcode file/shared object or an archive member which is guaranteed to be extracted, the archive member will be ignored. The behavior is somewhat similar to symbol interposition. See ELF interposition and -Bsymbolic for details.
For ld ... definition.a reference.o
, the definition may
be entirely dropped and a later reference.o
will lead to an
undefined symbol error.
For an archive member to be extracted, the dependents need to appear before the archive or within another member of the archive. This property leads to a loose topological order requirement.
An external program 'lorder' or build system's integrated topological sorting feature is needed to order archives. In the old days, the program "lorder" (https://www.gnu.org/software/coreutils/manual/html_node/tsort-background.html) was used.
The archive order matters (loss of commutativity) and can hide brittle build problems due to duplicate definitions. Minor ordering tweaks can cause subtle behavior changes (symbol resolution).
When providing an interceptor library (a library providing overriding
definitions), you usually want to make it optional, i.e. the intercepted
library does not have a dependency on the interceptor. However, when
linking both the intercepted archive and the interceptor archive, you
need to be careful with the order, which usually requires some special
plumbing in the build system. A common approach is
--whole-archive
, which unfortunately loses the nice lazy
property.
I will use some examples to explain the symbol resolution rules related to lazy symbols.
Shared object overriding archive
Both a.so
and b.a(b.o)
define
foo
. The linker command line is
ld a.so b.a
.
- For
a.so
, insertfoo
as aShared
. - For
b.a
, try inserting every symbol from the index.foo
is already a shared definition, soa.so
wins.b.a(b.o)
is not extracted.
ld a.so --start-lib b.o --end-lib
is similar.
See LLD selects symbol from later dynamic library over earlier archive, for LTO-generated libcall for a phenomenon.
Extracted archive member overriding shared object
0.o
has a STB_GLOBAL
undefined symbol
bar
. a.o
defines foo
.
b.a(b.o)
defines foo
and bar
. The
linker command line is ld 0.o a.so b.a
.
- For
0.o
, insertbar
as anUndefined
. - For
a.so
, insertfoo
as aShared
. - For
b.a
, try inserting every symbol from the index.foo
is already a shared definition, so insertingfoo
is ignored. However,bar
in the archive index can resolve theUndefined
entry in the global symbol table, so the member providingbar
(b.o
) is extracted. b.a(b.o)
is added like a relocatable object file. Itsfoo
definition overrides theShared
entry in the global symbol table. Itsbar
definition overrides theUndefined
entry.
Note: in Mach-O, if the a.so
definition is weak,
b.a
is loaded even if it is weak. A non-weak dylib
definition has the same precedence as a weak/non-weak archive
definition.
The linker command line is ld 0.o b.a a.so
.
- For
0.o
, insertbar
as anUndefined
. - For
b.a
, insertfoo
as aLazyArchive
.bar
in the archive index can resolve theUndefined
entry, sob.a(b.o)
is extracted. b.a(b.o)
is added like a relocatable object file. Itsfoo
definition overrides theLazyArchive
entry in the global symbol table. Itsbar
definitions overrides theUndefined
entry.- For
a.so
, its shared definition loses to theDefined
entry in the global symbol table.
Relocatable object file suppressing archive member extraction
m1.o
defines memcpy
.
libc.a(memcpy.o)
defines memcpy
. The linker
command line is ld ... m1.o -lc
.
- For
m1.o
, insertmemcpy
as aDefined
. - For
libc.a
, try inserting every symbol from the index. Some members are extracted. However, because no symbol defined bymemcpy.o
isUndefined
in the global symbol table,memcpy.o
is not extracted.
As a result, m1.o
succeeds in shadowing
libc.a(memcpy.o)
. In practice, m1.o
may be an
object file providing optimized memcpy
routines.
m2.o
defines memcmp
.
libc.a(memcmp.o)
defines memcmp
and
STB_WEAK
bcmp
. The linker command line is
ld ... m2.o -lc
.
bcmp
is referenced and thus inserted as anUndefined
.- For
m2.o
, insertmemcpy
as aDefined
. - For
libc.a
,bcmp
in the archive index can resolve theUndefined
entry, thereforelibc.a(memcmp.o)
is extracted. libc.a(memcmp.o)
is added like a relocatable object file. ItsSTB_GLOBAL
memcmp
definition causes a duplicate definition error with the existingSTB_GLOBAL
definition.
The example above demonstrates that if we want to shadow an archive member, it is better to ensure all its symbols which may be referenced are defined. I.e. The defined symbols of the interceptor should be a superset of the archive member being shadowed.
Symbol order
1 | % cat order.s |
Let's say we have two applications A (A.o+b.o+c.o) and a (a.o+b.o+c.o). A.o and a.o reference different definitions in b.o: b2 vs b1.
ld.lld's symbol order is dependent on the first occurrence. The forward symtab of A consists of symbols in A.o+b.o+c.o with duplicates removed. Let's imagine a backward order if we link A using this reversed order: c.o, b.o, A.o.
- The forward symtab of A: A1 b2 c1 A2 b1 c2.
- The forward symtab of a: a1 b1 c1 a2 b2 c2.
- The backward symtab of A: c1 c2 b1 b2 A1 A2.
- The backward symtab of a: c1 c2 b1 b2 a1 a2.
It's clear that backward symtabs are more similar.
However, in some use cases, most input files are archives (or
--start-lib/--end-lib
surrounded relocatable files). After
my https://reviews.llvm.org/D95985 in 2021,
A_archive
and a_archive
have similar symtabs
as well!
- A_archive: A1 A2 b1 b2 c1 c2
- a_archive: a1 a2 b1 b2 c1 c2
a.out
a.out is the ancestor of Mach-O, COFF, and ELF.
There are a text segment and a data segment. There are no flexible
COFF/ELFF style sections, so no need for a ELF-style
st_shndx
.
1 | struct nlist { |
n_type
:N_UNDF, N_ABS, N_TEXT, N_DATA, N_BSS, N_FN
. TheN_EXT
bit is used to represent an external symbol.
PE/COFF
The 8-byte short name and the 2-byte type just waste space.
The storage class basically maps to ELF's binding. There are many useless values. A weak definition needs an inconvenient auxiliary symbol record. A weak reference cannot be expressed.
The auxiliary symbol record representation makes the symbol table non-uniform, i.e. not an array of fixed-length entries. This makes random-access inconvenient.
Note: there is no 64-bit representation.
1 | struct Symbol { |
With some exceptions, an unresolved undefined symbol leads to a linker error, even if it is not referenced.
A COFF file contributes defined and undefined symbols.
A defined symbol can be any of the following kinds:
- special (ignored in the global symbol table)
- common (section number is
IMAGE_SYM_UNDEFINED
and value is not 0) - absolute (section number is -1)
- regular (section number is positive)
An undefined symbol has a storage class of
IMAGE_SYM_CLASS_EXTERNAL
, a section number of
IMAGE_SYM_UNDEFINED
(zero), and a value of zero.
An undefined symbol with a storage class of
IMAGE_SYM_CLASS_WEAK_EXTERNAL
is a weak external, which is
actually like a weak definition in ELF.
Code symbols
1 | // b.dll |
1 | # b.dll |
The linker parses the import file and creates a definition for
__imp_f
pointing to the import address table entry and
creates a definition f
pointing to the thunk. Normally,
f
is unused and will be discarded. 1
call *__imp_f(%rip)
However, if the user code does call f
instead, the
f
definition will be retained: 1
2
3
4 call f
f:
jmpq *__imp_f(%rip)
call *__imp_f(%rip)
is like -fno-plt
codegen for ELF.
Data symbols
1 | // b.dll |
1 | # b.dll |
The linker parses the import file and creates a definition for
__imp_var
pointing to the import address table entry.
Unlike a code symbol, the linker does not create a definition for
var
(without the __imp_
prefix).
With a dllimport
: 1
2movq __imp_var(%rip), %rax
movl (%rax), %eax
If dllimport
is not specified, 1
movq var(%rip), %rax
MinGW supports runtime pseudo relocations to patch the text section
so that absolute and PC-relative references to the symbol will be bound
to the actual definition. 1
movq var(%rip), %rax # the runtime will rewrite this to point to the definition in b.dll
Mach-O
The 256-section limitation is unfortunate. As a workaround, subsections were invented.
1 | struct nlist { |