Updated in 2023-08.
Many people just want to know how to define or reference versioned symbols properly. You may jump to Recommended usage below.
In 1995, Solaris' link editor and ld.so introduced the symbol versioning mechanism. Ulrich Drepper and Eric Youngdale borrowed Solaris' symbol versioning in 1997 and designed the GNU style symbol versioning for glibc.
When a shared object is updated and the behavior of a symbol changes,
a DT_SONAME
version bump is traditionally required to
indicate ABI incompatibility (such as changing the type of parameters or
return values). The DT_SONAME
version bump can be
inconvenient when there are many dependent applications. If we don't
bump DT_SONAME
, a dependent application/shared object built
with the old version may run abnormally at run-time.
Symbol versioning provides a way to maintain backward compatibility
without changing DT_SONAME
.
The following part describes the representation, and then describes the behaviors from the perspectives of assembler, linker, and ld.so. One may wish to skip the representation part when reading for the first time.
Representation
In a shared object or executable file that uses symbol versioning,
there are up to three sections related to symbol versioning.
.gnu.version_r
and .gnu.version_d
among them
are optional:
.gnu.version
(version symbol section of typeSHT_GNU_versym
). TheDT_VERSYM
tag in the dynamic table points to the section. Assuming there are N entries in.dynsym
,.gnu.version
contains Nuint16_t
values, with the i-th entry indicating the version ID of the i-th symbol. Put it another way,.gnu.version
is a parallel table to.dynsym
..gnu.version_r
(version requirement section of typeSHT_GNU_verneed
). TheDT_VERNEED
/DT_VERNEEDNUM
tags in the dynamic table delimiter this section. This section describes the version information used by the undefined versioned symbol in the module..gnu.version_d
(version definition section of typeSHT_GNU_verdef
). TheDT_VERDEF
/DT_VERDEFNUM
tags in the dynamic table delimiter this section. This section describes the version information used by the defined versioned symbols in the module.
1 | // Version definitions |
Currently GNU ld sets the VER_FLG_WEAK
flag when a
version node has no symbol associated with it. This behavior matches
Solaris. BZ24718#c15
proposed "set VER_FLG_WEAK on version reference if all symbols are weak"
and was rejected.
vd_cnt
is one plus the number of children version
definitions. vd_cnt
is not used by glibc or FreeBSD rtld.
ld.lld just always sets the field to 1.
The advantage of using a parallel table for .gnu.version
is that symbol versioning is optional. ld.so implementations which do
not support symbol versioning can freely assume no symbol has a version.
The behavior is that all references as if bind to the default version
definitions. musl ld.so falls into this category.
Version index values
Index 0 is called VER_NDX_LOCAL
. The binding of the
symbol will be changed to STB_LOCAL
. Index 1 is called
VER_NDX_GLOBAL
. It has no special effect and is used for
unversioned symbols. Index 2 to 0xffef are used for user defined
versions.
Defined versioned symbols have two forms:
foo@@v2
, the default version.foo@v2
, a non-default version (hidden version). TheVERSYM_HIDDEN
bit of the version ID is set.
Undefined versioned symbols have only the foo@v2
form.
There is a special case: a version symbol referenced by a copy
relocation in an executable. The symbol acts as a definition in runtime
relocation processing but its version ID references
.gnu.version_r
instead of .gnu.version_d
. The
resolution of PR28158
picks the form @
.
Usually versioned symbols are only defined in shared objects, but executables can have defined versioned symbols as well. (When a shared object is updated, the old symbols are retained so that other shared objects do not need to be relinked, and executable files usually do not provide versioned symbols for other shared objects to reference.)
Example
readelf -V
can dump the symbol versioning tables.
In the .gnu.version_d
output below:
- Version index 1 (
VER_NDX_GLOBAL
) is the filename (soname if shared object). TheVER_FLG_BASE
flag is set. - Version index 2 is a user defined version. Its name is
LUA_5.3
.
In the .gnu.version_r
output below, each of version
indexes 3~10 represents a version in a needed shared object. The name
GLIBC_2.2.5
appears thrice, each for a different shared
object.
The .gnu.version
table assigns a version index to each
.dynsym
entry. An entry (version ID) corresponds to a
Index:
entry in .gnu.version_d
or a
Version:
entry in .gnu.version_r
.
1 | % readelf -V /usr/bin/lua5.3 |
Symbol versioning in object files
The GNU scheme allows .symver
directives to label the
versions of the symbols in relocatable object files. The symbol names
residing in .o contain @
or @@
.
Assembler behavior
GNU as and LLVM integrated assembler provide implementation.
.symver foo, foo@v1
- If foo is undefined, produce
foo@v1
- If foo is defined, produce
foo
andfoo@v1
with the same binding (STB_LOCAL
,STB_WEAK
, orSTB_GLOBAL
) andst_other
value (i.e. the same visibility). Personally I think this behavior is a design flaw {gas-copy}. The proposed V4 PATCH gas: Extend .symver directive can address this problem.
- If foo is undefined, produce
.symver foo, foo@@v1
- If foo is undefined, error
- If foo is defined, produce
foo
andfoo@v1
with the same binding andst_other
value.
.symver foo, foo@@@v1
- If foo is undefined, produce
foo@v1
- If foo is defined, produce
foo@@v1
- If foo is undefined, produce
With GNU as 2.35 (PR25295) or Clang 13:
.symver foo, foo@v1, remove
- If foo is undefined, produce
foo@v1
- If foo is defined, produce
foo@v1
- This is a recommended way to define a non-default version symbol.
- Unfortunately, in GNU as,
foo
cannot be used in a relocation (PR28157).
- If foo is undefined, produce
Linker behavior
The linker enters the symbol resolution stage after reading in object files, archive files, shared objects, LTO files, linker scripts, etc.
GNU ld uses indirect symbols to represent versioned symbols. There are complicated rules, and these rules are not documented. The symbol resolution rules that I personally derived:
- Defined
foo
resolves undefinedfoo
(traditional unversioned rule) - Defined
foo@v1
resolves undefinedfoo@v1
(a non-default version symbol is like a separate symbol) - Defined
foo@@v1
(default version) resolves both undefinedfoo
andfoo@v1
If there are multiple default version definitions (such as
foo@@v1 foo@@v2
), a duplicate definition error should be
issued even if one is weak. Usually a symbol has zero or one default
version (@@
) definition, and an arbitrary number of
non-default version (@
) definitions.
If the linker sees undefined foo
and foo@v1
first, it will treat them as two symbols. When the linker sees the
definition foo@@v1
, conceptually foo
and
foo@@v1
should be combined. If the linker sees
foo@@v2
instead, foo@@v2
should resolve
foo
and foo@v1
should be a separate
symbol.
- Combining Versions describes the problem.
gold/symtab.cc Symbol_table::define_default_version
uses a heuristic rule to solve this problem. It special cases on visibility, but I feel that this rule is unneeded.- Before 2.36, GNU ld reported a bogus multiple definition error for
defined weak
foo@@v1
and defined globalfoo@v1
PR ld/26978 - Before 2.36, GNU ld had a bug that the visibility of undefined
foo@v1
does not affect the output visibility offoo@@v1
: PR ld/26979 - I fixed the object file side problem of ld.lld 12.0 in https://reviews.llvm.org/D92259
foo
Archive files and lazy object files may still have incompatibility issues.
When ld.lld sees a defined foo@@v
, it adds both
foo
and foo@v1
into the symbol table, thus
foo@@v1
can resolve both undefined foo
and
foo@v1
. After processing all input files, a pass iterates
symbols and redirects foo@v1
to foo@@v1
.
Because ld.lld treats them as separate symbols during input processing,
a defined foo@v
cannot suppress the extraction of an
archive member defining foo@@v1
, leading to a behavior
incompatible with GNU ld. This probably does not matter, though.
If both foo
and foo@v1
are defined (at the
same position), foo
will be removed. GNU ld has another
strange behavior: if both foo
and foo@v1
are
defined, foo
will be removed. I strongly believe it is an
issue in GNU ld but the maintainer rejected PR
ld/27210. I implemented a similar hack in ld.lld 13.0.0 (https://reviews.llvm.org/D107235) but hoped binutils can
fix the assembler issue (https://sourceware.org/pipermail/binutils/2021-August/117677.html).
Version script
To define a versioned symbol in a shared object or an executable, a version script must be specified. If there is no defined versioned symbol, the version script can be omitted.
1 | # Make all symbols other than foo and bar local. |
A version script has three purposes:
- Define versions.
- Specify some patterns so that matched defined non-local symbols
(which do not have
@
in the name) are tied to the specified version. - Scope reduction
- for a matched symbol, its binding will be changed to
STB_LOCAL
and will not be exported to the dynamic symbol table. - for a defined unversioned symbol, it can be matched by a
local:
pattern in any version node. E.g.foo
can be matched byv1 { local: foo; };
- for a defined versioned symbol, it can be matched by a
local:
pattern in the associated version node. E.g. bothfoo@@v1
andfoo@v1
can be matched byv1 { local: foo; };
.
- for a matched symbol, its binding will be changed to
A version script consists of one anonymous version tag
({...};
) or a list of named version tags
(v1 {...};
). If you use an anonymous version tag with other
version tags, GNU ld will error:
anonymous version tag cannot be combined with other version tags
.
A local:
part can be placed in any version tag. Which
version tag is used does not matter.
If a defined symbol is matched by multiple version tags, the
following precedence rules apply
(binutils-gdb/bfd/linker.c:find_version_for_sym
):
- The first version tag with an exact pattern (i.e. there is no wildcard) wins.
- Otherwise, the last version tag with a non-
*
wildcard pattern inglobal:
wins. - Otherwise, the last version tag with a non-
*
wildcard pattern inlocal:
wins. - Otherwise, the last version tag with a
*
pattern wins.
In gold and ld.lld, the rules are like:
- The first version tag with an exact pattern (i.e. there is no wildcard) wins.
- Otherwise, the last version tag with a non-
*
wildcard pattern wins. If the version tag has non-*
wildcard patterns in bothglobal:
andlocal:
, theglobal:
one wins. - Otherwise, the last version tag with a
*
pattern wins. (Prior to LLD 18, the first instead of the last)
For example, given
v1 { local: p*;}; v2 { global: pq*;}; v3 { local: pqr*;};
,
local: pqr*
is selected for a defined non-local symbol
pqrs
in gold and ld.lld while global: pq*
is
slected in GNU ld.
**
is also a catch-all pattern, but its precedence is
higher than *
.
GNU ld reports an error when a pattern appears in both
global:
and local:
.
Most patterns are exact so gold and ld.lld iterate patterns instead of symbols to improve performance.
GNU ld and gold add an absolute symbol
(st_shndx=SHN_ABS
) for each defined version to
.symtab
and .dynsym
. ld.so does not need the
symbol, so this behavior looks strange to me.
In a -r
link, --version-script
is ignored.
Technically local:
version nodes may be useful together
with -r
, but GNU ld and ld.lld just ignore
--version-script
.
How a versioned symbol is produced
An undefined symbol can be assigned a version if:
- its name does not contain
@
(.symver
is unused) and a shared object provides a default version definition. - its name contains
@
and a shared object defines the symbol. GNU ld errors if there is no such a shared object. After https://reviews.llvm.org/D92260, ld.lld will report an error as well.
A defined symbol can be assigned a version if:
- its name does not contain
@
and it is matched by a pattern in a named version tag in a version script. - its name contains
@
- If
-shared
, the version should be defined by a version script, otherwise GNU ld errorsversion node not found for symbol
. This exception looks strange to me so I have filed PR ld/26980. - If
-no-pie
or-pie
, a version definition is unneeded in GNU ld. This behavior is strange.
- If
Recommended usage
Personal recommendation:
To define a default version symbol, don't use .symver
.
Just list the symbol name in a version node in the version script. If
you really want to use .symver
, use
.symver foo, foo@@@v2
so that foo
is not
present. If you require binutils>=2.35 or Clang>=13,
.symver foo, foo@@v2, remove
works as well.
To define a non-default version symbol, add a suffix to the original
symbol name (.symver foo_v1, foo@v1
) to prevent conflicts
with foo
. This will however leave (usually undesirable)
foo_v1
. If you don't strip foo_v1
from the
object file, you may localize it with a local:
pattern in
the version script. With the newer toolchain, you can use
.symver foo_v1, foo@v1, remove
.
1 | cat > a.c <<e |
1 | % readelf -W --dyn-syms a.so | grep @ |
Most of the time, you want an undefined symbol to be bound to the
default version symbol at link time. It is usually unnecessary to set
the version with .symver
.
If you really need to set a version, either
.symver foo_v1, foo@@@v1
or
.symver foo_v1, foo@v1
is fine.
1 | cat > b.c <<e |
The reference is bound to the non-default version
foo@v1
: 1
2% readelf -W --dyn-syms b.so | grep foo
5: 0000000000000000 0 FUNC GLOBAL DEFAULT UND foo@v1 (2)
If you omit .symver
, the reference will be bound to the
default version foo@@v2
.
Why is
.symver xxx, foo@v1
bad with a defined symbol?
There are two cases.
First, xxx is not foo
(the unadorned name of the
versioned symbol). This is the most common usage. Without loss of
generality, we use .symver foo_v1, foo@v1
as the
example.
If the version script does not localize foo_v1
, we will
get foo_v1
in .dynsym
. The extra symbol is
almost always undesired.
Second, xxx is foo
. The foo
definition can
satisfy unversioned references from other TUs. If you think about it, it
is very rare for a non-default version definition to be used outside the
TU.
1 | # a.s |
If the version script contains v1 {};
, the output will
have just foo@v
with GNU ld and ld.lld>=13.0.0. The
output will have both foo
and `foo@v1 with gold and older
ld.lld.
If the version script contains v1 { foo; };
, the output
will have just foo@v1
with GNU ld, gold, and
ld.lld>=13.0.0. The output will have both foo
and
`foo@v1 with older ld.lld.
If the version script contains v2 { foo; };
, the patterm
will be ignored. Unfortunately no linker reports a warning for this
error-prone case.
Having distinct behaviors is unfortunate. And the second case requires complexity in the linker internals.
rtld behavior
Linux Standard Base Core Specification, Generic Part describes the behavior of rtld (ld.so). Kan added symbol versioning support to FreeBSD rtld in 2005.
The DT_VERNEED
and DT_VERNEEDNUM
tags in
the dynamic table delimiter the version requirement by a shared
object/executable file: the required (needed) versions and required
(needed) shared object names (Vernaux::vna_name
).
When an object with DT_VERNEEDED
is loaded, glibc rtld
performs some checks (_dl_check_all_versions
). For each
Vernaux entry (a Verneed's auxiliary entry), glibc rtld checks whether
the referenced shared object has a DT_VERDEF
table. If no,
ld.so handles the case as a graceful degradation and prints
no version information available (required by %s)
; if yes
and the table does not define the version, ld.so reports (if the Vernaux
entry has the VER_FLG_WEAK
bit) a warning or (otherwise) an
error. [verneed-check]
Usually a minor release does not bump soname. Suppose that libB.so
depends on libA 1.3 (soname is libA.so.1) and calls a function which
does not exist in libA 1.2. If PLT lazy binding is used, libB.so may
seem to work on a system with libA 1.2, until the PLT of the 1.3 symbol
is called. If symbol versioning is not used and you want to solve this
problem, you have to record the minor version number
(libA.so.1.3
) in the soname. However, bumping soname is
all-or-nothing: all the dependent shared objects need to be relinked. If
symbol versioning is used, you can continue to use the soname
libA.so.1
. ld.so will report an error if libA 1.2 is used,
because the 1.3 version required by libB.so does not exist.
When searching a definition for foo
,
- for an object without
DT_VERSYM
- it can be bound to
foo
- it can be bound to
- for an object with
DT_VERSYM
- it can be bound to
foo
of versionVER_NDX_GLOBAL
. This takes precendence over the next two rules - it can be bound to
foo
of any default version - it can be bound to
foo
of non-default version index 2 in relocation resolving phase (not dlsym/dlvsym). The rule retains compatibility when a shared object becomes versioned.
- it can be bound to
Note (undefined foo
binding to foo@v1
with
version index 2) is allowed by ld.so but not allowed by the linker
{reject-non-default}. The rtld behavior
is to retains compatibility when a shared object becomes versioned: the
symbols with the smallest version (index 2) indicate the previously
unversioned symbols. If a new version of a shared object needs to
deprecate an unversioned bar
, you can remove
bar
and define bar@compat
instead. Libraries
using bar
are unaffected but new linking against
bar
is disallowed.
When there are multiple versions of foo
,
dlsym(RTLD_DEFAULT, ...)
returns the default version. On
glibc before 2.36, dlsym(RTLD_NEXT, ...)
returned
the first version (BZ14932). This was because in
elf/dl-sym.c:do_sym
, the RTLD_NEXT
branch did
not pass the flags DL_LOOKUP_RETURN_NEWEST
to
dl_lookup_symbol_x
. FreeBSD does not have the issue.
When searching a definition for foo@v1
,
- for an object without
DT_VERSYM
- it can be bound to
foo
. In glibc,elf/dl-lookup.c:check_match
asserts that the filename does not match thevn_file
filename
- it can be bound to
- for an object with
DT_VERSYM
- it can be bound to
foo@v1
orfoo@@v1
- it can be bound to
foo
of versionVER_NDX_GLOBAL
in relocation resolving phase (not dlsym/dlvsym)
- it can be bound to
Say b.so
references malloc@GLIBC_2.2.5
. The
executable defines an unversioned malloc
due to linking in
a malloc implementation. At run-time, malloc@GLIBC_2.2.5
in
b.so
will bind to the executable. For example,
address/memory/thread sanitizers leverage this behavior: shared objects
do not need to link in interceptors; having the interceptor in the
executable is sufficient. libxml2 relied on the behavior to drop
versioning on symbols while retaining compatibility for objects
linking against older versions of libxml2.
In glibc, when a versioned referenced is bound to a shared object
without symbol versioning, elf/dl-lookup.c:check_match
asserts that the filename does not match the vn_file
filename. If the filename matches, glibc thinks that the runtime shared
object is older than the link-time shared object, and will report an
assertion error. 1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17echo 'void foo(); int main() { foo(); }' > a.c
echo 'v1 { foo; };' > c0.ver
echo 'void foo() {}' > c.c
sed 's/^ /\t/' > Makefile <<'eof'
.MAKE.MODE := meta curdirOk=1
CFLAGS := -fpic
LDFLAGS := -Wl,--no-as-needed
a: a.c c.so c0.so
$(LINK.c) a.c c0.so -Wl,-rpath=$$PWD -o $@
c0.so: c.c c0.ver
$(LINK.c) -shared -Wl,-soname=c.so,--version-script=c0.ver c.c -o $@
c.so: c.c
$(LINK.c) -shared -Wl,-soname=c.so -nostdlib c.c -o $@
clean:
rm -f a *.so *.o *.meta
eof1
2
3% bmake && ./a
./a: /tmp/d/c.so: no version information available (required by ./a)
Inconsistency detected by ld.so: dl-lookup.c: 107: check_match: Assertion `version->filename == NULL || ! _dl_name_match_p (version->filename, map)' failed!
However, this check is pretty dumb, as most shared objects have
DT_VERSYM
due to versioned references to libc like
__cxa_finalize@GLIBC_2.2.5
(from GCC
crtbeginS.o
).
Aside from this assertion, vn_file
is essentially
ignored for symbol search since glibc 2.30 BZ24741. Previously during
relocation resolving, after an object failed to provide a match, if it
matched vn_file
, rtld would report an error
symbol %s version %s not defined in file %s with link time reference
.
ld.so:
Support moving versioned symbols between sonames [BZ #24741] has a
side benefit. Previously, if b.so
has a versioned weak
reference foo@v1
where v1
references
c.so
, rtld would report an error
symbol %s version %s not defined in file %s with link time reference
if c.so
does not define foo@@v1
or
foo@v1
. This behavior did not match the expectation of weak
references. Newer rtld no longer report the error. 1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20echo 'void fb(); int main() { fb(); }' > a.c
echo '__attribute__((weak)) void foo(); void fb() { if (foo) foo(); }' > b.c
echo 'v1 { foo; };' > c0.ver
echo 'void foo() {}' > c.c
echo 'v1 { };' > c.ver
sed 's/^ /\t/' > Makefile <<'eof'
.MAKE.MODE := meta curdirOk=1
CFLAGS := -fpic
a: a.c b.so c.so c0.so
$(LINK.c) a.c b.so -Wl,-rpath=$$PWD -o $@
b.so: b.c c0.so
$(LINK.c) -shared $> -Wl,-rpath=$$PWD -o $@
c0.so: c.c c.ver
$(LINK.c) -shared -Wl,-soname=c.so,--version-script=c0.ver c.c -o $@
c.so: c.c c.ver
$(LINK.c) -shared -Wl,-soname=c.so,--version-script=c.ver -Dfoo=foo1 c.c -o $@
clean:
rm -f a *.so *.o *.meta
eof
Example
Run the following code to create a.c b.c b0.c Makefile
,
then run bmake
.
1 | cat > ./a.c <<'eof' |
Before glibc 2.36, the output is: 1
2
3
4% ./a
foo(0) = 0, ok
foo(0) = 3, ok
foo(0) = 0, wrong
Upgraded symbols in glibc
When to prevent execution of new binaries with old glibc has a summary about when a new symbol version is introduced.
Note that GNU nm before binutils 2.35 does not display @
or @@
.
1 | nm -D /lib/x86_64-linux-gnu/libc.so.6 | \ |
The output on my x86-64 system:
1 | pthread_cond_broadcast @GLIBC_2.2.5 @@GLIBC_2.3.2 |
realpath@@GLIBC_2.3
: the previous version returnsEINVAL
when the second parameter isNULL
memcpy@@GLIBC_2.14
BZ12518: the previous version guarantees a forward copying behavior. Shockwave Flash at that time had a "memcpy downward" bug which required the workaround.quick_exit@@GLIBC_2.24
BZ20198: the previous version copies the destructors of thread_local objects.glob64@@GLIBC_2.27
: the previous version does not follow dangling symlinks.
How to remove symbol versioning
Imagine that you want to build an application with a prebuilt shared
object which has versioned references, but you can only find shared
objects providing the unversioned definitions. The linker will helpfully
error: 1
ld.lld: error: undefined reference to foo@v1 [--no-allow-shlib-undefined]
As the diagnostic suggests, you can add
--allow-shlib-undefined
to get rid of the error. It is not
recommended but the built application may happen to work.
For this case, an alternative hacky solution is:
1 | # 64-bit |
With the removal of .gnu.version
, the linker will think
that out.so
references foo
instead of
foo@v1
. However, llvm-objcopy will zero out the section
contents. At runtime, glibc ld.so will complain
unsupported version 0 of Verneed record
. To make glibc
happy, you can delete DT_VER*
tags from the dynamic table.
The above code snippet uses an r2 command to locate
DT_VERNEED
(0x6ffffffe) and rewrite it to
DT_NULL
(a DT_NULL
entry stops the parsing of
the dynamic table). The difference of the readelf -d
output
is roughly:
1 | 0x000000006ffffffb (FLAGS_1) Flags: NOW |
ld.lld
- If an undefined symbol is not defined by a shared object, GNU ld will report an error. ld.lld before 12.0 did not error (I fixed it in https://reviews.llvm.org/D92260).
GNU function attribute
There is a GNU function attribute which is lowers to a
.symver
assembly directive. The attribute is implemented by
GCC but not by Clang.
1 | extern "C" __attribute__((symver("foo@@v2"))) void foo() {} |
Unfortunately, @@@
and ,remove
are not
supported. Along with the reason that Clang does not implement the
function attribute, I discourage using this feature.
Remarks
GCC/Clang supports asm specifier and
#pragma redefine_extname
renaming a symbol. For example, if
you declare int foo() asm("foo_v1");
and then reference
foo, the symbol in .o will be foo_v1
.
For example, the biggest change in musl v1.2.0 is the time64 support for its supported 32-bit architectures. musl adopted a scheme based on asm specifiers:
1 | // include/features.h |
- In .o, the time32 symbol remains
utimes
and is compatible with the ABI required by programs linked against old musl versions; the time64 symbol is__utimes_time64
. - The public header redirects
utimes
to__utimes_time64
.- cons: if the user declares
utimes
by themself, they will not link against the correct__utimes_time64
.
- cons: if the user declares
- The "good-looking" name
utimes
is used for the preferred time64 implementation internally and the "ugly" name__utimes_time32
is used for the legacy time32 implementation.- If the time32 implementation is called elsewhere, the "ugly" name can make it stand out.
For the above example, here is an implementation with symbol versioning:
1 | // API header include/sys/time.h |
Note that @@@
cannot be used. The header is included in
a defining translation unit and @@@
will lead to a default
version definition while we want a non-default version definition.
According to Assembler behavior,
the undesirable __utimes_time32
is present. Be careful to
use a version script to localize it.
So what is the significance of symbol versioning? I think carefully:
- Refuse linking against old symbols while keeping compatibility with unversioned old libraries. {reject-non-default}
- No need to label declarations.
- The version definition can be delayed until link time. The version script provides a flexible pattern matching mechanism to assign versions.
- Scope reduction. Arguably another mechanism like
--dynamic-list
might have been developed if version scripts did not providelocal:
. - There are some semantic issues in renaming builtin functions with asm specifiers in GCC and Clang (they do not know that the renamed symbol has built-in semantic). See 2020-10-15-intra-call-and-libc-symbol-renaming
- [verneed-check]
For the first item, the asm specifier scheme uses conventions to prevent problems (users should include the header); and symbol versioning can be forced by ld.
Design flaws:
.symver foo, foo@v1
{gas-copy}- Verdaux is a bit redundant. In practice, one Verdef has only one auxiliary Verdaux entry.
- This is arguably a minor problem but annoying for a framework
providing multiple shared objects. ld.so requires "a versioned symbol is
implemented in the same shared object in which it was found at link
time", which disallows moving definitions between shared objects.
Fortunately, glibc 2.30 BZ24741 relaxes this
requirement, essentially ignoring
Vernaux::vna_name
.
Before that, glibc used a forwarder to move clock_*
functions from librt.so to libc.so:
1 | // rt/clock-compat.c |
libc.so defines __clock_getres
and
clock_getres
. librt.so defines an ifunc called
clock_getres
which forwards to libc.so
__clock_getres
.
Related links
中文版
1995年Solaris的link editor和ld.so引入了symbol versioning机制。 Ulrich Drepper和Eric Youngdale在1997年借鉴Solaris symbol versioning,设计了用于glibc的GNU风格symbol versioning。
一个shared
object更新,某个符号的行为变更(ABI改变(如变更参数或返回值的类型)或行为变化)时,传统上可以bump
DT_SONAME
:依赖的shared
objects必须重新编译、链接才能继续使用;如果不改变DT_SONAME
,依赖的shared
objects可能悄悄地产生异常行为。 使用symbol
versioning可以提供不改变DT_SONAME
的backward
compatibility。
下面描述表示方式,然后从assembler、链接器、ld.so几个角度描述symbol versioning行为。初次阅读时不妨跳过表示方式部分。
表示方式
在使用symbol versioning的shared object或可执行档中,有至多三个symbol
versioning相关的sections,其中.gnu.version_r
和.gnu.version_d
是可选的:
.gnu.version
(version symbol section)。dynamic table中的DT_VERSYM
tag指向该section。假设.dynsym
有N个entries,那么.gnu.version
包含N个uint16_t。第i个entry描述第i个dynamic symbol table所属的version.gnu.version_r
(version requirement section)。dynamic table中的DT_VERNEED
/DT_VERNEEDNUM
tags标记该section。描述该模块的未定义的versioned符号用到的version信息.gnu.version_d
(version definition section)。dynamic table中的DT_VERDEF
/DT_VERDEFNUM
tags标记该section。记录该模块定义的versioned符号用到的version信息
1 | // Version definitions |
目前GNU ld不会设置VER_FLG_WEAK
。BZ24718#c15提议"set
VER_FLG_WEAK on version reference if all symbols are weak"。
使用一个parallel
table的好处是:不支持(忽略DT_VERSYM,DT_VERDEF,DT_VERNEED
)symbol
versioning的ld.so也能继续工作,就好像所有符号都没有version一样。 musl
ld.so就属于此类。
Version index values
Index 0称为VER_NDX_LOCAL
。Version
id为0的符号的binding将会更改为STB_LOCAL
。 Index
1称为VER_NDX_GLOBAL
。没有特殊作用,用于unversioned符号。
Index 2到0xffef用于其他versions。
定义的versioned符号有两种形式:
foo@@v2
,称为default versionfoo@v2
,称为non-default version,也叫hidden version,其version id设置了VERSYM_HIDDEN
bit
未定义符号只有foo@v2
这一种形式。
通常只在shared object中定义versioned符号,但可执行档也是可以获得versioned符号的。 (一个shared object更新时保留旧符号,可以使其他shared objects不须重新链接,而可执行档通常不提供versioned符号供其他shared objects引用。)
例子
readelf -V
可以导出symbol versioning表。
下面输出的.gnu.version_d
section里:
- Version index 1 (
VER_NDX_GLOBAL
) is the filename (soname if shared object). TheVER_FLG_BASE
flag is set. - Version index 2 is a user defined version. Its name is
LUA_5.3
.
下面输出的.gnu.version_r
section里,version index
3~10每一个都表示了一个依赖的shared
object。名字GLIBC_2.2.5
出现了三次,每一次给一个不同的shared
object。
.gnu.version
表给每个.dynsym
符号分配了一个version
index。
1 | % readelf -V /usr/bin/lua5.3 |
Symbol versioning in object files
这套GNU symbol versioning允许.symver
directive在.o里标注符号的version。在.o里符号名字面包含@
或@@
。
Assembler行为
GNU as和LLVM integrated assembler提供实现。
- 对于
.symver foo, foo@v1
- 如果foo未定义,.o中有一个名为
foo@v1
的符号 - 如果foo被定义,.o中有两个符号:
foo
和foo@v1
,两者的binding一致(均为STB_LOCAL
,或均为STB_WEAK
,或均为STB_GLOBAL
),st_other
一致(visibility一致)。个人认为这个行为是设计缺陷{gas-copy}
- 如果foo未定义,.o中有一个名为
- 对于
.symver foo, foo@@v1
- 如果foo未定义,assembler报错
- 如果foo被定义,.o中有两个符号:
foo
和foo@@v1
,两者的binding和st_other
一致
- 对于
.symver foo, foo@@@v1
- 如果foo未定义,.o中有一个名为
foo@v1
的符号 - 如果foo被定义,.o中有一个名为
foo@@v1
的符号
- 如果foo未定义,.o中有一个名为
With GNU as 2.35 (PR25295) or Clang 13:
.symver foo, foo@v1, remove
- 如果foo未定义,.o中有一个名为
foo@v1
的符号 - 如果foo被定义,.o中有一个名为
foo@v1
的符号 - 我推荐用这种方式定义non-default符号
- Unfortunately, in GNU as,
foo
cannot be used in a relocation (PR28157).
- 如果foo未定义,.o中有一个名为
链接器行为
链接器在读入object files、archive files、shared objects、LTO files、linker scripts等后就进入符号解析阶段。
GNU ld用indirect symbol表示versioned符号,在很多阶段都有复杂的规则,这些规则都没有文档。 我个人得出的符号解析规则:
- 定义的
foo
可以满足未定义的foo
(传统unversioned符号规则) - 定义的
foo@v1
可以满足未定义的foo@v1
- 定义的
foo@@v1
可以同时满足未定义的foo
和foo@v1
若存在多个default
version的定义(如foo@@v1 foo@@v2
),触发duplicate definition
error。通常一个符号有零或一个default
version(@@
)定义,任意个non-default
version(@
)定义。
ld.lld的实现中,看到shared
object中的foo@@v1
则在符号表中同时插入foo
和foo@v1
,因此可以满足未定义的foo
和foo@v1
。
链接器如果先看到未定义foo
和foo@v1
,会把它们当作两个符号。之后看到定义的foo@@v1
时,概念上应该合并foo
和foo@@v1
。若看到的是定义的foo@@v2
,应该用foo@@v2
满足foo
,而foo@v1
仍是一个不同的符号。
- Combining Versions描述了这个问题
gold/symtab.cc Symbol_table::define_default_version
用一个启发式规则处理这个问题。它特殊判断了visibility,但我感觉这个规则可能不需要也行- Before 2.36, GNU ld reported a bogus multiple definition error for
defined weak
foo@@v1
and defined globalfoo@v1
PR ld/26978 - Before 2.36, GNU ld had a bug that the visibility of undefined
foo@v1
does not affect the output visibility offoo@@v1
: PR ld/26979 - I fixed the object file side problem of ld.lld 12.0 in https://reviews.llvm.org/D92259
foo
Archive files and lazy object files may still have incompatibility issues.
When ld.lld sees a defined foo@@v
, it adds both
foo
and foo@v1
into the symbol table, thus
foo@@v1
can resolve both undefined foo
and
foo@v1
. After processing all input files, a pass iterates
symbols and redirects foo@v1
to foo@@v1
.
Because ld.lld treats them as separate symbols during input processing,
a defined foo@v
cannot suppress the extraction of an
archive member defining foo@@v1
, leading to a behavior
incompatible with GNU ld. This probably does not matter, though.
GNU ld has another strange behavior: if both foo
and
foo@v1
are defined, foo
will be removed. I
strongly believe it is an issue in GNU ld but the maintainer rejected PR
ld/27210.
Version script
在输出的shared object或可执行档中定义version必须指定version script。若所有versioned符号均为未定义状态则无需version script。
1 | # Make all symbols other than foo and bar local. |
Version script有三个用途:
- 定义versions
- 指定一些模式,使得匹配的、定义的、unversioned的符号具有指定的version
- Scope reduction
- 对于一个被
local:
模式匹配的符号,如果它是定义的、unversioned的,那么它的binding会被更改为STB_LOCAL
,不会导出到dynamic symbol table - 对一个定义的unversioned符号,它可以被任何version
node内的
local:
模式匹配 - 对一个定义的versioned符号,它可以被相应的version
node内的
local:
模式匹配。例如,foo@@v1
和foo@v1
都能被v1 { local: foo; };
匹配
- 对于一个被
一个version script由一个anonymous version tag
({...};
),或若干named version tags
(v1 {...};
)组成。 如果一个anonymous version
tag和其他version tag一起使用,GNU
ld会报错anonymous version tag cannot be combined with other version tags
。
local:
可以放在任意version tag里。
如果一个定义的符号被多个version
tags匹配,如下的优先级规则适用(binutils-gdb/bfd/linker.c:find_version_for_sym
):
- 第一个exact pattern的version tag
- 最后一个非
*
的wildcard pattern的version tag - 第一个含
*
的version tag
**
规则虽然也能匹配所有符号,但它的优先级高于*
。
大多数patterns不含wildcard,所以gold和ld.lld迭代patterns而不是符号来改善性能。
Versioned symbol产生方式
一个未定义符号获得version的方式:
- 名字不包含
@
(没有使用.symver
):某个shared object定义了default version符号 - 名字包含
@
:该符号须要被某个shared object定义,否则GNU ld会报错;https://reviews.llvm.org/D92260之后ld.lld也会报错
一个定义的符号获得version的方式:
- 名字不包含
@
:被version script的一个named version tag的某个pattern匹配而获得version - 名字包含
@
-shared
:versionv1
须要被version script定义,否则GNU ld会报错(version node not found for symbol
)-no-pie
或-pie
:GNU ld不需要version script即会生成version定义v1
。这个行为奇怪。
推荐用法
个人推荐:
定义default-version符号时不要用.symver
。在version
script的相应version node里指定这个符号即可。
如果你确实想用.symver
,使用.symver foo, foo@@@v2
,在.o中只产生foo@@v2
,不产生foo
。
定义non-default符号时在原符号名后加后缀(.symver foo_v1, foo@v1
)防止和foo
冲突。在.o中会同时有foo_v1
和foo@v1
。目前没有便捷方法去除(通常不想要的)foo_v1
,一般在指定version
script时注意把foo_v1
设置为local
1 | cat > a.c <<e |
1 | % readelf -W --dyn-syms a.so | grep @ |
未定义的versioned符号通常是链接时绑定的,object
files不须要指定符号。如果确实要引用,推荐.symver foo, foo@@@v1
,即使能.symver foo, foo@v1
达到相同效果
多数时候,你希望未定义在链接时绑定到default
version。通常没有必要指定.symver
。
如果确实要引用,.symver foo_v1, foo@@@v1
或.symver foo_v1, foo@v1
均可。
1 | cat > b.c <<e |
未定义符号被绑定到non-default version foo@v1
:
1
2% readelf -W --dyn-syms b.so | grep foo
5: 0000000000000000 0 FUNC GLOBAL DEFAULT UND foo@v1 (2)
如果省略.symver
,则会被绑定到default version
foo@@v2
。
rtld行为
Linux Standard Base Core Specification, Generic Part 描述了ld.so行为。 kan在2005年给FreeBSD rtld添加了symbol versioning支持。
Dynamic
table中的DT_VERNEED
和DT_VERNEEDNUM
标识了一个shared
object/可执行档需要的外部version定义,及该定义须由哪个shared
object(Vernaux::vna_name
)提供。
如果该Vernaux项(附属于Verneed)没有VER_FLG_WEAK
标志,且目标shared
object中DT_VERDEF
表存在但没有定义需要的version,报错。
通常minor releases不会bump
soname。假设libB.so依赖1.3版本的libA(soname为libA.so.1
),用了1.2版本不存在的API(函数)。假如使用PLT
lazy
binding,libB.so在安装了1.2版本的libA的系统上似乎还能工作,直到1.3版本的函数的PLT被调用了为止
若不使用symbol versioning,如果想要解决这个问题,就得在soname里记录minor
version number(libA.so.1.3
) 若使用symbol
versioning,可以继续使用libA.so.1
。ld.so会在安装了libA
1.2的系统上报错,因为libB.so的DT_VERNEED
需要的1.3
version不存在
为foo
搜索定义时:
- 对于不含
DT_VERDEF
的object- 可以绑定到定义
foo
- 可以绑定到定义
- 对于含
DT_VERDEF
的object- 可以绑定到定义version
VER_NDX_GLOBAL
的foo
- 可以绑定到定义任一default version的
foo
- 在relocation resolving阶段(非dlvsym)可以绑定到定义non-default
version index 2的
foo
- 可以绑定到定义version
注意(未定义foo
解析到定义version index
2的foo@v1
)这种情况是ld.so允许而链接器不允许的{reject-non-default}。
rtld的行为使得给.so库添加version可以保持兼容性。
假如一个新版本想废弃unversioned
bar
,可以去除bar
而定义bar@compat
。依赖该.so的库中的未定义bar
仍可以解析,但该库无法重新用于链接其他程序。
为foo@v1
搜索定义时:
- 对于不含
DT_VERDEF
的object- 可以绑定到定义
foo
- 可以绑定到定义
- 对于含
DT_VERDEF
的object- 可以绑定到定义
foo@v1
或foo@@v1
- 在relocation resolving阶段(非dlvsym)可以绑定到定义version
VER_NDX_GLOBAL
的foo
- 可以绑定到定义
假设b.so
引用malloc@GLIBC_2.2.5
,可执行档因为包含了一个malloc实现而定义malloc
。
运行时b.so
中的malloc@GLIBC_2.2.5
会绑定到可执行档。
address/memory/thread sanitizers利用了这个行为:shared
objects不需要链接interceptors,只有可执行档需要链接interceptor。
glibc中升级过的符号
注意,binutils
2.35之前的nm -D
不显示@
和@@
。
1 | nm -D /lib/x86_64-linux-gnu/libc.so.6 | \ |
在我的x86-64系统上的输出: 1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31pthread_cond_broadcast @GLIBC_2.2.5 @@GLIBC_2.3.2
clock_nanosleep @@GLIBC_2.17 @GLIBC_2.2.5
_sys_siglist @@GLIBC_2.3.3 @GLIBC_2.2.5
sys_errlist @@GLIBC_2.12 @GLIBC_2.2.5 @GLIBC_2.3 @GLIBC_2.4
quick_exit @GLIBC_2.10 @@GLIBC_2.24
memcpy @@GLIBC_2.14 @GLIBC_2.2.5
regexec @GLIBC_2.2.5 @@GLIBC_2.3.4
pthread_cond_destroy @GLIBC_2.2.5 @@GLIBC_2.3.2
nftw @GLIBC_2.2.5 @@GLIBC_2.3.3
pthread_cond_timedwait @@GLIBC_2.3.2 @GLIBC_2.2.5
clock_getres @GLIBC_2.2.5 @@GLIBC_2.17
pthread_cond_signal @@GLIBC_2.3.2 @GLIBC_2.2.5
fmemopen @GLIBC_2.2.5 @@GLIBC_2.22
pthread_cond_init @GLIBC_2.2.5 @@GLIBC_2.3.2
clock_gettime @GLIBC_2.2.5 @@GLIBC_2.17
sched_setaffinity @GLIBC_2.3.3 @@GLIBC_2.3.4
glob @@GLIBC_2.27 @GLIBC_2.2.5
sys_nerr @GLIBC_2.2.5 @GLIBC_2.4 @@GLIBC_2.12 @GLIBC_2.3
_sys_errlist @GLIBC_2.3 @GLIBC_2.4 @@GLIBC_2.12 @GLIBC_2.2.5
sys_siglist @GLIBC_2.2.5 @@GLIBC_2.3.3
clock_getcpuclockid @GLIBC_2.2.5 @@GLIBC_2.17
realpath @GLIBC_2.2.5 @@GLIBC_2.3
sys_sigabbrev @GLIBC_2.2.5 @@GLIBC_2.3.3
posix_spawnp @@GLIBC_2.15 @GLIBC_2.2.5
posix_spawn @@GLIBC_2.15 @GLIBC_2.2.5
_sys_nerr @@GLIBC_2.12 @GLIBC_2.4 @GLIBC_2.3 @GLIBC_2.2.5
nftw64 @GLIBC_2.2.5 @@GLIBC_2.3.3
pthread_cond_wait @GLIBC_2.2.5 @@GLIBC_2.3.2
sched_getaffinity @GLIBC_2.3.3 @@GLIBC_2.3.4
clock_settime @GLIBC_2.2.5 @@GLIBC_2.17
glob64 @@GLIBC_2.27 @GLIBC_2.2.5
一个符号发生ABI变更时(如变更参数或返回值的类型)必须要加新version,而API行为变更时有时也会加,比如:
realpath@@GLIBC_2.3
: 之前的realpath在第二个参数为NULL时返回EINVALmemcpy@@GLIBC_2.14
BZ12518: 之前的memcpy copies forward。当年的Shockwave Flash有个memcpy downward的bug因为memcpy采用了复杂的copy策略而触发quick_exit@@GLIBC_2.24
BZ20198: 之前的quick_exit会调用thread_local objects的destructorsglob64@@GLIBC_2.27
: 之前的glob不follow dangling symlinks
去除symbol versioning
因为前文提到的[reject]:如果in.so
引用foo@v1
,而该符号只有unversioned
foo
定义。用in.so
链接可执行档时会因为预设的--no-allow-shlib-undefined
行为报错。ld.lld的错误信息:
1
ld.lld: error: undefined reference to foo@v1 [--no-allow-shlib-undefined]
如果因为各种原因非得复用in.so
,一种比较hack的解决方法是:
1
2
3cp in.so out.so
r2 -wqc '/x feffff6f00000000 @ section..dynamic; w0 16 @ hit0_0' out.so
llvm-objcopy -R .gnu.version out.so
删除.gnu.version
后,链接器就会认为out.so
引用的是foo
而非foo@v1
。
llvm-objcopy在删除sections时会把对应的区域清零。
这样得到的out.so
若用于运行时,glibc会报错unsupported version 0 of Verneed record
。
若须用于运行时,可以把dynamic
table中的DT_VER*
删除。上面用了r2命令定位到DT_VERNEED
(0x6ffffffe)并把它改写为DT_NULL
(解析dynamic
table时遇到DT_NULL
停止)。readelf -d
的输出大致是:
1 | 0x000000006ffffffb (FLAGS_1) Flags: NOW |
ld.lld
- If an undefined symbol is not defined by a shared object, GNU ld will report an error. ld.lld before 12.0 did not error (I fixed it in https://reviews.llvm.org/D92260).
评价
GCC/Clang支持asm
specifier和#pragma redefine_extname
重命名一个符号。比如声明int foo() asm("foo_v1");
再引用foo
,.o中的符号会是foo_v1
。
举个例子,musl v.1.2.0最大的变化是32-bit架构的time64支持。musl采取了一种使用asm specifier的方案:
1 | // include/features.h |
- .o中,time32定义仍为
utimes
,提供ABI兼容旧程序;time64定义则为__utimes_time64
- Public header用asm
specifier重定向
utimes
到__utimes_time64
- 缺点是倘若用户自行声明
utimes
而不include public header,会得到deprecated time32定义。这种自行声明的方式是不推荐的
- 缺点是倘若用户自行声明
- 内部实现中“好看的”名字
utimes
表示time64定义;“难看的”名字__utimes_time32
表示deprecated time32定义- 假如time32实现被其他函数调用,那么用“难看的”名字能清晰地标识出来“此处和deprecated time32定义有关”
对于上述的例子,用symbol versioning来实作大概是这样:
1 | // API header include/sys/time.h |
注意.symver
不可用@@@
。这个header被定义的translation
unit使用,@@@
会产生一个default
version定义,而我们想要一个non-default version。
根据前文对Assembler行为的讨论,不如意的地方是:定义的translation
unit中,__utimes_time32
这个符号也存在。链接时注意用version
script localize它。
那么symbol versioning还有什么意义呢?我细细琢磨,有如下优点:
- 在不阻碍运行时符号解析的情况下拒绝链接旧的符号(reject-non-default)
- 不需要标注declarations
- version定义可以延迟决定到链接时。version script提供灵活的pattern matching机制指定versions
- Scope
reduction。然而,另一个类似
--dynamic-list
的机制可能会被发明用于localize符号 - 对编译器认识的builtin functions,在GCC/Clang的实现里重命名有一些语义上的问题(符号foo含有内建语义X)2020-10-15-intra-call-and-libc-symbol-renaming
- [verneed-check]
对于第一条,asm specifier的方案用约定来避免意外链接(用户应该include header);而symbol versioning可以用ld强制。
设计缺点:
.symver foo, foo@v1
在foo
被定义时的行为[gas-copy]:保留符号foo
(链接时有个多余的符号)、binding/st_other
保持同步(不方便设置不同的binding/visibility)- Verdaux有点多余。实践中一个Verdef只有一个Verdaux
- Verneed/Vernaux的结构绑定了提供version定义的shared
object的soname,ld.so要求"a versioned symbol is implemented in the same
shared object in which it was found at link time",这给在不同shared
objects间移动定义的符号造成了不便。所幸glibc 2.30 BZ24741放宽了该要求,实质上忽略了
Vernaux::vna_name
在此之前,glibc把clock_*
函数从librt.so移动到libc.so用的方法是:
1 | // rt/clock-compat.c |
libc.so中定义__clock_getres
和clock_getres
。librt.so中用一个名为clock_getres
的ifunc引导到libc.so中的__clock_getres
。