In 1995, Solaris' link editor and ld.so introduced the symbol versioning mechanism. Ulrich Drepper and Eric Youngdale borrowed Solaris symbol versioning in 1997 and designed the GNU style symbol versioning for glibc.

When a shared object is updated, the behavior of a symbol changes (ABI changes (such as changing the type of parameters or return values) or behavior changes), traditionally a DT_SONAME bump is required. Otherwise a dependent application/shared object built with the old version may run abnormally. This can be inconvenient if the number of dependent applications is large.

Symbol versioning provides backward compatibility without changing DT_SONAME.

The following part describes the representation, and then describes the behaviors from the perspectives of assembler, linker, and ld.so. One may wish to skip the representation part when reading for the first time.

## Representation

In a shared object or executable file that uses symbol versioning, there are up to three sections related to symbol versioning. .gnu.version_r and .gnu.version_d among them are optional:

• .gnu.version (version symbol section). The DT_VERSYM tag in the dynamic table points to the section. Assuming there are N entries in .dynsym, .gnu.version contains N uint16_t values, with the i-th entry indicating the version ID of the i-th symbol. Put it another way, .gnu.version is a parallel table to .dynsym.
• .gnu.version_r (version requirement section). The DT_VERNEED/DT_VERNEEDNUM tags in the dynamic table delimiter this section. This section describes the version information used by the undefined versioned symbol in the module.
• .gnu.version_d (version definition section). The DT_VERDEF/DT_VERDEFNUM tags in the dynamic table delimiter this section. This section describes the version information used by the defined versioned symbols in the module.

Currently GNU ld does not set the VER_FLG_WEAK flag. BZ24718#c15 proposed "set VER_FLG_WEAK on version reference if all symbols are weak".

The advantage of using a parallel table for .gnu.version is that symbol versioning is optional. ld.so implementations which do not support symbol versioning can freely assume no symbol has a version. The behavior is that all references as if bind to the default version definitions. musl ld.so falls into this category.

### Version index values

Index 0 is called VER_NDX_LOCAL. The binding of the symbol will be changed to STB_LOCAL. Index 1 is called VER_NDX_GLOBAL. It has no special effect and is used for unversioned symbols. Index 2 to 0xffef are used for user defined versions.

Defined versioned symbols have two forms:

• foo@@v2, the default version.
• foo@v2, a non-default version (hidden version). The VERSYM_HIDDEN bit of the version ID is set.

Undefined versioned symbols have only the foo@v2 form.

Usually versioned symbols are only defined in shared objects, but executables can have defined versioned symbols as well. (When a shared object is updated, the old symbols are retained so that other shared objects do not need to be relinked, and executable files usually do not provide versioned symbols for other shared objects to reference.)

### Example

readelf -V can dump the symbol versioning tables.

In the .gnu.version_d output below:

• Version index 1 (VER_NDX_GLOBAL) is the filename (soname if shared object). The VER_FLG_BASE flag is set.
• Version index 2 is a user defined version. Its name is LUA_5.3.

In the .gnu.version_r output below, each of version indexes 3~10 represents a version in a depended shared object. The name GLIBC_2.2.5 appears thrice, each for a different shared object.

The .gnu.version table assigns a version index to each .dynsym entry.

### Symbol versioning in object files

The GNU scheme allows .symver directives to label the versions of the symbols in objec files. The symbol names residing in .o contain @ or @@.

## Assembler behavior

GNU as and LLVM integrated assembler provide implementation.

• .symver foo, foo@v1
• If foo is undefined, produce foo@v1
• If foo is defined, produce foo and foo@v1 with the same binding (STB_LOCAL, STB_WEAK, or STB_GLOBAL) and st_other value (i.e. the same visibility). Personally I think this behavior is a design flaw {gas-copy}. The proposed V4 PATCH gas: Extend .symver directive can address this problem.
• .symver foo, foo@@v1
• If foo is undefined, error
• If foo is defined, produce foo and foo@v1 with the same binding and st_other value.
• .symver foo, foo@@@v1
• If foo is undefined, produce foo@v1
• If foo is defined, produce foo@@v1

With GNU as 2.35 (PR25295) or Clang 13:

• .symver foo, foo@v1, remove
• If foo is undefined, produce foo@v1
• If foo is defined, produce foo@v1
• This is recommended way to define a non-default version symbol.

Personal recommendation:

• To define a default version symbol: use .symver foo, foo@@@v2 so that foo is not present. If you require binutils>=2.35 or Clang>=13, .symver foo, foo@@v2, remove works as well.
• To define a non-default version symbol, add a suffix to the original symbol name (.symver foo_v1, foo@v1) to prevent conflicts with foo. This will however leave (usually undesirable) foo_v1. If you don't strip foo_v1 from the object file, you may localize it with a local: pattern in the version script. With newer toolchain, you can use .symver foo_v1, foo@v1, remove.
• The version of an undefined symbol is usually bound at link time. It is usually unnecessary to set the version with .symver. If you really need to set a version, either .symver foo_v1, foo@@@v1 or .symver foo_v1, foo@v1 is fine.

The linker enters the symbol resolution stage after reading in object files, archive files, shared objects, LTO files, linker scripts, etc.

GNU ld uses indirect symbol to represent versioned symbols. There are complicated rules, and these rules are not documented. The symbol resolution rules that I personally derived:

• Defined foo resolves undefined foo (traditional unversioned rule)
• Defined foo@v1 resolves undefined foo@v1 (a non-default version symbol is like a separate symbol)
• Defined foo@@v1 (default version) resolves both undefined foo and foo@v1

If there are multiple default version definitions (such as foo@@v1 foo@@v2), a duplicate definition error should be issued even if one is weak. Usually a symbol has zero or one default version (@@) definition, and an arbitrary number of non-default version (@) definitions.

If the linker sees undefined foo and foo@v1 first, it will treat them as two symbols. When the linker see the definition foo@@v1, conceptually foo and foo@@v1 should be combined. If the linker sees foo@@v2 instead, foo@@v2 should resolve foo and foo@v1 should be a separate symbol.

• Combining Versions describes the problem.
• gold/symtab.cc Symbol_table::define_default_version uses a heuristic rule to solve this problem. It special cases on visibility, but I feel that this rule is unneeded.
• Before 2.36, GNU ld reported a bogus multiple definition error for defined weak foo@@v1 and defined global foo@v1 PR ld/26978
• Before 2.36, GNU ld had a bug that the visibility of undefined foo@v1 does not affect the output visibility of foo@@v1: PR ld/26979
• I fixed the object file side problem of LLD 12.0 in https://reviews.llvm.org/D92259 foo Archive files and lazy object files may still have incompatibility issues.

When LLD sees a defined foo@@v, it adds both foo and foo@v1 into the symbol table, thus foo@@v1 can resolve both undefined foo and foo@v1. After processing all input files, a pass iterates symbols and redirects foo@v1 to foo@@v1. Becase LLD treats them as separate symbols during input processing, a defined foo@v cannot suppress the extraction of an archive member defining foo@@v1, leading to a behavior incompatible with GNU ld. This probably does not matter, though.

GNU ld has another strange behavior: if both foo and foo@v1 are defined, foo will be removed. I strongly believe it is an issue in GNU ld but the maintainer rejected PR ld/27210.

### Version script

To define a versioned symbol in a shared object or an executable, a version script must be specified. If all versioned symbols are undefined, then the version script can be omitted.

A version script has three purposes:

• Define versions.
• Specify some patterns so that matched defined symbols (which do not have @ in the name) are tied to the specified version.
• Scope reduction: for a defined unversioned symbol matched by a local: pattern, its binding will be changed to STB_LOCAL and will not be exported to the dynamic symbol table.

A version script consists of one anonymous version tag ({...};) or a list of named version tags (v1 {...};). If you use an anonymous version tag with other version tags, GNU ld will error: anonymous version tag cannot be combined with other version tags. A local: part can be placed in any version tag. Which version tag is used does not matter.

If a defined symbol is matched by multiple version tags, the following precedence rules apply (binutils-gdb/bfd/linker.c:find_version_for_sym):

• The first version tag with an exact pattern (i.e. there is no wildcard) wins.
• Otherwise, the last version tag with a non-* wildcard pattern wins.
• Otherwise, the first version tag with a * pattern wins.

The gotcha is that ** is a wildcard pattern which matches any symbol but its precedence is higher than *.

Most patterns are exact so gold and LLD iterate patterns instead of symbols to improve performance.

### How a versioned symbol is produced

An undefined symbol can be assigned a version if:

• its name does not contain @ (.symver is unused) and a shared object provides a default version definition.
• its name contains @ and a shared object defines the symbol. GNU ld errors if there is no such a shared object. After https://reviews.llvm.org/D92260, LLD will report an error as well.

A defined symbol can be assigned a version if:

• its name does not contain @ and it is matched by a pattern in a named version tag in a version script.
• its name contains @
• If -shared, the version should be defined by a version script, otherwise GNU ld errors version node not found for symbol. This exception looks strange to me so I have filed PR ld/26980.
• If -no-pie or -pie, a version definition is unneeded in GNU ld. This behavior is strange.

## ld.so behavior

Linux Standard Base Core Specification, Generic Part describes the behavior of ld.so. Kan added symbol versioning support to FreeBSD rtld in 2005.

The DT_VERNEED and DT_VERNEEDNUM tags in the dynamic table delimiter the version requirement by a shared object/executable file: the requires versions and required shared object names (Vernaux::vna_name).

For each Vernaux entry (a Verneed's auxilliary entry) without the VER_FLG_WEAK bit, ld.so checks whether the referenced shared object has the DT_VERDEF table. If no, ld.so handles the case as a graceful degradation; if yes and the table does not define the version, ld.so reports an error. [verneed-check]

Usually a minor release does not bump soname. Suppose that libB.so depends on the libA 1.3 (soname is libA.so.1) and calls an function which does not exist in libA 1.2. If PLT lazy binding is used, libB.so may seem to work on a system with libA 1.2, until the PLT of the 1.3 symbol is called. If symbol versioning is not used and you want to solve this problem, you have to record the minor version number (libA.so.1.3) in the soname. However, bumping soname is all-or-nothing: all the dependent shared objects need to be relinked. If symbol versioning is used, you can continue to use the soname libA.so.1. ld.so will report an error if libA 1.2 is used, because the 1.3 version required by libB.so does not exist.

In the symbol resolution stage:

• An undefined foo can be resolved to a definition of foo or foo@@v2 (only the definitions with index number 1 (VER_NDX_GLOBAL) and 2 are used in the reference match).
• An undefined foo@v1 can be resolved to a definition of foo, foo@v1, or foo@@v1.

Note (undefined foo resolving to foo@v1) is allowed by ld.so but not allowed by the linker {reject-non-default}. This difference provides a mechanism to refuse linking against old symbols while keeping compatibility with unversioned old libraries. If a new version of a shared object needs to deprecate an unversioned bar, you can remove bar and define bar@compat instead. Libraries using bar are unaffected but new links against bar are disallowed.

Note that GNU nm before binutils 2.35 does not display @ or @@.

The output on my x86-64 system:

• realpath@@GLIBC_2.3: the previous version returns EINVAL when the second parameter is NULL
• memcpy@@GLIBC_2.14 BZ12518: the previous version guarantees a forward copying behavior. Shockwave Flash at that time had a "memcpy downward" bug which required the workaround.
• quick_exit@@GLIBC_2.24 BZ20198: the previous version copies the destructors of thread_local objects.
• glob64@@GLIBC_2.27: the previous version does not follow dangling symlinks.

## How to remove symbol versioning

Imagine that you want to build an application with a prebuilt shared object which has versioned references, but you can only find shared objects providing the unversioned definitions. The linker will helpfully error:

As the diagnostic suggests, you can add --allow-shlib-undefined to get rid of the error. It is not recommended but the built application may happen to work.

For this case, an alternative hacky solution is:

With the removal of .gnu.version, the linker will think that out.so references foo instead of foo@v1. However, llvm-objcopy will zero out the section contents. At runtime, glibc ld.so will complain unsupported version 0 of Verneed record. To make glibc happy, you can delete DT_VER* tags from the dynamic table. The above code snippet uses an r2 command to locate DT_VERNEED(0x6ffffffe) and rewrite it to DT_NULL(a DT_NULL entry stops the parsing of the dynamic table). The difference of the readelf -d output is roughly:

## LLD

• If an undefined symbol is not defined by a shared object, GNU ld will report an error. LLD before 12.0 did not error (I fixed it in https://reviews.llvm.org/D92260).

## Remarks

GCC/Clang supports asm specifier and #pragma redefine_extname renaming a symbol. For example, if you declare int foo() asm("foo_v1"); and then reference foo, the symbol in .o will be foo_v1.

For example, the biggest change in musl v1.2.0 is the time64 support for its supported 32-bit architectures. musl adopted a scheme based on asm specifiers:

• In .o, the time32 symbol remains utimes and is compatible with the ABI required by programs linked against old musl versions; the time64 symbol is __utimes_time64.
• The public header redirects utimes to __utimes_time64.
• cons: if the user declares utimes by themself, they will not link against the correct __utimes_time64.
• The "good-looking" name utimes is used for the preferred time64 implementation internally and the "ugly" name __utimes_time32 is used for the legacy time32 implementation.
• If the time32 implementation is called elsewhere, the "ugly" name can make it stand out.

For the above example, here is an implementation with symbol versioning:

Note that it is @@@ cannot be used. The header is included in a defining translation unit and @@@ will lead to a default version definition while we want a non-default version definition.

According to Assembler behavior, the undesirable __utimes_time32 is present. Be careful to use a version script to localize it.

So what is the significance of symbol versioning? I think carefully:

• Refuse linking against old symbols while keeping compatibility with unversioned old libraries. {reject-non-default}
• No need to label declarations.
• The version definition can be delayed until link time. The version script provides a flexible pattern matching mechanism to assign versions.
• Scope reduction. Arguably another mechanism like --dynamic-list might have been developed if version scripts did not provide local:.
• There are some semantic issues in renaming builtin functions with asm specifiers in GCC and Clang (they do not know that the renamed symbol has built-in semantic). See 2020-10-15-intra-call-and-libc-symbol-renaming
• [verneed-check]

For the first item, the asm specifier scheme uses conventions to prevent problems (users should include the header); and symbol versioning can be forced by ld.

Design flaws:

• .symver foo, foo@v1In foobehavior defined {gas-copy}: reserved symbol foo(redundant symbol has a link), binding / st_othersync (not convenient to set different binding / visibility)
• Verdaux is a bit redundant. In practice, one Verdef has only one auxilliary Verdaux entry.
• This is arguably a minor problem but annoying for a framework providing multiple shared objects. ld.so requires "a versioned symbol is implemented in the same shared object in which it was found at link time", which disallows moving definitions between shared objects. Fortunately, glibc 2.30 BZ24741 relaxes this requirement, essentially ignoring Vernaux::vna_name.

Before that, glibc used a forwarder to move clock_* functions from librt.so to libc.so:

libc.so defines __clock_getres and clock_getres. librt.so defines an ifunc called clock_getres which forwards to libc.so __clock_getres.

# 中文版

1995年Solaris的link editor和ld.so引入了symbol versioning机制。 Ulrich Drepper和Eric Youngdale在1997年借鉴Solaris symbol versioning，设计了用于glibc的GNU风格symbol versioning。

## 表示方式

• .gnu.version(version symbol section)。dynamic table中的DT_VERSYM tag指向该section。假设.dynsym有N个entries，那么.gnu.version包含N个uint16_t。第i个entry描述第i个dynamic symbol table所属的version
• .gnu.version_r(version requirement section)。dynamic table中的DT_VERNEED/DT_VERNEEDNUM tags标记该section。描述该模块的未定义的versioned符号用到的version信息
• .gnu.version_d(version definition section)。dynamic table中的DT_VERDEF/DT_VERDEFNUM tags标记该section。记录该模块定义的versioned符号用到的version信息

### Version index values

Index 0称为VER_NDX_LOCAL。Version id为0的符号的binding将会更改为STB_LOCAL。 Index 1称为VER_NDX_GLOBAL。没有特殊作用，用于unversioned符号。 Index 2到0xffef用于其他versions。

• foo@@v2，称为default version
• foo@v2，称为non-default version，也叫hidden version，其version id设置了VERSYM_HIDDEN bit

### 例子

readelf -V可以导出symbol versioning表。

• Version index 1 (VER_NDX_GLOBAL) is the filename (soname if shared object). The VER_FLG_BASE flag is set.
• Version index 2 is a user defined version. Its name is LUA_5.3.

.gnu.version表给每个.dynsym符号分配了一个version index。

## Assembler行为

GNU as和LLVM integrated assembler提供实现。

• 对于.symver foo, foo@v1
• 如果foo未定义，.o中有一个名为foo@v1的符号
• 如果foo被定义，.o中有两个符号：foofoo@v1，两者的binding一致(均为STB_LOCAL，或均为STB_WEAK，或均为STB_GLOBAL)，st_other一致(visibility一致)。个人认为这个行为是设计缺陷{gas-copy}
• 对于.symver foo, foo@@v1
• 如果foo未定义，assembler报错
• 如果foo被定义，.o中有两个符号：foofoo@@v1，两者的binding和st_other一致
• 对于.symver foo, foo@@@v1
• 如果foo未定义，.o中有一个名为foo@v1的符号
• 如果foo被定义，.o中有一个名为foo@@v1的符号

• 定义default-version符号时使用.symver foo, foo@@@v2，在.o中只产生foo@@v2，不产生foo
• 定义non-default符号时在原符号名后加后缀(.symver foo_v1, foo@v1)防止和foo冲突。在.o中会同时有foo_v1foo@v1。目前没有便捷方法去除(通常不想要的)foo_v1，一般在指定version script时注意把foo_v1设置为local
• 未定义的versioned符号通常是链接时绑定的，object files不须要指定符号。如果确实要引用，推荐.symver foo, foo@@@v1，即使能.symver foo, foo@v1达到相同效果

## 链接器行为

GNU ld用indirect symbol表示versioned符号，在很多阶段都有复杂的规则，这些规则都没有文档。 我个人得出的符号解析规则：

• 定义的foo可以满足未定义的foo(传统unversioned符号规则)
• 定义的foo@v1可以满足未定义的foo@v1
• 定义的foo@@v1可以同时满足未定义的foofoo@v1

LLD的实现中，看到shared object中的foo@@v1则在符号表中同时插入foofoo@v1，因此可以满足未定义的foofoo@v1

• Combining Versions描述了这个问题
• gold/symtab.cc Symbol_table::define_default_version用一个启发式规则处理这个问题。它特殊判断了visibility，但我感觉这个规则可能不需要也行
• Before 2.36, GNU ld reported a bogus multiple definition error for defined weak foo@@v1 and defined global foo@v1 PR ld/26978
• Before 2.36, GNU ld had a bug that the visibility of undefined foo@v1 does not affect the output visibility of foo@@v1: PR ld/26979
• I fixed the object file side problem of LLD 12.0 in https://reviews.llvm.org/D92259 foo Archive files and lazy object files may still have incompatibility issues.

### Version script

Version script有三个用途：

• 定义versions
• 指定一些模式，使得匹配的、定义的、unversioned的符号具有指定的version
• Scope reduction：对于一个被local:模式匹配的符号，如果它是定义的、unversioned的，那么它的binding会被更改为STB_LOCAL，不会导出到dynamic symbol table

• 第一个exact pattern的version tag
• 最后一个非*的wildcard pattern的version tag
• 第一个含*的version tag

**规则虽然也能匹配所有符号，但它的优先级高于*

### Versioned symbol产生方式

• 名字不包含@(没有使用.symver)：某个shared object定义了default version符号
• 名字包含@：该符号须要被某个shared object定义，否则GNU ld会报错；https://reviews.llvm.org/D92260之后LLD也会报错

• 名字不包含@：被version script的一个named version tag的某个pattern匹配而获得version
• 名字包含@
• -shared：version v1须要被version script定义，否则GNU ld会报错(version node not found for symbol)
• -no-pie-pie：GNU ld不需要version script即会生成version定义v1。这个行为奇怪。

## ld.so行为

Linux Standard Base Core Specification, Generic Part 描述了ld.so行为。 kan在2005年给FreeBSD rtld添加了symbol versioning支持。

Dynamic table中的DT_VERNEEDDT_VERNEEDNUM标识了一个shared object/可执行档需要的外部version定义，及该定义须由哪个shared object(Vernaux::vna_name)提供。 如果该Vernaux项(附属于Verneed)没有VER_FLG_WEAK标志，且目标shared object中DT_VERDEF表存在但没有定义需要的version，报错。

• 未定义unversioned foo可以解析到定义foofoo@@v2(v2的version index应为1(VER_NDX_GLOBAL)或2)
• 未定义versioned foo@v1可以解析到定义foofoo@v1foo@@v1

## glibc中升级过的符号

• realpath@@GLIBC_2.3: 之前的realpath在第二个参数为NULL时返回EINVAL
• memcpy@@GLIBC_2.14 BZ12518: 之前的memcpy copies forward。当年的Shockwave Flash有个memcpy downward的bug因为memcpy采用了复杂的copy策略而触发
• quick_exit@@GLIBC_2.24 BZ20198: 之前的quick_exit会调用thread_local objects的destructors
• glob64@@GLIBC_2.27: 之前的glob不follow dangling symlinks

## 去除symbol versioning

llvm-objcopy在删除sections时会把对应的区域清零。 这样得到的out.so若用于运行时，glibc会报错unsupported version 0 of Verneed record。 若须用于运行时，可以把dynamic table中的DT_VER*删除。上面用了r2命令定位到DT_VERNEED(0x6ffffffe)并把它改写为DT_NULL(解析dynamic table时遇到DT_NULL停止)。readelf -d的输出大致是：

## LLD

• If an undefined symbol is not defined by a shared object, GNU ld will report an error. LLD before 12.0 did not error (I fixed it in https://reviews.llvm.org/D92260).

## 评价

GCC/Clang支持asm specifier和#pragma redefine_extname重命名一个符号。比如声明int foo() asm("foo_v1");再引用foo，.o中的符号会是foo_v1

• .o中，time32定义仍为utimes，提供ABI兼容旧程序；time64定义则为__utimes_time64
• Public header用asm specifier重定向utimes__utimes_time64
• 缺点是倘若用户自行声明utimes而不include public header，会得到deprecated time32定义。这种自行声明的方式是不推荐的
• 内部实现中“好看的”名字utimes表示time64定义；“难看的”名字__utimes_time32表示deprecated time32定义
• 假如time32实现被其他函数调用，那么用“难看的”名字能清晰地标识出来“此处和deprecated time32定义有关”

• 在不阻碍运行时符号解析的情况下拒绝链接旧的符号(reject-non-default)
• 不需要标注declarations
• version定义可以延迟决定到链接时。version script提供灵活的pattern matching机制指定versions
• Scope reduction。然而，另一个类似--dynamic-list的机制可能会被发明用于localize符号
• 对编译器认识的builtin functions，在GCC/Clang的实现里重命名有一些语义上的问题(符号foo含有内建语义X)2020-10-15-intra-call-and-libc-symbol-renaming
• [verneed-check]

• .symver foo, foo@v1foo被定义时的行为[gas-copy]：保留符号foo(链接时有个多余的符号)、binding/st_other保持同步(不方便设置不同的binding/visibility)
• Verdaux有点多余。实践中一个Verdef只有一个Verdaux
• Verneed/Vernaux的结构绑定了提供version定义的shared object的soname，ld.so要求"a versioned symbol is implemented in the same shared object in which it was found at link time"，这给在不同shared objects间移动定义的符号造成了不便。所幸glibc 2.30 BZ24741放宽了该要求，实质上忽略了Vernaux::vna_name

libc.so中定义__clock_getresclock_getres。librt.so中用一个名为clock_getres的ifunc引导到libc.so中的__clock_getres