Everything I know about glibc

UNDER CONSTRUCTION

glibc is an implementation of the user-space side of standard C/POSIX functions with Linux extensions.

Build

Since glibc 2.35, glibc can be built with lld as the linker. I typically create a symlink /usr/local/bin/ld to a recent lld. Using ld.lld requires --with-default-link=yes.

1
2
mkdir out/gcc && cd out/gcc
../../configure --prefix=/tmp/glibc/gcc --with-default-link=yes && make -j 50 && make -j 50 install && 'cp' -f /usr/lib/x86_64-linux-gnu/{libgcc_s.so.1,libstdc++.so.6} /tmp/glibc/gcc/lib/

Some tests need {libgcc_s.so.1,libstdc++.so.6}.

Since 2021-12 (milestone: 2.36), an architecture supporting static-pie (SUPPORT_STATIC_PIE) enables static-pie by default, unless disabled by --disable-static-pie.

Cross compilation

To cross compile for aarch64:

1
2
mkdir out/aarch64 && cd out/aarch64
../../configure --prefix=/tmp/glibc/aarch64 --host=aarch64-linux-gnu

To cross compile for i686:

1
../../configure --prefix=/tmp/glibc/i686 --host=i686-linux-gnu CC='gcc -m32' CXX='g++ -m32' && make -j 50 && make -j 50 install && cp -f /usr/lib/i386-linux-gnu/{libgcc_s.so.1,libstdc++.so.6} /tmp/glibc/i686/

For cross compiling, run-built-tests is yes only if test-wrapper is set.

If you don't have a C++ compiler, specify CXX=.

Thanks to binfmt_misc and qemu-user, we can run make check. Unfortunately some tests may get stuck. I use the timeout program as the test wrapper.

1
2
3
4
make -j 50 check test-wrapper='timeout -k 5 5'

# Find tests which might be killed by `timeout`.
rg -g '*.test-result' 'original exit status 124'

Misc

A very unfortunate fact: glibc can only be built with -O2, not -O0 or -O1. If you want to have an un-optimized debug build, deleting an object file and recompiling it with -g usually works. Another workaround is #pragma GCC optimize ("O0").

The -O2 issue is probably related to (1) expected inlining and (2) avoiding dynamic relocations.

To regenerate configure from configure.ac: https://sourceware.org/glibc/wiki/Regeneration. Consider installing an autoconf somewhere with the required version.

In elf/, .o objects use -fpie -DPIC -DMODULE_NAME=libc, while .os objects use -fPIC -DPIC -DMODULE_NAME=rtld.

Build or test one directory

In the build directory, run:

1
2
3
4
5
# build
make -j50 subdirs='elf stdlib'

# test
make -j50 check subdirs='elf stdlib'

Alternatively,

1
make -r -C ~/Dev/glibc/stdlib objdir=$PWD subdir=stdlib subdir_lib

To rerun tests in elf/, delete elf/*.out files and run make -j50 check subdirs=elf.

Delete all failed tests so that the next make check invocation will re-run them:

1
rg -lg '*.test-result' '^FAIL' | while read i; do rm -f ${i/.test-result/.out} $i; done

The run a program with the built dynamic loader:

1
2
3
4
$build/testrun.sh $program

# Debug a test. --direct can avoid spawning a new process.
cgdb -ex 'set exec-wrapper env LD_LIBRARY_PATH=.:./math:./elf:./dlfcn:./nss:./nis:./rt:./resolv:./mathvec:./support:./crypt:./nptl' elf/tst-nodelete --direct

(

1
../../configure --prefix=/tmp/glibc/lld && make -j 50 && make -j 50 install && 'cp' -f /usr/lib/x86_64-linux-gnu/libgcc_s.so.1 /tmp/glibc/lld/lib/
)

GRTE

In a llvm-project build directory, build clang, lld, and clang_rt.crtbegin.o

1
ninja -C /tmp/out/custom1 clang crt lld

1
2
3
4
5
git checkout origin/google/grte/v5-2.27/master
mkdir -p out/grte && cd out/grte
../../configure --prefix=/tmp/grte/play --disable-werror --disable-float128 --with-clang --with-lld --enable-static-pie CC=/tmp/out/custom1/bin/clang CXX=/tmp/out/custom1/bin/clang++
make -j 30
make -j 30 install

build-many-glibcs.py

Run the following commands to populate /tmp/glibc-many with toolchains. Caution: please make sure the target file system has tens of gigabytes.

Preparation:

1
2
3
4
5
6
7
8
scripts/build-many-glibcs.py /tmp/glibc-many checkout --shallow
scripts/build-many-glibcs.py /tmp/glibc-many host-libraries

# Build a bootstrap GCC (static-only, C-only, --with-newlib).
# /tmp/glibc-many/src/gcc/gcc/configure --srcdir=/tmp/glibc-many/src/gcc/gcc --prefix=/tmp/glibc-many/install/compilers/aarch64-linux-gnu --with-sysroot=/tmp/glibc-many/install/compilers/aarch64-linux-gnu/sysroot --with-gmp=/tmp/glibc-many/install/host-libraries --with-mpfr=/tmp/glibc-many/install/host-libraries --with-mpc=/tmp/glibc-many/install/host-libraries --enable-shared --enable-threads --enable-languages=c,c++,lto ... --build=x86_64-pc-linux-gnu --host=x86_64-pc-linux-gnu --target=aarch64-glibc-linux-gnu ...
scripts/build-many-glibcs.py /tmp/glibc-many compilers aarch64-linux-gnu
scripts/build-many-glibcs.py /tmp/glibc-many compilers powerpc64le-linux-gnu
scripts/build-many-glibcs.py /tmp/glibc-many compilers sparc64-linux-gnu

  • --shallow passes --depth 1 to the git clone command.
  • --keep all keeps intermediary build directories intact. You may want this option to investigate build issues.

The glibcs command will delete the glibc build directory, build glibc, and run make check.

1
2
3
4
5
6
7
8
9
10
# Build glibc using bootstrap GCC in <path>/install/compilers/aarch64-linux-gnu/bin/aarch64-linux-gcc
# Then build GCC using the built glibc.
# /tmp/glibc-many/src/glibc/configure --prefix=/usr --enable-profile --build=x86_64-pc-linux-gnu --host=aarch64-glibc-linux-gnu CC=aarch64-glibc-linux-gnu-gcc CXX=aarch64-glibc-linux-gnu-g++ AR=aarch64-glibc-linux-gnu-ar AS=aarch64-glibc-linux-gnu-as LD=aarch64-glibc-linux-gnu-ld NM=aarch64-glibc-linux-gnu-nm OBJCOPY=aarch64-glibc-linux-gnu-objcopy OBJDUMP=aarch64-glibc-linux-gnu-objdump RANLIB=aarch64-glibc-linux-gnu-ranlib READELF=aarch64-glibc-linux-gnu-readelf STRIP=aarch64-glibc-linux-gnu-strip
# Find built glibc in <path>/install/glibcs/aarch64-linux-gnu
scripts/build-many-glibcs.py /tmp/glibc-many glibcs aarch64-linux-gnu
# Find the logs and test results under /tmp/glibc-many/logs/glibcs/aarch64-linux-gnu/

scripts/build-many-glibcs.py /tmp/glibc-many glibcs powerpc64le-linux-gnu

scripts/build-many-glibcs.py /tmp/glibc-many glibcs sparc64-linux-gnu

For the glibcs command, add --full-gcc to build C++.

1
2
3
4
5
6
many=/tmp/glibc-many
$many/install/compilers/aarch64-linux-gnu/bin/aarch64-glibc-linux-gnu-g++ -Wl,--dynamic-linker=$many/install/glibcs/aarch64-linux-gnu/lib/ld-linux-aarch64.so.1 -Wl,-rpath=$many/install/compilers/aarch64-linux-gnu/sysroot/lib64:$many/install/compilers/aarch64-linux-gnu/aarch64-glibc-linux-gnu/lib64 a.cc
./a.out

$many/install/compilers/x86_64-linux-gnu/bin/x86_64-glibc-linux-gnu-gcc -Wl,--dynamic-linker=$many/install/glibcs/x86_64-linux-gnu/lib64/ld-linux-x86-64.so.2 -Wl,-rpath=$many/install/compilers/x86_64-linux-gnu/sysroot/lib64:$many/install/compilers/x86_64-linux-gnu/x86_64-glibc-linux-gnu/lib64 a.c
./a.out

"On build-many-glibcs.py and most stage1 compiler bootstrap, gcc is build statically against newlib. the static linked gcc (with a lot of disabled features) is then used to build glibc and then the stage2 gcc (which will then have all the features that rely on libc enabled) so the stage1 gcc might not have the require started files"

During development, some interesting targets:

1
make -C out/debug check-abi

$build/abi-versions.h

For each port, shlib-versions files describe the lowest version of each library (e.g. ld, libc, libpthread). The information is collected into $build/Versions.v, next $build/Versions.def, then $build/Versions.all, finally $build/abi-versions.h.

Run rg -g 'shlib-versions' DEFAULT to query at which glibc version each port is introduced.

Take x86_64 as an example. sysdeps/unix/sysv/linux/x86_64/64/shlib-versions says that x86_64's earliest symbol is at GLIBC_2.2.5. $build/abi-versions.h contains:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
// Macros of earlier versions expand to ABI_libpthread_GLIBC_2_2_5, aka 1
#define ABI_libpthread_GLIBC_2_0 ABI_libpthread_GLIBC_2_2_5
#define ABI_libpthread_GLIBC_2_1 ABI_libpthread_GLIBC_2_2_5
...
#define ABI_libpthread_GLIBC_2_2_4 ABI_libpthread_GLIBC_2_2_5
#define ABI_libpthread_GLIBC_2_2_5 1
#define ABI_libpthread_GLIBC_2_2_6 2
...
#define ABI_libpthread_GLIBC_2_33 36
#define ABI_libpthread_GLIBC_2_34 37

// Macros of earlier versions expand to GLIBC_2.2.5
#define VERSION_libpthread_GLIBC_2_0 GLIBC_2.2.5
...
#define VERSION_libpthread_GLIBC_2_2_5 GLIBC_2.2.5
#define VERSION_libpthread_GLIBC_2_2_6 GLIBC_2.2.6

For each library (e.g. libpthread), a macro constructed at the earliest version (ABI_libpthread_GLIBC_2_2_5) is defined as 1.

Let's see a C source file using symbol versioning.

1
2
3
4
5
6
7
// __asm__ (".symver " "__new_sem_init" "," "sem_init" "@@" "GLIBC_2.34");
versioned_symbol (libc, __new_sem_init, sem_init, GLIBC_2_34);

#if OTHER_SHLIB_COMPAT(libpthread, GLIBC_2_1, GLIBC_2_34)
// __asm__ (".symver " "__new_sem_init" "," "sem_init" "@" "GLIBC_2_2_5");
compat_symbol (libpthread, __new_sem_init, sem_init, GLIBC_2_1);
#endif

#define _OTHER_SHLIB_COMPAT(lib, introduced, obsoleted) checks whether ABI_##lib##_##obsoleted is 0 (undefined) or (ABI_##lib##_##introduced - 0) < (ABI_##lib##_##obsoleted - 0) (obsoleted is greater than the earliest version). When the condition is true, the version range [introduced, obsoleted) overlaps the version range of the port, and therefore _OTHER_SHLIB_COMPAT expands to true.

compat_symbol (libpthread, __old_sem_init, sem_init, GLIBC_2_0); sets __old_sem_init to "sem_init" "@" "VERSION_libpthread_GLIBC_2_1" which expands to "sem_init" "@" "GLIBC_2.2.5". Ideally ,remove should be used (https://sourceware.org/bugzilla/show_bug.cgi?id=28197).