Before September 2020 FreeBSD could only be built on a FreeBSD host. Alexander Richardson did a lot of work making this possible: https://wiki.freebsd.org/BuildingOnNonFreeBSD.
Get prebuilt Clang and LLD
I use prebuilt Clang and LLD from Chromium: https://chromium.googlesource.com/chromium/src/tools/clang/+/refs/heads/main/scripts/update.py. My output directory is
FreeBSD src enables
-Werror by default. Our Clang is new and may have many new diagnostics. We can use
-DWITHOUT_WERROR to drop
Now kick off our build.
mkdir -p obj/default
tools/build/make.py forwards unknown options to bmake.
It may stop at some step due to various non-hermetic issues.
man bmake to learn
MAKEOBJDIRPREFIX. bmake comes with built-in support for separate src/obj trees. The recursive makefile build system used by FreeBSD does not handle relative
MAKEOBJDIRPREFIX paths well, so just use an absolute path.
I tried Bear but it could not handle such a complex build system. The process got stuck at some step, so I just gave up on it.
bmake meta mode
I learned a bit about bmake (default
make on FreeBSD) and noticed a nice feature: meta mode. In meta mode bmake records build commands into
.meta files. For the next build, bmake will consult
.meta files to evaluate whether the target has become out-of-date. This is more robust than just comparing file modification times. Build Systems à la Carte says such a build system is self-tracking.
For FreeBSD src, we can enable meta mode with
buildworld, we can parse these
.meta files under objdir and build
MAKEOBJDIRPREFIX=$PWD/obj/meta ./tools/build/make.py --cross-bindir=~/Stable/bin -j 20 buildworld TARGET=amd64 TARGET_ARCH=amd64 -DWITH_META_MODE -DNO_FILEMON -DWITHOUT_WERROR
Some notes about the other
bmake has code dealing with filemon, which is a FreeBSD driver. On Linux we need to disable it with
As of 2021-08, building on Linux still has some issues. I mostly read
libexec/rtld-elf and the build process can proceed beyond
libexec/rtld-elf, so I am satisfied.
compile_commands.json, my ccls can index the repository.
Here is a screenshot browsing
libexec/rtld-elf code in Emacs with (lsp-mode + emacs-ccls).
(setq ccls-sem-highlight-method 'font-lock)
Contribute to libexec/rtld-elf
I stumbled upon FreeBSD
libexec/rtld-elf in 2019 to sort out how ld.lld should set the
p_memsz field of
PT_GNU_RELRO. I noticed an issue but did not get a chance to create a patch. Scroll down for details.
When working on some TLS issues in ld.lld, I noticed that rtld did not handle
p_vaddr % p_align != 0 correctly. (Note: fixed for i386 and amd64.)
In 2020 I noticed a symbol resolution issue related to
STB_WEAK, but did not follow up with the patch. (Note: introduced the environment variable
LD_DYNAMIC_WEAK=0 to match ELF spec (glibc/musl behavior).)
Now that I have a proper setup, I can work on the aforementioned problems in a virtual machine running FreeBSD 12.2.
qemu-system-x86_64 -enable-kvm -m 16384 -smp 16 -drive file=~/Images/freebsd.qcow2,if=virtio -net nic,model=virtio -net user,hostfwd=tcp::2223-:22
% cat /etc/src.conf
# rsync changes to the src repository
(My experience with
SUBDIR_OVERRIDE=libexec/rtld-elf is bad.)
The versions of rtld and libc should match if they are of different major versions. Simple programs may work even if you don't use a libc of the matching version.
mkdir -p /tmp/opt/lib
Thanks to kib who reviewed these patches and lwhsu who added me to this contributor list: https://docs.freebsd.org/en/articles/contributors/#contrib-additional.
p_memsz of PT_GNU_RELRO
An ELF component usually needs a
PT_LOAD program header with the permission bits
PF_R|PF_W. Some sections are only needed to be writable at relocation processing time and can be made read-only during regular program execution. glibc invented
PT_GNU_RELRO which has been ported to FreeBSD/NetBSD/OpenBSD.
While linkers ensure that there is an alignment boundary of max-page-size bytes between two
PT_LOAD program headers, the alignment boundary following
PT_GNU_RELRO is just common-page-size bytes. GNU ld, gold, and ld.lld ensure that
p_vaddr+p_memsz is a multiple of common-page-size.
glibc and musl do something like
size_t start = roundDown(p_vaddr, PAGE_SIZE);
FreeBSD rtld did something like
size_t start = roundDown(p_vaddr, PAGE_SIZE);
// roundUp instead of roundDown
size_t size = roundUp(p_vaddr+p_memsz, PAGE_SIZE) - roundDown(p_vaddr, PAGE_SIZE);
mprotect(laddr(start), size, PROT_READ);
PAGE_SIZE (the system page size) is larger than the link-time common-page-size, mprotect may incorrectly map some non-RELRO pages read-only.
https://reviews.freebsd.org/D31498 fixed the bug.
STB_WEAK in symbol lookup
The first version of the ELF specification http://www.sco.com/developers/gabi/1998-04-29/ch5.dynamic.html says:
When resolving symbolic references, the dynamic linker examines the symbol tables with a breadth-first search. That is, it first looks at the symbol table of the executable program itself, then at the symbol tables of the DT_NEEDED entries (in order), and then at the second level DT_NEEDED entries, and so on...
(This paragraph has not been updated in the latest snapshot.)
The common(?) interpretation is that STB_WEAK/STB_GLOBAL have no differences for symbol lookup.
I asked https://groups.google.com/g/generic-abi/c/YdmpBmukW0g for clarification and archaeology.
glibc and musl use this symbol lookup behavior: resolve to the first found symbol definition, regardless of
FreeBSD/NetBSD/OpenBSD use this non-conforming behavior: when a weak symbol definition is found, remember the definition and keep searching in the remaining shared objects for a non-weak definition. If found, the non-weak definition is preferred, otherwise the remembered weak definition is returned.
https://reviews.freebsd.org/D26352 implemented the Linux behavior under the environment variable
p_vaddr % p_align != 0 for PT_TLS
It is very complex. See All about thread-local storage.
https://reviews.freebsd.org/D31538 fixed the i386/amd64 ports.
bmake supports a few features which make a make based build system less daunting:
- built-in support for separte src/obj trees
- meta mode (self-tracking build system)
- logical AND operators
||are supported in
- variable modifiers
These are major pain points in GNU make.
GNU make needs something like http://make.mad-scientist.net/papers/multi-architecture-builds/ to support separate src/obj trees.
Most GNU make based build systems cannot rebuild the target when the commands change. The Linux kernel uses
.cmd files to solve the problem.
In glibc, the following pattern is quite common to work around the lack of logical AND operators.
Some variable modifiers can be difficult to remember, but a good use of them drops the need to spawn various shell utilities.
FreeBSD rtld supports many GNU extensions:
- GNU indirect functions (
- GNU symbol versioning
I came from a musl background. In many places I think FreeBSD's is overly complex, but is still relatively clean.
It has implemented some features which musl does not support:
- lazy binding PLT
- dlclose which actually unloads a DSO
In comparison, many stuff are quite messy in glibc rtld. It can learn a lot from musl and FreeBSD rtld.