LLVM 17 will soon be relased. This post provides a summary of my contributions in this release cycle to record my learning progress.
- Ported
-fsanitize=function
to non-x86 architectures and the C language. - lld maintenance. See lld 17 ELF changes
- LLVM binary utility maintenance, e.g.
- compiler-rt maintenance, e.g.
- changed lsan to work with high-entropy ASLR for x86-64 Linux
- removed crypt and crypt_r interceptors to work with glibc
- implemented
interceptors for glibc 2.38
__isoc23_strtol
and__isoc23_scanf
family functions
- Fixed ~20 places where the iteration order of
MapVector
orSetVector
was relied on. Proposed moreLLVM_ENABLE_REVERSE_ITERATION=on
coverage - Fixed longstanding brittle assembler support for RISC-V linker relaxation
- adapted a simplified 64-bit xxh3 from Cyan4973/xxHash
- as efficient as the reference implementation for lld's purpose
- migrated various xxhash64 uses in the repository
- XRay maintenance, e.g.
- Implemented
__xray_customevent
/__xray_typedevent
for AArch64 - Changed codegen to work with Mach-O. Though the runtime part requires some CMake (which I lack) expertise to move more progress
- Fixed semantics of some obscure driver options
- Implemented
- gcov maintenance
- Helped improving featuress like
-fexperimental-sanitize-metadata=
,-fpseudo-probe-for-profiling
, and-fsanitize=kcfi
MC
- Removed many obsoleted workarounds from the integrated assembler
- Fixed placement of function entry comments
- Re-architectured a substantial part of the integrated assembler that is used by RISC-V linker relaxation, fixing some longstanding bugs. See The dark side of RISC-V linker relaxation for detail.
Clang
Driver
- Made output files more meaningful when combining compilation and
linking phases for certain options:
-gsplit-dwarf
,-ftime-trace
,--coverage
. See Compiler output files for detail - made target-specific
option report an error when used with a different target. Previously
Clang merely
issued warnings like
warning: argument unused during compilation: '-msse' [-Wunused-command-line-argument]
-fdebug-prefix-map=
: made the last win when multiple prefixes match- added
-mllvm=
as an alias for-mllvm
- Downgraded
riscv*-
triples to-fdebug-default-version=4
due to linker-relaxation complexity
Others:
Learning plan
I shall learn more about instrumentation related technologies and code generation.
Code review
At the present, LLVM uses a self-hosted Phabricator for code review. I have accepted or rejected 381 Phabricator Differentials revisions (patches) during the 181 days. (I will mention how I get this number later.) I hope that the number is fabulous, considering so many other tasks I have to do at work for bread and butter. Here is an incomplete list:
- unified LTO
(
clang -funified-lto
) - fat LTO
(
clang -ffat-lto-objects
) (pulled out from the 17.x release to avoid an incomplete state) - many lld features
- revamp of sanitizer interceptors
-gsplit-dwarf
for Windows- RISC-V
-fsanitize=kcfi
. - RISC-V patches related to assembler, linker, and psABI
llvm-objdump -d
for BPF- LLVMSupport patches
- Mips and LoongArch patches
Future of LLVM code review process
In June 2023, a full disk issue occurred with the server, which led to the rejection of new patch uploads. To address this problem, I expanded the database disk size from 500GB to 600GB, helping to mitigate the issue.
Phabricator is winding down. Maintaining a website with over 600K lines of PHP code is not sustainable. As a result, our community is in need of a suitable replacement and GitHub's pull request system have been selected as the replacement. I belong to the group that shares concerns regarding the quality of GitHub's pull request system, as it utilizes a model that some consider flawed.
GitHub's pull request system tightly couples a pull request with a branch and has a poor commit tracking ability. In short, the system shoehorns into less flexible and less desired workflows.
If commits are reordered or added or removed in the middle of an existing series, the tool can get confused quite easily.
[...]
I have to think will my rewriting history here make re-review harder? [...] This is a textbook example of tooling deficiencies dictating a sub-optimal workflow and outcome: because pull requests don't track commits explicitly, I'm forced to adopt a non-ideal workflow or sacrifice something like commit quality in order to minimize risks that the review tool won't get confused.
That said, ultimately, the individuals investing their efforts will have a significant say in determining the replacement. We that are concerned of GitHub pull will aceept the result, but I hope that the level of harm caused by the less-capable system is deemed acceptable. My primary worries include:
- Efficient subscription: how do contributors effectively subscribe to patches they are interested in?
- Review time: will we witness an increase in less prepared patches? Will they consume more time from established reviewers? Or will they be rubber stamped more frequently?
- Will we have dirty commit messages such as "Fix typo", "Add test", "Resolve XXX's comments"? We can force "Squash and merge" (forbid "Create a merge commit" and "Rebase and merge"), but it is difficult to ensure clean commit messages, if contributors have to think of implications of rewriting history.
Appendix
I did a SQL query to find the numbers of accepted/rejected patches by
Phabricator username between the two tags llvmorg-17-init
and llvmorg-18-init
(181 days), which roughly corresponds to the LLVM 17.x release
cycle.
1 | select u.username, count(distinct r.lastActionDIffPHID) from differential_reviewer r join phabricator_user.user u where r.reviewerStatus in ('accepted','rejected') and 1674629847<=r.dateCreated and r.dateCreated<1674634847 and r.reviewerPHID=u.phid group by r.reviewerPHID; |
1 | +-----------------------------+--------------------------------------+ |
We have reviewers who are diligent in accepting/rejecting patches, and others who prefer to provide comments rather than changing the state to accepted/blocked. In any case, a big thanks to all those who review code as well as the contributors!