This article describes how to detect C++ One Definition Rule (ODR) violations. There are many good resources on the Internet about how ODR violations can introduce subtle bugs, so I will not repeat that here.
_FORTIFY_SOURCE
Updated in 2024-08.
glibc 2.3.4 introduced _FORTIFY_SOURCE
in 2004
to catch security errors due to misuse of some C library functions. The
initially supported functions were
fprintf, gets, memcpy, memmove, mempcpy, memset, printf, snprintf, sprintf, stpcpy, strcat, strcpy, strncat, strncpy, vfprintf, vprintf, vsnprintf, vsprintf
and focused on buffer overflow detection and dangerous printf
%n
uses. The implementation leverages inline functions and
__builtin_object_size
(see [PATCH]
Object size checking to prevent (some) buffer overflows). More
functions were added over time and __builtin_constant_p
was
used as well. As of 2022-11 glibc defines 79 default version
*_chk
functions.
lld linked musl on PowerPC64
I was asked about a segfault related to lld linked musl libc.so on PowerPC64.
/usr/lib/ld-musl-powerpc64le.so.1 /path/to/thing
worked. The kernel ELF loader loads rtld and rtld loads the executable./path/to/thing
segfaulted. The kernel ELF loader loads both rtld and the executable.
Therefore the bug is likely due to a difference between the two modes.
Distribution of debug information
Updated in 2024-08.
Note: The article will likely get frequent updates in the next few days.
This article describes some approaches to distribute debug information. Commands below will use two simple C files for demonstration.
1 | cat > a.c <<eof |
C minifier with Clang
I recently revamped Competitive programming in Nim. In short, I can create a C amalgamation from a Nim program and submit the C source code to various competitive programming websites.
Then I use a Clang based tool to shorten the C source code. It does two things:
- Shorten function, variables, and type names
- Use the
clangFormat
library to remove some whitespace
For the first step, the tool uses a derived
ASTFrontendAction
to traverse the AST twice, one for
collecting function/var/type names and the other for renaming. Building
clang::CompilerInstance
from command lines needs some
boilerplate. An alternative is to use
clang::tooling::CommonOptionsParser
and
clang::tooling::ClangTool
.
1 | /* |
CMakeLists.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61cmake_minimum_required(VERSION 3.14)
project(cminify LANGUAGES C CXX)
add_executable(cminify "")
set(DEFAULT_CMAKE_BUILD_TYPE Release)
set_property(TARGET cminify PROPERTY CXX_STANDARD 17)
set_property(TARGET cminify PROPERTY CXX_STANDARD_REQUIRED ON)
set_property(TARGET cminify PROPERTY CXX_EXTENSIONS OFF)
find_package(Clang REQUIRED)
if(CLANG_LINK_CLANG_DYLIB)
target_link_libraries(cminify PRIVATE clang-cpp)
else()
target_link_libraries(cminify PRIVATE
clangIndex
clangFormat
clangTooling
clangToolingInclusions
clangToolingCore
clangFrontend
clangParse
clangSerialization
clangSema
clangAST
clangLex
clangDriver
clangBasic
)
endif()
if(LLVM_LINK_LLVM_DYLIB)
target_link_libraries(cminify PRIVATE LLVM)
else()
target_link_libraries(cminify PRIVATE LLVMOption LLVMSupport)
endif()
if(NOT LLVM_ENABLE_RTTI)
# releases.llvm.org libraries are compiled with -fno-rtti
# The mismatch between lib{clang,LLVM}* and cminify can make libstdc++ std::make_shared return nullptr
# _Sp_counted_ptr_inplace::_M_get_deleter
if(MSVC)
target_compile_options(cminify PRIVATE /GR-)
else()
target_compile_options(cminify PRIVATE -fno-rtti)
endif()
endif()
target_sources(cminify PRIVATE main.cc)
foreach(include_dir ${LLVM_INCLUDE_DIRS} ${CLANG_INCLUDE_DIRS})
get_filename_component(include_dir_realpath ${include_dir} REALPATH)
# Don't add as SYSTEM if they are in CMAKE_CXX_IMPLICIT_INCLUDE_DIRECTORIES.
# It would reorder the system search paths and cause issues with libstdc++'s
# use of #include_next. See https://github.com/MaskRay/ccls/pull/417
if(NOT "${include_dir_realpath}" IN_LIST CMAKE_CXX_IMPLICIT_INCLUDE_DIRECTORIES)
target_include_directories(cminify SYSTEM PRIVATE ${include_dir})
endif()
endforeach()
install(TARGETS cminify RUNTIME DESTINATION bin)
Define LLVM
as the llvm-project repository and
LLVMOUT
as the build directory (make sure you have at least
built these targets:
ninja clang clangFormat clangIndex clangTooling
).
1
2cmake -GNinja -S. -Bout/release -DCMAKE_BUILD_TYPE=Release -DCMAKE_PREFIX_PATH="$LLVMOUT;$LLVMOUT/tools/clang;$LLVM/llvm;$LLVM/clang"
ninja -C out/release
If LLVM and Clang's CMake, library, and header files are installed in
well-known locations, then -DCMAKE_PREFIX_PATH
can be
omitted.
It's certainly not straightforward to find all these APIs. I mainly
use ccls as a reference which was inspired by clangIndex
.
For writing this tool, I read a bit code of clang-rename
,
clang-format
, and C-Reduce clang_delta
.
C-Reduce provides clang_delta/RenameFun.cpp
and two other passes (RenameVar, RenameParam) which do similar stuff.
Its code was a bit old now as it was written based on a Clang in circa
2012.
Let's see an example. Unfortunately I don't find clangFormat options
removing whitespace after =
and ,
. That can
perhaps be done by a post-processing string substitution tool without
introducing too much risk.
1 | % cat test/a.c |
Layering check with Clang
Updated in 2023-07.
This article describes some Clang header modules
features that apply to #include
. These features enforce a
more explicit dependency graph, which provide documentation purposes and
makes refactoring convenient. The benefits of clean header inclusions
are well described in Include
What You Use as well, so I won't repeat them here.
When using C++20 modules, these features apply to
#include
in a global module fragment (module;
)
but have no effect for import declarations.
Layering check
-fmodules-decluse
For a #include
directive, this option emits an error if
the following conditions are satisfied (see
clang/lib/Lex/ModuleMap.cpp
diagnoseHeaderInclusion
):
- The main file is within a module (called "source module", say,
A
). - The main file or an included file from the source module includes a
file from another module
B
. A
does not have a use-declaration ofB
(nouse B
).
For the first condition, -fmodule-map-file=
is needed to
load the source module map and -fmodule-name=A
is needed to
indicate that the source file is logically part of module
A
.
For the second condition, the module map defining B
must
be loaded by specifying -fimplicit-module-maps
(implied by
-fmodules
and -fcxx-modules
) or a
-fmodule-map-file=
.
zstd compressed debug sections
Updated in 2022-10.
In January I wrote Compressed debug
sections. The venerable zlib shows its age and there are
replacements which are better in every metric except adoption and a
larger memory footprint. The obvious choice was Zstandard, but I was not
so confident about adoptinig it and solving the ecosystem issue. At any
rate, I slowly removed some legacy .zdebug
support from
llvm-project so that a new format could be more easily introduced.
lld 15 ELF changes
llvm-project 15 was just released. I added some lld/ELF notes to https://github.com/llvm/llvm-project/blob/release/15.x/lld/docs/ReleaseNotes.rst. Here I will elaborate on some changes.
-march=, -mcpu=, and -mtune=
In GCC and Clang, there are three major options specifying the architecture and microarchitecture the generated code can run on. The general semantics are described below, but each target machine may assign different semantics.
-march=X
: (execution domain) Generate code that can use instructions available in the architecture X-mtune=X
: (optimization domain) Optimize for the microarchitecture X, but does not change the ABI or make assumptions about available instructions-mcpu=X
: Specify both-march=
and-mtune=
but can be overridden by the two options. The supported values are generally the same as-mtune=
. The architecture name is inferred fromX
glibc and DT_GNU_HASH
tl;dr "Easy Anti-Cheat"'s incompatibility with glibc 2.36 provides
shared objects (libc.so.6
,
ld-linux-x86_64.so.2
) is an instance of Hyrum's law.
- On 2022-08-02 glibc 2.36 was released.
- On the same day the x86-64 package was moved
to
[core]
on Arch Linux. - On 2022-08-03 Jelgnum reported that with the new glibc, "Easy Anti-Cheat" cannot load the anti-cheat module (GLIBC update broke EAC for most games that use it).
- Multiple Arch Linux game users confirmed the problem.
- Frogging101 bisected the problem to the glibc commit Do not use --hash-style=both for building glibc shared objects.
- The problem led to heated discussions, some clickbait news, and claims such as "glibc breaks ABI" and "glibc does not prioritize compatibility with pre-existing applications".
I feel compelled to demystify the accident and wish that people can stop defamation to glibc.