Understanding alignment - from source to object file

Alignment refers to the practice of placing data or code at memory addresses that are multiples of a specific value, typically a power of 2. This is typically done to meet the requirements of the programming language, ABI, or the underlying hardware. Misaligned memory accesses might be expensive or will cause traps on certain architectures.

This blog post explores how alignment is represented and managed as C++ code is transformed through the compilation pipeline: from source code to LLVM IR, assembly, and finally the object file. We'll focus on alignment for both variables and functions.

Read More

LLVM integrated assembler: Engineering better fragments

In my previous assembler posts, I've discussed improvements on expression resolving and relocation generation. Now, let's turn our attention to recent refinements within section fragments. Understanding how an assembler utilizes these fragments is key to appreciating the improvements we've made. At a high level, the process unfolds in three main stages:

  • Parsing phase: The assembler constructs section fragments. These fragments represent sequences of regular instructions or data, span-dependent instructions, alignment directives, and other elements.
  • Section layout phase: Once fragments are built, the assembler assigns offsets to them and finalizes the span-dependent content.
  • Relocation decision phase: In the final stage, the assembler evaluates fixups and, if necessary, updates the content of the fragments.

Read More

GCC 13.3.0 miscompiles LLVM

For years, I've been involved in updating LLVM's MC layer. A recent journey led me to eliminate the FK_PCRel_ fixup kinds:

MCFixup: Remove FK_PCRel_

The generic FK_Data_ fixup kinds handle both absolute and PC-relative
fixups. ELFObjectWriter sets IsPCRel to true for `.long foo-.`, so the
backend has to handle PC-relative FK_Data_.

However, the existence of FK_PCRel_ encouraged backends to implement it
as a separate fixup type, leading to redundant and error-prone code.

Removing FK_PCRel_ simplifies the overall fixup mechanism.

Read More

LLVM integrated assembler: Improving MCExpr and MCValue

In my previous post, Relocation Generation in Assemblers, I explored some key concepts behind LLVM’s integrated assemblers. This post dives into recent improvements I’ve made to refine that system.

The LLVM integrated assembler handles fixups and relocatable expressions as distinct entities. Relocatable expressions, in particular, are encoded using the MCValue class, which originally looked like this:

1
2
3
4
5
class MCValue {
const MCSymbolRefExpr *SymA = nullptr, *SymB = nullptr;
int64_t Cst = 0;
uint32_t RefKind = 0;
};

Read More

Relocation generation in assemblers

This post explores how GNU Assembler and LLVM integrated assembler generate relocations, an important step to generate a relocatable file. Relocations identify parts of instructions or data that cannot be fully determined during assembly because they depend on the final memory layout, which is only established at link time or load time. These are essentially placeholders that will be filled in (typically with absolute addresses or PC-relative offsets) during the linking process.

Relocation generation: the basics

Symbol references are the primary candidates for relocations. For instance, in the x86-64 instruction movl sym(%rip), %eax (GNU syntax), the assembler calculates the displacement between the program counter (PC) and sym. This distance affects the instruction's encoding and typically triggers a R_X86_64_PC32 relocation, unless sym is a local symbol defined within the current section.

Both the GNU assembler and LLVM integrated assembler utilize multiple passes during assembly, with several key phases relevant to relocation generation:

Read More

Migrating comments to giscus

Followed this guide: https://www.patrickthurmond.com/blog/2023/12/11/commenting-is-available-now-thanks-to-giscus

Add the following to layout/_partial/article.ejs

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
<% if (!index && post.comments) { %>
<section class="giscus"></section>
<script src="https://giscus.app/client.js"
data-repo="MaskRay/maskray.me"
data-repo-id="FILL IT UP"
data-category="Blog Post Comments"
data-category-id="FILL IT UP"
data-mapping="pathname"
data-strict="0"
data-reactions-enabled="1"
data-emit-metadata="0"
data-input-position="bottom"
data-theme="preferred_color_scheme"
data-lang="en"
data-loading="lazy"
crossorigin="anonymous"
async>
</script>
<% } %>

Unfortunately comments from Disqus have not been migrated yet. If you've left comments in the past, thank you. Apologies they are now gone.

While you can create Github Discussions via GraphQL API, I haven't found a solution that works out of the box. https://www.davidangulo.xyz/posts/dirty-ruby-script-to-migrate-comments-from-disqus-to-giscus/ provides a Ruby solution, which is promising but no longer works.

1
2
3
4
5
6
7
8
9
Failed to define value method for :name, because EnterpriseOrderField already responds to that method. Use `value_method:` to override the method name or `value_method: false` to disable Enum value me
thod generation.
Failed to define value method for :name, because EnvironmentOrderField already responds to that method. Use `value_method:` to override the method name or `value_method: false` to disable Enum value m
ethod generation.
Failed to define value method for :name, because LabelOrderField already responds to that method. Use `value_method:` to override the method name or `value_method: false` to disable Enum value method
generation.
...
.local/share/gem/ruby/3.3.0/gems/graphql-client-0.25.0/lib/graphql/client.rb:338:in `query': wrong number of arguments (given 2, expected 1) (ArgumentError)
from g.rb:42:in `create_discussion'