Relocations¶

Relocations are only present in object files (i.e. not executables, dylibs, etc). They tell the linker that certain parts of the binary are pointers that the link step needs to handle. They are stored as per-section lists of the following struct:

struct relocation_info {
    int32_t r_address;       // offset in the section to what is being relocated
    uint32_t r_symbolnum:24, // symbol index or section ordinal
             r_pcrel:1,      // whether the relocation is PC-relative
             r_length:2,     // log2 size of the item to be relocated
             r_extern:1,     // whether r_symbolnum refers to a symbol or a section
             r_type:4;       // target-specific relocation type
};

r_extern indicates how we should interpret r_symbolnum. For extern relocations, r_symbolnum is an index into the file’s symbol table; for local relocations, it instead stores the ordinal of a section in the object file.

One “missing” piece of information is the addend, AKA the offset that should be added to the target symbol or section’s address. Mach-O encodes the addend directly in the instruction stream instead of within the relocation info struct. That is, we must read the bytes at r_address to find it.

If the linker can determine the final address of the target at link time, it will update the bytes at r_address accordingly. If not, it will emit instructions to tell dyld how to resolve these addresses at load time. See Dynamic Binding for details. Either way, the relocation entry itself is elided from the final binary.

Relocations are inspectable via llvm-readobj --relocations --expand-relocs.

Below are the semantics of the relocation r_type values for x86-64 and ARM64, as well as assembly snippets that show what inputs would cause the assembler to emit these relocations.

X86_64¶

X86_64_RELOC_UNSIGNED¶

Resolves to an absolute address. Naturally, r_pcrel is always false here. The relocation is “unsigned” since addresses cannot be negative.

Example:

.data
.quad _foo        # Absolute address of _foo

X86_64_RELOC_SIGNED¶

Resolves to an address offset, relative to the current instruction pointer (%rip).

Example:

leaq _foo(%rip), %rax   # Load address of _foo into %rax

X86_64_RELOC_BRANCH¶

References a function. For statically linked functions, this resolves to an address offset relative to the current instruction pointer (%rip).

For dynamically-linked functions, this resolves to an entry in the stubs section (AKA what ELF calls the Procedure Linkage Table, or PLT.)

Example:

callq _foo              # Call function _foo

X86_64_RELOC_GOT_LOAD¶

Resolves to an address within the Global Offset Table. Only used with mov opcodes that reference symbols which may be dynamically loaded (i.e. live in a dylib).

If the symbol ends up being statically linked, we don’t need to go through the GOT, and can instead reference the symbol directly by turning the mov into a lea opcode.

Example:

movq _foo@GOTPCREL(%rip), %rax   # Load address of _foo from GOT

If _foo ends up being statically linked, the above can be optimized to:

leaq _foo(%rip), %rax             # Load address of _foo directly

X86_64_RELOC_GOT¶

Resolves to an address within the Global Offset Table. Used for all non-mov opcodes. No optimization can be done even if the symbol ends up being statically linked.

Example:

pushq _foo@GOTPCREL(%rip)        # Push address of _foo from GOT

X86_64_RELOC_TLV¶

References a thread-local variable. Resolves to an address within the __thread_ptrs section, which, like the GOT, is an array of address values.

Example:

movq _tlv_var@TLVP(%rip), %rdi   # Load TLV descriptor address
callq *(%rdi)                    # Call TLV getter function

X86_64_RELOC_SUBTRACTOR¶

Used to encode the difference between two symbol addresses.

Example:

.quad _foo - _bar

In order to encode the two separate referents (_foo and _bar), the assembler will emit a pair of relocations: an UNSIGNED one whose r_symbolnum points at _foo, and a SUBTRACTOR one whose r_symbolnum points at _bar. They will both point to the same r_address.

ARM64¶

???

Relocations¶

X86_64¶

X86_64_RELOC_UNSIGNED¶

X86_64_RELOC_SIGNED¶

X86_64_RELOC_BRANCH¶

X86_64_RELOC_GOT_LOAD¶

X86_64_RELOC_GOT¶

X86_64_RELOC_TLV¶

X86_64_RELOC_SUBTRACTOR¶

ARM64¶

Notes on the Mach-O File Format

Navigation

Related Topics