Skip to content

elf: fix MIPS64 little-endian relocation parsing#519

Merged
m4b merged 3 commits intom4b:masterfrom
messense:fix-mips64-reloc-parsing
Mar 16, 2026
Merged

elf: fix MIPS64 little-endian relocation parsing#519
m4b merged 3 commits intom4b:masterfrom
messense:fix-mips64-reloc-parsing

Conversation

@messense
Copy link
Contributor

Summary

Fixes #274 — MIPS64 LE ELF binaries fail to parse with an out-of-bounds error because r_sym and r_type return garbage values from the non-standard MIPS64 relocation layout.

Background

MIPS64 ELF uses a non-standard relocation info (r_info) layout different from standard ELF64:

Standard ELF64 MIPS64
r_sym (32 bits) | r_type (32 bits) r_sym (32 bits) | r_ssym (8 bits) | r_type3 (8 bits) | r_type2 (8 bits) | r_type (8 bits)

On little-endian MIPS64, when this struct is read as a single u64, the byte order causes the fields to be scrambled. For example, an r_info with r_sym=0 and r_type=R_MIPS_REL32 gets read as 0x0312000000000000, causing:

  • r_sym to return 51,511,296 (instead of 0)
  • r_type to return 0 (instead of 3)

The bogus r_sym then causes the dynamic symbol table parsing to try to read millions of symbols, failing with an out-of-bounds error.

Approach

Based on the maintainer's feedback from #382pass down whether it's a MIPS binary at runtime rather than using #[cfg] compile-time conditionals:

  1. reloc64::mips64el_r_info() — Converts MIPS64 LE r_info to standard ELF64 format using the same byte transformation as LLVM and the object crate

  2. Reloc::fixup_mips64el() — Applies the transformation after parsing a relocation entry

  3. RelocSection / RelocIterator — Carry an is_mips64el flag and conditionally apply the fixup during get() and iteration

  4. Elf::parse_with_opts() — Detects MIPS64 LE from the ELF header (e_machine == EM_MIPS && 64-bit && little-endian) and passes it through via RelocSection::parse_inner()

Backward Compatibility

  • RelocSection::parse() signature is unchanged (defaults is_mips64el = false)
  • RelocCtx type alias is unchanged
  • r_sym() / r_type() / r_info() functions are unchanged
  • All existing tests pass

Tests

Added 5 new tests:

  • test_mips64el_r_info — validates the byte transformation with real data from MIPS64 parse error: "type is too big (1236271128) for 137416" #274
  • test_mips64el_r_info_with_sym — validates transformation when r_sym is non-zero
  • test_mips64el_r_info_zero — edge case: all-zero r_info
  • test_standard_r_sym_r_type_unchanged — ensures no regression for non-MIPS
  • test_mips64el_reloc_section_parse — integration test through the full RelocSection pipeline, verifying both broken (without fix) and correct (with fix) behavior

References

@messense messense force-pushed the fix-mips64-reloc-parsing branch from 0d7b9ab to 27b25a2 Compare February 28, 2026 07:31
Fix #274: MIPS64 ELF uses a non-standard relocation info layout where
the r_info field contains r_sym (32 bits) | r_ssym (8 bits) | r_type3
(8 bits) | r_type2 (8 bits) | r_type (8 bits), instead of the standard
ELF64 format of (sym << 32) | type.

On little-endian MIPS64 systems, when this struct is read as a single
u64, the byte order causes the fields to be scrambled. This resulted in
r_sym returning garbage values (e.g., 51511296 instead of 0), which then
caused the dynamic symbol table parsing to try to read millions of
symbols, failing with an out-of-bounds error.

The fix applies a byte transformation (matching the approach used by
LLVM and the `object` crate) to rearrange the MIPS64 LE r_info bytes
into the standard ELF64 format before extracting r_sym and r_type.

Changes:
- Add `reloc64::mips64el_r_info()` to convert MIPS64 LE r_info to
  standard ELF64 format
- Add `Reloc::fixup_mips64el()` to apply the transformation after
  parsing
- Add `is_mips64el` flag to `RelocSection` and `RelocIterator` to
  conditionally apply the fixup during iteration/access
- Add `RelocSection::parse_inner()` that accepts the `is_mips64el` flag
- Detect MIPS64 LE in `Elf::parse_with_opts()` from the ELF header's
  e_machine and endianness
- Backward compatible: `RelocSection::parse()` still works unchanged
  (defaults to no MIPS64 fixup)
- Add comprehensive unit and integration tests
@messense messense force-pushed the fix-mips64-reloc-parsing branch from 27b25a2 to 9dfc91e Compare February 28, 2026 07:40
Copy link
Owner

@m4b m4b left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First of all, thank you for your patience; I was extremely sick last week, and was unable to review.

Anyway, thank you for this PR, this is great! I like that it's backwards compatible, though it is tempting to make some of the backwards imcompatible changes for a better api surface.

The main issue is whether my suggestions are backwards compatible? At first I thought they were not, but then I was surprised to see that RelocCtx was a non-pub typealias, which iiuc, would mean the trait implementation (which I thought were always public?) depends on a private type, which I assume means this is not a breaking change to change the TryFromCtx impl? E.g., no one can currently write down RelocCtx, or construct it, so I can't see how it would be a breaking change, but there may be some subtlety here I'm not seeing right now.

Assuming this is correct, I believe we should do the changes I suggest (putting the mips boolean into the RelocCtx, and doing the fixup in the TryFromCtx impl)

If this is not true (though semantically I don't' quite understand how the trait impl could be public, somehow breaking for users, but also depend on a private type you can't even name due to visibility, e.g., in swift this is a compile time error), then we should proceed with yours and we can do the proper Ctx based implementation as a breaking change.

Thanks again for this great PR and pushing this over the finish line!

/// fixup. If you are parsing a MIPS64 LE binary, use [`Elf::parse`] or
/// [`Elf::parse_with_opts`] instead, which automatically detect MIPS64 LE
/// and apply the necessary `r_info` byte transformation.
pub fn parse(bytes: &'a [u8], offset: usize, filesz: usize, is_rela: bool, ctx: Ctx) -> crate::error::Result<RelocSection<'a>> {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if we want to make this a breaking change I think the better api here is to have caller construct the reloc ctx, so we'd make RelocCtx public and do something like:

pub fn parse(bytes: &'a [u8], offset: usize, filesz: usize, ctx: RelocCtx) -> crate::error::Result<RelocSection<'a>> 

I don't think we should do this though in this patch, if this change isn't already breaking (which it doesn't appear to be, even with suggestion about doing the mips logic in the Ctx)

@m4b
Copy link
Owner

m4b commented Mar 9, 2026

Oh I should also mention the only other "oddity" here is that Reloc::size(is_rela, Ctx) is pub, but thankfully size doesn't depend on the mips property so we can still keep as is for now unless want to do a breaking change later as suggested with RelocCtx for the parse method and would also update size. Anyway, not a big deal.

Address code review feedback:
- Convert RelocCtx from a type alias (bool, Ctx) to a struct with
  is_rela, is_mips64el, and ctx fields
- Move fixup_mips64el() call from RelocSection::get() and
  RelocIterator::next() into the TryFromCtx implementation
- Remove is_mips64el field from RelocSection and RelocIterator
  since RelocCtx now carries it

This is a non-breaking change since RelocCtx was not pub.
@messense
Copy link
Contributor Author

messense commented Mar 9, 2026

Thanks for the review, hope you are getting better!

@messense messense requested a review from m4b March 9, 2026 12:05
@philipc
Copy link
Collaborator

philipc commented Mar 9, 2026

While RelocCtx isn't public, it is an alias for a tuple of public types, so that tuple can still manage to leak into the public API, such as in these trait implementations: https://docs.rs/goblin/0.10.5/goblin/elf/reloc/struct.Reloc.html#impl-SizeWith%3C(bool,+Ctx)%3E-for-Reloc. Changing RelocCtx to a private struct means that SizeWith implementation is removed from the public API. In practice I doubt this matters, just depends how strict you want to be for breaking changes.

Copy link
Owner

@m4b m4b left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is great few minor changes then ready I think! I'm on the fence whether we take these latest changes and make it a breaking change, or just add it in minor release (e.g., moving from tuple to private struct).
As philipc points out, tuples are not nominal, but structural, in rust, so even though the alias is private, one can still "name" (e.g., construct) the type that is implemented, since it uses two public types (bool, Ctx).

in practice I highly doubt anyone is relying on the SizeWith and TryFromCtx impls being visible or what the ctx for the impl is, but it is technically a breaking change to change Reloc's ctx from a tuple to a struct.

One thing we could do is take your first initial commit, merge that, do a minor release, then take your 2nd commit and make that apart of a breaking release if want to be really particular.

A nice compromise I think I will end up doing is just do a minor release with the RelocCtx change (the requested changes), and if it breaks anyone, I'll yank, and release with the first change, and put the RelocCtx as a breaking change.

- Remove RelocCtx::new() constructor, use struct literal instead to
  make is_mips64el: false explicit at call sites
- Change IntoCtx<(bool, Ctx)> to IntoCtx<RelocCtx> for consistency
  with TryIntoCtx
- Handle is_mips64el in TryIntoCtx for round-trip correctness
@messense messense requested a review from m4b March 10, 2026 11:45
Copy link
Owner

@m4b m4b left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

awesome, thanks for your patience and this PR @messense !

@m4b m4b merged commit 86de3b4 into m4b:master Mar 16, 2026
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

MIPS64 parse error: "type is too big (1236271128) for 137416"

3 participants