elf: fix MIPS64 little-endian relocation parsing#519
Conversation
0d7b9ab to
27b25a2
Compare
Fix #274: MIPS64 ELF uses a non-standard relocation info layout where the r_info field contains r_sym (32 bits) | r_ssym (8 bits) | r_type3 (8 bits) | r_type2 (8 bits) | r_type (8 bits), instead of the standard ELF64 format of (sym << 32) | type. On little-endian MIPS64 systems, when this struct is read as a single u64, the byte order causes the fields to be scrambled. This resulted in r_sym returning garbage values (e.g., 51511296 instead of 0), which then caused the dynamic symbol table parsing to try to read millions of symbols, failing with an out-of-bounds error. The fix applies a byte transformation (matching the approach used by LLVM and the `object` crate) to rearrange the MIPS64 LE r_info bytes into the standard ELF64 format before extracting r_sym and r_type. Changes: - Add `reloc64::mips64el_r_info()` to convert MIPS64 LE r_info to standard ELF64 format - Add `Reloc::fixup_mips64el()` to apply the transformation after parsing - Add `is_mips64el` flag to `RelocSection` and `RelocIterator` to conditionally apply the fixup during iteration/access - Add `RelocSection::parse_inner()` that accepts the `is_mips64el` flag - Detect MIPS64 LE in `Elf::parse_with_opts()` from the ELF header's e_machine and endianness - Backward compatible: `RelocSection::parse()` still works unchanged (defaults to no MIPS64 fixup) - Add comprehensive unit and integration tests
27b25a2 to
9dfc91e
Compare
m4b
left a comment
There was a problem hiding this comment.
First of all, thank you for your patience; I was extremely sick last week, and was unable to review.
Anyway, thank you for this PR, this is great! I like that it's backwards compatible, though it is tempting to make some of the backwards imcompatible changes for a better api surface.
The main issue is whether my suggestions are backwards compatible? At first I thought they were not, but then I was surprised to see that RelocCtx was a non-pub typealias, which iiuc, would mean the trait implementation (which I thought were always public?) depends on a private type, which I assume means this is not a breaking change to change the TryFromCtx impl? E.g., no one can currently write down RelocCtx, or construct it, so I can't see how it would be a breaking change, but there may be some subtlety here I'm not seeing right now.
Assuming this is correct, I believe we should do the changes I suggest (putting the mips boolean into the RelocCtx, and doing the fixup in the TryFromCtx impl)
If this is not true (though semantically I don't' quite understand how the trait impl could be public, somehow breaking for users, but also depend on a private type you can't even name due to visibility, e.g., in swift this is a compile time error), then we should proceed with yours and we can do the proper Ctx based implementation as a breaking change.
Thanks again for this great PR and pushing this over the finish line!
| /// fixup. If you are parsing a MIPS64 LE binary, use [`Elf::parse`] or | ||
| /// [`Elf::parse_with_opts`] instead, which automatically detect MIPS64 LE | ||
| /// and apply the necessary `r_info` byte transformation. | ||
| pub fn parse(bytes: &'a [u8], offset: usize, filesz: usize, is_rela: bool, ctx: Ctx) -> crate::error::Result<RelocSection<'a>> { |
There was a problem hiding this comment.
if we want to make this a breaking change I think the better api here is to have caller construct the reloc ctx, so we'd make RelocCtx public and do something like:
pub fn parse(bytes: &'a [u8], offset: usize, filesz: usize, ctx: RelocCtx) -> crate::error::Result<RelocSection<'a>> I don't think we should do this though in this patch, if this change isn't already breaking (which it doesn't appear to be, even with suggestion about doing the mips logic in the Ctx)
|
Oh I should also mention the only other "oddity" here is that |
Address code review feedback: - Convert RelocCtx from a type alias (bool, Ctx) to a struct with is_rela, is_mips64el, and ctx fields - Move fixup_mips64el() call from RelocSection::get() and RelocIterator::next() into the TryFromCtx implementation - Remove is_mips64el field from RelocSection and RelocIterator since RelocCtx now carries it This is a non-breaking change since RelocCtx was not pub.
|
Thanks for the review, hope you are getting better! |
|
While |
m4b
left a comment
There was a problem hiding this comment.
this is great few minor changes then ready I think! I'm on the fence whether we take these latest changes and make it a breaking change, or just add it in minor release (e.g., moving from tuple to private struct).
As philipc points out, tuples are not nominal, but structural, in rust, so even though the alias is private, one can still "name" (e.g., construct) the type that is implemented, since it uses two public types (bool, Ctx).
in practice I highly doubt anyone is relying on the SizeWith and TryFromCtx impls being visible or what the ctx for the impl is, but it is technically a breaking change to change Reloc's ctx from a tuple to a struct.
One thing we could do is take your first initial commit, merge that, do a minor release, then take your 2nd commit and make that apart of a breaking release if want to be really particular.
A nice compromise I think I will end up doing is just do a minor release with the RelocCtx change (the requested changes), and if it breaks anyone, I'll yank, and release with the first change, and put the RelocCtx as a breaking change.
- Remove RelocCtx::new() constructor, use struct literal instead to make is_mips64el: false explicit at call sites - Change IntoCtx<(bool, Ctx)> to IntoCtx<RelocCtx> for consistency with TryIntoCtx - Handle is_mips64el in TryIntoCtx for round-trip correctness
Summary
Fixes #274 — MIPS64 LE ELF binaries fail to parse with an out-of-bounds error because
r_symandr_typereturn garbage values from the non-standard MIPS64 relocation layout.Background
MIPS64 ELF uses a non-standard relocation info (
r_info) layout different from standard ELF64:r_sym(32 bits) |r_type(32 bits)r_sym(32 bits) |r_ssym(8 bits) |r_type3(8 bits) |r_type2(8 bits) |r_type(8 bits)On little-endian MIPS64, when this struct is read as a single
u64, the byte order causes the fields to be scrambled. For example, anr_infowithr_sym=0andr_type=R_MIPS_REL32gets read as0x0312000000000000, causing:r_symto return 51,511,296 (instead of 0)r_typeto return 0 (instead of 3)The bogus
r_symthen causes the dynamic symbol table parsing to try to read millions of symbols, failing with an out-of-bounds error.Approach
Based on the maintainer's feedback from #382 — pass down whether it's a MIPS binary at runtime rather than using
#[cfg]compile-time conditionals:reloc64::mips64el_r_info()— Converts MIPS64 LEr_infoto standard ELF64 format using the same byte transformation as LLVM and theobjectcrateReloc::fixup_mips64el()— Applies the transformation after parsing a relocation entryRelocSection/RelocIterator— Carry anis_mips64elflag and conditionally apply the fixup duringget()and iterationElf::parse_with_opts()— Detects MIPS64 LE from the ELF header (e_machine == EM_MIPS && 64-bit && little-endian) and passes it through viaRelocSection::parse_inner()Backward Compatibility
RelocSection::parse()signature is unchanged (defaultsis_mips64el = false)RelocCtxtype alias is unchangedr_sym()/r_type()/r_info()functions are unchangedTests
Added 5 new tests:
test_mips64el_r_info— validates the byte transformation with real data from MIPS64 parse error: "type is too big (1236271128) for 137416" #274test_mips64el_r_info_with_sym— validates transformation when r_sym is non-zerotest_mips64el_r_info_zero— edge case: all-zero r_infotest_standard_r_sym_r_type_unchanged— ensures no regression for non-MIPStest_mips64el_reloc_section_parse— integration test through the fullRelocSectionpipeline, verifying both broken (without fix) and correct (with fix) behaviorReferences
#[cfg])objectcrate implementation