Skip to content

perf(ebpf): use fused get_value() API for shard lookups, removing raw…#76

Merged
zz85 merged 2 commits intomainfrom
fused-get-value
Feb 21, 2026
Merged

perf(ebpf): use fused get_value() API for shard lookups, removing raw…#76
zz85 merged 2 commits intomainfrom
fused-get-value

Conversation

@zz85
Copy link
Owner

@zz85 zz85 commented Feb 21, 2026

… bpf_map_lookup_elem workaround

Replace the two-step shard_lookup (typed outer get() + raw inner bpf_map_lookup_elem) with the fused ArrayOfMaps::get_value() API from Brskt's updated hashmapofmaps-new branch. The fused method performs both outer and inner bpf_map_lookup_elem calls without intermediate struct indirection (MapDef::as_ptr() on the inner map), which was causing the BPF verifier's state exploration to explode in the DWARF binary search loop.

This also updates the userspace ArrayOfMaps usage to pass typed Array<MapData, UnwindEntryPod> references directly to set(), matching the updated aya API.

Measured on kernel 6.14, 16 CPUs:

Here's a cleaned-up version of the table:

Version profile_cpu verification (ms)
Raw bpf_map_lookup_elem (pre-#73) 345ms
Typed two-step get()+Array::get() 10.65s
Workaround: typed outer, raw inner 345ms
Fused get_value() API (this commit) 433ms
# Run DWARF profile to check verifier time

  ┃  $ sudo ./target/release/probee --dwarf --time 3000 --collapse /dev/null 2>&1
  ┃
  ┃  Starting Opt { collapse: Some("/dev/null"), svg: None, html: None, json: None, serve: false, tui: false, tui_refresh_ms: 2000, update_mode: "accumulate",
  ┃  no_tui_mouse: false, skip_idle: false, time: Some(3000), frequency: 99, kprobe: None, uprobe: [], uprobe_pid: None, list_probes: None, tracepoint: None, group_by_cpu:
  ┃   false, dwarf: Some(true), pid: None, cpu: None, self_profile: false, stream_mode: 2, cmd: None, command: [], off_cpu: false, min_block_time: 1, max_block_time:
  ┃  18446744073709551615 }
  ┃  CPUs: 16
  ┃  eBPF verification completed in 433.000746ms
  ┃  Waiting for Ctrl-C...
  ┃  Processed 407 queue events
  ┃  Processing stacks...
  ┃  Total value: 5524 (samples or us off-CPU time)
  ┃  Writing to file: /dev/null
  ┃  PointerStackFramesCache entries: 73/16384, hits: 65, miss: 73, hit ratio: 47.10

Summary by CodeRabbit

  • Refactor

    • Optimized lookup flow for stack unwinding data to improve runtime performance and reduce verifier/inspection overhead.
    • Strengthened internal type handling and references for more robust, predictable behavior when loading and using shard data.
    • Removed redundant intermediate lookup steps to simplify internal logic and improve maintainability.
  • Documentation

    • Updated verifier and implementation notes to reflect the fused lookup approach and its performance implications.

… bpf_map_lookup_elem workaround

Replace the two-step shard_lookup (typed outer get() + raw inner
bpf_map_lookup_elem) with the fused ArrayOfMaps::get_value() API from
Brskt's updated hashmapofmaps-new branch. The fused method performs both
outer and inner bpf_map_lookup_elem calls without intermediate struct
indirection (MapDef::as_ptr() on the inner map), which was causing the
BPF verifier's state exploration to explode in the DWARF binary search
loop.

This also updates the userspace ArrayOfMaps usage to pass typed
Array<MapData, UnwindEntryPod> references directly to set(), matching
the updated aya API.

Measured on kernel 6.14, 16 CPUs:

  Version                              profile_cpu verification
  ──────────────────────────────────   ─────────────────────────
  Raw bpf_map_lookup_elem (pre-#73)   ~345ms
  Typed two-step get()+Array::get()    10.65s
  Workaround: typed outer, raw inner   ~345ms
  Fused get_value() API (this commit)  ~433ms
@coderabbitai
Copy link

coderabbitai bot commented Feb 21, 2026

📝 Walkthrough

Walkthrough

Replaces two-step outer+inner eBPF map lookup with a fused typed get_value() call in the eBPF program and updates userspace to insert inner arrays by reference into ArrayOfMaps instead of passing raw file descriptors.

Changes

Cohort / File(s) Summary
eBPF shard lookup
profile-bee-ebpf/src/lib.rs
Replaces manual bpf_map_lookup_elem + inner-map lookup with UNWIND_SHARDS.get_value(shard_id, &idx) returning UnwindEntry directly; updates inline docs to reflect fused lookup.
Userspace map typing & insertion
profile-bee/bin/profile-bee.rs, profile-bee/src/ebpf.rs
Removes InnerMap import; makes ArrayOfMaps instantiation explicit with Array<MapData, UnwindEntryPod> as inner type and inserts inner arrays by reference (&inner_array) instead of using inner_array.fd().
Docs / agent notes
AGENTS.md
Updates verifier/notes to describe fused ArrayOfMaps::get_value() pattern and rationale for avoiding two-step lookups.

Sequence Diagram(s)

sequenceDiagram
  participant Userspace
  participant KernelMaps
  participant eBPF
  rect rgba(180,220,255,0.5)
    Userspace->>KernelMaps: create inner Array<MapData,UnwindEntryPod>
    Userspace->>KernelMaps: insert inner array reference into ArrayOfMaps (outer)
  end
  rect rgba(200,255,200,0.5)
    eBPF->>KernelMaps: UNWIND_SHARDS.get_value(shard_id, &idx)
    KernelMaps-->>eBPF: UnwindEntry (direct)
  end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

Poem

🐰 I hopped through maps with tiny paws so fleet,
No more two-step digging—one call is neat.
Inner arrays held close, by reference bound,
A single get returns what once was found.
Thump-thump! I celebrate with carrot-cheer 🥕

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly and concisely describes the main change: replacing raw map lookups with a fused get_value() API for shard lookups to improve performance.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch fused-get-value

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@profile-bee-ebpf/src/lib.rs`:
- Around line 1089-1096: Update the "Critical Pitfalls" section in AGENTS.md to
reflect the current implementation: replace the statement that "shard_lookup()
uses an 8-way static match — dynamic array indexing fails verification" with a
note that shard_lookup() now uses the fused get_value() API
(UNWIND_SHARDS.get_value(shard_id as u32, &idx)) which performs both outer and
inner bpf_map_lookup_elem calls and avoids verifier state explosion; mention the
refactor (hashmapofmaps-new / Brskt) and that the fused approach is the
supported pattern instead of static match-based indexing.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@AGENTS.md`:
- Line 270: Update the "Key Dependencies" table entry that currently lists "aya
/ aya-ebpf" to explicitly require the Brskt fork and branch (Brskt/aya with the
hashmapofmaps-new branch) so readers install the correct aya variant; locate the
Key Dependencies table in AGENTS.md (the table starting near the previously
noted line) and change the aya entry to mention "Brskt/aya (hashmapofmaps-new
branch required)" or equivalent clarifying text and optionally add a brief note
referencing the shard_lookup() usage of ArrayOfMaps::get_value() to justify the
requirement.

- All loops MUST be bounded (`for _ in 0..CONST`). No `while`, no dynamic bounds.
- Map accesses must be bounds-checked. The verifier tracks these statically.
- `shard_lookup()` uses an 8-way static `match` — dynamic array indexing fails verification.
- `shard_lookup()` uses `ArrayOfMaps::get_value()` (from the Brskt/aya `hashmapofmaps-new` branch) which fuses the outer and inner `bpf_map_lookup_elem` into a single call without intermediate struct indirection. Using the two-step approach (`get()` then `Array::get()`) causes verifier state explosion (~10s vs ~400ms). The fused `get_value()` is the required pattern.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Update the Key Dependencies table to note the required aya fork.

Line 270 correctly documents the Brskt/aya hashmapofmaps-new branch requirement, but the Key Dependencies table at line 314 still lists plain aya / aya-ebpf without qualification. A contributor following only that table would install mainline aya and get a compilation error with no hint of why.

📝 Proposed fix to Key Dependencies table (line 314)
-| `aya` / `aya-ebpf` | eBPF loading, map access, program attachment |
+| `aya` / `aya-ebpf` (fork: `Brskt/aya hashmapofmaps-new` branch) | eBPF loading, map access, program attachment — requires the forked branch for `ArrayOfMaps::get_value()` |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@AGENTS.md` at line 270, Update the "Key Dependencies" table entry that
currently lists "aya / aya-ebpf" to explicitly require the Brskt fork and branch
(Brskt/aya with the hashmapofmaps-new branch) so readers install the correct aya
variant; locate the Key Dependencies table in AGENTS.md (the table starting near
the previously noted line) and change the aya entry to mention "Brskt/aya
(hashmapofmaps-new branch required)" or equivalent clarifying text and
optionally add a brief note referencing the shard_lookup() usage of
ArrayOfMaps::get_value() to justify the requirement.

@zz85 zz85 merged commit dd4ef69 into main Feb 21, 2026
5 checks passed
@zz85 zz85 deleted the fused-get-value branch February 21, 2026 19:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant