Skip to content

Conversation

@vadorovsky
Copy link
Member

@vadorovsky vadorovsky commented Oct 13, 2025

Problem

Building them with itertools::into_group_map takes 20.9ms.

before

Summary of Changes

Building them manually with a pre-allocated hash map, using pubkey hasher, takes 5.2ms.

after_1

Ref: #8280

@vadorovsky vadorovsky force-pushed the compute-epoch-schedule-group-map branch 2 times, most recently from e42e215 to cb8fedb Compare October 13, 2025 15:14
@codecov-commenter
Copy link

codecov-commenter commented Oct 13, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 83.1%. Comparing base (d8dcdad) to head (80b2e04).
⚠️ Report is 16 commits behind head on master.

Additional details and impacted files
@@            Coverage Diff            @@
##           master    #8451     +/-   ##
=========================================
- Coverage    83.2%    83.1%   -0.1%     
=========================================
  Files         846      846             
  Lines      368550   368566     +16     
=========================================
- Hits       306656   306536    -120     
- Misses      61894    62030    +136     
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@vadorovsky vadorovsky force-pushed the compute-epoch-schedule-group-map branch 2 times, most recently from 37b32a9 to b758fe5 Compare October 14, 2025 07:07
@vadorovsky vadorovsky marked this pull request as ready for review October 14, 2025 07:24
slot_leaders: Vec<Pubkey>,
// Inverted index from pubkeys to indices where they are the leader.
leader_slots_map: HashMap<Pubkey, Arc<Vec<usize>>>,
leader_slots_map: HashMap<Pubkey, Vec<usize>, PubkeyHasherBuilder>,

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

image

We might need to change this fn to not cloning the vec anymore since it is no longer an Arc.

Copy link
Member Author

@vadorovsky vadorovsky Oct 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for pointing this!

I tried to remove this cloned() and make it work with Vec<usize> (without Arc), but the problem is that the returned iterator is used outside of the LeaderSchedule, so then compiler was rightfully complaining about lifetimes and returning references to local values.

So I decided to put the Arc back. Now I can see its purpose.

Copy link
Member Author

@vadorovsky vadorovsky Oct 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW, adding Arc back unfortunately increases the execution time from 5.2ms to 9.9ms. 🥲 But I have no better ideas and that's still better than cloning.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks reasonable to me.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find a way to remove Arc.

#8499

The Arc was needed because get_leader_upcoming_slots returned a boxed iterator requiring ownership of the Vec. By modifying get_leader_upcoming_slots to return a borrowed iterator (Box<dyn Iterator + '_>), and collecting the Arc references upfront in next_leader_slot, the schedules stay alive for the entire iterator chain, eliminating the need for Arc wrapping and avoiding intermediate Vec allocations.

@HaoranYi
Copy link

Excellent work! I have just one comment!

HaoranYi
HaoranYi previously approved these changes Oct 15, 2025
Copy link

@HaoranYi HaoranYi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Please wait for other reviewer's approval before merge.

@brooksprumo brooksprumo removed their request for review October 15, 2025 16:14
@brooksprumo
Copy link

I'm going to bow out and defer to the other reviewers since I'm OOO.

jstarry
jstarry previously approved these changes Oct 16, 2025
@jstarry
Copy link

jstarry commented Oct 16, 2025

We should probably review our uses of itertools across the codebase because many of the utility methods create hashsets and hashmaps which are not pre-allocated.

@vadorovsky
Copy link
Member Author

We should probably review our uses of itertools across the codebase because many of the utility methods create hashsets and hashmaps which are not pre-allocated.

I'm yet to profile all uses of into_group_map() in ledger/runtime/accounts-db, but my gut feeling for now is that we should move away from it entirely, especially given that the manual alternative I'm replacing it with is basically this few-liner:

let mut grouped = HashMap::with_capacity_and_hasher(cap, hasher);
for (key, value) in input {
    grouped_slot_leaders
        .entry(value)
        .and_modify(|keys| {
            keys.push(key);
    })
    .or_insert(vec![key]);
}

Perhaps I should already move it to some common function in solana-perf.

Given the lack of elasticity of itertools (no way of providing capacities and hashers), I think its usage is a footgun. I tried to make it accept custom hashers, but the PR got closed: rust-itertools/itertools#1057.

let slots = Arc::get_mut(slots).expect("should be the only reference");
slots.push(slot)
})
.or_insert(Arc::new(vec![slot]));

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could also experiment with smallvec, with just a slot value you could keep like 2-3 elements inline without allocating

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea. Or maybe even arrayvec.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in the profile, I think you see grow taking a lot of time, so if anything you should allocate with higher capacity?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh it's Vec<usize>, yeah use arrayvec

}
None => HashMap::with_hasher(PubkeyHasherBuilder::default()),
};
for (slot, leader) in slot_leaders.iter().enumerate() {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Kamil's comment made me realize that slot_leaders will always be setup so that the same leader will always the same for 4 consecutive slots. We could use that fact to do more optimizing

@HaoranYi
Copy link

We should probably review our uses of itertools across the codebase because many of the utility methods create hashsets and hashmaps which are not pre-allocated.

Good point — applying the same pattern to staked_nodes in stake_cache gave us a 3–4× speed-up as well: #8516

Building them with `itertools::into_group_map` takes 20.9ms.

Building them manually with a pre-allocated hash map, using pubkey
hasher, takes 5.2ms.
@vadorovsky vadorovsky dismissed stale reviews from jstarry and HaoranYi via 80b2e04 October 21, 2025 08:49
@vadorovsky vadorovsky force-pushed the compute-epoch-schedule-group-map branch from cd7b5b6 to 80b2e04 Compare October 21, 2025 08:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants