Skip to content

Conversation

@benlwalker
Copy link
Contributor

What?

Change nixlbench to call makeXferReq on every I/O. The prepXferDlist calls are outside of the main I/O loop.

Why?

  1. We observe that all real software using NIXL calls either makeXferReq or createXferReq in the I/O path, immediately before posting the request.
  2. I cannot imagine a system that wouldn't call makeXferReq on demand. Once a request is created, it's tied to all of the ranges it will be transferirng. There's no fathomable system that could know all of the exact batches it will transfer over the life of the application at start up time.

Reason 1 is insufficient for the change here because all real software I can find also busy polls in a loop after calling postXferReq and that doesn't mean we should make NIXL synchronous. Instead, it's the combination of reason 1 and reason 2 that justifies the change.

How?

For the GUSLI path there was already logic to call createXferReq in the I/O path. I first made that always the case, rather than just for GUSLI. Then I broke createXferReq into two prepXferDlist calls up front and makeXferReq in the I/O path because that's more efficient and still realistic.

Remove recreate_per_iteration parameter from execTransferIterations and
always recreate transfer requests for each iteration. This reflects
real world usage where at least makeXferRequest is always called in the
hot path. No one does the exact same I/O over and over.

Signed-off-by: Ben Walker <[email protected]>
…akeXferReq

Replace createXferReq with the two-step approach of prepXferDlist and
makeXferReq. This optimization moves the prepXferDlist calls outside
the iteration loop, avoiding redundant descriptor list preparation on
each iteration.

While most real uses of NIXL seem to call createXferReq in the hot path,
the most optimized ones at least do prepXferDlist ahead of time. This
makes nixlbench match that behavior.

Signed-off-by: Ben Walker <[email protected]>
@copy-pr-bot
Copy link

copy-pr-bot bot commented Dec 16, 2025

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@github-actions
Copy link

👋 Hi benlwalker! Thank you for contributing to ai-dynamo/nixl.

Your PR reviewers will review your contribution then trigger the CI to test your changes.

🚀

@aranadive
Copy link
Contributor

Lets keep the previous createXferReq path as well. This denotes the absolute limit for the system config similar to many performance tests as well as to test nixl API in general. MakeXferReq can denote an alternate path for applications to perform flexible allocations.

std::cerr << "prepXferDlist (local) failed: " << nixlEnumStrings::statusStr(prep_rc)
<< std::endl;
return -1;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would propose to use a scope guard to automatically release handles when goes out of scope. This way we don't need release calls in many places, and it's also exception safe. Just allocate it once right after prepXferDlist, same for remote_dlist_hndl

    auto local_guard = make_scope_guard ([&] {
        agent->releasedDlistH(local_dlist_hndl);
    });

@benlwalker
Copy link
Contributor Author

Lets keep the previous createXferReq path as well. This denotes the absolute limit for the system config similar to many performance tests as well as to test nixl API in general. MakeXferReq can denote an alternate path for applications to perform flexible allocations.

I strongly disagree here. It's not possible to create a system that doesn't call either createXferReq or makeXferReq on each I/O request, so it isn't worth measuring without it. If we need to pull together the team to have a meeting on this, let's do that after the holidays.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants