EXAMPLES: Modify nixl_ep to use UCX channels #1122

2dm · 2025-12-13T03:45:16Z

What?

This PR introduce updates to nixl-ep example:

Updates from DeepEP library
Use of channel API

copy-pr-bot · 2025-12-13T03:45:20Z

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

github-actions · 2025-12-13T03:45:25Z

👋 Hi 2dm! Thank you for contributing to ai-dynamo/nixl.

Your PR reviewers will review your contribution then trigger the CI to test your changes.

🚀

examples/device/ep/meson.build

examples/device/ep/nixl_ep/__init__.py

examples/device/ep/csrc/kernels/nixl_ep.cu

itayalroy · 2025-12-22T14:56:09Z

Do we have some perf comparison between multiple workers single channel and single worker multiple channels?
I think it's worth to run some sanity at least to verify that this does not cause major degradations

itayalroy

This PR does not reduce UCX backend's num_workers to 1

@eranrs this is why you probably did not see a huge benefit with it. The PR needs to be fixed & re-tested for control path perf

Kernel assertion for 0 combined tokens: Update internode_ll.cu (ai-dynamo#374) From <https://github.com/deepseek-ai/DeepEP/commits/main/?before=92fe2deaec24bc92ebd9de276daa6ca9ed602ed4+35> num_sms calculation simplification: Fix: avoid floating point exception (ai-dynamo#379) From <https://github.com/deepseek-ai/DeepEP/commits/main/?before=92fe2deaec24bc92ebd9de276daa6ca9ed602ed4+35> Increase max topk to 11 support topk10 in low latency kernel From <https://github.com/deepseek-ai/DeepEP/commits/main/?before=92fe2deaec24bc92ebd9de276daa6ca9ed602ed4+35> Remove fences, optimized nvlink trasnfer Canonicalize TMA usages (ai-dynamo#410) From <https://github.com/deepseek-ai/DeepEP/commits/main/?before=92fe2deaec24bc92ebd9de276daa6ca9ed602ed4+35> Speed up local copy loop macro: Speed up dispatch send by refining loop unrolling (ai-dynamo#385) From <https://github.com/deepseek-ai/DeepEP/commits/main/?before=92fe2deaec24bc92ebd9de276daa6ca9ed602ed4+35> Topk dtype (not only int64) Make dtype of topk_idx configurable (ai-dynamo#422) From <https://github.com/deepseek-ai/DeepEP/commits/main/?before=92fe2deaec24bc92ebd9de276daa6ca9ed602ed4+35> Assertion and reinterpret_cast to static cast Fix OOB (ai-dynamo#454) From <https://github.com/deepseek-ai/DeepEP/commits/main/?before=92fe2deaec24bc92ebd9de276daa6ca9ed602ed4+35>i support hidden-dim 3072 (ai-dynamo#458) From <https://github.com/deepseek-ai/DeepEP/commits/main/?before=92fe2deaec24bc92ebd9de276daa6ca9ed602ed4+35>

brminich · 2026-01-08T14:53:40Z

/ok to test

copy-pr-bot · 2026-01-08T14:53:43Z

/ok to test

@brminich, there was an error processing your request: E1

See the following link for more information: https://docs.gha-runners.nvidia.com/cpr/e/1/

brminich · 2026-01-08T14:53:45Z

/build

brminich · 2026-01-08T14:53:57Z

/ok to test 0b866f3

itayalroy · 2026-01-10T02:06:51Z

Merged in #1175

2dm requested a review from a team as a code owner December 13, 2025 03:45

pull-request-size bot added the size/L label Dec 13, 2025

github-actions bot added the external-contribution label Dec 13, 2025

2dm force-pushed the ep_ch_api branch from 504127e to 4235a26 Compare December 13, 2025 04:11

itayalroy reviewed Dec 14, 2025

View reviewed changes

examples/device/ep/meson.build Outdated Show resolved Hide resolved

itayalroy reviewed Dec 15, 2025

View reviewed changes

examples/device/ep/nixl_ep/__init__.py Show resolved Hide resolved

itayalroy reviewed Dec 15, 2025

View reviewed changes

examples/device/ep/csrc/kernels/nixl_ep.cu Show resolved Hide resolved

2dm requested a review from itayalroy December 16, 2025 17:47

itayalroy requested changes Jan 5, 2026

View reviewed changes

RoeyAzran1992 and others added 7 commits January 7, 2026 09:08

moving to channel api, depricating worker_id

d01e1af

code review fixes

ec4f19b

Mask setting for connected nodes

bb8bc7c

elastic test fix

58d57ca

Format fixes

a009e14

meson fix

d2854b8

2dm force-pushed the ep_ch_api branch from e807bc3 to d2854b8 Compare January 7, 2026 17:09

copyright year update

0b866f3

copy-pr-bot bot temporarily deployed to SWX_AWS January 8, 2026 14:54 Inactive

copy-pr-bot bot had a problem deploying to GITLAB January 8, 2026 14:54 Failure

copy-pr-bot bot temporarily deployed to SWX_AWS January 8, 2026 14:54 Inactive

itayalroy closed this Jan 10, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

EXAMPLES: Modify nixl_ep to use UCX channels #1122

EXAMPLES: Modify nixl_ep to use UCX channels #1122

Uh oh!

2dm commented Dec 13, 2025

Uh oh!

copy-pr-bot bot commented Dec 13, 2025

Uh oh!

github-actions bot commented Dec 13, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

itayalroy commented Dec 22, 2025

Uh oh!

itayalroy left a comment •

edited

Loading

Uh oh!

brminich commented Jan 8, 2026

Uh oh!

copy-pr-bot bot commented Jan 8, 2026

Uh oh!

brminich commented Jan 8, 2026

Uh oh!

brminich commented Jan 8, 2026

Uh oh!

itayalroy commented Jan 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

EXAMPLES: Modify nixl_ep to use UCX channels #1122

EXAMPLES: Modify nixl_ep to use UCX channels #1122

Uh oh!

Conversation

2dm commented Dec 13, 2025

What?

Uh oh!

copy-pr-bot bot commented Dec 13, 2025

Uh oh!

github-actions bot commented Dec 13, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

itayalroy commented Dec 22, 2025

Uh oh!

itayalroy left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

brminich commented Jan 8, 2026

Uh oh!

copy-pr-bot bot commented Jan 8, 2026

Uh oh!

brminich commented Jan 8, 2026

Uh oh!

brminich commented Jan 8, 2026

Uh oh!

itayalroy commented Jan 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

itayalroy left a comment •

edited

Loading