Skip to content

feat: correct routed_experts from sglang(trick of routing_replay)#884

Merged
rchardx merged 31 commits intoinclusionAI:mainfrom
ZiyiTsang:r3replay
Mar 2, 2026
Merged

feat: correct routed_experts from sglang(trick of routing_replay)#884
rchardx merged 31 commits intoinclusionAI:mainfrom
ZiyiTsang:r3replay

Conversation

@ZiyiTsang
Copy link
Copy Markdown
Collaborator

@ZiyiTsang ZiyiTsang commented Feb 3, 2026

Description

The routing replay proposed in this Paper becomes a trick for stabilizing the RL training in MOE.

The sglang 0.5.7 has support to return routed_experts in API.
therefore this P.R support to get the routed_experts from sglang, which will be put into Megatron later.

To test with:

python3 -m areal.launcher.local examples/math/gsm8k_rl.py --config examples/math/gsm8k_grpo.yaml gconfig.return_routed_experts=True

Related Issue

N/A

Type of Change

  • Bug fix (non-breaking change that fixes an issue)
  • New feature (non-breaking change that adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not
    work as expected)
  • Documentation update
  • Code refactoring (no functional changes)
  • Performance improvement
  • Test coverage improvement

Checklist

  • I have read the Contributing Guide
  • I have run formatting tools (pre-commit or manual)
  • I have run relevant unit tests and they pass
  • I have added tests for new functionality
  • I have updated documentation if needed
  • My branch is up to date with main
  • This PR introduces breaking changes (if yes, fill out details below)
  • If this PR changes documentation, I have built and previewed it locally with
    jb build docs
  • No critical issues raised by AI reviewers (/gemini review)

Breaking Change Details (if applicable):

Additional Context


Need help? Check the Contributing Guide or ask in
GitHub Discussions!

@gemini-code-assist

This comment was marked as outdated.

@ZiyiTsang

This comment was marked as resolved.

gemini-code-assist[bot]

This comment was marked as resolved.

gemini-code-assist[bot]

This comment was marked as resolved.

ZiyiTsang and others added 3 commits February 4, 2026 11:56
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
@ZiyiTsang ZiyiTsang force-pushed the r3replay branch 2 times, most recently from 24fc9c1 to 7adde36 Compare February 4, 2026 04:17
@ZiyiTsang ZiyiTsang changed the title WIP: Support Routing Replay (Phase I: Return routed_experts from sglang) Support Routing Replay (Phase I: Return routed_experts from sglang) Feb 19, 2026
@ZiyiTsang ZiyiTsang marked this pull request as ready for review February 19, 2026 12:28
@ZiyiTsang ZiyiTsang changed the title Support Routing Replay (Phase I: Return routed_experts from sglang) feat: correct routed_experts from sglang(trick of routing_replay) Feb 19, 2026
Comment thread areal/infra/remote_inf_engine.py
Comment thread areal/api/cli_args.py
Comment thread areal/api/cli_args.py Outdated
Comment thread areal/api/cli_args.py
Comment thread areal/api/cli_args.py Outdated
Comment thread areal/api/cli_args.py
@rchardx
Copy link
Copy Markdown
Collaborator

rchardx commented Feb 26, 2026

(Moved to inline comment on uv.lock)

Comment thread areal/api/cli_args.py Outdated
Comment thread uv.lock Outdated
@ZiyiTsang ZiyiTsang requested a review from rchardx February 26, 2026 06:06
Comment thread uv.lock
@ZiyiTsang
Copy link
Copy Markdown
Collaborator Author

/gemini review

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces support for retrieving routed_experts information from the sglang backend, which is useful for analyzing Mixture of Experts (MoE) models. The changes span across configuration, data structures, and engine logic to propagate the option and handle the returned data. The implementation looks good, with proper error handling for unsupported backends and documentation updates. I have a couple of suggestions to improve maintainability by adding a clarifying comment and removing a redundant validation check.

Comment on lines +97 to +99
num_sgl_token = (
meta_info["prompt_tokens"] + meta_info["completion_tokens"] - 1
)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

For better maintainability, could you add a brief comment explaining why 1 is subtracted in the num_sgl_token calculation? This would clarify the logic for future developers who might not be familiar with the specifics of the sglang API's token counting.

Comment on lines +599 to +602
if self.config.rollout.return_routed_experts:
raise ValueError(
"return_routed_experts is not supported with vLLM backend. Please disable return_routed_experts or switch to SGLang backend."
)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

This validation check is redundant because the _validate_cfg method, which is called earlier in __init__, already performs the same check. To avoid code duplication, this if block can be removed.

@rchardx rchardx added the safe-to-test Ready to run unit-tests in a PR. label Mar 2, 2026
Copy link
Copy Markdown
Collaborator

@rchardx rchardx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@rchardx rchardx merged commit b984b5c into inclusionAI:main Mar 2, 2026
6 of 8 checks passed
leandermaben pushed a commit to leandermaben/AReaL that referenced this pull request Mar 24, 2026
…nclusionAI#884)

The routing replay becomes a trick for stabilizing the RL training in MOE.

The sglang 0.5.7 has support to return routed_experts in API.
Therefore this P.R support to get the routed_experts from sglang, which will be put into Megatron later.

---------

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: Wentai Zhang <zhangwentai.zwt@antgroup.com>
SathyaGnanakumar pushed a commit to danielkiely/AReaL that referenced this pull request Apr 29, 2026
…nclusionAI#884)

The routing replay becomes a trick for stabilizing the RL training in MOE.

The sglang 0.5.7 has support to return routed_experts in API.
Therefore this P.R support to get the routed_experts from sglang, which will be put into Megatron later.

---------

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: Wentai Zhang <zhangwentai.zwt@antgroup.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

safe-to-test Ready to run unit-tests in a PR.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants