[0.9.1][bugfix] Unify MoE routing init with standard torch_npu operator #2400

SlightwindSec · 2025-08-15T10:51:57Z

Description

This PR fixes the dependency on a POC version of torch_npu for the MoE routing initialization feature.

Before:
To get the best performance, users needed a torch_npu version containing the npu_moe_init_routing_quant operator. Official versions would trigger a slower, pure PyTorch fallback.

After:
The code is updated to use the npu_moe_init_routing_v2 operator, which is included in the official torch_npu releases and provides equivalent performance. This change unifies the implementation, removes the fallback logic, and makes the high-performance path accessible to all users without requiring a special library version.

Signed-off-by: SlightwindSec <[email protected]>

gemini-code-assist

Code Review

This pull request refactors the MoE routing initialization for W8A8 dynamic quantization. It replaces the dependency on a proof-of-concept torch_npu operator, npu_moe_init_routing_quant, with the official npu_moe_init_routing_v2 operator. This change successfully unifies the implementation by removing the conditional logic and the pure PyTorch fallback path, which simplifies the code and improves maintainability. The update appears correct and aligns with the goal of using standardized, performant operators from the official torch_npu library.

Signed-off-by: SlightwindSec <[email protected]>

Yikun · 2025-08-28T07:09:58Z

v0.9.1-dev are code freezing, can you make sure is this still needed? or just move to main branch. Thanks.

SlightwindSec · 2025-08-29T06:28:08Z

v0.9.1-dev are code freezing, can you make sure is this still needed? or just move to main branch. Thanks.

OK

Fix: Relax the torch_npu version dependency for the POC build.

32461dc

Signed-off-by: SlightwindSec <[email protected]>

gemini-code-assist bot reviewed Aug 15, 2025

View reviewed changes

github-actions bot added the module:quantization label Aug 15, 2025

fix typo

c54d7af

Signed-off-by: SlightwindSec <[email protected]>

github-actions bot added the documentation Improvements or additions to documentation label Aug 16, 2025

SlightwindSec closed this Aug 29, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[0.9.1][bugfix] Unify MoE routing init with standard torch_npu operator #2400

[0.9.1][bugfix] Unify MoE routing init with standard torch_npu operator #2400

SlightwindSec commented Aug 15, 2025

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Yikun commented Aug 28, 2025

Uh oh!

SlightwindSec commented Aug 29, 2025

Uh oh!

Uh oh!

[0.9.1][bugfix] Unify MoE routing init with standard torch_npu operator #2400

[0.9.1][bugfix] Unify MoE routing init with standard torch_npu operator #2400

Conversation

SlightwindSec commented Aug 15, 2025

Description

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Yikun commented Aug 28, 2025

Uh oh!

SlightwindSec commented Aug 29, 2025

Uh oh!

Uh oh!