Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -185,16 +185,17 @@ def moe_topk_select(
probs_for_choice.reshape([seq_length, n_group, -1]).topk(2, axis=-1)[0].sum(axis=-1)
) # [seq_len, n_group]
group_idx = paddle.topk(group_scores, k=topk_group, axis=-1, sorted=True)[1] # [seq_len, topk_group]
group_mask = paddle.zeros_like(group_scores).put_along_axis(
group_idx, paddle.to_tensor(1.0, dtype=group_scores.dtype), axis=-1
group_mask = paddle.sum(
paddle.nn.functional.one_hot(group_idx, num_classes=n_group).cast(group_scores.dtype),
axis=1, # Sum over topk_group dimension -> [seq_len, n_group]
Comment on lines +188 to +190
Copy link

Copilot AI Mar 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

当前 PR 标题不符合仓库 Cherry-Pick 约定:需要在 [Cherry-Pick] 后包含至少一个标签(如 [BugFix]),并在标题末尾追加原 develop PR 号(此处应为 (#7069)),否则可能触发 CI 的 Cherry-Pick 校验失败。建议按模板格式调整标题。

Copilot generated this review using guidance from repository custom instructions.
)
Comment on lines +189 to 191
Copy link

Copilot AI Mar 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR 描述里 Modifications / Usage or Command / Accuracy Tests 等关键字段为空。由于该改动是修复 cudagraph 下的路由选择 bug,建议补充:复现方式/影响范围、修复原理,以及至少一个可运行的验证命令或准确性/回归结果,便于 release 分支风险评估。

Copilot generated this review using guidance from repository custom instructions.
score_mask = (
group_mask.unsqueeze(-1).expand([seq_length, n_group, n_experts // n_group]).reshape([seq_length, -1])
) # [seq_len, n_experts]
probs_for_choice = probs_for_choice.masked_fill(~score_mask.astype(paddle.bool), float("-inf"))

_, topk_ids = paddle.topk(probs_for_choice, top_k, axis=-1)
topk_weights = paddle.take_along_axis(gate_probs, topk_ids, axis=-1)
topk_weights = paddle.index_sample(gate_probs, topk_ids)
Comment on lines 197 to +198
Copy link

Copilot AI Mar 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里的修改是为了解决 cudagraph 场景下的 topk/group mask 选择问题,但当前单测(例如 tests/operators/test_noaux_tc_redundant.py)只覆盖了数值正确性,未覆盖 CUDA Graph capture/replay。建议新增或扩展单测:在 paddle.device.cuda.graphs.CUDAGraph 的 capture/replay 中运行 moe_topk_select(包含 n_group>1 && topk_group<n_group 分支),以避免该类回归再次出现。

Copilot generated this review using guidance from repository custom instructions.

# normalize combine weights
if renormalize:
Expand Down
Loading