[Spec][MOE][Internal Op] Specification of MOE internal operation #32255

mitruska · 2025-09-30T09:33:39Z

Details:

Specification of MOE internal operation
Internal ops are used mainly for fusion transformations and optimizations,
they will not appear in the converted model public IR

Describes MOE used in PR:

[Transformations][MOE] Add MOE internal op and fuse vectorized MatMul experts into MOE #32183

Tickets:

171911

…rnal_spec

mitruska · 2025-10-01T13:18:26Z

...articles_en/documentation/openvino-ir-format/operation-sets/operation-specs/internal/moe.rst

+    # Experts computation part (GEMM3_SWIGLU)
+    x_proj = matmul(reshaped_hidden_states, weight_0, transpose_a=False, transpose_b=True)
+    x_proj2 = matmul(reshaped_hidden_states, weight_1, transpose_a=False, transpose_b=True)
+    swiglu = swish(x_proj, beta=expert_beta)
+    x_proj = x_proj2 * swiglu
+    down_proj = matmul(x_proj, weight_2, transpose_a=False, transpose_b=True)


GPU plugin request is to transpose those weights at conversion stage, so the MatMul both transpose_a/b attrs should be False at this point:

Suggested change

# Experts computation part (GEMM3_SWIGLU)

x_proj = matmul(reshaped_hidden_states, weight_0, transpose_a=False, transpose_b=True)

x_proj2 = matmul(reshaped_hidden_states, weight_1, transpose_a=False, transpose_b=True)

swiglu = swish(x_proj, beta=expert_beta)

x_proj = x_proj2 * swiglu

down_proj = matmul(x_proj, weight_2, transpose_a=False, transpose_b=True)

# Experts computation part (GEMM3_SWIGLU)

x_proj = matmul(reshaped_hidden_states, weight_0, transpose_a=False, transpose_b=False)

x_proj2 = matmul(reshaped_hidden_states, weight_1, transpose_a=False, transpose_b=False)

swiglu = swish(x_proj, beta=expert_beta)

x_proj = x_proj2 * swiglu

down_proj = matmul(x_proj, weight_2, transpose_a=False, transpose_b=False)

cc: @yeonbok

…ts/operation-specs/internal/moe.rst

rkazants · 2025-10-02T06:02:28Z

...articles_en/documentation/openvino-ir-format/operation-sets/operation-specs/internal/moe.rst

@@ -0,0 +1,151 @@
+.. {#openvino_docs_ops_internal_MOE}
+
+MOE


let us not use MoE name because we can use it for external operation and for real MoE operation. Now it is a sort of FusedExperts.

rkazants · 2025-10-02T06:06:14Z

...articles_en/documentation/openvino-ir-format/operation-sets/operation-specs/internal/moe.rst

+.. code-block:: py
+    :force:
+
+    # Common part: Reshape hidden states and prepare for expert computation


I propose to add router_topk_output_indices into this logic. It will show how weights are prepared. Now it is not clear how router_topk_output_indices is used in the specified operation.

rkazants

Good job! Thank you, Kasia. Left a couple of comments,

...articles_en/documentation/openvino-ir-format/operation-sets/operation-specs/internal/moe.rst

Co-authored-by: Tatiana Savina <[email protected]>

mitruska added 4 commits September 29, 2025 20:56

Internal MOE spec init

1fc8f7b

Merge remote-tracking branch 'upstream/master' into mitruska/moe_inte…

9a66628

…rnal_spec

Minor spelling refactor

95359d3

Switch beta with alpha to match the beta for swish naming

1bb24d8

mitruska requested a review from a team as a code owner September 30, 2025 09:33

mitruska requested review from zKulesza and removed request for a team September 30, 2025 09:33

github-actions bot added the category: docs OpenVINO documentation label Sep 30, 2025

ValentinaKats requested a review from tsavina September 30, 2025 09:54

mitruska requested review from yeonbok, maxnick, riverlijunjie, rkazants and mmikolajcz September 30, 2025 10:04

mitruska mentioned this pull request Sep 30, 2025

[Transformations][MOE] Add MOE internal op and fuse vectorized MatMul experts into MOE #32183

Open

mitruska self-assigned this Sep 30, 2025

mitruska added 3 commits September 30, 2025 13:32

Refactor formatting

e5c0009

Update identation

8a9e4d1

Fix x_proj -> x_proj2 in GEMM3 mode in moe.rst

bce1465

mitruska commented Oct 1, 2025

View reviewed changes

Update docs/articles_en/documentation/openvino-ir-format/operation-se…

b9b12ff

…ts/operation-specs/internal/moe.rst

rkazants reviewed Oct 2, 2025

View reviewed changes

tsavina reviewed Oct 2, 2025

View reviewed changes

Apply suggestions from code review

4cb21b4

Co-authored-by: Tatiana Savina <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Spec][MOE][Internal Op] Specification of MOE internal operation #32255

[Spec][MOE][Internal Op] Specification of MOE internal operation #32255

Uh oh!

mitruska commented Sep 30, 2025 •

edited

Loading

Uh oh!

mitruska Oct 1, 2025

Uh oh!

rkazants Oct 2, 2025

Uh oh!

rkazants Oct 2, 2025

Uh oh!

rkazants left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

[Spec][MOE][Internal Op] Specification of MOE internal operation #32255

Are you sure you want to change the base?

[Spec][MOE][Internal Op] Specification of MOE internal operation #32255

Uh oh!

Conversation

mitruska commented Sep 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Details:

Tickets:

Uh oh!

mitruska Oct 1, 2025

Choose a reason for hiding this comment

Uh oh!

rkazants Oct 2, 2025

Choose a reason for hiding this comment

Uh oh!

rkazants Oct 2, 2025

Choose a reason for hiding this comment

Uh oh!

rkazants left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

mitruska commented Sep 30, 2025 •

edited

Loading