allow seperate learning rate "muon_lr" and "adam_lr" for muon optimizer #7658

delock · 2025-10-30T06:09:17Z

This PR allows seperate learning rate for muon and adam part of the Muon optimizer. Following up #7657

Signed-off-by: Guokai Ma <[email protected]>

PKUWZP · 2025-10-30T14:12:05Z

deepspeed/runtime/engine.py

            if muon_params:
                accepted_parameters = dict()
-                for key in ["lr", "momentum", "weight_decay"]:
+                for key in ["lr", "momentum", "weight_decay", "muon_lr"]:


Does the user need to specific "muon_lr" for their training jobs?

Not mandatory, 'lr' will apply to both muon and adam. However, 'muon_lr' will be used if present. Same for 'adam_lr'. Thinking them as advanced user setting.

delock · 2025-11-03T03:00:57Z

@PKUWZP @tjruwase I'll put this PR into auto-merge if everything looks fine.

…er (deepspeedai#7658) This PR allows seperate learning rate for muon and adam part of the Muon optimizer. Following up deepspeedai#7657 Signed-off-by: Guokai Ma <[email protected]> Co-authored-by: Olatunji Ruwase <[email protected]> Signed-off-by: Luke Friedrichs <[email protected]>

delock requested review from tjruwase and tohtana as code owners October 30, 2025 06:09

allow seperate learning rate "muon_lr" and "adam_lr" for muon optimizer

c94b822

Signed-off-by: Guokai Ma <[email protected]>

PKUWZP self-requested a review October 30, 2025 14:10

PKUWZP approved these changes Oct 30, 2025

View reviewed changes

sfc-gh-truwase and others added 2 commits November 3, 2025 14:38

Merge branch 'master' into gma/muon_sep_lr

b813c93

Merge branch 'master' into gma/muon_sep_lr

df0d913

loadams approved these changes Nov 5, 2025

View reviewed changes

delock added 2 commits November 7, 2025 11:05

Merge branch 'master' into gma/muon_sep_lr

c2b7c2b

Merge branch 'master' into gma/muon_sep_lr

65bc485

delock merged commit df59f20 into master Nov 11, 2025
12 checks passed

delock deleted the gma/muon_sep_lr branch November 11, 2025 05:26

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

allow seperate learning rate "muon_lr" and "adam_lr" for muon optimizer #7658

allow seperate learning rate "muon_lr" and "adam_lr" for muon optimizer #7658

Uh oh!

delock commented Oct 30, 2025

Uh oh!

PKUWZP Oct 30, 2025

Uh oh!

delock Oct 31, 2025

Uh oh!

delock commented Nov 3, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

allow seperate learning rate "muon_lr" and "adam_lr" for muon optimizer #7658

allow seperate learning rate "muon_lr" and "adam_lr" for muon optimizer #7658

Uh oh!

Conversation

delock commented Oct 30, 2025

Uh oh!

PKUWZP Oct 30, 2025

Choose a reason for hiding this comment

Uh oh!

delock Oct 31, 2025

Choose a reason for hiding this comment

Uh oh!

delock commented Nov 3, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants