From NVIDIA Megatron-LM for visibility #18

RaymondLi0 · 2023-01-24T20:01:13Z

No description provided.

Co-authored-by: lit <[email protected]> Co-authored-by: Yuzhong Wang <[email protected]>

feat(MoE): support CP and recompute for MTP See merge request ADLR/megatron-lm!3330

…ntirely

Disallow expandable segments for cudagraphs entirely See merge request ADLR/megatron-lm!3806

MXFP8 DP AG overlap enablement See merge request ADLR/megatron-lm!3710

Signed-off-by: oliver könig <[email protected]>

Co-authored-by: Santosh Bhavani <[email protected]>

Update README See merge request ADLR/megatron-lm!3406

…egatron Core.

Move FullCudaGraphWrapper implementation to Megatron Core. See merge request ADLR/megatron-lm!3808

Fixes and updates for external cudagraph See merge request ADLR/megatron-lm!3631

build: Bump TE See merge request ADLR/megatron-lm!3799

…Engine fused MLP

Debug distributed checkpoint for Transformer Engine fused MLP See merge request ADLR/megatron-lm!3606

Add argument to control collnet enablement See merge request ADLR/megatron-lm!3812

Dynamic Backend Inference MLA See merge request ADLR/megatron-lm!3569

Co-authored-by: Mcore Bot <[email protected]> Co-authored-by: Brandon Norick <[email protected]>

Adding support for multiple validation sets See merge request ADLR/megatron-lm!3422

…m, and sequence parallelism for dynamic engine

Fix log prob calculation, pipeline parallelism, and sequence parallelism for dynamic engine See merge request ADLR/megatron-lm!3718

…calling the right fp8_context

Fix bug in param_norm computation where some ranks might call collective and some might not See merge request ADLR/megatron-lm!3918

Fix BERT + virtual pipeline parallelism See merge request ADLR/megatron-lm!3993

…phs. Co-authored-by: Mcore Bot <[email protected]> Co-authored-by: Siddharth Singh <[email protected]>

Dynamic inference functional tests | Cuda graphs. See merge request ADLR/megatron-lm!3620

… training graph creation until create_cudagraphs

…main' Create inference graphs immediately but defer training graph creation until create_cudagraphs See merge request ADLR/megatron-lm!3965

…to be flaky

Set mimo_vlm and gpt_dynamic_inference tests to be flaky See merge request ADLR/megatron-lm!3995

Signed-off-by: oliver könig <[email protected]>

Co-authored-by: Mcore Bot <[email protected]>

chore: Upgrade dependencies (2025-09-15) See merge request ADLR/megatron-lm!3998

Co-authored-by: vignesh1507 <[email protected]>

Co-authored-by: Mcore Bot <[email protected]>

Co-authored-by: John Kamalu <[email protected]>

…ate to safe globals Co-authored-by: Mcore Bot <[email protected]>

Co-authored-by: Jon Barker <[email protected]> Co-authored-by: Robert Kirby <[email protected]> Co-authored-by: Vitaly Kurin <[email protected]> Co-authored-by: Helen Ngo <[email protected]> Co-authored-by: Mcore Bot <[email protected]> Co-authored-by: Keshav Santhanam <[email protected]> Co-authored-by: Robert Kirby <[email protected]>

…fline implementation Co-authored-by: Chenhan Yu <[email protected]> Co-authored-by: Oliver Koenig <[email protected]> Co-authored-by: Ye Yu <[email protected]> Co-authored-by: Ye Yu <[email protected]> Co-authored-by: Ye Yu <[email protected]>

… but only on the last layer

… Gradient Clipping Co-authored-by: Wil Kong <[email protected]>

Co-authored-by: Selvaraj Anandaraj <[email protected]>

… fully_shard_model and fully_shard_optimizer. Co-authored-by: Mcore Bot <[email protected]>

RaymondLi0 changed the base branch from multi-query-attention to before-merge June 20, 2023 20:12

RaymondLi0 changed the base branch from before-merge to multi-query-attention June 20, 2023 20:12

Mcore Bot and others added 28 commits August 11, 2025 04:11

chore: Version bump

3da9032

ADLR/megatron-lm!3330 - feat(MoE): support CP and recompute for MTP

08abeed

Co-authored-by: lit <[email protected]> Co-authored-by: Yuzhong Wang <[email protected]>

Merge branch 'shifang/mtp_cp' into 'main'

650ab87

feat(MoE): support CP and recompute for MTP See merge request ADLR/megatron-lm!3330

ADLR/megatron-lm!3806 - Disallow expandable segments for cudagraphs e…

eea7c08

…ntirely

Merge branch 'helenn-ban-expandable-segments' into 'main'

94a3711

Disallow expandable segments for cudagraphs entirely See merge request ADLR/megatron-lm!3806

ADLR/megatron-lm!3710 - MXFP8 DP AG overlap enablement

13fd57a

Merge branch 'mxfp8-dp-ag-overlap-mr' into 'main'

410222b

MXFP8 DP AG overlap enablement See merge request ADLR/megatron-lm!3710

ci(hotfix): Disable broken tests

2f1027d

Signed-off-by: oliver könig <[email protected]>

ci(hotfix): Catch WaitTimeExceeded

a5f0057

Signed-off-by: oliver könig <[email protected]>

ADLR/megatron-lm!3406 - Update README

cde92b2

Co-authored-by: Santosh Bhavani <[email protected]>

Merge branch 'update-readme' into 'main'

2dd030e

Update README See merge request ADLR/megatron-lm!3406

ADLR/megatron-lm!3808 - Move FullCudaGraphWrapper implementation to M…

d87bfd1

…egatron Core.

Merge branch 'vrengasamy/full_cuda_graph_core' into 'main'

91e2ee5

Move FullCudaGraphWrapper implementation to Megatron Core. See merge request ADLR/megatron-lm!3808

ADLR/megatron-lm!3631 - Fixes and updates for external cudagraph

2b6b46b

Merge branch 'robinz/external_cudagraph_update' into 'main'

9545270

Fixes and updates for external cudagraph See merge request ADLR/megatron-lm!3631

ADLR/megatron-lm!3799 - build: Bump TE

16ad771

Merge branch 'ko3n1g/build/te-2.6' into 'main'

0d33682

build: Bump TE See merge request ADLR/megatron-lm!3799

ADLR/megatron-lm!3606 - Debug distributed checkpoint for Transformer …

82e5ff6

…Engine fused MLP

Merge branch 'tmoon/te-op-fuser-debug-checkpoint' into 'main'

d1a8777

Debug distributed checkpoint for Transformer Engine fused MLP See merge request ADLR/megatron-lm!3606

ADLR/megatron-lm!3812 - Add argument to control collnet enablement

7704169

Merge branch 'ibsharp-knob' into 'main'

4819438

Add argument to control collnet enablement See merge request ADLR/megatron-lm!3812

ADLR/megatron-lm!3569 - Dynamic Backend Inference MLA

46eb0a3

Merge branch 'mla-flash' into 'main'

29a0607

Dynamic Backend Inference MLA See merge request ADLR/megatron-lm!3569

ADLR/megatron-lm!3422 - Adding support for multiple validation sets

b6a7f40

Co-authored-by: Mcore Bot <[email protected]> Co-authored-by: Brandon Norick <[email protected]>

Merge branch 'bnorick/multi-validation' into 'main'

2e88416

Adding support for multiple validation sets See merge request ADLR/megatron-lm!3422

ADLR/megatron-lm!3718 - Fix log prob calculation, pipeline parallelis…

1b07529

…m, and sequence parallelism for dynamic engine

Merge branch 'dynamic_logprobs_fix' into 'main'

7a31b35

Fix log prob calculation, pipeline parallelism, and sequence parallelism for dynamic engine See merge request ADLR/megatron-lm!3718

ADLR/megatron-lm!3746 - Fix cuda graph with first/last layer bf16 by …

c86819f

…calling the right fp8_context

deepakn94 and others added 30 commits September 9, 2025 20:59

Merge branch 'dnarayanan/param_norm_fixes' into 'main'

5e8c9c4

Fix bug in param_norm computation where some ranks might call collective and some might not See merge request ADLR/megatron-lm!3918

ADLR/megatron-lm!3993 - Fix BERT + virtual pipeline parallelism

18420b6

Merge branch 'dnarayanan/fix_bert_vpp' into 'main'

1dc7019

Fix BERT + virtual pipeline parallelism See merge request ADLR/megatron-lm!3993

ADLR/megatron-lm!3620 - Dynamic inference functional tests | Cuda gra…

9a4002e

…phs. Co-authored-by: Mcore Bot <[email protected]> Co-authored-by: Siddharth Singh <[email protected]>

Merge branch 'lmcafee/dyn-inf-functional-tests-cuda-graph' into 'main'

159a6a0

Dynamic inference functional tests | Cuda graphs. See merge request ADLR/megatron-lm!3620

ADLR/megatron-lm!3965 - Create inference graphs immediately but defer…

2d60db7

… training graph creation until create_cudagraphs

Merge branch 'helenn-create-graphs-immediately-inference-only' into '…

8ed1e2c

…main' Create inference graphs immediately but defer training graph creation until create_cudagraphs See merge request ADLR/megatron-lm!3965

ADLR/megatron-lm!3995 - Set mimo_vlm and gpt_dynamic_inference tests …

6023444

…to be flaky

Merge branch 'chtruong/flaky-test' into 'main'

1584dca

Set mimo_vlm and gpt_dynamic_inference tests to be flaky See merge request ADLR/megatron-lm!3995

chore: Version bump

2fbece5

ci(hotfix): Notify release

bdf57ae

Signed-off-by: oliver könig <[email protected]>

ci(hotfix): Publish

1c0eb4a

Signed-off-by: oliver könig <[email protected]>

ci(hotfix): Publish

a43b818

Signed-off-by: oliver könig <[email protected]>

ADLR/megatron-lm!3998 - chore: Upgrade dependencies (2025-09-15)

1e6f75a

Co-authored-by: Mcore Bot <[email protected]>

Merge branch 'ci-bot/build/upgrade-dependencies-2025-09-15' into 'main'

c76ed86

chore: Upgrade dependencies (2025-09-15) See merge request ADLR/megatron-lm!3998

ADLR/megatron-lm!3938 - Fix typos in pretrain_mamba.py

ef5e03c

Co-authored-by: vignesh1507 <[email protected]>

ADLR/megatron-lm!3805 - Dynamic inference engine | Events

848c8c9

Co-authored-by: Mcore Bot <[email protected]>

ADLR/megatron-lm!3323 - fix: prevent an integer overflow on numpy >= 2

84e9c3a

Co-authored-by: John Kamalu <[email protected]>

ADLR/megatron-lm!3987 - Add megatron.core.rerun_state_machine.RerunSt…

23d2ada

…ate to safe globals Co-authored-by: Mcore Bot <[email protected]>

ADLR/megatron-lm!3931 - fix: Fix non TE optimizer ckpt issue

2ebb6ee

ADLR/megatron-lm!4011 - ci: Don't run legacy tests on release branch

15a0d47

ADLR/megatron-lm!4003 - Fix gc.freeze() slowdown: Add a gc.collect(),…

a3f9e56

… but only on the last layer

ADLR/megatron-lm!3992 - [Flux] Remove Redundant Host & Device Sync In…

5653514

… Gradient Clipping Co-authored-by: Wil Kong <[email protected]>

ADLR/megatron-lm!4018 - bugfix: Added support for FSDP grad accum fusion

9f72f47

Co-authored-by: Selvaraj Anandaraj <[email protected]>

ADLR/megatron-lm!3879 - Add unit tests, and refactor fully_shard into…

199113b

… fully_shard_model and fully_shard_optimizer. Co-authored-by: Mcore Bot <[email protected]>

ADLR/megatron-lm!4022 - Add ModelOpt pruning example

5a58976

ADLR/megatron-lm!3859 - Gradient comparison test

74bec5b

ADLR/megatron-lm!4033 - ci: Don't publish protected branches

93a0d8e

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

From NVIDIA Megatron-LM for visibility #18

From NVIDIA Megatron-LM for visibility #18

Uh oh!

RaymondLi0 commented Jan 24, 2023

Uh oh!

Uh oh!

From NVIDIA Megatron-LM for visibility #18

Are you sure you want to change the base?

From NVIDIA Megatron-LM for visibility #18

Uh oh!

Conversation

RaymondLi0 commented Jan 24, 2023

Uh oh!

Uh oh!