Add extra RL files #2077

tdene · 2025-10-31T20:41:57Z

What does this PR do ?

Add extra RL files

⚠️ For major changes (either in lines of code or in its impact), please make sure to first share discuss a design-doc with the team.

Contribution process

flowchart LR
    A[Pre-checks] --> B[PR Tests]
    subgraph Code Review/Approval
        C1[Expert Review] --> C2[Final Review]
    end
    B --> C1
    C2 --> D[Merge]

Pre-checks

I want this PR in a versioned release and have added the appropriate Milestone (e.g., Core 0.8)
I have added relevant unit tests
I have added relevant functional tests
I have added proper typing to my code Typing guidelines
I have added relevant documentation
I have run the autoformatter.sh on my PR

Code review

The following process is enforced via the CODEOWNERS file for changes into megatron/core. For changes outside of megatron/core, it is up to the PR author whether or not to tag the Final Reviewer team.

For MRs into `main` branch

(Step 1): Add PR label `Expert Review`

(Step 2): Collect the expert reviewers reviews

Attach the Expert Review label when your PR is ready for review.
GitHub auto-assigns expert reviewers based on your changes. They will get notified and pick up your PR soon.

⚠️ Only proceed to the next step once all reviewers have approved, merge-conflict are resolved and the CI is passing.
Final Review might get declined if these requirements are not fulfilled.

(Step 3): Final Review

Add Final Review label
GitHub auto-assigns final reviewers based on your changes. They will get notified and pick up your PR soon.

(Optional Step 4): Cherry-pick into release branch

If this PR also needs to be merged into core_r* release branches, after this PR has been merged, select Cherry-pick to open a new PR into the release branch.

For MRs into `dev` branch

The proposed review process for `dev` branch is under active discussion.

MRs are mergable after one approval by either [email protected] or [email protected].

Merging your PR

Any member of core-adlr and core-nemo will be able to merge your PR.

ArEsKay3

👍

copy-pr-bot · 2025-10-31T21:11:38Z

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

tdene · 2025-10-31T21:14:00Z

/ok to test

copy-pr-bot · 2025-10-31T21:14:04Z

/ok to test

@tdene, there was an error processing your request: E1

See the following link for more information: https://docs.gha-runners.nvidia.com/cpr/e/1/

tdene · 2025-10-31T21:14:53Z

/ok to test 0c14bc6

* ci: Move test optimizer into its own bucket (NVIDIA#1909) Signed-off-by: oliver könig <[email protected]> * ci: Use matrix for approval-bot Signed-off-by: oliver könig <[email protected]> * ci: Update function name Signed-off-by: oliver könig <[email protected]> * ci: Adjust approval-bot for copy-pr-bot Signed-off-by: oliver könig <[email protected]> * ci: Parametrize workflow Signed-off-by: oliver könig <[email protected]> * ci: Parametrize workflow Signed-off-by: oliver könig <[email protected]> * ci: Remove attribute Signed-off-by: oliver könig <[email protected]> * ci: Update container image tag to use GitHub SHA * chore: Remove file * ci: Fix approval bot Signed-off-by: oliver könig <[email protected]> * ci: Configure cherrypick bot (NVIDIA#1925) Signed-off-by: oliver könig <[email protected]> * Ci approve dev (NVIDIA#1933) Signed-off-by: oliver könig <[email protected]> * ci: Update nightly schedule (NVIDIA#1934) Signed-off-by: oliver könig <[email protected]> * ci: Bump pre-flight for runs on main/dev (NVIDIA#1935) Signed-off-by: oliver könig <[email protected]> * ci: Allow skipping on main (NVIDIA#1936) Signed-off-by: oliver könig <[email protected]> * Ko3n1g/ci/pr template community bot (NVIDIA#1937) * ci: More granular unit tests buckets (NVIDIA#1932) Signed-off-by: oliver könig <[email protected]> * Add sequence packing to RL (NVIDIA#1911) Add sequence packing to RL * chore: Update template (NVIDIA#1939) Signed-off-by: oliver könig <[email protected]> * chore: Add description about who can merge (NVIDIA#1940) Signed-off-by: oliver könig <[email protected]> * Ko3n1g/ci/fix main on eos (NVIDIA#1938) Signed-off-by: oliver könig <[email protected]> * Ko3n1g/ci/internal mrs (NVIDIA#1942) Signed-off-by: oliver könig <[email protected]> * ci: Fix branch of approval bot (NVIDIA#1944) Signed-off-by: oliver könig <[email protected]> * ci: Approvalbot for other branches (NVIDIA#1947) Signed-off-by: oliver könig <[email protected]> * ci(fix): Approval bot (NVIDIA#1949) Signed-off-by: oliver könig <[email protected]> * ci(fix): Approval gate Signed-off-by: oliver könig <[email protected]> * ci: Approval gate rule Signed-off-by: oliver könig <[email protected]> * ci: Update golden values nightly Signed-off-by: oliver könig <[email protected]> * ci: Approval gate Signed-off-by: oliver könig <[email protected]> * ci: Approval bot Signed-off-by: oliver könig <[email protected]> * ci: Sync branches Signed-off-by: oliver könig <[email protected]> * ci: Smaller image Signed-off-by: oliver könig <[email protected]> * ci: Better output Signed-off-by: oliver könig <[email protected]> * ci: sync branches Signed-off-by: oliver könig <[email protected]> * ci: Fix sync bot Signed-off-by: oliver könig <[email protected]> * ci: Finalize Signed-off-by: oliver könig <[email protected]> * ci: Finalize Signed-off-by: oliver könig <[email protected]> * Ko3n1g/ci/sync branches (NVIDIA#1956) Signed-off-by: oliver könig <[email protected]> * ci: Increase time limit for main tests Signed-off-by: oliver könig <[email protected]> * Ko3n1g/ci/add milestone (NVIDIA#1951) Signed-off-by: oliver könig <[email protected]> * Remove M-FSDP testing under LTS environment (NVIDIA#1959) * ci: Run on push to release branch (NVIDIA#1960) Signed-off-by: oliver könig <[email protected]> * ci: Add golden values for inference Signed-off-by: oliver könig <[email protected]> * Fix typo in rl section of CODEOWNERS (NVIDIA#1968) * ci: Update copyright checker (NVIDIA#1973) Signed-off-by: oliver könig <[email protected]> * Ko3n1g/ci/auto reminder GitHub (NVIDIA#1955) Signed-off-by: oliver könig <[email protected]> * ci: Update secret Signed-off-by: oliver könig <[email protected]> * ci(fix): `Run tests` label (NVIDIA#1970) Signed-off-by: oliver könig <[email protected]> * ci(hotfix): Disable tests again Signed-off-by: oliver könig <[email protected]> * ci(hotfix): Add merge-group to copyright check Signed-off-by: oliver könig <[email protected]> * ci(hotfix): Copyright check on merge-queue Signed-off-by: oliver könig <[email protected]> * zarr soft deprecation (NVIDIA#2004) Signed-off-by: dimapihtar <[email protected]> Co-authored-by: oliver könig <[email protected]> * Make `get_asyncio_loop` safe to use repeatedly (NVIDIA#1990) Co-authored-by: oliver könig <[email protected]> * Update symmetric registration interface to sync-up with upstream pytorch change (NVIDIA#1924) Signed-off-by: Youngeun Kwon <[email protected]> Signed-off-by: Youngeun <[email protected]> Co-authored-by: oliver könig <[email protected]> * chore: Update codeowners (NVIDIA#2012) Signed-off-by: oliver könig <[email protected]> * Deduplicate dynamic engine + coordinator. (NVIDIA#1981) Co-authored-by: Mcore Bot <[email protected]> Co-authored-by: oliver könig <[email protected]> * Safely access state dict args in load ckpt (NVIDIA#1957) Signed-off-by: Maanu Grover <[email protected]> * Allow mixed-batch sampling in dynamic inference (NVIDIA#1927) * Stop Nemo_CICD_Test from failing in forks (NVIDIA#2024) * Clean up dynamic inference step (NVIDIA#1992) Co-authored-by: Lawrence McAfee <[email protected]> Co-authored-by: oliver könig <[email protected]> * ci: Auto-update copy-pr-bot vetters (NVIDIA#1850) Signed-off-by: oliver könig <[email protected]> Co-authored-by: AJ Schmidt <[email protected]> * Have datasets account for tokenizers which incorrectly define PAD (NVIDIA#2017) * ci: Enable integration tests (NVIDIA#2023) Signed-off-by: oliver könig <[email protected]> * ci: Fix build-push-wheel workflow (NVIDIA#2022) Signed-off-by: oliver könig <[email protected]> * chore: Update tooling for interactive jobs (NVIDIA#2032) Signed-off-by: oliver könig <[email protected]> * revert(hotfix): ci: trustees_override (NVIDIA#2041) Signed-off-by: oliver könig <[email protected]> * add missing warnings import in model parallel config (NVIDIA#2039) Signed-off-by: ykarnati <[email protected]> * Reduce-scatter implementation with FP32 accumulation (NVIDIA#1967) Signed-off-by: Deepak Narayanan <[email protected]> * ci(fix): Workflows on `main` (NVIDIA#2045) Signed-off-by: oliver könig <[email protected]> * build: Bump modelopt (NVIDIA#2046) Signed-off-by: oliver könig <[email protected]> * Remove TestCaptureFreezeGC unit test. (NVIDIA#1978) * ci: Add multi-approval action (NVIDIA#2051) Signed-off-by: oliver könig <[email protected]> * ci(hotfix): Repair codeowners file * ci(hotfix): Set docs allowed to fail Signed-off-by: oliver könig <[email protected]> * Ko3n1g/ci/test iteration time (NVIDIA#2067) Signed-off-by: oliver könig <[email protected]> * ci(hotfix): Remove performance for ckpt-resume Signed-off-by: oliver könig <[email protected]> * Allow inference test throughput to vary by 10% (NVIDIA#2070) * ci(hotfix): Inference test pipeline Signed-off-by: oliver könig <[email protected]> * chore: Fix autoformatter (NVIDIA#2073) Signed-off-by: oliver könig <[email protected]> * ci(hotfix): Remove iteration-time from t5 Signed-off-by: oliver könig <[email protected]> * ci(hotfix): disable inference test Signed-off-by: oliver könig <[email protected]> * ci(hotfix): Disable inference test Signed-off-by: oliver könig <[email protected]> * ci(hotfix): Bypass approvalbot in merge-queue (NVIDIA#2082) Signed-off-by: oliver könig <[email protected]> * ci(hotfix): Enable merge-group for approval bot Signed-off-by: oliver könig <[email protected]> * chore: Update local tooling (NVIDIA#2066) Signed-off-by: oliver könig <[email protected]> * Add extra RL files (NVIDIA#2077) Co-authored-by: Robert Kirby <[email protected]> Co-authored-by: oliver könig <[email protected]> * Prevent summary jobs from running in forks (NVIDIA#2083) Co-authored-by: oliver könig <[email protected]> * ci: Fix test scope (NVIDIA#2091) Signed-off-by: oliver könig <[email protected]> * ci(hotfix): Remove publish workflows Signed-off-by: oliver könig <[email protected]> * Refactor the attention metadata into separate classes (NVIDIA#2001) Co-authored-by: Siddharth Singh <[email protected]> Co-authored-by: oliver könig <[email protected]> * Guard against incorrectly using MoE prefill graphs (NVIDIA#2030) Co-authored-by: oliver könig <[email protected]> * Revert "Refactor the attention metadata into separate classes (NVIDIA#2001)" This reverts commit a652e2c. * Run mr-slim tests in lightweight-mode (NVIDIA#2106) Signed-off-by: Charlie Truong <[email protected]> * Inference | Lazy compile UVM allocator. (NVIDIA#1977) Co-authored-by: Mcore Bot <[email protected]> Co-authored-by: oliver könig <[email protected]> * chore: Reenable trustees (NVIDIA#2108) Signed-off-by: oliver könig <[email protected]> * Revert "Inference | Lazy compile UVM allocator. (NVIDIA#1977)" This reverts commit 7487c53. * ci(fix): Changeset of copyright checker (NVIDIA#2110) Signed-off-by: oliver könig <[email protected]> * Ko3n1g/chore/update release settings (NVIDIA#2097) Signed-off-by: oliver könig <[email protected]> * Remove unnecessary check on rotary_pos_cos (NVIDIA#2003) Signed-off-by: Keshav Santhanam <[email protected]> * (Reverted) Inference | Lazy compile UVM allocator. (NVIDIA#2125) Co-authored-by: Mcore Bot <[email protected]> Co-authored-by: oliver könig <[email protected]> * Refactor Attention Metadata to Separate Classes (NVIDIA#2112) Co-authored-by: Siddharth Singh <[email protected]> Co-authored-by: oliver könig <[email protected]> * Refactor model_provider to model_builder format for ModelOpt examples (NVIDIA#2107) * wandb Inference stats logging (NVIDIA#2026) Co-authored-by: root <[email protected]> Co-authored-by: William Dykas <[email protected]> Co-authored-by: root <[email protected]> * Make `PipelineParallelLayout` always return str from ` __repr__` (NVIDIA#2055) Signed-off-by: Ananth Subramaniam <[email protected]> Co-authored-by: oliver könig <[email protected]> * Add flash_attn_3 as first option for FA3 import (NVIDIA#2010) Signed-off-by: Keshav Santhanam <[email protected]> Co-authored-by: oliver könig <[email protected]> Co-authored-by: Teodor-Dumitru Ene <[email protected]> * Add debugging hint for case when cudagraphs are created but no matching runner is found (NVIDIA#2129) * ci: LTS container (NVIDIA#2133) Signed-off-by: oliver könig <[email protected]> * Revert "ci: LTS container (NVIDIA#2133)" This reverts commit eb48e81. * Fix param init (NVIDIA#2033) Signed-off-by: Chen Cui <[email protected]> * Hotfix to unit tests on hopper FA3 (NVIDIA#2143) * Add BytesIO to safe_globals (NVIDIA#2074) * add deprecation warning for legacy tokenizer system (NVIDIA#2145) Signed-off-by: dimapihtar <[email protected]> * replay: ci: Bump LTS container (NVIDIA#2157) Signed-off-by: oliver könig <[email protected]> * Hotfix to unit tests on hopper FA3 (bis) (NVIDIA#2179) * Fix has_modelopt_state() for native Torch checkpoint format (NVIDIA#2160) Signed-off-by: Asha Anoosheh <[email protected]> * chore: Remove codeowners (NVIDIA#2175) Signed-off-by: oliver könig <[email protected]> * Fix FP8 inference with sequence parallelism (NVIDIA#2009) Signed-off-by: Keshav Santhanam <[email protected]> Co-authored-by: Teodor-Dumitru Ene <[email protected]> * Replace ModelOpt generation server (NVIDIA#2147) Signed-off-by: Asha Anoosheh <[email protected]> * Add hybrid model support for dynamic inference engine (NVIDIA#1907) Signed-off-by: Keshav Santhanam <[email protected]> Co-authored-by: Teodor-Dumitru Ene <[email protected]> * Async task and event loop safety in Megatron Core (NVIDIA#2025) Co-authored-by: Robert Kirby <[email protected]> * Rename skip_prompt_log_probs (NVIDIA#2181) * Dynamic inference context | UVM only. (NVIDIA#1983) Co-authored-by: Robert Kirby <[email protected]> Co-authored-by: Mcore Bot <[email protected]> Co-authored-by: Teodor-Dumitru Ene <[email protected]> * Update copy-pr-bot.yaml [skip ci] Signed-off-by: oliver könig <[email protected]> * Revert "Dynamic inference context | UVM only. (NVIDIA#1983)" This reverts commit d6979d6. * ci: Run `auto-update-copy-pr-bot` only on forks (NVIDIA#2191) Signed-off-by: oliver könig <[email protected]> * Inference throughput tests: refactor goldens to be in list format (NVIDIA#2072) * Enable TE custom quantization recipe (NVIDIA#2005) Signed-off-by: Evgeny <[email protected]> Signed-off-by: root <Evgeny> Co-authored-by: oliver könig <[email protected]> Co-authored-by: root <Evgeny> * Remove redundant logits calculations in gpt_model --------- Signed-off-by: oliver könig <[email protected]> Signed-off-by: dimapihtar <[email protected]> Signed-off-by: Youngeun Kwon <[email protected]> Signed-off-by: Youngeun <[email protected]> Signed-off-by: Maanu Grover <[email protected]> Signed-off-by: ykarnati <[email protected]> Signed-off-by: Deepak Narayanan <[email protected]> Signed-off-by: Charlie Truong <[email protected]> Signed-off-by: Keshav Santhanam <[email protected]> Signed-off-by: Ananth Subramaniam <[email protected]> Signed-off-by: Chen Cui <[email protected]> Signed-off-by: Asha Anoosheh <[email protected]> Signed-off-by: Evgeny <[email protected]> Signed-off-by: root <Evgeny> Co-authored-by: oliver könig <[email protected]> Co-authored-by: Teodor-Dumitru Ene <[email protected]> Co-authored-by: Dmytro Pykhtar <[email protected]> Co-authored-by: Youngeun Kwon <[email protected]> Co-authored-by: Lawrence McAfee <[email protected]> Co-authored-by: Mcore Bot <[email protected]> Co-authored-by: Maanu Grover <[email protected]> Co-authored-by: Lawrence McAfee <[email protected]> Co-authored-by: AJ Schmidt <[email protected]> Co-authored-by: Yashaswi Karnati <[email protected]> Co-authored-by: Deepak Narayanan <[email protected]> Co-authored-by: helen ngo <[email protected]> Co-authored-by: Robert Kirby <[email protected]> Co-authored-by: kanz-nv <[email protected]> Co-authored-by: Siddharth Singh <[email protected]> Co-authored-by: Charlie Truong <[email protected]> Co-authored-by: Keshav Santhanam <[email protected]> Co-authored-by: Asha Anoosheh <[email protected]> Co-authored-by: wdykas <[email protected]> Co-authored-by: root <[email protected]> Co-authored-by: William Dykas <[email protected]> Co-authored-by: root <[email protected]> Co-authored-by: Ananth Subramaniam <[email protected]> Co-authored-by: Chen Cui <[email protected]> Co-authored-by: Teodor-Dumitru Ene <[email protected]> Co-authored-by: Robert Kirby <[email protected]> Co-authored-by: Evgeny Tsykunov <[email protected]>

* ci: Move test optimizer into its own bucket (NVIDIA#1909) Signed-off-by: oliver könig <[email protected]> * ci: Use matrix for approval-bot Signed-off-by: oliver könig <[email protected]> * ci: Update function name Signed-off-by: oliver könig <[email protected]> * ci: Adjust approval-bot for copy-pr-bot Signed-off-by: oliver könig <[email protected]> * ci: Parametrize workflow Signed-off-by: oliver könig <[email protected]> * ci: Parametrize workflow Signed-off-by: oliver könig <[email protected]> * ci: Remove attribute Signed-off-by: oliver könig <[email protected]> * ci: Update container image tag to use GitHub SHA * chore: Remove file * ci: Fix approval bot Signed-off-by: oliver könig <[email protected]> * ci: Configure cherrypick bot (NVIDIA#1925) Signed-off-by: oliver könig <[email protected]> * Ci approve dev (NVIDIA#1933) Signed-off-by: oliver könig <[email protected]> * ci: Update nightly schedule (NVIDIA#1934) Signed-off-by: oliver könig <[email protected]> * ci: Bump pre-flight for runs on main/dev (NVIDIA#1935) Signed-off-by: oliver könig <[email protected]> * ci: Allow skipping on main (NVIDIA#1936) Signed-off-by: oliver könig <[email protected]> * Ko3n1g/ci/pr template community bot (NVIDIA#1937) * ci: More granular unit tests buckets (NVIDIA#1932) Signed-off-by: oliver könig <[email protected]> * Add sequence packing to RL (NVIDIA#1911) Add sequence packing to RL * chore: Update template (NVIDIA#1939) Signed-off-by: oliver könig <[email protected]> * chore: Add description about who can merge (NVIDIA#1940) Signed-off-by: oliver könig <[email protected]> * Ko3n1g/ci/fix main on eos (NVIDIA#1938) Signed-off-by: oliver könig <[email protected]> * Ko3n1g/ci/internal mrs (NVIDIA#1942) Signed-off-by: oliver könig <[email protected]> * ci: Fix branch of approval bot (NVIDIA#1944) Signed-off-by: oliver könig <[email protected]> * ci: Approvalbot for other branches (NVIDIA#1947) Signed-off-by: oliver könig <[email protected]> * ci(fix): Approval bot (NVIDIA#1949) Signed-off-by: oliver könig <[email protected]> * ci(fix): Approval gate Signed-off-by: oliver könig <[email protected]> * ci: Approval gate rule Signed-off-by: oliver könig <[email protected]> * ci: Update golden values nightly Signed-off-by: oliver könig <[email protected]> * ci: Approval gate Signed-off-by: oliver könig <[email protected]> * ci: Approval bot Signed-off-by: oliver könig <[email protected]> * ci: Sync branches Signed-off-by: oliver könig <[email protected]> * ci: Smaller image Signed-off-by: oliver könig <[email protected]> * ci: Better output Signed-off-by: oliver könig <[email protected]> * ci: sync branches Signed-off-by: oliver könig <[email protected]> * ci: Fix sync bot Signed-off-by: oliver könig <[email protected]> * ci: Finalize Signed-off-by: oliver könig <[email protected]> * ci: Finalize Signed-off-by: oliver könig <[email protected]> * Ko3n1g/ci/sync branches (NVIDIA#1956) Signed-off-by: oliver könig <[email protected]> * ci: Increase time limit for main tests Signed-off-by: oliver könig <[email protected]> * Ko3n1g/ci/add milestone (NVIDIA#1951) Signed-off-by: oliver könig <[email protected]> * Remove M-FSDP testing under LTS environment (NVIDIA#1959) * ci: Run on push to release branch (NVIDIA#1960) Signed-off-by: oliver könig <[email protected]> * ci: Add golden values for inference Signed-off-by: oliver könig <[email protected]> * Fix typo in rl section of CODEOWNERS (NVIDIA#1968) * ci: Update copyright checker (NVIDIA#1973) Signed-off-by: oliver könig <[email protected]> * Ko3n1g/ci/auto reminder GitHub (NVIDIA#1955) Signed-off-by: oliver könig <[email protected]> * ci: Update secret Signed-off-by: oliver könig <[email protected]> * ci(fix): `Run tests` label (NVIDIA#1970) Signed-off-by: oliver könig <[email protected]> * ci(hotfix): Disable tests again Signed-off-by: oliver könig <[email protected]> * ci(hotfix): Add merge-group to copyright check Signed-off-by: oliver könig <[email protected]> * ci(hotfix): Copyright check on merge-queue Signed-off-by: oliver könig <[email protected]> * zarr soft deprecation (NVIDIA#2004) Signed-off-by: dimapihtar <[email protected]> Co-authored-by: oliver könig <[email protected]> * Make `get_asyncio_loop` safe to use repeatedly (NVIDIA#1990) Co-authored-by: oliver könig <[email protected]> * Update symmetric registration interface to sync-up with upstream pytorch change (NVIDIA#1924) Signed-off-by: Youngeun Kwon <[email protected]> Signed-off-by: Youngeun <[email protected]> Co-authored-by: oliver könig <[email protected]> * chore: Update codeowners (NVIDIA#2012) Signed-off-by: oliver könig <[email protected]> * Deduplicate dynamic engine + coordinator. (NVIDIA#1981) Co-authored-by: Mcore Bot <[email protected]> Co-authored-by: oliver könig <[email protected]> * Safely access state dict args in load ckpt (NVIDIA#1957) Signed-off-by: Maanu Grover <[email protected]> * Allow mixed-batch sampling in dynamic inference (NVIDIA#1927) * Stop Nemo_CICD_Test from failing in forks (NVIDIA#2024) * Clean up dynamic inference step (NVIDIA#1992) Co-authored-by: Lawrence McAfee <[email protected]> Co-authored-by: oliver könig <[email protected]> * ci: Auto-update copy-pr-bot vetters (NVIDIA#1850) Signed-off-by: oliver könig <[email protected]> Co-authored-by: AJ Schmidt <[email protected]> * Have datasets account for tokenizers which incorrectly define PAD (NVIDIA#2017) * ci: Enable integration tests (NVIDIA#2023) Signed-off-by: oliver könig <[email protected]> * ci: Fix build-push-wheel workflow (NVIDIA#2022) Signed-off-by: oliver könig <[email protected]> * chore: Update tooling for interactive jobs (NVIDIA#2032) Signed-off-by: oliver könig <[email protected]> * revert(hotfix): ci: trustees_override (NVIDIA#2041) Signed-off-by: oliver könig <[email protected]> * add missing warnings import in model parallel config (NVIDIA#2039) Signed-off-by: ykarnati <[email protected]> * Reduce-scatter implementation with FP32 accumulation (NVIDIA#1967) Signed-off-by: Deepak Narayanan <[email protected]> * ci(fix): Workflows on `main` (NVIDIA#2045) Signed-off-by: oliver könig <[email protected]> * build: Bump modelopt (NVIDIA#2046) Signed-off-by: oliver könig <[email protected]> * Remove TestCaptureFreezeGC unit test. (NVIDIA#1978) * ci: Add multi-approval action (NVIDIA#2051) Signed-off-by: oliver könig <[email protected]> * ci(hotfix): Repair codeowners file * ci(hotfix): Set docs allowed to fail Signed-off-by: oliver könig <[email protected]> * Ko3n1g/ci/test iteration time (NVIDIA#2067) Signed-off-by: oliver könig <[email protected]> * ci(hotfix): Remove performance for ckpt-resume Signed-off-by: oliver könig <[email protected]> * Allow inference test throughput to vary by 10% (NVIDIA#2070) * ci(hotfix): Inference test pipeline Signed-off-by: oliver könig <[email protected]> * chore: Fix autoformatter (NVIDIA#2073) Signed-off-by: oliver könig <[email protected]> * ci(hotfix): Remove iteration-time from t5 Signed-off-by: oliver könig <[email protected]> * ci(hotfix): disable inference test Signed-off-by: oliver könig <[email protected]> * ci(hotfix): Disable inference test Signed-off-by: oliver könig <[email protected]> * ci(hotfix): Bypass approvalbot in merge-queue (NVIDIA#2082) Signed-off-by: oliver könig <[email protected]> * ci(hotfix): Enable merge-group for approval bot Signed-off-by: oliver könig <[email protected]> * chore: Update local tooling (NVIDIA#2066) Signed-off-by: oliver könig <[email protected]> * Add extra RL files (NVIDIA#2077) Co-authored-by: Robert Kirby <[email protected]> Co-authored-by: oliver könig <[email protected]> * Prevent summary jobs from running in forks (NVIDIA#2083) Co-authored-by: oliver könig <[email protected]> * ci: Fix test scope (NVIDIA#2091) Signed-off-by: oliver könig <[email protected]> * ci(hotfix): Remove publish workflows Signed-off-by: oliver könig <[email protected]> * Refactor the attention metadata into separate classes (NVIDIA#2001) Co-authored-by: Siddharth Singh <[email protected]> Co-authored-by: oliver könig <[email protected]> * Guard against incorrectly using MoE prefill graphs (NVIDIA#2030) Co-authored-by: oliver könig <[email protected]> * Revert "Refactor the attention metadata into separate classes (NVIDIA#2001)" This reverts commit a652e2c. * Run mr-slim tests in lightweight-mode (NVIDIA#2106) Signed-off-by: Charlie Truong <[email protected]> * Inference | Lazy compile UVM allocator. (NVIDIA#1977) Co-authored-by: Mcore Bot <[email protected]> Co-authored-by: oliver könig <[email protected]> * chore: Reenable trustees (NVIDIA#2108) Signed-off-by: oliver könig <[email protected]> * Revert "Inference | Lazy compile UVM allocator. (NVIDIA#1977)" This reverts commit 7487c53. * ci(fix): Changeset of copyright checker (NVIDIA#2110) Signed-off-by: oliver könig <[email protected]> * Ko3n1g/chore/update release settings (NVIDIA#2097) Signed-off-by: oliver könig <[email protected]> * Remove unnecessary check on rotary_pos_cos (NVIDIA#2003) Signed-off-by: Keshav Santhanam <[email protected]> * (Reverted) Inference | Lazy compile UVM allocator. (NVIDIA#2125) Co-authored-by: Mcore Bot <[email protected]> Co-authored-by: oliver könig <[email protected]> * Refactor Attention Metadata to Separate Classes (NVIDIA#2112) Co-authored-by: Siddharth Singh <[email protected]> Co-authored-by: oliver könig <[email protected]> * Refactor model_provider to model_builder format for ModelOpt examples (NVIDIA#2107) * wandb Inference stats logging (NVIDIA#2026) Co-authored-by: root <[email protected]> Co-authored-by: William Dykas <[email protected]> Co-authored-by: root <[email protected]> * Make `PipelineParallelLayout` always return str from ` __repr__` (NVIDIA#2055) Signed-off-by: Ananth Subramaniam <[email protected]> Co-authored-by: oliver könig <[email protected]> * Add flash_attn_3 as first option for FA3 import (NVIDIA#2010) Signed-off-by: Keshav Santhanam <[email protected]> Co-authored-by: oliver könig <[email protected]> Co-authored-by: Teodor-Dumitru Ene <[email protected]> * Add debugging hint for case when cudagraphs are created but no matching runner is found (NVIDIA#2129) * ci: LTS container (NVIDIA#2133) Signed-off-by: oliver könig <[email protected]> * Revert "ci: LTS container (NVIDIA#2133)" This reverts commit eb48e81. * Fix param init (NVIDIA#2033) Signed-off-by: Chen Cui <[email protected]> * Hotfix to unit tests on hopper FA3 (NVIDIA#2143) * Add BytesIO to safe_globals (NVIDIA#2074) * add deprecation warning for legacy tokenizer system (NVIDIA#2145) Signed-off-by: dimapihtar <[email protected]> * replay: ci: Bump LTS container (NVIDIA#2157) Signed-off-by: oliver könig <[email protected]> * Hotfix to unit tests on hopper FA3 (bis) (NVIDIA#2179) * Fix has_modelopt_state() for native Torch checkpoint format (NVIDIA#2160) Signed-off-by: Asha Anoosheh <[email protected]> * chore: Remove codeowners (NVIDIA#2175) Signed-off-by: oliver könig <[email protected]> * Fix FP8 inference with sequence parallelism (NVIDIA#2009) Signed-off-by: Keshav Santhanam <[email protected]> Co-authored-by: Teodor-Dumitru Ene <[email protected]> * Replace ModelOpt generation server (NVIDIA#2147) Signed-off-by: Asha Anoosheh <[email protected]> * Add hybrid model support for dynamic inference engine (NVIDIA#1907) Signed-off-by: Keshav Santhanam <[email protected]> Co-authored-by: Teodor-Dumitru Ene <[email protected]> * Async task and event loop safety in Megatron Core (NVIDIA#2025) Co-authored-by: Robert Kirby <[email protected]> * Rename skip_prompt_log_probs (NVIDIA#2181) * Dynamic inference context | UVM only. (NVIDIA#1983) Co-authored-by: Robert Kirby <[email protected]> Co-authored-by: Mcore Bot <[email protected]> Co-authored-by: Teodor-Dumitru Ene <[email protected]> * Update copy-pr-bot.yaml [skip ci] Signed-off-by: oliver könig <[email protected]> * Revert "Dynamic inference context | UVM only. (NVIDIA#1983)" This reverts commit d6979d6. * ci: Run `auto-update-copy-pr-bot` only on forks (NVIDIA#2191) Signed-off-by: oliver könig <[email protected]> * Inference throughput tests: refactor goldens to be in list format (NVIDIA#2072) * Enable TE custom quantization recipe (NVIDIA#2005) Signed-off-by: Evgeny <[email protected]> Signed-off-by: root <Evgeny> Co-authored-by: oliver könig <[email protected]> Co-authored-by: root <Evgeny> * Add MoE parameters to ModelOpt pruning example + conf fixes (NVIDIA#2205) Signed-off-by: Keval Morabia <[email protected]> * Add repr to pg collection class (NVIDIA#2089) Co-authored-by: Jared Casper <[email protected]> * Move `data_samplers.py` from `legacy` to `training.datasets` & add `DistributedSignalHandler` to DataLoader workers (NVIDIA#2068) * Fix Megatron-FSDP checkpoint save failure (NVIDIA#2138) --------- Signed-off-by: oliver könig <[email protected]> Signed-off-by: dimapihtar <[email protected]> Signed-off-by: Youngeun Kwon <[email protected]> Signed-off-by: Youngeun <[email protected]> Signed-off-by: Maanu Grover <[email protected]> Signed-off-by: ykarnati <[email protected]> Signed-off-by: Deepak Narayanan <[email protected]> Signed-off-by: Charlie Truong <[email protected]> Signed-off-by: Keshav Santhanam <[email protected]> Signed-off-by: Ananth Subramaniam <[email protected]> Signed-off-by: Chen Cui <[email protected]> Signed-off-by: Asha Anoosheh <[email protected]> Signed-off-by: Evgeny <[email protected]> Signed-off-by: root <Evgeny> Signed-off-by: Keval Morabia <[email protected]> Co-authored-by: oliver könig <[email protected]> Co-authored-by: Teodor-Dumitru Ene <[email protected]> Co-authored-by: Dmytro Pykhtar <[email protected]> Co-authored-by: Youngeun Kwon <[email protected]> Co-authored-by: Lawrence McAfee <[email protected]> Co-authored-by: Mcore Bot <[email protected]> Co-authored-by: Maanu Grover <[email protected]> Co-authored-by: Lawrence McAfee <[email protected]> Co-authored-by: AJ Schmidt <[email protected]> Co-authored-by: Yashaswi Karnati <[email protected]> Co-authored-by: Deepak Narayanan <[email protected]> Co-authored-by: helen ngo <[email protected]> Co-authored-by: Robert Kirby <[email protected]> Co-authored-by: kanz-nv <[email protected]> Co-authored-by: Siddharth Singh <[email protected]> Co-authored-by: Charlie Truong <[email protected]> Co-authored-by: Keshav Santhanam <[email protected]> Co-authored-by: Asha Anoosheh <[email protected]> Co-authored-by: wdykas <[email protected]> Co-authored-by: root <[email protected]> Co-authored-by: William Dykas <[email protected]> Co-authored-by: root <[email protected]> Co-authored-by: Ananth Subramaniam <[email protected]> Co-authored-by: Chen Cui <[email protected]> Co-authored-by: Teodor-Dumitru Ene <[email protected]> Co-authored-by: Robert Kirby <[email protected]> Co-authored-by: Evgeny Tsykunov <[email protected]> Co-authored-by: Keval Morabia <[email protected]> Co-authored-by: Jared Casper <[email protected]> Co-authored-by: Antoni-Joan Solergibert <[email protected]>

Add extra RL files

d8b0d81

tdene requested a review from a team as a code owner October 31, 2025 20:41

copy-pr-bot bot had a problem deploying to nemo-ci October 31, 2025 20:42 Failure

copy-pr-bot bot temporarily deployed to nemo-ci October 31, 2025 20:42 Inactive

ko3n1g added this to the Core 0.16 milestone Oct 31, 2025

Merge branch 'main' into tde/rl_extra_files

3164c87

copy-pr-bot bot temporarily deployed to nemo-ci October 31, 2025 20:42 Inactive

tdene enabled auto-merge October 31, 2025 20:42

copy-pr-bot bot temporarily deployed to nemo-ci October 31, 2025 20:42 Inactive

copy-pr-bot bot had a problem deploying to nemo-ci October 31, 2025 20:42 Failure

copy-pr-bot bot temporarily deployed to nemo-ci October 31, 2025 20:42 Inactive

copy-pr-bot bot temporarily deployed to test October 31, 2025 20:43 Inactive

copy-pr-bot bot temporarily deployed to public October 31, 2025 20:46 Inactive

ArEsKay3 approved these changes Oct 31, 2025

View reviewed changes

copy-pr-bot bot had a problem deploying to nemo-ci October 31, 2025 20:57 Error

copy-pr-bot bot temporarily deployed to nemo-ci October 31, 2025 21:05 Inactive

tdene disabled auto-merge October 31, 2025 21:06

tdene enabled auto-merge October 31, 2025 21:13

copy-pr-bot bot temporarily deployed to nemo-ci October 31, 2025 21:14 Inactive

copy-pr-bot bot temporarily deployed to nemo-ci October 31, 2025 21:15 Inactive

copy-pr-bot bot temporarily deployed to nemo-ci November 2, 2025 17:49 Inactive

copy-pr-bot bot temporarily deployed to test November 2, 2025 17:50 Inactive

copy-pr-bot bot temporarily deployed to nemo-ci November 2, 2025 17:51 Inactive

copy-pr-bot bot temporarily deployed to nemo-ci November 2, 2025 17:52 Inactive

copy-pr-bot bot temporarily deployed to public November 2, 2025 17:53 Inactive

copy-pr-bot bot temporarily deployed to nemo-ci November 2, 2025 18:08 Inactive

ko3n1g added this pull request to the merge queue Nov 2, 2025

Merged via the queue into NVIDIA:main with commit dc7a0ca Nov 2, 2025
45 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add extra RL files #2077

Add extra RL files #2077

Uh oh!

tdene commented Oct 31, 2025

Uh oh!

ArEsKay3 left a comment

Uh oh!

copy-pr-bot bot commented Oct 31, 2025

Uh oh!

tdene commented Oct 31, 2025

Uh oh!

copy-pr-bot bot commented Oct 31, 2025

Uh oh!

tdene commented Oct 31, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Add extra RL files #2077

Add extra RL files #2077

Uh oh!

Conversation

tdene commented Oct 31, 2025

What does this PR do ?

Contribution process

Pre-checks

Code review

(Step 1): Add PR label Expert Review

(Step 2): Collect the expert reviewers reviews

(Step 3): Final Review

(Optional Step 4): Cherry-pick into release branch

Merging your PR

Uh oh!

ArEsKay3 left a comment

Choose a reason for hiding this comment

Uh oh!

copy-pr-bot bot commented Oct 31, 2025

Uh oh!

tdene commented Oct 31, 2025

Uh oh!

copy-pr-bot bot commented Oct 31, 2025

Uh oh!

tdene commented Oct 31, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

(Step 1): Add PR label `Expert Review`