-
Notifications
You must be signed in to change notification settings - Fork 641
MXFP8 support in Userbuffers #1711
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from 47 commits
Commits
Show all changes
55 commits
Select commit
Hold shift + click to select a range
25bcbda
Initial work toward restoring UB support in te.Sequential
timmoon10 49c6a02
Forward UB linear runs, but has numerical error
timmoon10 4bacf2a
Debug UB forward tests
timmoon10 bd1d50a
Minor tweaks
timmoon10 b8b325b
Remove Python checks for MXFP8 UB linear forward
timmoon10 c8b2c51
Add dim check for MXFP8 full tiles
timmoon10 9f562b6
Move QuantizedTensor logic out of UB comm and into Python helper func…
timmoon10 c7a5e65
Support MXFP8 AGs
timmoon10 0c1a98f
Coalesce NCCL all-gathers for MXFP8 all-gather
timmoon10 4304ddf
Merge branch 'main' into mxfp8-ub-debug
timmoon10 15c34ec
Merge branch 'main' into optimize-wgrad-allgather
timmoon10 0917b20
Initial impl of backward UB linear in te.Sequential
timmoon10 33a3dbb
Merge branch 'optimize-wgrad-allgather' into mxfp8-ub-debug
timmoon10 a86cbb9
Debug UB linear backward with no quantization
timmoon10 3ff2955
Fix chunk dims for dgrad GEMM + dx RS
timmoon10 0942ee3
Debugging MXFP8 UB cases
timmoon10 df59ba0
Use NCCL to overlap dy AG with dgrad GEMM
timmoon10 1047a09
Merge branch 'main' into mxfp8-ub-debug
timmoon10 ffa65bf
Debug UB GEMM tests
timmoon10 8531147
Initial refactoring of linear module forward
timmoon10 0767cea
Refactor linear module backward
timmoon10 1a03c8f
Debug linear module UB tests
timmoon10 0639166
Tweak test tensor dims
timmoon10 6f855aa
Merge branch 'main' into mxfp8-ub-debug
timmoon10 d4ac2ea
Do not store autograd context within wgrad GEMM closure
timmoon10 7803cb0
Fix linter warnings
timmoon10 a8f1ada
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] 46675e5
Update LayerNormLinear
timmoon10 7119346
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] 7df9b74
Update LayerNormMLP
timmoon10 a85a79e
Debug UB tests
timmoon10 13fefeb
Merge branch 'main' into mxfp8-ub-debug
timmoon10 6f7da09
Fix linter warnings
timmoon10 9e16039
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] f717755
Debug test failures
timmoon10 f4544d9
Minor style tweaks
timmoon10 776fbe5
Merge branch 'main' into mxfp8-ub-debug
timmoon10 8c825f7
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] f783b95
Fix incorrect usage for GEMM input with block-scaled FP8
timmoon10 9288e6a
Merge branch 'main' into mxfp8-ub-debug
timmoon10 dfb53ca
Merge branch 'main' into mxfp8-ub-debug
timmoon10 546e02a
Fix RS out dims
timmoon10 8e63f8c
Disable dgrad GEMM + UB AG + NCCL AG overlapping
timmoon10 ab04e50
Merge branch 'main' into mxfp8-ub-debug
timmoon10 9794f91
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] 418aab2
Disable dgrad GEMM + UB AG + NCCL AG overlap in te.Sequential
timmoon10 da753ea
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] fe02a7e
Merge branch 'main' into mxfp8-ub-debug
timmoon10 19459ec
Restore support for internal quantized tensors
timmoon10 0efa1a9
Add tests for MXFP8 GEMM with UB
timmoon10 fb5ca6e
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] 70057c6
Fix linter warnings
timmoon10 7153804
Debug test failures
timmoon10 d1fc045
Debug test failures
timmoon10 6994f29
Merge branch 'main' into mxfp8-ub-debug
timmoon10 File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.