-
Notifications
You must be signed in to change notification settings - Fork 485
Pull requests: NVIDIA/TransformerEngine
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
[ TE-JAX ] Expose cp_strategy argument to DPA api
2.7.0
#2090
opened Aug 19, 2025 by
kocchop
Loading…
7 of 12 tasks
[JAX] Error checking for mesh resource and update GemmPrimitive to use global_mesh_resource().fsdp_resource
2.7.0
#2088
opened Aug 19, 2025 by
jberchtold-nvidia
Loading…
1 of 13 tasks
[Draft] FP8 AllGather in FP8 GroupedGEMM
#2086
opened Aug 19, 2025 by
mingxu1067
Loading…
2 of 13 tasks
[PyTorch] Add test for TRT integration + fix for mxfp8 export
#2083
opened Aug 19, 2025 by
pggPL
Loading…
7 tasks done
[Common] Add checks to CUDA kernel launch and CUDA API calls
#2074
opened Aug 14, 2025 by
yaox12
Loading…
13 tasks
[PyTorch] ONNX export of FP8 Current Scaling
#2068
opened Aug 12, 2025 by
pggPL
Loading…
8 of 13 tasks
Support communication/gemm overlap for [Wgrad->Dgrad] execution order in the bwd pass.
#2065
opened Aug 12, 2025 by
fanshiqing
Loading…
3 of 13 tasks
Support MLA Context Parallel (CP) exchanging latent KV
#2064
opened Aug 12, 2025 by
yuzhongw-nvidia
Loading…
13 tasks
Support PyPI wheel for cuda13
build
Build system
#2057
opened Aug 11, 2025 by
ksivaman
Loading…
2 of 13 tasks
Add better ordering enforcment to split_overlap_rs gemms.
#2056
opened Aug 11, 2025 by
chaseblock
Loading…
6 of 13 tasks
[Draft] Add primary weighs fp8 support for mxfp8
#2055
opened Aug 11, 2025 by
kunlunl
Loading…
13 tasks
[Pytorch] Add Cutlass GroupGEMM Support for fine-grained MoE Model
#2045
opened Aug 8, 2025 by
cassiewilliam
Loading…
1 of 13 tasks
[JAX] Fix layernorm distributed test sharding and collective assertions
#2041
opened Aug 7, 2025 by
jberchtold-nvidia
Loading…
8 of 13 tasks
Offloading support for multiple attention layouts
#2024
opened Aug 3, 2025 by
sanandaraj5597
Loading…
[Draft] Add FP8 attention with current scaling
#2012
opened Jul 30, 2025 by
cyanguwa
Loading…
8 of 13 tasks
Previous Next
ProTip!
Filter pull requests by the default branch with base:main.