-
Notifications
You must be signed in to change notification settings - Fork 1.3k
Pull requests: NVIDIA/TensorRT-LLM
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
doc: Update perf-benchmarking doc on GPU configuration for consistent benchmarking.
#3458
opened Apr 10, 2025 by
bobboli
Loading…
fix: nvbugs/5187237 nvbugs/5112075: fix deterministic mode error
#3448
opened Apr 10, 2025 by
VALLIS-NERIA
•
Draft
feat: Update fmha kernels to support Eagle3 on blackwell.
#3446
opened Apr 10, 2025 by
PerkzZheng
Loading…
infra: Support auto trigger test stage for special file change.
#3443
opened Apr 10, 2025 by
ZhanruiSunCh
Loading…
test: move mistral / mixtral test cases in QA test list into the new accuracy test suite
#3440
opened Apr 10, 2025 by
crazydemo
Loading…
2 tasks
perf: Eliminate the need for attention DP padding when possible
Community Engagement
Community want to contribute
#3439
opened Apr 10, 2025 by
jinyangyuan-nvidia
Loading…
feat: Add group_rms_norm kernel to normalize multiple inputs in a single operator.
#3438
opened Apr 10, 2025 by
SimengLiu-nv
•
Draft
feat: replace MLA context (192x128 packed) fmha from ampere-style to hopper-style
#3436
opened Apr 10, 2025 by
zhou-yuxin
•
Draft
infra: always trigger multi gpu test to protect llama and deepseek
#3434
opened Apr 10, 2025 by
QiJune
Loading…
fix: Fixing issue with first gen token being returned twice in streaming
#3427
opened Apr 9, 2025 by
pcastonguay
Loading…
Previous Next
ProTip!
Type g i on any issue or pull request to go back to the issue listing page.