-
Notifications
You must be signed in to change notification settings - Fork 80
Pull requests: quic/efficient-transformers
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
First Block Caching Infra for diffusers
Diffusers
Use for PR related to diffusers in efficient-transformers.
#941
opened Apr 24, 2026 by
quic-amitraj
Contributor
Loading…
feat(moe): NSP-blocked expert dispatch for Qwen3MOE and GPT-OSS prefill
enhancement
New feature or request
#935
opened Apr 21, 2026 by
vbaddi
Contributor
Loading…
updated blocking in diffusers with cross attention check instead of SL
#932
opened Apr 21, 2026 by
tv-karthikeya
Contributor
Loading…
CB Bug fix for Qwen3VL Dense and basic cleaning of example script and Model File
#926
opened Apr 20, 2026 by
qcdipankar
Contributor
Loading…
Enabling support of rerankers models 2B and 8B of qwen3vl
#921
opened Apr 18, 2026 by
quic-amitraj
Contributor
Loading…
Removed redundancies from QEFFHybridCache and QEFFHybridChunkedCache
#914
opened Apr 13, 2026 by
quic-mamta
Contributor
•
Draft
revert(export): Revert proxy-only ONNX transform gating and restore default export behavior
1.21.0
#912
opened Apr 10, 2026 by
vbaddi
Contributor
Loading…
feat: Enable benchmark-mode module inventory/export across all CausalLM architectures
enhancement
New feature or request
#906
opened Apr 3, 2026 by
vbaddi
Contributor
Loading…
Merge ft_experimental_v1 branch to main
fine-tuning
ready for review
#887
opened Mar 25, 2026 by
quic-akuruvil
Contributor
Loading…
Undo deepstack_features based changes for Qwen3VL and Qwen3VL_MOE models
#869
opened Mar 18, 2026 by
quic-dhirajku
Contributor
•
Draft
MLA : update attention in fused_forward, head blocking and add prefillonly transform
#857
opened Mar 16, 2026 by
quic-mamta
Contributor
Loading…
Add support for num_crops and valid_size from vLLM
#796
opened Feb 17, 2026 by
quic-vargupt
Contributor
Loading…
Previous Next
ProTip!
Find all pull requests that aren't related to any open issues with -linked:issue.