Skip to content

Commit 35d8fd8

Browse files
quic-sanisingsanisingquic-dhirajkuquic-hemagnih
authored
Extend On-Device Sampling Support to more Causal Language Models (#553)
### 📢 Expanded On-Device Sampling Support in QEfficient Excited to share that **On-Device Sampling**—previously available only for `LlamaForCausalLM`—is now supported across a broader set of architectures! This enhancement brings faster, more efficient inference directly to the QAIC device. #### ✅ Newly Supported Architectures: 1. `FalconForCausalLM` 2. `GemmaForCausalLM` 3. `GPT2LMHeadModel` 4. `GPTJForCausalLM` 5. `GraniteForCausalLM` 6. `GraniteMoeForCausalLM` 7. `LlamaForCausalLM` (existing) 8. `MptForCausalLM` 9. `Phi3ForCausalLM` 10. `Qwen2ForCausalLM` #### ⚠️ Architectures Still Pending Support: 1. `GPTBigCodeForCausalLM` 2. `InternVLChatModel` 3. `MistralForCausalLM` 4. `MixtralForCausalLM` 5. `LlamaSwiftKVForCausalLM` 6. `Grok1ModelForCausalLM` We’re actively working to extend support to these models. Contributions, feedback, and testing from the community are always welcome to help accelerate this effort! --------- Signed-off-by: quic-sanising <[email protected]> Signed-off-by: sanising <[email protected]> Signed-off-by: Dhiraj Kumar Sah <[email protected]> Co-authored-by: sanising <[email protected]> Co-authored-by: Dhiraj Kumar Sah <[email protected]> Co-authored-by: Hem Agnihotri <[email protected]>
1 parent efb34ea commit 35d8fd8

File tree

1 file changed

+9
-1
lines changed

1 file changed

+9
-1
lines changed

QEfficient/transformers/models/pytorch_transforms.py

Lines changed: 9 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -676,8 +676,16 @@ class SamplerTransform:
676676

677677
# supported architectures
678678
_module_mapping = {
679-
# Llama
679+
QEffFalconForCausalLM,
680+
QEffGemmaForCausalLM,
681+
QEffGPT2LMHeadModel,
682+
QEffGPTJForCausalLM,
683+
QEffGraniteForCausalLM,
684+
QEffGraniteMoeForCausalLM,
680685
QEffLlamaForCausalLM,
686+
QEffMptForCausalLM,
687+
QEffPhi3ForCausalLM,
688+
QEffQwen2ForCausalLM,
681689
}
682690

683691
@classmethod

0 commit comments

Comments
 (0)