Add HF_MODEL to load models directly from huggingface #17801

yieldthought · 2025-02-10T19:47:45Z

Problem description

Downloading the weights for models is so 2019. We just want to do things like set HF_MODEL=deepseek-ai/DeepSeek-R1-Distill-Qwen-7B and run the demo and have it work first time.

What's changed

Add HF_MODEL as an alternative to setting LLAMA_DIR:

Loads the model from HuggingFace directly using their organisation/model-name format
Creates tenstorrent cache tensor files in the existing LLAMA_CACHE_PATH if you set it, otherwise in model_cache/$HF_MODEL
Tested with mistralai/Mistral-7B-Instruct-v0.3 and works out-of-the-box.
Work around as_tensor issue to enable models with bias to run again on N150
Generalise 2d matmul in0_block_w selection
With these changes even more HF models run out-of-the-box!

Checklist

All post commit CI passes
Model regression CI passes (single-card, t3k hanging in main)

mtairum

Looks good

mtairum · 2025-02-11T14:43:15Z

Double check the accuracy numbers, the pipeline was failing due to that.

mtairum · 2025-02-11T17:50:49Z

Rebased and updated Perf.md.

Re-running:

yieldthought requested review from cglagovichTT, mtairum and uaydonat as code owners February 10, 2025 19:47

mtairum approved these changes Feb 11, 2025

View reviewed changes

yieldthought and others added 5 commits February 11, 2025 16:55

#0: Add HF_MODEL to load models directly from huggingface

622610c

#0: Fix Qwen on N150 using old reshape syntax, fix in0_block_w for 2d mm

5f8a026

#0: Support HF models with text accuracy test

b256793

#0: Removed unused import

9b76f7e

#0: Update PERF.md

9adc481

mtairum force-pushed the yieldthought/hf-url branch from 6fb0f58 to 9adc481 Compare February 11, 2025 17:47

mtairum merged commit d2f0b15 into main Feb 12, 2025
217 of 218 checks passed

mtairum deleted the yieldthought/hf-url branch February 12, 2025 03:25

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add HF_MODEL to load models directly from huggingface #17801

Add HF_MODEL to load models directly from huggingface #17801

yieldthought commented Feb 10, 2025 •

edited

Loading

mtairum left a comment

mtairum commented Feb 11, 2025

mtairum commented Feb 11, 2025 •

edited

Loading

Add HF_MODEL to load models directly from huggingface #17801

Add HF_MODEL to load models directly from huggingface #17801

Conversation

yieldthought commented Feb 10, 2025 • edited Loading

Problem description

What's changed

Checklist

mtairum left a comment

Choose a reason for hiding this comment

mtairum commented Feb 11, 2025

mtairum commented Feb 11, 2025 • edited Loading

yieldthought commented Feb 10, 2025 •

edited

Loading

mtairum commented Feb 11, 2025 •

edited

Loading