| title | Models |
|---|---|
| description | Browse 100+ open-source models available on DeepInfra. |
| icon | brain |
DeepInfra hosts a large number of the most popular machine learning models. You can find the full list here, conveniently split into categories based on their functionality.
We are constantly adding more. DeepInfra is usually amongst the first to add a new model once it is available, and offers the best prices for open-source model inference.
- Text generation / LLMs — Llama, DeepSeek, Mistral, Qwen, Gemma, and more
- Embeddings — Qwen3 Embedding, BAAI/bge, sentence-transformers, and more
- Rerankers — Cross-encoder rerankers for RAG pipelines
- Vision / multimodal — Qwen2.5-VL, Llama Vision, and more
- OCR — Specialized models for document text extraction
- Text to image — FLUX, Stable Diffusion, and more
- Text to video — Generate video clips from text prompts
- Text to speech — Convert text to natural-sounding audio
- Speech recognition — Whisper and other ASR models
Each model has a dedicated page where you can:
- Try it out interactively
- See its API documentation
- Grab ready-to-use code examples
We also support deploying custom models on DeepInfra infrastructure. Run your own fine-tuned or trained-from-scratch LLM on dedicated A100/H100/H200/B200/B300 GPUs.
Some models have more than one version available. You can infer against a particular version using {"model": "MODEL_NAME:VERSION", ...} format.
You can also infer against a deploy_id using {"model": "deploy_id:DEPLOY_ID", ...}. This is especially useful for Custom LLMs — you can start inferring before the deployment finishes and before you have the model name + version pair.
If you think there is a model that we should run, let us know at info@deepinfra.com. We read every email.