Skip to content

Ollama provider for local inference#322

Merged
gsabran merged 10 commits intogetcmd-dev:mainfrom
knox:ollama-provider
Jan 10, 2026
Merged

Ollama provider for local inference#322
gsabran merged 10 commits intogetcmd-dev:mainfrom
knox:ollama-provider

Conversation

@knox
Copy link
Contributor

@knox knox commented Dec 21, 2025

This introduces an Ollama provider for local inference.

Ollama can be run locally or on a remote endpoint, e.g. a host providing AI in a private network.

This implementation makes use of the Ollama Provider V2 for Vercel AI SDK for the AI interactions and plain HTTP to retrieve available models and their details from a given Ollama endpoint.

It was tested succesfully with the following models:

  • qwen3-coder:30b
  • devstral-small-2:24b
  • deepseek-r1:14b
  • qwen2.5-coder:7b

What's missing here is making capable models available for code completion and maybe filtering models which are not capable of chatting from the user.

@gsabran
Copy link
Collaborator

gsabran commented Dec 22, 2025

I skimmed through and this looks in good shape. I'll be a bit slow to review and test over the winter break.

@gsabran
Copy link
Collaborator

gsabran commented Dec 22, 2025

Thanks a lot for making this change, this is exciting

Copy link
Collaborator

@gsabran gsabran left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks really good. I tested it and after figuring out the Ollama configuration it seems to work.

I think adding a page to cmd's doc portal on how to do the configuration (ie you need to load the models in Ollama first + pointer to how to do this) would help. Should probably suggest restarting cmd when new models are added to Ollama, or to add a sync button on the AI provider setting to reload the settings (the later would be better obviously, but fine to leave out of scope as well)


extension AIModel {
public var supportsCompletion: Bool {
static func modelSupportsCompletion(id: String) -> Bool {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: the getter feels more ergonomic

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the computed property supportsCompletion got refactored into a value property, see L139. the extension functions in L164 are left overs from the previous logic to derive model features from their name. in my understanding, everything about model features should be handled in local server instead but i decided to keep this out of scope for this pr.

name: "Ollama",
executableName: "ollama",
defaultBaseUrl: URL(string: "http://localhost:11434")!,
installationInstructions: URL(string: "https://docs.ollama.com/quickstart")!,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should add an entry in cmd's doc, that describe how to configure Ollama with cmd, and from where we can link to https://docs.ollama.com/quickstart

docs are in /docs

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

docs added in 5fbe3ba
but i left this link here as is. it's the same pattern as with the external agents, linking to the external docs directly.

*/
private async fetchModelDetails(baseUrl: string, modelName: string): Promise<OllamaModelDetails | null> {
try {
const response = await fetch(`${baseUrl}/show`, {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: could we add a pointer to the API ref https://github.com/ollama/ollama/blob/main/docs/api.md#show-model-information in a comment?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added in 6eab883

"general.parameter_count"?: number
"general.size_label"?: string
"general.license"?: string
[key: string]: string | number | null | undefined
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit:

Suggested change
[key: string]: string | number | null | undefined
// Ollama's API is not fully typed and some parameters get scoped keys such as `qwen3.context_length` / `llama.context_length`
// For this reason this property is a catch all from where the relevant values will be extracted.
[key: string]: string | number | null | undefined

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added in 6eab883


// Extract context length from model_info
function extractContextLength(modelInfo: Record<string, string | number | null | undefined>): number {
// Search for any key ending with ".context_length"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// Search for any key ending with ".context_length"
// Search for any key ending with ".context_length". This is because Ollama uses scoped keys (e.g. `qwen3.context_length` / `llama.context_length`)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added in 6eab883

contributing.md Outdated
export GROQ_LOCAL_SERVER_PROXY="http://localhost:10004/openai/v1"
export GEMINI_LOCAL_SERVER_PROXY="http://localhost:10005/v1beta"
export GITHUB_COPILOT_PROXY="http://localhost:9090"
export OLLAMA_LOCAL_SERVER_PROXY="http://localhost:1006"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: 10006 to remain consistent.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed in 6eab883

@knox
Copy link
Contributor Author

knox commented Dec 24, 2025

This looks really good.

Thanks for review. I will pick up on your feedback the other day.

@knox
Copy link
Contributor Author

knox commented Dec 29, 2025

I think adding a page to cmd's doc portal on how to do the configuration

i agree that we should add docs on how to configure the new provider and i will prepare something.

(ie you need to load the models in Ollama first + pointer to how to do this) would help.

this seems to be a bit of a misunderstanding. ollama will load the respective model automatically when it receives a completion request.

the prerequisites to use this provider with cmd are quite simple: install ollama, make sure it's running (default install will setup autostart), install one or more models. i will explain this in the docs.

Should probably suggest restarting cmd when new models are added to Ollama, or to add a sync button on the AI provider setting to reload the settings (the later would be better obviously, but fine to leave out of scope as well)

in fact, there already is an easy way to reload models, i believe. but it's not very obvious to the users. by disabling and enabling the provider again, the model discovery gets triggered and a current list of models gets retrieved.

in my humble opinion, the whole provider and model settings could benefit from a good revisit. i just did the minimum for the new provider to fit into the existing mechanics.

Copy link
Collaborator

@gsabran gsabran left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot, and sorry for the slow review

@knox
Copy link
Contributor Author

knox commented Jan 10, 2026

Thanks for review. 👍

How to fulfill the pending required "Mintlify Deployment" check?

@gsabran gsabran merged commit 1fbe779 into getcmd-dev:main Jan 10, 2026
13 checks passed
@gsabran
Copy link
Collaborator

gsabran commented Jan 10, 2026

🤷‍♂️ not sure, merged!

@knox knox deleted the ollama-provider branch January 10, 2026 19:48
@gsabran
Copy link
Collaborator

gsabran commented Jan 12, 2026

This should be included in the new release

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants