-
Notifications
You must be signed in to change notification settings - Fork 150
feat(ollama): migrate ollama to native api #1075
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
📝 WalkthroughSummary by CodeRabbit
✏️ Tip: You can customize this high-level summary in your review settings. WalkthroughAdds native Ollama provider integration: types, bidirectional converters, message/tool/image utilities, request/response pools, streaming (newline-delimited JSON), completeRequest error handling, embedding and chat endpoints (/api/chat, /api/embed, /api/tags), and tests for local and cloud scenarios. Changes
Sequence Diagram(s)sequenceDiagram
participant Client
participant OllamaProvider
participant OllamaAPI
participant ResponsePool
Client->>OllamaProvider: ChatCompletion(bifrostReq)
OllamaProvider->>OllamaProvider: ToOllamaChatRequest(convert request)
OllamaProvider->>ResponsePool: Acquire response object
OllamaProvider->>OllamaAPI: POST /api/chat (JSON)
OllamaAPI-->>OllamaProvider: OllamaChatResponse (or NDJSON stream)
OllamaProvider->>OllamaProvider: ToBifrostChatResponse(convert & enrich)
OllamaProvider->>ResponsePool: Release response object
OllamaProvider-->>Client: BifrostChatResponse
sequenceDiagram
participant Client
participant OllamaProvider
participant OllamaAPI
participant StreamParser
Client->>OllamaProvider: ChatCompletionStream(bifrostReq)
OllamaProvider->>OllamaProvider: ToOllamaChatRequest(convert request)
OllamaProvider->>OllamaAPI: POST /api/chat (streaming)
OllamaAPI-->>StreamParser: Newline-delimited JSON stream
loop per JSON line
StreamParser->>OllamaProvider: OllamaStreamResponse
OllamaProvider->>OllamaProvider: ToBifrostStreamResponse(convert chunk)
OllamaProvider->>Client: BifrostStream (delta + metadata)
end
sequenceDiagram
participant Client
participant OllamaProvider
participant OllamaAPI
Client->>OllamaProvider: Embedding(bifrostReq)
OllamaProvider->>OllamaProvider: ToOllamaEmbeddingRequest(convert request)
OllamaProvider->>OllamaAPI: POST /api/embed
OllamaAPI-->>OllamaProvider: OllamaEmbeddingResponse
OllamaProvider->>OllamaProvider: ToBifrostEmbeddingResponse(convert & enrich)
OllamaProvider-->>Client: BifrostEmbeddingResponse
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45–75 minutes
Poem
Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
📜 Recent review detailsConfiguration used: Organization UI Review profile: CHILL Plan: Pro 📒 Files selected for processing (8)
🚧 Files skipped from review as they are similar to previous changes (1)
🧰 Additional context used📓 Path-based instructions (1)**⚙️ CodeRabbit configuration file
Files:
🧠 Learnings (3)📚 Learning: 2025-12-09T17:07:42.007ZApplied to files:
📚 Learning: 2025-12-19T09:26:54.961ZApplied to files:
📚 Learning: 2025-12-15T10:16:21.909ZApplied to files:
🧬 Code graph analysis (4)core/providers/ollama/models.go (2)
core/providers/ollama/ollama_test.go (4)
core/providers/ollama/chat.go (3)
core/providers/ollama/types.go (2)
🔇 Additional comments (32)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
2789db7 to
5608aa4
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 8
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (3)
ui/lib/schemas/providerForm.ts (1)
201-243: Schema incorrectly mandates Ollama API keys, contradicting configuration intent.The config at
ui/lib/constants/config.ts(line 64) explicitly setsisKeyRequiredByProvider.ollama = false, indicating keys should be optional for Ollama. However, the schema logic at line 222 makeskeysRequired = truefor Ollama (since it's not "custom" and not in the["sgl"]exempt list), forcing at least one non-empty key via lines 224–240. This prevents local Ollama usage and contradicts the stated design.Additionally,
utils.tsnormalizes provider names with.toLowerCase(), butproviderForm.tsdoes not, creating a case-sensitivity mismatch that could cause validation bugs.Suggested fix:
- const keysRequired = data.selectedProvider === "custom" || !["sgl"].includes(data.selectedProvider); + const provider = data.selectedProvider.trim().toLowerCase(); + const keysRequired = provider === "custom" || !["sgl", "ollama"].includes(provider);This aligns schema validation with configuration and normalizes casing for consistency.
core/providers/ollama/ollama_test.go (2)
12-75: Deferclient.Shutdown()to ensure cleanup ont.Fatal/FailNow.client, ctx, cancel, err := testutil.SetupTest() if err != nil { t.Fatalf("Error initializing test setup: %v", err) } defer cancel() + defer client.Shutdown() @@ - client.Shutdown() }
77-133: Same cleanup issue inTestOllamaCloud: makeclient.Shutdown()a defer.client, ctx, cancel, err := testutil.SetupTest() if err != nil { t.Fatalf("Error initializing test setup: %v", err) } defer cancel() + defer client.Shutdown() @@ - client.Shutdown() }
🧹 Nitpick comments (6)
core/providers/ollama/chat.go (2)
237-265: Base64 detection: consider supporting unpadded base64 and avoiding obvious false positives.
UsingStdEncoding.DecodeStringrejects unpadded base64 (common) and can accept some “random” strings that happen to decode.func isBase64(s string) bool { if len(s) < 4 { return false } - _, err := base64.StdEncoding.DecodeString(s) - return err == nil + // Try padded and raw (unpadded) forms. + if _, err := base64.StdEncoding.DecodeString(s); err == nil { + return true + } + if _, err := base64.RawStdEncoding.DecodeString(s); err == nil { + return true + } + return false }
296-362: Reverse conversion isn’t symmetric (dropskeep_alive,format, and many options).
If this method is intended for true “passthrough/reverse conversion”, it should round-trip the fields you set inToOllamaChatRequest(or the comment should narrow the promise).core/providers/ollama/types.go (4)
166-182: Consider consolidating duplicate types.
OllamaStreamResponseis identical toOllamaChatResponse. Consider using a type alias to reduce duplication:// OllamaStreamResponse is the same structure as OllamaChatResponse used during streaming. type OllamaStreamResponse = OllamaChatResponseAlternatively, if distinct types are intentional for clarity, the current approach is acceptable.
214-225: Silently ignored error on JSON marshal.The error from
json.Marshalis discarded. While marshaling amap[string]interface{}rarely fails, if it does, theArgumentsfield will contain"null"instead of valid JSON, which could cause downstream parsing issues.Consider logging or handling the error:
- args, _ := json.Marshal(tc.Function.Arguments) + args, err := json.Marshal(tc.Function.Arguments) + if err != nil { + args = []byte("{}") + }
320-331: Silently ignored error on JSON marshal (same issue as line 215).Same concern as the non-streaming conversion - consider handling the error consistently.
357-370: Inconsistent finish reason mapping compared tomapFinishReason.This inline switch doesn't handle
"load"and"unload"cases likemapFinishReasondoes (lines 271-272). Consider extracting a shared helper to ensure consistent behavior:- if r.Done { - if r.DoneReason != nil { - switch *r.DoneReason { - case "stop": - choice.FinishReason = schemas.Ptr("stop") - case "length": - choice.FinishReason = schemas.Ptr("length") - default: - choice.FinishReason = schemas.Ptr("stop") - } - } else { - choice.FinishReason = schemas.Ptr("stop") - } - } + if r.Done { + choice.FinishReason = mapDoneReasonToFinishReason(r.DoneReason, r.Done) + }Then create a shared helper function that both methods can use.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (8)
core/providers/ollama/chat.go(1 hunks)core/providers/ollama/embedding.go(1 hunks)core/providers/ollama/models.go(1 hunks)core/providers/ollama/ollama.go(5 hunks)core/providers/ollama/ollama_test.go(3 hunks)core/providers/ollama/types.go(1 hunks)ui/app/workspace/providers/views/utils.ts(1 hunks)ui/lib/schemas/providerForm.ts(1 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**
⚙️ CodeRabbit configuration file
always check the stack if there is one for the current PR. do not give localized reviews for the PR, always see all changes in the light of the whole stack of PRs (if there is a stack, if there is no stack you can continue to make localized suggestions/reviews)
Files:
core/providers/ollama/chat.gocore/providers/ollama/embedding.goui/app/workspace/providers/views/utils.tscore/providers/ollama/ollama_test.gocore/providers/ollama/types.gocore/providers/ollama/models.goui/lib/schemas/providerForm.tscore/providers/ollama/ollama.go
🧠 Learnings (1)
📚 Learning: 2025-12-09T17:07:42.007Z
Learnt from: qwerty-dvorak
Repo: maximhq/bifrost PR: 1006
File: core/schemas/account.go:9-18
Timestamp: 2025-12-09T17:07:42.007Z
Learning: In core/schemas/account.go, the HuggingFaceKeyConfig field within the Key struct is currently unused and reserved for future Hugging Face inference endpoint deployments. Do not flag this field as missing from OpenAPI documentation or require its presence in the API spec until the feature is actively implemented and used. When the feature is added, update the OpenAPI docs accordingly; otherwise, treat this field as non-breaking and not part of the current API surface.
Applied to files:
core/providers/ollama/chat.gocore/providers/ollama/embedding.gocore/providers/ollama/ollama_test.gocore/providers/ollama/types.gocore/providers/ollama/models.gocore/providers/ollama/ollama.go
🧬 Code graph analysis (3)
core/providers/ollama/types.go (4)
core/schemas/models.go (1)
Model(109-129)core/schemas/chatcompletions.go (6)
BifrostChatResponse(26-41)ChatAssistantMessageToolCall(564-570)ChatAssistantMessageToolCallFunction(573-576)ChatAssistantMessage(541-545)BifrostResponseChoice(582-590)BifrostLLMUsage(640-647)core/schemas/bifrost.go (1)
BifrostResponseExtraFields(285-294)core/schemas/embedding.go (2)
BifrostEmbeddingResponse(22-28)EmbeddingData(118-122)
core/providers/ollama/models.go (2)
core/schemas/models.go (1)
Model(109-129)core/providers/ollama/types.go (1)
OllamaModel(140-147)
ui/lib/schemas/providerForm.ts (1)
ui/app/workspace/providers/views/utils.ts (1)
keysRequired(1-1)
🔇 Additional comments (16)
ui/app/workspace/providers/views/utils.ts (1)
1-1: Backend and frontend schemas confirm onlysglshould be keyless—ollamakey requirement is correct.The code aligns with the backend validation schema in
ui/lib/schemas/providerForm.ts, which consistently exempts onlysglfrom the keys requirement. Both the backend and frontend use identical logic:keysRequired = custom || !["sgl"].includes(provider). The commit message ("Refactor: update UI schema to enforce sgLang key requirements") indicates this was intentional for sgl, with no evidence thatollamawas ever intended to be keyless.Minor suggestion (non-blocking): normalize provider casing once to avoid double
toLowerCase().core/providers/ollama/embedding.go (1)
9-55: Forward embedding conversion looks solid (incl. nil-safety and conditional Options).core/providers/ollama/models.go (3)
9-21:ToOllamaModelis fine (and the “future use” note is helpful).
39-67:GetModelInfoformatting is clean and nicely conditional.
23-37: Keepm.Nameforschemas.Model.ID—it is the canonical field in Ollama's official /api/tags API.The official Ollama documentation specifies
models[].nameas the model identifier field in /api/tags responses. Themodelfield is a non-standard duplicate that appears in some third-party docs and builds, not in the canonical specification. Usingm.Nameensures stability across Ollama versions and aligns with the official API contract. The current implementation is correct.Likely an incorrect or invalid review comment.
core/providers/ollama/chat.go (1)
14-154: Request mapping is comprehensive; response-format + tool mapping are good.core/providers/ollama/ollama.go (2)
42-83: Pooling helpers are straightforward and safe (reset + nil-guard).
98-120: FYI: per guidelines—if PR #1075 is part of a PR stack, double-check ordering/dependencies.
[coding_guidelines] says to consider the whole stack “if there is one”; I can’t see that context here.core/providers/ollama/types.go (8)
1-10: LGTM!Package declaration and imports are appropriate for the type definitions and conversion utilities.
12-92: LGTM!Request types are well-structured with comprehensive documentation and appropriate JSON tags. The
OllamaOptionsstruct covers a thorough set of Ollama-specific parameters.
94-109: LGTM!Response type accurately represents Ollama's chat completion response structure with appropriate optional fields.
111-130: LGTM!Embedding types correctly represent Ollama's embedding API, with
Inputasinterface{}to accommodate both single strings and string arrays.
132-157: LGTM!List models types accurately represent Ollama's
/api/tagsendpoint response structure.
257-276: LGTM!The finish reason mapping handles Ollama-specific values appropriately, with sensible defaults for edge cases.
278-291: LGTM!Usage conversion is straightforward. Note that this always returns a non-nil
BifrostLLMUsageeven when token counts are unavailable (defaulting to zeros), which is a reasonable design choice.
391-432: LGTM!Embedding response conversion correctly handles the float64 to float32 conversion and appropriately sets usage for embedding requests (prompt tokens only).
5608aa4 to
9055732
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (2)
core/providers/ollama/utils.go (1)
215-224: Hardcoded JPEG MIME type may be incorrect for PNG images.When converting images from Ollama back to Bifrost, the code always uses
 regardless of the actual image format. If the original image was PNG, this produces an incorrect data URL.🔎 Suggested improvement
Consider detecting the image type from the base64 magic bytes or preserving the original format. For now, this is a minor issue since the base64 data itself is correct:
// Add images for _, img := range msg.Images { - dataURL := "data:image/jpeg;base64," + img + // Detect image type from base64 magic bytes + mimeType := "image/jpeg" // Default fallback + if len(img) >= 4 { + // PNG starts with iVBORw (decoded: 0x89 0x50 0x4E 0x47) + if strings.HasPrefix(img, "iVBORw") { + mimeType = "image/png" + } + } + dataURL := "data:" + mimeType + ";base64," + imgcore/providers/ollama/ollama.go (1)
514-521: Scanner error handling could include more context.When
scanner.Err()returns an error, it's logged and processed, but it might be helpful to know if it was due to buffer overflow (which could happen with very large responses despite the 10MB limit).🔎 Suggested improvement
if err := scanner.Err(); err != nil { - provider.logger.Warn(fmt.Sprintf("Error reading Ollama stream: %v", err)) + provider.logger.Warn(fmt.Sprintf("Error reading Ollama stream (buffer limit: 10MB): %v", err))
📜 Review details
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (6)
core/providers/ollama/chat.go(1 hunks)core/providers/ollama/embedding.go(1 hunks)core/providers/ollama/ollama.go(5 hunks)core/providers/ollama/types.go(1 hunks)core/providers/ollama/utils.go(1 hunks)core/providers/ollama/utils_test.go(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- core/providers/ollama/chat.go
🧰 Additional context used
📓 Path-based instructions (1)
**
⚙️ CodeRabbit configuration file
always check the stack if there is one for the current PR. do not give localized reviews for the PR, always see all changes in the light of the whole stack of PRs (if there is a stack, if there is no stack you can continue to make localized suggestions/reviews)
Files:
core/providers/ollama/embedding.gocore/providers/ollama/utils_test.gocore/providers/ollama/ollama.gocore/providers/ollama/utils.gocore/providers/ollama/types.go
🧠 Learnings (3)
📚 Learning: 2025-12-09T17:07:42.007Z
Learnt from: qwerty-dvorak
Repo: maximhq/bifrost PR: 1006
File: core/schemas/account.go:9-18
Timestamp: 2025-12-09T17:07:42.007Z
Learning: In core/schemas/account.go, the HuggingFaceKeyConfig field within the Key struct is currently unused and reserved for future Hugging Face inference endpoint deployments. Do not flag this field as missing from OpenAPI documentation or require its presence in the API spec until the feature is actively implemented and used. When the feature is added, update the OpenAPI docs accordingly; otherwise, treat this field as non-breaking and not part of the current API surface.
Applied to files:
core/providers/ollama/embedding.gocore/providers/ollama/utils_test.gocore/providers/ollama/ollama.gocore/providers/ollama/utils.gocore/providers/ollama/types.go
📚 Learning: 2025-12-19T09:26:54.961Z
Learnt from: qwerty-dvorak
Repo: maximhq/bifrost PR: 1006
File: core/providers/utils/utils.go:1050-1051
Timestamp: 2025-12-19T09:26:54.961Z
Learning: Update streaming end-marker handling so HuggingFace is treated as a non-[DONE] provider for backends that do not emit a DONE marker (e.g., meta llama on novita). In core/providers/utils/utils.go, adjust ProviderSendsDoneMarker() (or related logic) to detect providers that may not emit DONE and avoid relying on DONE as the sole end signal. Add tests to cover both DONE-emitting and non-DONE backends, with clear documentation in code comments explaining the rationale and any fallback behavior.
Applied to files:
core/providers/ollama/embedding.gocore/providers/ollama/utils_test.gocore/providers/ollama/ollama.gocore/providers/ollama/utils.gocore/providers/ollama/types.go
📚 Learning: 2025-12-15T10:16:21.909Z
Learnt from: qwerty-dvorak
Repo: maximhq/bifrost PR: 1006
File: core/providers/huggingface/huggingface_test.go:12-63
Timestamp: 2025-12-15T10:16:21.909Z
Learning: In provider tests under core/providers/<provider>/*_test.go, do not require or flag the use of defer for Shutdown(); instead call client.Shutdown() at the end of each test function. This pattern appears consistent across all provider tests. Apply this rule only within this path; for other tests or resources, defer may still be appropriate.
Applied to files:
core/providers/ollama/utils_test.go
🧬 Code graph analysis (1)
core/providers/ollama/utils_test.go (2)
core/schemas/chatcompletions.go (6)
ChatMessageRoleAssistant(419-419)ChatAssistantMessage(541-545)ChatAssistantMessageToolCall(564-570)ChatAssistantMessageToolCallFunction(573-576)ChatMessageRoleTool(422-422)ChatMessageRoleUser(420-420)core/providers/ollama/types.go (3)
OllamaMessage(28-35)OllamaToolCall(38-40)OllamaToolCallFunction(43-46)
🔇 Additional comments (23)
core/providers/ollama/embedding.go (3)
9-54: LGTM! Clean Bifrost to Ollama embedding request conversion.The function correctly handles nil checks, maps the model, and extracts text/texts from the Bifrost
EmbeddingInput. Extra parameters (truncate,keep_alive,num_ctx) are properly transferred to Ollama options.
81-94: The[]interface{}case is now properly handled.The implementation correctly iterates through the slice, validates each element is a string, and breaks early if any element fails type assertion. This addresses the previous review concern about JSON-roundtripped shapes.
101-115: Extra params mapping back to Bifrost looks correct.The condition on line 102 properly checks all three fields before creating the params struct, avoiding unnecessary allocations.
core/providers/ollama/utils_test.go (4)
9-80: Comprehensive test coverage forextractBase64Image.Good coverage of data URLs (JPEG, PNG), raw base64, HTTP/HTTPS URLs, empty strings, and malformed data URLs. The test cases align well with the expected behavior documented in the implementation.
155-269: Excellent test coverage for tool call conversion semantics.The tests correctly verify critical Ollama-specific behavior:
- Tool calls only appear on assistant messages
- Tool responses use
tool_namefor correlation (nottool_call_id)- The mapping from
ToolCallIDto function name is validatedThis ensures the conversion correctly handles Ollama's native tool call semantics.
271-300: Good edge case coverage for tool response without prior assistant message.This test validates the fallback behavior where
Namefield is used when no prior tool call exists to map from. This is important for robustness.
400-481: Round-trip conversion tests look correct.The
convertMessagesFromOllamatests properly verify:
- Assistant messages with tool calls are reconstructed correctly
- Tool response messages map
tool_nameto bothNameandToolCallIDfieldscore/providers/ollama/utils.go (4)
25-106: Well-documented tool call conversion with Ollama-specific semantics.The extensive comments explaining Ollama's tool call correlation by function name (not IDs) and the thinking placeholder pattern are helpful. The logic correctly:
- Filters out thinking placeholders before conversion
- Extracts thinking content from
ExtraContent- Skips invalid tool messages without
ToolName
171-189: Thinking content preservation via placeholder tool call is a reasonable workaround.Given that
ChatAssistantMessagedoesn't have anExtraContentfield, using a dummy_thinking_placeholdertool call to preserve thinking content for passthrough scenarios is a pragmatic solution. The pattern is consistently applied in both directions.
358-375: Base64 validation with padding fallback is robust.Trying both
StdEncodingandRawStdEncodinghandles cases where padding may be missing. The minimum length check of 4 is appropriate.
389-420: Tool call argument parsing with fallback is well-handled.When JSON unmarshaling fails, the raw arguments are preserved under
_raw_argumentskey, which allows debugging while preventing data loss.core/providers/ollama/ollama.go (7)
41-81: Response pooling implementation is correct.The pools are properly initialized with
sync.Pool, and the acquire/release functions correctly reset structs before returning them to prevent data leakage between requests.
97-101: Pool pre-warming during initialization is a good optimization.Pre-allocating pool entries based on concurrency helps reduce allocation pressure during initial requests.
126-170: Centralized request handling is clean and well-structured.The
completeRequesthelper properly:
- Sets extra headers from config
- Handles Bearer token authentication
- Uses
MakeRequestWithContextfor timeout support- Decodes and copies the response body before releasing
370-378: Proper documentation of fasthttp streaming context limitations.The comment at lines 371-373 correctly documents that fasthttp doesn't natively support context cancellation for streaming. The implementation works around this with the
selectcheck beforescanner.Scan().
414-432: Context cancellation check before scanner.Scan() is correct.This addresses the previous review concern about fasthttp's lack of native context cancellation for streaming. The
selectwithdefaultcase allows the loop to proceed if context is still active.
460-468: Latency bookkeeping is now correct.The chunk latency is computed (
time.Since(lastChunkTime)) beforelastChunkTimeis updated. This ensures each chunk's latency reflects the time since the previous chunk was received, addressing the previous review concern.
490-498: Verify latency assignment for Responses fallback path.The
chunkLatencyMscomputed at line 460 is reused for Responses chunks at line 497. However, whenisDone && Type == ResponsesStreamResponseTypeCompleted(line 490), the latency is correctly set to total time since start. This looks correct.core/providers/ollama/types.go (5)
14-95: Comprehensive Ollama API type definitions.The request types accurately model Ollama's native API with proper JSON tags and optional field handling. The
OllamaOptionsstruct covers both common and advanced model parameters.
189-293: ToBifrostChatResponse conversion is thorough.The conversion correctly handles:
- Timestamp parsing with RFC3339Nano format
- Tool call mapping with function name as ID (Ollama limitation documented)
- Thinking content preservation via placeholder pattern
- Finish reason mapping
331-437: ToBifrostStreamResponse handles streaming chunks correctly.The streaming conversion properly:
- Sets
chat.completion.chunkobject type- Builds delta content only when there's actual data (
hasDeltacheck)- Maps finish reason on final chunk
- Adds usage statistics only on the final chunk (
r.Done)
454-469: Embedding float64 to float32 conversion is acceptable.While there is some precision loss, float32 is standard for embeddings in most systems and the trade-off for memory efficiency is reasonable.
495-505: The configuredSet is now properly used for model filtering.The previous review concern about unused
configuredSetis addressed. The code now correctly filters models whenconfiguredModelsis non-empty (line 503).
7d62378 to
4537ab1
Compare
|
@Pratham-Mishra04 the ollama cloud provider requires api key while ollama locally doesn't. current provider has implementations of both but in the UI we're either setting base URL or APIkey. So should I proceed to make a separate UI for this or a separate provider implementation for ollama cloud? |
Summary
Switches the Ollama provider from OpenAI-compatible APIs to Ollama's native APIs, enabling API key support for Ollama Cloud deployments. This allows users to configure API keys for authenticated Ollama instances while maintaining backward compatibility with local Ollama setups.
Changes
New Files:
Modified Files:
Key Design Decisions:
Type of change
Affected areas
How to test
ollama serve
Pull a test model
ollama serve
Pull a test model
ollama pull llama3.2:latest
ollama pull nomic-embed-text:latest
Run tests (no environment variables needed!)
cd core
go test -v ./providers/ollama/... -timeout 5m
Optional: Override default model
export OLLAMA_MODEL="llama3.2:latest"
export OLLAMA_EMBEDDING_MODEL="nomic-embed-text:latest"
go test -v ./providers/ollama/... -timeout 5m
If adding new configs or environment variables, document them here.
Screenshots/Recordings
If UI changes, add before/after screenshots or short clips.
Breaking changes
If yes, describe impact and migration instructions.
Related issues
Closes Issue #1011
Security considerations
Note any security implications (auth, secrets, PII, sandboxing, etc.).
Checklist
docs/contributing/README.mdand followed the guidelines