-
Notifications
You must be signed in to change notification settings - Fork 150
feat: added support for gemini native converters #1018
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: added support for gemini native converters #1018
Conversation
📝 WalkthroughSummary by CodeRabbitRelease Notes
✏️ Tip: You can customize this high-level summary in your review settings. WalkthroughReplaces non‑streaming Gemini conversions with streaming‑first implementations: adds SSE streaming handlers and per‑chunk delta assembly for ChatCompletion and Responses, expands Gemini↔Bifrost conversion surfaces, integrates Gemini into Vertex/transport layers, introduces ThinkingLevel, and updates integration tests for reasoning and streaming. Changes
Sequence Diagram(s)sequenceDiagram
participant Client
participant Transport as Bifrost-HTTP
participant Provider as GeminiProvider
participant Gemini as Gemini API
Client->>Transport: Start stream request (ChatCompletionStream / ResponsesStream)
Transport->>Provider: Forward request + postHookRunner
Provider->>Provider: Build GeminiGenerationRequest (ToGemini*Request)
Provider->>Gemini: POST /generateContent_stream (SSE)
Gemini-->>Provider: SSE chunks (parts)
loop per SSE chunk
Provider->>Provider: processGeminiStreamChunk -> GenerateContentResponse
Provider->>Provider: assemble delta (text/thought/tool/file)
Provider-->>Transport: emit Bifrost stream chunk (delta + metadata)
Transport-->>Client: deliver chunk
end
Provider->>Provider: on final chunk attach usage/final outputs
Provider-->>Transport: final event / done
Transport-->>Client: stream end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60–90 minutes
Poem
Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
This stack of pull requests is managed by Graphite. Learn more about stacking. |
🧪 Test Suite AvailableThis PR can be tested by a repository admin. |
4374efc to
2cc10d8
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 6
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
core/providers/gemini/chat.go (1)
64-156: Preserve Gemini function-call IDs in non‑streaming chat responses.In
ToBifrostChatResponse, tool calls currently set:
TypeusingChatToolChoiceTypeFunction(choice type), andIDtofunction.Name(the function name), ignoringpart.FunctionCall.ID.This differs from the streaming path, which:
- Uses
ChatToolTypeFunction, and- Preferentially uses
FunctionCall.ID, falling back toName.Losing the provider‑supplied ID breaks correlation when Gemini emits multiple calls with the same name.
Consider aligning non‑streaming logic with streaming, e.g.:
- if part.FunctionCall != nil { - function := schemas.ChatAssistantMessageToolCallFunction{ - Name: &part.FunctionCall.Name, - } - if part.FunctionCall.Args != nil { - jsonArgs, err := json.Marshal(part.FunctionCall.Args) - if err != nil { - jsonArgs = []byte(fmt.Sprintf("%v", part.FunctionCall.Args)) - } - function.Arguments = string(jsonArgs) - } - - toolCalls = append(toolCalls, schemas.ChatAssistantMessageToolCall{ - Index: uint16(len(toolCalls)), - Type: schemas.Ptr(string(schemas.ChatToolChoiceTypeFunction)), - ID: function.Name, - Function: function, - }) - } + if part.FunctionCall != nil { + function := schemas.ChatAssistantMessageToolCallFunction{ + Name: &part.FunctionCall.Name, + } + if part.FunctionCall.Args != nil { + jsonArgs, err := json.Marshal(part.FunctionCall.Args) + if err != nil { + jsonArgs = []byte(fmt.Sprintf("%v", part.FunctionCall.Args)) + } + function.Arguments = string(jsonArgs) + } + + callID := part.FunctionCall.Name + if part.FunctionCall.ID != "" { + callID = part.FunctionCall.ID + } + + toolCalls = append(toolCalls, schemas.ChatAssistantMessageToolCall{ + Index: uint16(len(toolCalls)), + Type: schemas.Ptr(string(schemas.ChatToolTypeFunction)), + ID: &callID, + Function: function, + }) + }This keeps IDs consistent across streaming and non‑streaming chat.
[scratchpad_end] -->
🧹 Nitpick comments (2)
core/providers/gemini/utils.go (1)
262-293: Consider lazy initialization of token details structs.The
OutputTokensDetailsandInputTokensDetailsare initialized unconditionally on lines 272-273, but these allocations are only meaningful when there's data to populate. Consider initializing them only when needed to avoid unnecessary allocations.usage := &schemas.ResponsesResponseUsage{ TotalTokens: int(metadata.TotalTokenCount), InputTokens: int(metadata.PromptTokenCount), OutputTokens: int(metadata.CandidatesTokenCount), - OutputTokensDetails: &schemas.ResponsesResponseOutputTokens{}, - InputTokensDetails: &schemas.ResponsesResponseInputTokens{}, } // Add cached tokens if present if metadata.CachedContentTokenCount > 0 { usage.InputTokensDetails = &schemas.ResponsesResponseInputTokens{ CachedTokens: int(metadata.CachedContentTokenCount), } } if metadata.CandidatesTokensDetails != nil { + usage.OutputTokensDetails = &schemas.ResponsesResponseOutputTokens{} for _, detail := range metadata.CandidatesTokensDetails { switch detail.Modality { case "AUDIO": usage.OutputTokensDetails.AudioTokens = int(detail.TokenCount) } } }core/schemas/responses.go (1)
130-140: JSON schema struct extensions look correct; consider tighteningSchematyping.The added
Name,Description, andStrictfields align with richer JSON schema metadata and are safe. TheSchema *anyfield is flexible but very loose; if callers always treat this as raw JSON, considerjson.RawMessageor a documented alias type instead to make usage clearer and avoid accidental double-marshalling.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (12)
core/providers/gemini/chat.go(3 hunks)core/providers/gemini/gemini.go(1 hunks)core/providers/gemini/gemini_test.go(2 hunks)core/providers/gemini/responses.go(9 hunks)core/providers/gemini/speech.go(1 hunks)core/providers/gemini/utils.go(7 hunks)core/providers/vertex/vertex.go(8 hunks)core/providers/vertex/vertex_test.go(1 hunks)core/schemas/responses.go(1 hunks)core/schemas/utils.go(1 hunks)tests/integrations/tests/test_google.py(1 hunks)transports/bifrost-http/integrations/genai.go(2 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**
⚙️ CodeRabbit configuration file
always check the stack if there is one for the current PR. do not give localized reviews for the PR, always see all changes in the light of the whole stack of PRs (if there is a stack, if there is no stack you can continue to make localized suggestions/reviews)
Files:
core/providers/gemini/gemini_test.gocore/providers/vertex/vertex.gocore/providers/vertex/vertex_test.gocore/schemas/utils.gocore/schemas/responses.gotransports/bifrost-http/integrations/genai.gocore/providers/gemini/speech.gotests/integrations/tests/test_google.pycore/providers/gemini/gemini.gocore/providers/gemini/utils.gocore/providers/gemini/chat.gocore/providers/gemini/responses.go
🧬 Code graph analysis (7)
core/providers/vertex/vertex.go (4)
core/schemas/utils.go (1)
IsGeminiModel(1052-1054)core/providers/gemini/chat.go (1)
ToGeminiChatCompletionRequest(11-62)core/providers/gemini/types.go (3)
GeminiGenerationRequest(55-73)FunctionCall(1091-1101)FunctionResponse(1104-1126)core/providers/gemini/gemini.go (1)
HandleGeminiChatCompletionStream(294-485)
core/schemas/responses.go (1)
core/providers/gemini/types.go (1)
Schema(724-775)
transports/bifrost-http/integrations/genai.go (3)
transports/bifrost-http/handlers/inference.go (1)
ResponsesRequest(193-197)core/schemas/bifrost.go (1)
ResponsesRequest(91-91)transports/bifrost-http/integrations/router.go (3)
EmbeddingResponseConverter(102-102)ResponsesResponseConverter(98-98)ResponsesStreamResponseConverter(123-123)
tests/integrations/tests/test_google.py (1)
tests/integrations/tests/utils/config_loader.py (2)
get_model(140-161)get_model(443-445)
core/providers/gemini/gemini.go (6)
core/providers/gemini/chat.go (1)
ToGeminiChatCompletionRequest(11-62)core/schemas/bifrost.go (8)
RequestType(83-83)ChatCompletionRequest(89-89)BifrostStream(318-325)BifrostError(353-362)Gemini(48-48)ModelProvider(32-32)BifrostResponseExtraFields(284-293)ResponsesRequest(91-91)core/providers/utils/utils.go (7)
ShouldSendBackRawResponse(482-487)NewBifrostOperationError(449-460)NewProviderAPIError(464-479)ProcessAndSendBifrostError(571-601)ProcessAndSendResponse(535-565)GetBifrostResponseForStreamResponse(777-805)ProcessAndSendError(607-653)core/schemas/utils.go (1)
Ptr(16-18)core/utils.go (1)
Ptr(51-53)core/providers/gemini/responses.go (1)
ToGeminiResponsesRequest(63-102)
core/providers/gemini/utils.go (4)
core/providers/gemini/types.go (16)
Type(778-778)Schema(724-775)FinishReason(22-22)FinishReasonStop(28-28)FinishReasonMaxTokens(30-30)FinishReasonSafety(33-33)FinishReasonRecitation(35-35)FinishReasonLanguage(37-37)FinishReasonOther(39-39)FinishReasonBlocklist(41-41)FinishReasonProhibitedContent(43-43)FinishReasonSPII(46-46)FinishReasonMalformedFunctionCall(48-48)FinishReasonImageSafety(50-50)FinishReasonUnexpectedToolCall(52-52)FunctionResponse(1104-1126)core/schemas/responses.go (4)
ResponsesParameters(85-112)ResponsesTextConfig(118-121)ResponsesTextConfigFormat(123-128)ResponsesTextConfigFormatJSONSchema(131-140)core/schemas/utils.go (1)
Ptr(16-18)core/schemas/chatcompletions.go (3)
BifrostLLMUsage(640-647)ChatPromptTokensDetails(649-655)ChatCompletionTokensDetails(657-667)
core/providers/gemini/chat.go (5)
core/schemas/chatcompletions.go (10)
ChatAssistantMessageToolCall(564-570)ChatContentBlock(505-512)ChatContentBlockTypeText(497-497)ChatAssistantMessageToolCallFunction(573-576)ChatAssistantMessage(541-545)BifrostResponseChoice(582-590)ChatNonStreamResponseChoice(605-608)BifrostChatResponse(26-41)ChatStreamResponseChoiceDelta(616-622)ChatStreamResponseChoice(611-613)core/providers/gemini/types.go (8)
GeminiGenerationRequest(55-73)GenerationConfig(631-697)Content(922-930)Type(778-778)Role(13-13)GenerateContentResponse(1287-1303)FinishReason(22-22)RoleModel(17-17)core/schemas/utils.go (1)
Ptr(16-18)core/schemas/bifrost.go (1)
BifrostError(353-362)core/providers/gemini/utils.go (1)
ConvertGeminiFinishReasonToBifrost(213-218)
🔇 Additional comments (14)
core/providers/vertex/vertex_test.go (1)
26-27: LGTM!The model name changes align with the PR's native Gemini converter integration. Removing the
google/prefix is consistent with howIsGeminiModel()detects Gemini models via substring matching on "gemini".core/providers/gemini/utils.go (3)
194-218: Finish reason mapping looks appropriate.The mapping of Gemini finish reasons to Bifrost format is well-structured. Mapping
FinishReasonMalformedFunctionCallandFinishReasonUnexpectedToolCallto "tool_calls" makes sense as they indicate tool-related termination reasons.
883-886: HardcodedStrict: falsemay not always be appropriate.The
Strictfield is always set tofalse. If the original Gemini schema had strictness requirements, this information is lost. Consider whether this should be configurable or derived from the schema.
932-954: LGTM!The
extractFunctionResponseOutputhelper has good defensive nil checks and follows a sensible extraction strategy: prefer the "output" key when present, otherwise marshal the entire response.core/providers/gemini/gemini_test.go (1)
50-54: Verify the test scenario flag changes are intentional.The test flags have been flipped:
TranscriptionandTranscriptionStreamare now enabled, whileSpeechSynthesisandSpeechSynthesisStreamare now disabled. Please confirm this reflects the actual capability changes in this PR.core/providers/gemini/speech.go (1)
110-127: LGTM!Good defensive nil-check added to prevent potential nil pointer dereference when
bifrostReq.Inputis nil.core/schemas/utils.go (1)
1052-1054: LGTM!The
IsGeminiModelfunction follows the established pattern used byIsAnthropicModelandIsMistralModel, providing consistent model type detection across providers.tests/integrations/tests/test_google.py (1)
507-516: Verify the change in error handling behavior is intentional.This test previously expected an exception for invalid roles but now expects a successful response. This represents a significant change in error handling semantics. Please confirm this aligns with the expected behavior of the new native Gemini converters.
core/providers/vertex/vertex.go (3)
297-312: LGTM!The Gemini branch follows the established pattern from the Anthropic branch: convert to native format, strip unsupported fields, and marshal/unmarshal for the Vertex API map format.
630-703: LGTM!The Gemini streaming implementation correctly:
- Uses v1beta1 API version for streaming endpoints
- Adds the
alt=sseparameter for SSE streaming- Handles both API key (query param) and OAuth2 authentication paths
- Delegates to the shared
HandleGeminiChatCompletionStreamhandler
1384-1400: LGTM!The
stripVertexGeminiUnsupportedFieldshelper cleanly removes theidfields from function_call and function_response objects that aren't supported by Vertex AI's Gemini API.core/providers/gemini/chat.go (1)
10-62: VerifyconvertParamsToGenerationConfigdefaults when given an empty modalities slice.The call to
convertParamsToGenerationConfig(bifrostReq.Params, []string{})with an empty modalities slice should be reviewed to ensure it correctly defaultsResponseModalitiesfor chat operations (e.g., to text). Confirm that this change does not regress any existing behavior that relied on non-empty default modalities.core/providers/gemini/responses.go (1)
63-101: Error handling verification incomplete — unable to assess full context.The review comment cannot be fully verified without access to:
- The actual repository and all call sites of
ToGeminiResponsesRequest- The PR stack context (as required by coding guidelines)
- Verification that the suggested signature change aligns with all callers across the codebase
To properly rewrite this review, please provide:
- Access to the repository, or
- The list of all call sites of
ToGeminiResponsesRequest, or- The git diff/PR context showing all related changes in the stack
transports/bifrost-http/integrations/genai.go (1)
35-81: GenAI routes correctly switched to Responses API and native Gemini converters.The RequestConverter and new
ResponsesResponseConverter/ResponsesStreamResponseConverterare wired consistently with the router types and Gemini helpers, and the embedding/speech/transcription branches remain intact. Verify that no remainingChatResponseConverter/ChatStreamResponseConverterusages exist for GenAI Gemini routes in the integration layer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 6
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
core/providers/gemini/responses.go (1)
2165-2172: FileData must be base64-decoded before casting to bytes.At line 2169,
block.ResponsesInputMessageContentBlockFile.FileDatais cast directly to[]byte, but the type definition explicitly documents this field as "Base64 encoded file data". The code must decode it first. This mirrors the correct pattern already used in the same file for audio data (lines 2134–2136) and image data (lines 2110–2112).} else if block.ResponsesInputMessageContentBlockFile.FileData != nil { + decodedData, err := base64.StdEncoding.DecodeString(*block.ResponsesInputMessageContentBlockFile.FileData) + if err != nil { + return nil, fmt.Errorf("failed to decode base64 file data: %w", err) + } return &Part{ InlineData: &Blob{ MIMEType: "application/octet-stream", // default - Data: []byte(*block.ResponsesInputMessageContentBlockFile.FileData), + Data: decodedData, }, }, nil }
🧹 Nitpick comments (6)
core/providers/gemini/speech.go (1)
110-127: Consider returning an error when input is empty.The nil check prevents a potential nil pointer dereference, which is good. However, when
bifrostReq.Inputis nil or empty, the function returns aGeminiGenerationRequestwithout anyContents, which may cause unexpected behavior when making API calls.Consider whether an error should be returned for missing input:
+ if bifrostReq.Input == nil || bifrostReq.Input.Input == "" { + return nil, fmt.Errorf("speech input is required") + } - if bifrostReq.Input != nil && bifrostReq.Input.Input != "" { - geminiReq.Contents = []Content{ + geminiReq.Contents = []Content{ { Parts: []*Part{ { Text: bifrostReq.Input.Input, }, }, }, } - // Add speech config to generation config if voice config is provided - if bifrostReq.Params != nil && bifrostReq.Params.VoiceConfig != nil { - // Handle both single voice and multi-voice configurations - if bifrostReq.Params.VoiceConfig.Voice != nil || len(bifrostReq.Params.VoiceConfig.MultiVoiceConfig) > 0 { - addSpeechConfigToGenerationConfig(&geminiReq.GenerationConfig, bifrostReq.Params.VoiceConfig) - } + // Add speech config to generation config if voice config is provided + if bifrostReq.Params != nil && bifrostReq.Params.VoiceConfig != nil { + // Handle both single voice and multi-voice configurations + if bifrostReq.Params.VoiceConfig.Voice != nil || len(bifrostReq.Params.VoiceConfig.MultiVoiceConfig) > 0 { + addSpeechConfigToGenerationConfig(&geminiReq.GenerationConfig, bifrostReq.Params.VoiceConfig) } }tests/integrations/tests/test_google.py (1)
507-516: Test name may be misleading after behavior change.The test name
test_12_error_handling_invalid_rolesand docstring "Error handling for invalid roles" suggest it tests error scenarios, but it now validates successful responses. Consider renaming for clarity:- def test_12_error_handling_invalid_roles(self, google_client, test_config): - """Test Case 12: Error handling for invalid roles""" + def test_12_invalid_roles_handling(self, google_client, test_config): + """Test Case 12: Verify API handles invalid roles gracefully"""core/providers/gemini/utils.go (1)
262-293: Unnecessary initialization of empty detail structs.Lines 272-273 initialize
OutputTokensDetailsandInputTokensDetailsto empty structs, but then lines 277-280 reassignInputTokensDetailsonly when cached tokens > 0. This leaves potentially empty structs in the response when there are no details.Consider initializing the detail structs only when there's actual data:
usage := &schemas.ResponsesResponseUsage{ TotalTokens: int(metadata.TotalTokenCount), InputTokens: int(metadata.PromptTokenCount), OutputTokens: int(metadata.CandidatesTokenCount), - OutputTokensDetails: &schemas.ResponsesResponseOutputTokens{}, - InputTokensDetails: &schemas.ResponsesResponseInputTokens{}, } // Add cached tokens if present if metadata.CachedContentTokenCount > 0 { usage.InputTokensDetails = &schemas.ResponsesResponseInputTokens{ CachedTokens: int(metadata.CachedContentTokenCount), } } if metadata.CandidatesTokensDetails != nil { + usage.OutputTokensDetails = &schemas.ResponsesResponseOutputTokens{} for _, detail := range metadata.CandidatesTokensDetails { switch detail.Modality { case "AUDIO": usage.OutputTokensDetails.AudioTokens = int(detail.TokenCount) } } }core/providers/gemini/responses.go (3)
539-556: Pool's New function creates state with non-zero CreatedAt, but acquireGeminiResponsesStreamState resets it.The
sync.Pool'sNewfunction initializesCreatedAtwithint(time.Now().Unix()), but this value gets overwritten inacquireGeminiResponsesStreamStateat line 587. The initialization inNewis redundant, though harmless.Consider simplifying the pool's New function since all fields are reset in acquire anyway:
var geminiResponsesStreamStatePool = sync.Pool{ New: func() interface{} { return &GeminiResponsesStreamState{ ItemIDs: make(map[int]string), ToolCallIDs: make(map[int]string), ToolCallNames: make(map[int]string), ToolArgumentBuffers: make(map[int]string), - CurrentOutputIndex: 0, - CreatedAt: int(time.Now().Unix()), - HasEmittedCreated: false, - HasEmittedInProgress: false, - TextItemClosed: false, - HasStartedText: false, - HasStartedToolCall: false, } }, }
604-636: Duplicate code: flush() is identical to the reset logic in acquireGeminiResponsesStreamState().The
flush()method (lines 604-636) duplicates the reset logic inacquireGeminiResponsesStreamState()(lines 561-593). Consider extracting to a single method.+// reset clears all state fields to prepare for reuse +func (state *GeminiResponsesStreamState) reset() { + if state.ItemIDs == nil { + state.ItemIDs = make(map[int]string) + } else { + clear(state.ItemIDs) + } + // ... rest of clearing logic + state.CreatedAt = int(time.Now().Unix()) + state.HasEmittedCreated = false + // ... remaining fields +} func acquireGeminiResponsesStreamState() *GeminiResponsesStreamState { state := geminiResponsesStreamStatePool.Get().(*GeminiResponsesStreamState) - // Clear maps - if state.ItemIDs == nil { - // ... all the duplicate code + state.reset() return state } -func (state *GeminiResponsesStreamState) flush() { - // ... duplicate code -} +func (state *GeminiResponsesStreamState) flush() { + state.reset() +}
1961-1965: Role mapping may silently default non-assistant roles to "user".Lines 1961-1965 map assistant to "model" but default everything else to "user". If there are other roles (e.g., "developer", "tool"), they'll be incorrectly mapped.
Consider explicit handling of known roles:
if msg.Role != nil { if *msg.Role == schemas.ResponsesInputMessageRoleAssistant { content.Role = "model" - } else { + } else if *msg.Role == schemas.ResponsesInputMessageRoleUser { content.Role = "user" + } else { + // Log or handle unexpected role + content.Role = "user" // Fallback } }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (12)
core/providers/gemini/chat.go(3 hunks)core/providers/gemini/gemini.go(1 hunks)core/providers/gemini/gemini_test.go(1 hunks)core/providers/gemini/responses.go(9 hunks)core/providers/gemini/speech.go(1 hunks)core/providers/gemini/utils.go(7 hunks)core/providers/vertex/vertex.go(8 hunks)core/providers/vertex/vertex_test.go(1 hunks)core/schemas/responses.go(1 hunks)core/schemas/utils.go(1 hunks)tests/integrations/tests/test_google.py(1 hunks)transports/bifrost-http/integrations/genai.go(2 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**
⚙️ CodeRabbit configuration file
always check the stack if there is one for the current PR. do not give localized reviews for the PR, always see all changes in the light of the whole stack of PRs (if there is a stack, if there is no stack you can continue to make localized suggestions/reviews)
Files:
core/providers/gemini/gemini_test.gocore/schemas/responses.gocore/providers/vertex/vertex_test.gocore/schemas/utils.gocore/providers/vertex/vertex.gocore/providers/gemini/speech.gocore/providers/gemini/utils.gotests/integrations/tests/test_google.pycore/providers/gemini/gemini.gotransports/bifrost-http/integrations/genai.gocore/providers/gemini/chat.gocore/providers/gemini/responses.go
🧬 Code graph analysis (7)
core/schemas/responses.go (1)
core/providers/gemini/types.go (1)
Schema(724-775)
core/providers/gemini/utils.go (5)
core/providers/gemini/types.go (5)
GeminiGenerationRequest(55-73)GenerationConfig(631-697)Type(778-778)Schema(724-775)Modality(706-706)core/schemas/responses.go (7)
ResponsesParameters(85-112)ResponsesTextConfig(118-121)ResponsesTextConfigFormat(123-128)ResponsesResponseUsage(255-262)ResponsesResponseOutputTokens(272-282)ResponsesResponseInputTokens(264-270)ResponsesTextConfigFormatJSONSchema(131-140)core/schemas/utils.go (1)
Ptr(16-18)core/utils.go (1)
Ptr(51-53)core/schemas/chatcompletions.go (3)
BifrostLLMUsage(640-647)ChatPromptTokensDetails(649-655)ChatCompletionTokensDetails(657-667)
tests/integrations/tests/test_google.py (1)
tests/integrations/tests/utils/config_loader.py (2)
get_model(140-161)get_model(443-445)
core/providers/gemini/gemini.go (5)
core/providers/gemini/chat.go (1)
ToGeminiChatCompletionRequest(11-62)core/providers/utils/utils.go (4)
ShouldSendBackRawResponse(482-487)GetPathFromContext(211-216)ProcessAndSendBifrostError(571-601)ProcessAndSendResponse(535-565)core/schemas/utils.go (1)
Ptr(16-18)core/utils.go (1)
Ptr(51-53)core/providers/gemini/responses.go (1)
ToGeminiResponsesRequest(77-127)
transports/bifrost-http/integrations/genai.go (5)
transports/bifrost-http/handlers/inference.go (1)
ResponsesRequest(193-197)core/schemas/bifrost.go (1)
ResponsesRequest(91-91)transports/bifrost-http/integrations/router.go (3)
EmbeddingResponseConverter(102-102)ResponsesResponseConverter(98-98)ResponsesStreamResponseConverter(123-123)core/schemas/responses.go (2)
BifrostResponsesResponse(45-83)BifrostResponsesStreamResponse(1426-1464)core/providers/gemini/responses.go (2)
ToGeminiResponsesResponse(154-340)ToGeminiResponsesStreamResponse(342-510)
core/providers/gemini/chat.go (3)
core/schemas/chatcompletions.go (12)
ChatAssistantMessageToolCall(564-570)ChatContentBlock(505-512)ChatContentBlockTypeText(497-497)ChatAssistantMessageToolCallFunction(573-576)ChatToolChoiceTypeFunction(331-331)ChatMessage(427-436)ChatAssistantMessage(541-545)BifrostResponseChoice(582-590)ChatNonStreamResponseChoice(605-608)BifrostChatResponse(26-41)ChatStreamResponseChoiceDelta(616-622)ChatStreamResponseChoice(611-613)core/schemas/utils.go (1)
Ptr(16-18)core/providers/gemini/utils.go (1)
ConvertGeminiFinishReasonToBifrost(213-218)
core/providers/gemini/responses.go (2)
core/providers/gemini/types.go (13)
GeminiGenerationRequest(55-73)ToolConfig(146-151)FunctionCallingConfig(92-99)SafetySetting(81-89)Part(936-960)Role(13-13)Content(922-930)Type(778-778)FunctionCall(1091-1101)FunctionResponse(1104-1126)FileData(1080-1088)Blob(963-971)Tool(600-627)core/schemas/utils.go (4)
ParseModelString(23-36)SafeExtractFromMap(519-525)SafeExtractString(280-297)Ptr(16-18)
🔇 Additional comments (21)
core/providers/gemini/gemini_test.go (1)
63-63: LGTM!Trailing newline addition aligns with POSIX standards for text files.
tests/integrations/tests/test_google.py (1)
518-566: LGTM!The streaming test correctly uses
generate_content_streamand properly handles both response structures (candidates-based and direct text). The validation logic is thorough.core/schemas/utils.go (1)
1052-1055: LGTM!The
IsGeminiModelfunction follows the established pattern ofIsAnthropicModelandIsMistralModel, providing consistent model detection across the codebase.core/providers/vertex/vertex_test.go (1)
26-27: No action required—model name resolution works correctly.The Vertex provider explicitly handles unprefixed Gemini model names. The provider strips the
google/prefix at line 1262 ofcore/providers/vertex/vertex.go(deployment = strings.TrimPrefix(deployment, "google/")), andIsGeminiModel()correctly identifies Gemini models by checking for the"gemini"substring, which matches both prefixed and unprefixed names. The test configuration change to"gemini-2.0-flash-001"is compatible with existing model resolution logic.core/schemas/responses.go (1)
132-135: Potential field duplication with parent struct.The fields
NameandStrictadded here duplicate the same fields inResponsesTextConfigFormat(lines 125, 127). This could cause confusion about which field takes precedence during serialization/deserialization.If both structs need these fields for different serialization contexts (e.g., nested vs. flat JSON representations), consider adding a brief comment clarifying the distinction.
core/providers/vertex/vertex.go (1)
1384-1400: LGTM!The helper function correctly strips the unsupported ID fields from function_call and function_response parts for Vertex Gemini compatibility.
transports/bifrost-http/integrations/genai.go (1)
50-75: LGTM!The GenAI integration correctly routes non-embedding/speech/transcription requests through the Responses API path. The converter updates align with the new
BifrostResponsesResponseandBifrostResponsesStreamResponsetypes, and the relevant code snippets fromrouter.goconfirm the expected signatures.core/providers/gemini/chat.go (1)
160-307: LGTM!The streaming conversion implementation is well-structured with proper handling of:
- Thought vs regular text content differentiation
- ThoughtSignature preservation for Gemini 3 Pro compatibility
- Empty delta filtering to avoid unnecessary chunks
- Proper finish reason conversion using the shared helper
core/providers/gemini/gemini.go (2)
293-485: LGTM!The shared streaming handler for chat completion is well-implemented with:
- Proper resource management via defer statements
- Comprehensive error handling for cancellation, timeout, HTTP errors, and API errors
- Appropriate buffer sizing for large streaming responses
- Correct stream lifecycle management with end indicators
572-809: LGTM!The responses stream handler properly manages stream state with pooling, handles errors appropriately, and finalizes the stream correctly when the scanner completes. The pattern of setting the stream end indicator for final responses ensures proper lifecycle signaling.
core/providers/gemini/utils.go (5)
11-70: LGTM on the ResponsesParameters conversion.The conversion logic correctly maps Gemini generation config fields to the new ResponsesParameters structure. The use of ExtraParams for provider-specific fields (top_k, candidate_count, stop_sequences, presence_penalty, frequency_penalty, seed) is appropriate for maintaining flexibility.
194-218: Finish reason mapping looks correct.The mapping from Gemini finish reasons to Bifrost format is comprehensive. Minor observation:
FinishReasonMalformedFunctionCallandFinishReasonUnexpectedToolCallboth map to"tool_calls"which may lose fidelity about the error nature, but this is reasonable for a normalized API.
233-260: LGTM on chat usage conversion.The
convertGeminiUsageMetadataToChatUsagefunction correctly handles nil input, maps token counts, and conditionally populates details for cached and reasoning tokens.
839-890: LGTM on buildOpenAIResponseFormat signature change.The function now correctly returns
*schemas.ResponsesTextConfigand properly handles the three cases: explicit schema, JSON schema from map, and fallback to json_object mode.
932-954: LGTM on extractFunctionResponseOutput helper.The function safely handles nil checks and has a sensible fallback strategy: try "output" field first, then marshal the entire response. This is consistent with how function responses are handled elsewhere in the codebase.
core/providers/gemini/responses.go (6)
886-957: LGTM on function call part processing.The
processGeminiFunctionCallPartcorrectly:
- Closes any open text item first
- Uses function call ID with fallback to name
- Emits both OutputItemAdded and FunctionCallArgumentsDone events
- Properly tracks state
1115-1161: Empty text emitted on output_text.done event.At line 1124,
emptyText := ""is created and passed as theTextfield in theOutputTextDoneevent. This seems intentional but consider whether the accumulated text should be tracked and emitted instead for completeness.Is emitting an empty text string on
output_text.doneconsistent with the Bifrost responses API specification? Some consumers may expect the complete text in this event.
1539-1577: Function call handling creates separate reasoning message for thought signature.This approach stores thought signatures in a separate
ResponsesReasoningmessage following the function call message. This is a design choice that works but introduces coupling between adjacent messages.The approach is documented in the comments and maintains scalability. The reconstruction in
convertResponsesMessagesToGeminiContents(lines 2013-2018) correctly looks ahead to find the paired reasoning message.
2063-2085: LGTM on expanded content block type handling.The switch case now properly handles:
- Both input and output text types
- Reasoning content with Thought flag
- Refusal content converted to plain text
This aligns well with the bidirectional conversion requirements.
959-1035: Inconsistent outputIndex initialization for inline data.At lines 975-978, if
!state.HasStartedText, outputIndex is set to 0, but for other part types (thought, function call), it's set to 1. This inconsistency could cause item ID collisions or ordering issues.// Create new output item for the inline data outputIndex := state.CurrentOutputIndex + 1 if !state.HasStartedText { - outputIndex = 0 + outputIndex = 0 // Inline data takes first slot if no text }Consider documenting the indexing strategy or unifying the approach across all part types.
Likely an incorrect or invalid review comment.
738-799: Review comment is incorrect – no nil dereference risk exists.MessageID is guaranteed to be initialized before
processGeminiTextPartaccesses it. InToBifrostResponsesStream, theif !state.HasEmittedCreatedblock (which initializes MessageID) always executes beforeprocessGeminiPartis called. SinceHasEmittedCreatedstarts as false on first invocation and is set to true immediately after initialization, subsequent events reuse the same persisted state with MessageID already set. The defensive nil check is unnecessary.Likely an incorrect or invalid review comment.
2cc10d8 to
eae8367
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
♻️ Duplicate comments (2)
core/providers/gemini/chat.go (1)
77-112: Align non‑stream tool call IDs with streaming logic.Here the non‑stream path always sets
IDto the function name:toolCalls = append(toolCalls, schemas.ChatAssistantMessageToolCall{ Index: uint16(len(toolCalls)), Type: schemas.Ptr(string(schemas.ChatToolChoiceTypeFunction)), ID: function.Name, Function: function, })But in the streaming path you correctly prefer
FunctionCall.IDwhen present and only fall back to the function name. For consistency (and to preserve provider‑assigned tool call IDs), the non‑stream path should use the same logic.A minimal fix:
- function := schemas.ChatAssistantMessageToolCallFunction{ - Name: &part.FunctionCall.Name, - } + function := schemas.ChatAssistantMessageToolCallFunction{ + Name: &part.FunctionCall.Name, + } + + // Use ID if available, otherwise use function name (matches streaming path) + callID := part.FunctionCall.Name + if part.FunctionCall.ID != "" { + callID = part.FunctionCall.ID + } ... - toolCalls = append(toolCalls, schemas.ChatAssistantMessageToolCall{ - Index: uint16(len(toolCalls)), - Type: schemas.Ptr(string(schemas.ChatToolChoiceTypeFunction)), - ID: function.Name, - Function: function, - }) + toolCalls = append(toolCalls, schemas.ChatAssistantMessageToolCall{ + Index: uint16(len(toolCalls)), + Type: schemas.Ptr(string(schemas.ChatToolChoiceTypeFunction)), + ID: &callID, + Function: function, + })This keeps ID behavior consistent across streaming and non‑streaming Gemini chat responses and avoids losing original call IDs.
core/providers/gemini/responses.go (1)
114-123: Guard against nilbifrostReq.Paramsbefore accessingExtraParams.At line 114,
bifrostReq.Params.ExtraParamsis accessed without first verifying thatbifrostReq.Paramsis non-nil. The nil-check at line 87 only protects lines 88–99; after that block,Paramscan still be nil (e.g., if the request has no parameters but does haveExtraParamsin a different code path). Dereferencing.ExtraParamson a nilParamswill panic.Apply this diff to guard the access:
- if bifrostReq.Params.ExtraParams != nil { + if bifrostReq.Params != nil && bifrostReq.Params.ExtraParams != nil { if safetySettings, ok := schemas.SafeExtractFromMap(bifrostReq.Params.ExtraParams, "safety_settings"); ok {
🧹 Nitpick comments (3)
core/providers/gemini/gemini_test.go (1)
35-56: Consider enabling reasoning and transcription scenarios to cover new Gemini capabilitiesGiven this PR adds Gemini-native reasoning and transcription support, it may be worth turning
Reasoning,Transcription, and (if implemented)TranscriptionStreamtotruehere (or adding dedicated tests) so the comprehensive suite exercises those paths end-to-end.tests/integrations/tests/test_google.py (1)
507-517: Clarify intent oftest_12_error_handling_invalid_roles.The test name/docstring still talk about “error handling” but the body now asserts a successful response with candidates. If the new contract is “invalid roles are tolerated/normalized,” consider renaming the test or updating the docstring to reflect that it verifies graceful handling rather than an error path.
core/providers/vertex/vertex_test.go (1)
24-48: Model IDs update LGTM; comment forReasoningscenario is now stale.Switching
ChatModel/VisionModeltogemini-2.0-flash-001aligns these tests with the new native Gemini paths. However, theReasoning: falsecomment still says “because we are not using native gemini converters”, which is no longer accurate. Consider updating that comment (or adding a TODO) so future readers don’t misinterpret why reasoning is disabled.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (12)
core/providers/gemini/chat.go(3 hunks)core/providers/gemini/gemini.go(1 hunks)core/providers/gemini/gemini_test.go(1 hunks)core/providers/gemini/responses.go(9 hunks)core/providers/gemini/speech.go(1 hunks)core/providers/gemini/utils.go(7 hunks)core/providers/vertex/vertex.go(8 hunks)core/providers/vertex/vertex_test.go(1 hunks)core/schemas/responses.go(1 hunks)core/schemas/utils.go(1 hunks)tests/integrations/tests/test_google.py(1 hunks)transports/bifrost-http/integrations/genai.go(2 hunks)
🚧 Files skipped from review as they are similar to previous changes (4)
- core/schemas/responses.go
- core/providers/gemini/speech.go
- core/schemas/utils.go
- transports/bifrost-http/integrations/genai.go
🧰 Additional context used
📓 Path-based instructions (1)
**
⚙️ CodeRabbit configuration file
always check the stack if there is one for the current PR. do not give localized reviews for the PR, always see all changes in the light of the whole stack of PRs (if there is a stack, if there is no stack you can continue to make localized suggestions/reviews)
Files:
core/providers/vertex/vertex.gocore/providers/vertex/vertex_test.gocore/providers/gemini/utils.gocore/providers/gemini/gemini_test.gotests/integrations/tests/test_google.pycore/providers/gemini/chat.gocore/providers/gemini/gemini.gocore/providers/gemini/responses.go
🧬 Code graph analysis (6)
core/providers/vertex/vertex.go (6)
core/schemas/utils.go (1)
IsGeminiModel(1052-1054)core/providers/gemini/chat.go (1)
ToGeminiChatCompletionRequest(11-62)core/providers/gemini/types.go (1)
GeminiGenerationRequest(55-73)core/providers/utils/utils.go (1)
HandleProviderResponse(343-387)core/providers/gemini/gemini.go (2)
HandleGeminiChatCompletionStream(294-485)HandleGeminiResponsesStream(573-809)core/providers/gemini/responses.go (1)
ToGeminiResponsesRequest(76-126)
core/providers/gemini/utils.go (3)
core/providers/gemini/types.go (7)
GeminiGenerationRequest(55-73)GenerationConfig(631-697)Type(778-778)Schema(724-775)GenerateContentResponseUsageMetadata(1259-1284)Modality(706-706)FunctionResponse(1104-1126)core/schemas/utils.go (1)
Ptr(16-18)core/utils.go (1)
Ptr(51-53)
tests/integrations/tests/test_google.py (1)
tests/integrations/tests/utils/config_loader.py (2)
get_model(140-161)get_model(443-445)
core/providers/gemini/chat.go (4)
core/schemas/chatcompletions.go (12)
ChatAssistantMessageToolCall(564-570)ChatContentBlock(505-512)ChatAssistantMessageToolCallFunction(573-576)ChatToolChoiceTypeFunction(331-331)ChatMessage(427-436)ChatMessageRoleAssistant(419-419)ChatAssistantMessage(541-545)BifrostResponseChoice(582-590)BifrostChatResponse(26-41)ChatStreamResponseChoiceDelta(616-622)ChatToolTypeFunction(197-197)ChatStreamResponseChoice(611-613)core/schemas/utils.go (1)
Ptr(16-18)core/schemas/models.go (1)
Model(109-129)core/providers/gemini/utils.go (1)
ConvertGeminiFinishReasonToBifrost(213-218)
core/providers/gemini/gemini.go (8)
core/providers/gemini/chat.go (1)
ToGeminiChatCompletionRequest(11-62)core/schemas/bifrost.go (3)
ChatCompletionRequest(89-89)Gemini(48-48)ResponsesRequest(91-91)core/schemas/provider.go (1)
Provider(280-307)core/providers/utils/utils.go (3)
ShouldSendBackRawResponse(482-487)GetPathFromContext(211-216)GetBifrostResponseForStreamResponse(777-805)core/schemas/utils.go (1)
Ptr(16-18)core/utils.go (1)
Ptr(51-53)core/schemas/responses.go (2)
BifrostResponsesRequest(32-39)BifrostResponsesStreamResponse(1426-1464)core/providers/gemini/responses.go (2)
ToGeminiResponsesRequest(76-126)FinalizeGeminiResponsesStream(1210-1212)
core/providers/gemini/responses.go (3)
core/providers/gemini/types.go (15)
GeminiGenerationRequest(55-73)ToolConfig(146-151)FunctionCallingConfig(92-99)SafetySetting(81-89)Part(936-960)Role(13-13)Content(922-930)Type(778-778)FunctionCall(1091-1101)FunctionResponse(1104-1126)FileData(1080-1088)Blob(963-971)Tool(600-627)FunctionCallingConfigModeAuto(109-109)FunctionCallingConfigModeNone(116-116)core/schemas/responses.go (14)
BifrostResponsesResponse(45-83)ResponsesInputMessageRoleAssistant(332-332)ResponsesInputMessageRoleUser(333-333)ResponsesMessageTypeFunctionCall(296-296)ResponsesToolMessage(461-481)ResponsesReasoning(730-733)ResponsesMessageTypeFunctionCallOutput(297-297)ResponsesMessageTypeReasoning(308-308)BifrostResponsesStreamResponse(1426-1464)ResponsesMessage(315-327)ResponsesMessageTypeMessage(291-291)ResponsesMessageContent(339-344)ResponsesMessageContentBlock(399-410)ResponsesTool(1029-1044)core/schemas/utils.go (3)
SafeExtractFromMap(519-525)SafeExtractString(280-297)Ptr(16-18)
🔇 Additional comments (8)
core/providers/gemini/gemini_test.go (1)
63-63: No functional change – formatting onlyThis change just adds the final newline/closing brace; there’s no behavioral impact and the test setup remains valid.
tests/integrations/tests/test_google.py (1)
518-566: Streaming aggregation logic fortest_13_streaminglooks solid.Chunk iteration, content aggregation, and the robot-related assertions align with the Gemini streaming shape (candidates → content.parts → text) and validate both that chunks arrive and that the story is on-topic. No issues from a provider/streaming perspective.
core/providers/vertex/vertex.go (2)
272-305: Vertex Gemini chat paths are wired coherently.The Gemini branches in
ChatCompletionandChatCompletionStreamcorrectly:
- Build requests via
gemini.ToGeminiChatCompletionRequest,- Strip unsupported function IDs with
stripVertexGeminiUnsupportedFields,- Route to native
generateContent/streamGenerateContent?alt=sseendpoints, and- Map responses back through Gemini’s converters with consistent ExtraFields (provider, model, latency, raw response).
Nothing blocking here from a Vertex/Gemini integration standpoint.
Also applies to: 386-391, 477-497
1391-1407:stripVertexGeminiUnsupportedFieldscorrectly clears unsupported IDs in-place.Iterating over
requestBody.Contentsand mutatingpart.FunctionCall.ID/part.FunctionResponse.IDvia the*Partpointers ensures the underlying request is sanitized for Vertex Gemini without extra allocations. This matches the stated intent of dropping unsupportedidfields before hitting Vertex.core/providers/gemini/chat.go (1)
160-307: Streaming conversion (ToBifrostChatCompletionStream) is well‑structured.The streaming delta builder:
- Separates regular text vs. thought content via
part.Thought,- Properly maps Gemini
RoleModel→ assistant,- Preserves tool calls (with IDs and Gemini thought signatures in
ExtraContent), and- Emits a final chunk with finish reason and usage only once.
This matches Bifrost’s streaming schema expectations and should interoperate cleanly with the new Gemini/Vertex streaming handlers.
core/providers/gemini/utils.go (1)
11-70: Responses parameter, finish‑reason, usage, and function‑response helpers look consistent.
convertGenerationConfigToResponsesParameterscleanly maps GeminiGenerationConfigintoResponsesParameters, usingExtraParamsfor knobs liketop_k, penalties, seed, stop sequences, and response schemas, and wiringResponseMIMETypeinto the newResponsesTextConfig.ConvertGeminiFinishReasonToBifrostandgeminiFinishReasonToBifrostcover the expected Gemini finish reasons and sensibly default to the raw provider value.- The two usage converters correctly project
GenerateContentResponseUsageMetadatainto chat and responses usage structs, including cached and reasoning tokens where present.extractFunctionResponseOutputgives a reasonable best‑effort: prefer"output"as a string, otherwise fall back to JSON of the full response map.Nothing blocking here; these helpers should make request/response conversions much easier to maintain.
Also applies to: 194-293, 839-890, 933-954
core/providers/gemini/gemini.go (2)
209-244: Native Gemini ChatCompletion and streaming wiring look correct.The new
ChatCompletionandChatCompletionStream:
- Gate operations via
CheckOperationAllowed,- Build request bodies with
ToGeminiChatCompletionRequest,- Use a shared
completeRequestfor non‑streaming andHandleGeminiChatCompletionStreamfor SSE streaming,- Populate ExtraFields (provider, model, latency, raw response) consistently.
The streaming handler’s SSE loop, chunk indexing, finish detection, and error propagation all align with Bifrost’s streaming contracts.
Also applies to: 246-307
487-526: Responses and ResponsesStream Gemini paths are consistent with chat streaming.
ResponsesandResponsesStreammirror the chat flows:
- Convert via
ToGeminiResponsesRequest,- Reuse
completeRequestfor non‑streaming andHandleGeminiResponsesStreamfor SSE,- Maintain per‑chunk sequence indices, latency, and finalization (including
FinalizeGeminiResponsesStreamand usage attachment),- Integrate with post‑hooks and
GetBifrostResponseForStreamResponselike other providers.The shared
processGeminiStreamChunkandextractGeminiUsageMetadatahelpers centralize parsing and usage extraction, reducing duplicated error‑prone logic. No issues spotted in these paths.Also applies to: 528-809
e0964ba to
461269a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
♻️ Duplicate comments (1)
core/providers/gemini/responses.go (1)
76-126: Guard against nil Params and avoid silently swallowing conversion errors in ToGeminiResponsesRequestTwo issues here:
- Nil pointer risk on Params (regression of earlier fix).
bifrostReq.Paramsis optional onBifrostResponsesRequest, but this block unconditionally dereferences it:if bifrostReq.Params.ExtraParams != nil { ... }If a caller provides
InputwithoutParams, this will panic.Minimal fix:
- if bifrostReq.Params.ExtraParams != nil { + if bifrostReq.Params != nil && bifrostReq.Params.ExtraParams != nil { if safetySettings, ok := schemas.SafeExtractFromMap(bifrostReq.Params.ExtraParams, "safety_settings"); ok { ... } }
- Conversion errors from
convertResponsesMessagesToGeminiContentsare lost.If
convertResponsesMessagesToGeminiContentsreturns an error (e.g., invalid image/audio block),ToGeminiResponsesRequestreturnsnil, but the callers wrap it in:func() (any, error) { return ToGeminiResponsesRequest(request), nil }So
CheckContextAndGetRequestBodynever sees the error and happily marshals anilbody (null) instead of returning a structuredBifrostError.Consider changing
ToGeminiResponsesRequestto return( *GeminiGenerationRequest, error )and threading that through the builder, so malformed Responses inputs fail fast with a clear error instead of a confusing provider 4xx.
🧹 Nitpick comments (4)
tests/integrations/tests/test_google.py (1)
507-516: Test name no longer reflects actual behavior.The test is named
test_12_error_handling_invalid_rolesbut now expects a successful response with valid candidates. The docstring still says "Error handling for invalid roles" but the test asserts success.If invalid roles no longer cause errors (perhaps Bifrost/Gemini now handles them gracefully), consider renaming the test to reflect its actual purpose, e.g.,
test_12_invalid_roles_handled_gracefully.@skip_if_no_api_key("google") -def test_12_error_handling_invalid_roles(self, google_client, test_config): - """Test Case 12: Error handling for invalid roles""" +def test_12_invalid_roles_handled_gracefully(self, google_client, test_config): + """Test Case 12: Verify invalid roles are handled gracefully without errors""" response = google_client.models.generate_content( model=get_model("google", "chat"), contents=GENAI_INVALID_ROLE_CONTENT )core/providers/gemini/utils.go (2)
194-218: Fallback for unknown finish reasons may leak raw provider values
ConvertGeminiFinishReasonToBifrostfalls back tostring(providerReason)when a reason isn’t ingeminiFinishReasonToBifrost. That can expose new/raw Gemini enums instead of a stable Bifrost set (stop,length,content_filter,tool_calls, etc.).Consider normalizing unknown reasons to a generic bucket (e.g.
"other"or"stop") and only emit raw strings via an auxiliary field if you need them.
233-293: Also surface reasoning tokens in Responses usage metadata
convertGeminiUsageMetadataToChatUsagecorrectly mapsThoughtsTokenCountintoChatCompletionTokensDetails.ReasoningTokens, butconvertGeminiUsageMetadataToResponsesUsageignores it even thoughResponsesResponseOutputTokenssupports reasoning tokens.You can preserve that signal like this:
func convertGeminiUsageMetadataToResponsesUsage(metadata *GenerateContentResponseUsageMetadata) *schemas.ResponsesResponseUsage { @@ - usage := &schemas.ResponsesResponseUsage{ - TotalTokens: int(metadata.TotalTokenCount), - InputTokens: int(metadata.PromptTokenCount), - OutputTokens: int(metadata.CandidatesTokenCount), - OutputTokensDetails: &schemas.ResponsesResponseOutputTokens{}, - InputTokensDetails: &schemas.ResponsesResponseInputTokens{}, - } + usage := &schemas.ResponsesResponseUsage{ + TotalTokens: int(metadata.TotalTokenCount), + InputTokens: int(metadata.PromptTokenCount), + OutputTokens: int(metadata.CandidatesTokenCount), + // details structs are allocated lazily below + } @@ - if metadata.CachedContentTokenCount > 0 { - usage.InputTokensDetails = &schemas.ResponsesResponseInputTokens{ - CachedTokens: int(metadata.CachedContentTokenCount), - } - } + if metadata.CachedContentTokenCount > 0 { + if usage.InputTokensDetails == nil { + usage.InputTokensDetails = &schemas.ResponsesResponseInputTokens{} + } + usage.InputTokensDetails.CachedTokens = int(metadata.CachedContentTokenCount) + } @@ - if metadata.CandidatesTokensDetails != nil { + if metadata.CandidatesTokensDetails != nil { for _, detail := range metadata.CandidatesTokensDetails { @@ - case "AUDIO": - usage.OutputTokensDetails.AudioTokens = int(detail.TokenCount) + case "AUDIO": + if usage.OutputTokensDetails == nil { + usage.OutputTokensDetails = &schemas.ResponsesResponseOutputTokens{} + } + usage.OutputTokensDetails.AudioTokens = int(detail.TokenCount) } } } + + // Also map reasoning tokens from thoughts + if metadata.ThoughtsTokenCount > 0 { + if usage.OutputTokensDetails == nil { + usage.OutputTokensDetails = &schemas.ResponsesResponseOutputTokens{} + } + usage.OutputTokensDetails.ReasoningTokens = int(metadata.ThoughtsTokenCount) + }This keeps Chat and Responses usage aligned and makes reasoning‑token accounting available to clients.
core/providers/gemini/responses.go (1)
1224-1360: Inline/file output blocks always use input content block typesIn
convertGeminiContentsToResponsesMessages:
- Text blocks correctly use input vs output types based on
content.Role(model→ResponsesOutputMessageContentTypeText, user → input text).- For inline data and file data, you always use the input block types (
ResponsesInputMessageContentBlockTypeImage/Audio/File), even when the role is assistant/model:Type: func() schemas.ResponsesMessageContentBlockType { if strings.HasPrefix(part.InlineData.MIMEType, "image/") { return schemas.ResponsesInputMessageContentBlockTypeImage } else if strings.HasPrefix(part.InlineData.MIMEType, "audio/") { return schemas.ResponsesInputMessageContentBlockTypeAudio } return schemas.ResponsesInputMessageContentBlockTypeText }(),For consistency with how text/reasoning are handled (and with the Responses schema’s separation of input vs output block types), it may be worth:
- Selecting output variants when
content.Role == "model"for images/audio/files, and- Reserving input variants for user‑authored content.
If existing consumers don’t care about the distinction this is cosmetic, but aligning the types now will make downstream interpretation (e.g., UI rendering, analytics) more predictable.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (12)
core/providers/gemini/chat.go(3 hunks)core/providers/gemini/gemini.go(1 hunks)core/providers/gemini/gemini_test.go(1 hunks)core/providers/gemini/responses.go(9 hunks)core/providers/gemini/speech.go(1 hunks)core/providers/gemini/utils.go(7 hunks)core/providers/vertex/vertex.go(8 hunks)core/providers/vertex/vertex_test.go(1 hunks)core/schemas/responses.go(1 hunks)core/schemas/utils.go(1 hunks)tests/integrations/tests/test_google.py(1 hunks)transports/bifrost-http/integrations/genai.go(2 hunks)
🚧 Files skipped from review as they are similar to previous changes (4)
- core/providers/vertex/vertex_test.go
- core/schemas/responses.go
- core/providers/gemini/gemini_test.go
- core/schemas/utils.go
🧰 Additional context used
📓 Path-based instructions (1)
**
⚙️ CodeRabbit configuration file
always check the stack if there is one for the current PR. do not give localized reviews for the PR, always see all changes in the light of the whole stack of PRs (if there is a stack, if there is no stack you can continue to make localized suggestions/reviews)
Files:
tests/integrations/tests/test_google.pycore/providers/gemini/speech.gotransports/bifrost-http/integrations/genai.gocore/providers/gemini/utils.gocore/providers/gemini/chat.gocore/providers/gemini/gemini.gocore/providers/vertex/vertex.gocore/providers/gemini/responses.go
🧬 Code graph analysis (6)
tests/integrations/tests/test_google.py (1)
tests/integrations/tests/utils/config_loader.py (2)
get_model(140-161)get_model(443-445)
transports/bifrost-http/integrations/genai.go (4)
core/schemas/bifrost.go (1)
ResponsesRequest(91-91)core/providers/gemini/embedding.go (1)
ToGeminiEmbeddingResponse(57-100)core/schemas/responses.go (2)
BifrostResponsesResponse(45-83)BifrostResponsesStreamResponse(1426-1464)core/providers/gemini/responses.go (2)
ToGeminiResponsesResponse(153-339)ToGeminiResponsesStreamResponse(341-509)
core/providers/gemini/utils.go (5)
core/providers/gemini/types.go (16)
Type(778-778)Schema(724-775)FinishReason(22-22)FinishReasonStop(28-28)FinishReasonMaxTokens(30-30)FinishReasonSafety(33-33)FinishReasonRecitation(35-35)FinishReasonLanguage(37-37)FinishReasonOther(39-39)FinishReasonBlocklist(41-41)FinishReasonProhibitedContent(43-43)FinishReasonSPII(46-46)FinishReasonMalformedFunctionCall(48-48)FinishReasonImageSafety(50-50)FinishReasonUnexpectedToolCall(52-52)FunctionResponse(1104-1126)core/schemas/responses.go (7)
ResponsesParameters(85-112)ResponsesTextConfig(118-121)ResponsesTextConfigFormat(123-128)ResponsesResponseUsage(255-262)ResponsesResponseOutputTokens(272-282)ResponsesResponseInputTokens(264-270)ResponsesTextConfigFormatJSONSchema(131-140)core/schemas/utils.go (1)
Ptr(16-18)core/utils.go (1)
Ptr(51-53)core/schemas/chatcompletions.go (3)
BifrostLLMUsage(640-647)ChatPromptTokensDetails(649-655)ChatCompletionTokensDetails(657-667)
core/providers/gemini/chat.go (3)
core/schemas/chatcompletions.go (7)
ChatAssistantMessageToolCall(564-570)ChatContentBlock(505-512)ChatAssistantMessage(541-545)BifrostResponseChoice(582-590)BifrostChatResponse(26-41)ChatStreamResponseChoiceDelta(616-622)ChatStreamResponseChoice(611-613)core/schemas/utils.go (1)
Ptr(16-18)core/providers/gemini/utils.go (1)
ConvertGeminiFinishReasonToBifrost(213-218)
core/providers/gemini/gemini.go (5)
core/providers/gemini/chat.go (1)
ToGeminiChatCompletionRequest(11-62)core/providers/utils/utils.go (4)
ShouldSendBackRawResponse(482-487)GetPathFromContext(211-216)NewBifrostOperationError(449-460)ProcessAndSendResponse(535-565)core/schemas/utils.go (1)
Ptr(16-18)core/utils.go (1)
Ptr(51-53)core/providers/gemini/responses.go (2)
ToGeminiResponsesRequest(76-126)FinalizeGeminiResponsesStream(1220-1222)
core/providers/gemini/responses.go (3)
core/providers/gemini/types.go (11)
GeminiGenerationRequest(55-73)ToolConfig(146-151)SafetySetting(81-89)Part(936-960)Content(922-930)Type(778-778)FunctionCall(1091-1101)FunctionResponse(1104-1126)FileData(1080-1088)Blob(963-971)Tool(600-627)core/schemas/responses.go (9)
BifrostResponsesRequest(32-39)ResponsesParameters(85-112)BifrostResponsesResponse(45-83)BifrostResponsesStreamResponse(1426-1464)ResponsesMessage(315-327)ResponsesMessageContent(339-344)ResponsesMessageContentBlock(399-410)ResponsesTool(1029-1044)ResponsesToolChoice(958-961)core/schemas/utils.go (3)
ParseModelString(23-36)SafeExtractFromMap(519-525)SafeExtractString(280-297)
🔇 Additional comments (31)
core/providers/gemini/speech.go (1)
110-127: LGTM! Proper nil check added to prevent potential panic.The nil check on line 110 correctly guards against nil pointer dereference when
bifrostReq.Inputis nil. This is a good defensive programming practice.One consideration: the voice config handling (lines 121-126) is now inside the
ifblock, meaning ifInputis nil or empty, the request will have no contents and no voice config. This appears intentional since a speech request without input text would be invalid anyway.tests/integrations/tests/test_google.py (1)
519-566: LGTM! Well-implemented streaming test.The streaming test properly:
- Uses the correct Google GenAI SDK streaming method (
generate_content_stream)- Handles both the structured response format and fallback to direct text attribute
- Validates chunk count and content length
- Uses a comprehensive list of robot-related terms for flexible content validation
Good defensive handling of the response structure at lines 537-546.
transports/bifrost-http/integrations/genai.go (2)
49-62: LGTM! Correctly migrated to Responses API.The changes properly:
- Switch from
ChatRequesttoResponsesRequestwithToBifrostResponsesRequest()(line 51)- Update converters to use
BifrostResponsesResponseandToGeminiResponsesResponse(lines 60-61)This aligns with the PR objective to use native Gemini API format instead of OpenAI-compatible endpoints.
72-78: LGTM! Streaming converter updated for Responses API.The streaming converter is correctly renamed and updated to use
BifrostResponsesStreamResponseandToGeminiResponsesStreamResponse, maintaining consistency with the non-streaming path.core/providers/vertex/vertex.go (6)
297-312: LGTM! Gemini branch in ChatCompletion properly implemented.The Gemini model handling correctly:
- Uses
gemini.ToGeminiChatCompletionRequest()for conversion- Strips unsupported fields via
stripVertexGeminiUnsupportedFields()- Sets the deployment model
- Follows the same marshal/unmarshal pattern as other branches for Vertex API compatibility
478-499: LGTM! Gemini response handling in ChatCompletion is correct.The response processing:
- Uses the proper
GenerateContentResponsetype- Sets all required
ExtraFieldsincludingModelDeploymentwhen applicable- Handles raw response correctly based on configuration
656-668: Verify API version consistency between streaming and non-streaming paths.The streaming Gemini URLs use
v1beta1:completeURL = fmt.Sprintf("https://%s-aiplatform.googleapis.com/v1beta1/projects/%s/locations/%s/publishers/google/models/%s:streamGenerateContent", ...)However, the non-streaming ChatCompletion path (lines 388-391) uses
v1:completeURL = fmt.Sprintf("https://%s-aiplatform.googleapis.com/v1/projects/%s/locations/%s/publishers/google/models/%s:generateContent", ...)If this is intentional (e.g., SSE streaming only available in v1beta1), this is fine. Otherwise, consider aligning the versions.
916-1012: LGTM! Gemini branch in Responses is well-implemented.The implementation correctly:
- Calls
stripVertexGeminiUnsupportedFields()(line 927)- Handles auth errors by removing client from pool (lines 984-986)
- Sets
ModelDeploymentwhen it differs from request.Model (lines 1004-1006)- Uses consistent v1 API endpoint matching the non-streaming ChatCompletion path
All previously raised review concerns have been addressed.
1137-1229: LGTM! ResponsesStream Gemini branch properly implemented.The implementation:
- Correctly calls
stripVertexGeminiUnsupportedFields()(line 1159) - addressing the previous review concern- Uses v1beta1 endpoint for streaming (consistent with ChatCompletionStream)
- Properly handles authentication via API key or OAuth2
- Uses the shared
gemini.HandleGeminiResponsesStreamhandler
1395-1411: LGTM! Helper function correctly strips unsupported fields.The
stripVertexGeminiUnsupportedFieldsfunction properly:
- Iterates through all contents and parts
- Clears the
IDfield fromFunctionCallandFunctionResponseobjects- Operates in-place on the request (no allocation overhead)
Note: Setting
ID = ""works correctly withomitemptyJSON tags, ensuring the field is omitted from the serialized request.core/providers/gemini/chat.go (3)
11-23: LGTM! Signature simplification is appropriate.The removal of the
responseModalitiesparameter fromToGeminiChatCompletionRequestsimplifies the API. The function now internally passes an empty modalities list toconvertParamsToGenerationConfig, which is appropriate for chat completions where response modalities are typically not needed.
77-157: LGTM! Non-streaming response handling is well-structured.The updated
ToBifrostChatResponse:
- Properly aggregates content blocks and tool calls from all parts
- Uses
FunctionCall.IDwhen available, falling back to name (lines 106-109) - addressing the previous review concern- Handles
FunctionResponseby extracting output to text content blocks- Correctly structures the response with
ChatNonStreamResponseChoiceThe aggregation pattern ensures all content from Gemini's multi-part response format is captured in the Bifrost response.
165-312: LGTM! Well-implemented streaming response conversion.The
ToBifrostChatCompletionStreamfunction:
- Correctly handles streaming state - Returns
(response, error, isLastChunk)tuple for proper stream lifecycle management- Role mapping - Converts Gemini's "model" role to "assistant" (lines 200-206)
- Content separation - Properly distinguishes between regular text, thought/reasoning content, and tool calls (lines 212-265)
- ThoughtSignature preservation - Correctly preserves
thought_signatureinExtraContentfor Gemini 3 Pro compatibility (lines 247-254)- Empty chunk handling - Skips chunks with no meaningful content unless it's the final chunk (lines 283-287)
- Finish reason and usage - Only attached on the last chunk (lines 291-294, 307-309)
The implementation correctly converts Gemini's streaming format to Bifrost's streaming format.
core/providers/gemini/utils.go (3)
11-70: GenerationConfig → ResponsesParameters mapping looks correctThe conversion from
GenerationConfigtoschemas.ResponsesParameters(temperature, top‑p, logprobs, max tokens, stop sequences, penalties, response MIME/schema) is consistent with the Responses schema and keeps provider‑specific knobs underExtraParams. No functional issues spotted here.
839-890: Structuredresponse_format→ ResponsesTextConfig mapping looks solid
buildOpenAIResponseFormat’s conversion of GeminiSchema/responseJsonSchemaintoResponsesTextConfig(json_schemavsjson_objectfallback, lowercasing types, preserving title asName, and defaultingStricttofalse) matches the Responses text config types and should interop cleanly with OpenAI‑style structured outputs.
933-954: Function response extraction provides sensible fallback
extractFunctionResponseOutputpreferring a top‑level"output"string and otherwise JSON‑encoding the wholeResponsemap is a good compromise to get both clean text outputs and robust behavior for arbitrary payloads.core/providers/gemini/gemini.go (7)
209-244: ChatCompletion now uses native Gemini converters correctlyThe non‑streaming
ChatCompletionpath cleanly switches toToGeminiChatCompletionRequest+completeRequest+ToBifrostChatResponse, with ExtraFields (provider, model, request type, latency, raw response) populated consistently with other providers. No issues here.
246-291: ChatCompletionStream wiring into shared Gemini SSE handler looks goodThe streaming chat entrypoint delegates to
HandleGeminiChatCompletionStreamwith the right URL (/models/<model>:streamGenerateContent?alt=ssevia context override), headers, extra headers, and provider/model metadata. This keeps the provider surface small and reuses the common Gemini SSE logic.
293-485: Shared chat streaming handler has robust error and lifecycle handling
HandleGeminiChatCompletionStreamcorrectly:
- Sets
resp.StreamBody = trueand usesbufio.ScanneroverBodyStream().- Handles network errors (cancel, timeout, generic) and HTTP errors with structured
BifrostErrors.- Parses SSE
data:lines intoGenerateContentResponsechunks viaprocessGeminiStreamChunk.- Tracks
responseID/modelName, converts chunks viaToBifrostChatCompletionStream, and annotates ExtraFields (request type, provider, model, chunk index, per‑chunk and total latency, optional raw chunk).Stream‑end is signaled both on
isLastChunkand scanner errors with the appropriate context key and post‑hook processing. No correctness concerns.
487-526: Non‑streaming Responses path correctly reuses core request/response helpers
GeminiProvider.Responsesnow:
- Gates via
CheckOperationAllowedfor Gemini/Responses.- Uses
ToGeminiResponsesRequest+completeRequestto hit:generateContent.- Converts via
ToResponsesBifrostResponsesResponseand fills ExtraFields (provider, model, request type, latency, optional raw response).Assuming
ToGeminiResponsesRequestis fixed as noted inresponses.go(nil‑Params guard & error surfacing), this implementation looks correct.
528-809: Responses streaming integration aligns with the new Responses state machine
ResponsesStreamandHandleGeminiResponsesStream:
- Share the same SSE plumbing as chat (headers, URL, error handling).
- Maintain
chunkIndexand a globalsequenceNumberpassed intoToBifrostResponsesStream, sosequence_numberis strictly increasing across all events.- Track stream state via
GeminiResponsesStreamState(acquired/released from pool), keeping tool‑call lifecycle and text items coherent.- On normal EOF, finalize with
FinalizeGeminiResponsesStream, which now guards against doubleresponse.completed/output_item.doneviaHasEmittedCompleted.This addresses the earlier double‑terminal‑event issue and keeps the Responses streaming contract intact.
872-1077: Speech streaming reuse of Gemini chunk processing is consistent
SpeechStreamnow usesprocessGeminiStreamChunkandextractGeminiUsageMetadatato:
- Parse SSE chunks into
GenerateContentResponse.- Aggregate inline audio data from
InlineDataparts.- Attach token usage from the terminal chunk into
SpeechUsage.- Emit
SpeechStreamResponseTypeDeltachunks with per‑chunk latency and optional raw payload, followed by a singleSpeechStreamResponseTypeDonewith total latency and usage.Network and API errors are handled analogously to chat/responses. The structure looks sound.
1345-1377: Shared helpers for stream chunk parsing and usage extraction are straightforward
processGeminiStreamChunkcleanly separates error‑payload detection ("error"key) from normal chunks and returns descriptive errors used upstream to differentiate Gemini API errors from decode issues.extractGeminiUsageMetadatacentralizes token extraction for reuse across speech/transcription streaming logic. Both are minimal and correct.core/providers/gemini/responses.go (8)
15-74: Provider→Responses request conversion is now safer and more complete
ToBifrostResponsesRequest:
- Parses
request.Modelinto(provider, model)viaParseModelString.- Always initializes
params(withExtraParamsmap) viaconvertGenerationConfigToResponsesParameters, then overlays instructions, tools/toolChoice, safety settings, cached_content.- Populates
bifrostReq.InputfromContentsviaconvertGeminiContentsToResponsesMessages.This removes previous nil‑
Paramshazards and ensures all Gemini metadata (safety, cached content) is preserved through the Responses layer.
153-339: BifrostResponsesResponse → Gemini GenerateContentResponse mapping is thorough
ToGeminiResponsesResponse:
- Copies
ModelandID/CreatedAtintoModelVersion,ResponseID, andCreateTime.- Groups output messages by role into candidates (
role: "model"/"user"), translating:
- Text and content blocks via
convertContentBlockToGeminiPart.- Function calls with proper
Name,ID(from CallID), arguments JSON, and optionalThoughtSignaturefrom adjacent reasoning messages.- Function call outputs into
FunctionResponsewithName,ID, and structuredResponse.- Reasoning messages into
Thoughttext parts.Finish reasons are derived from
IncompleteDetails.Reason(mappingmax_tokensandcontent_filterto Gemini enums, defaulting toSTOP/OTHER), and usage is reconstructed frombifrostResp.Usage. This looks correct and consistent with the other converters.
341-509: Responses streaming back‑conversion covers the important event types
ToGeminiResponsesStreamResponse:
- Ignores purely lifecycle events (
ping,created,in_progress, etc.) and only materializes events that correspond to Gemini content: text deltas, reasoning deltas, function calls, content parts, completions, failures, refusals.- Builds a single
Candidatewith appropriatePartsandFinishReason, and restores basic usage metadata onCompletedevents (including cached tokens and audio tokens viaCandidatesTokensDetails).The behavior for
FunctionCallArgumentsDelta/ContentPartDone/reasoning‑done events (returningniland letting the caller rely on the finalCompletedevent) is reasonable, and should be enough for Vertex or other Gemini‑facing integrations that only need a best‑effort reverse mapping.
511-639: Streaming state object and pool correctly enforce single terminal emission
GeminiResponsesStreamStateand its pool helpers:
- Track lifecycle flags (
HasEmittedCreated,HasEmittedInProgress,HasEmittedCompleted) plus per‑output indices, item/tool IDs, argument buffers.- Initialize and reset all maps/fields in
acquireGeminiResponsesStreamStateandflush.- Use
HasEmittedCompletedincloseGeminiOpenItemsto guaranteeresponse.completed(and associatedoutput_item.dones) is only emitted once, whether viaFinishReasonon the last chunk or via finalization.This resolves the earlier double‑terminal‑event issue while keeping state reusable.
641-722: Responses stream conversion cleanly orchestrates lifecycle and closing logic
ToBifrostResponsesStream:
- Issues
response.createdandresponse.in_progressonce per stream (with synthetic message ID and model/version).- Routes each Gemini
Partthrough specialized handlers (processGeminiTextPart,processGeminiThoughtPart,processGeminiFunctionCallPart, inline/file handlers), passing the runningsequenceNumber.- On
candidate.FinishReason != "", callscloseGeminiOpenItemswith usage metadata, which in turn emits any remainingoutput_item.dones and a singleresponse.completed.Sequence numbers and output indices are computed in a way that remains monotonic and consistent with the Responses streaming schema. No correctness issues here.
1501-1616: Function call / response mapping preserves IDs and thought signaturesIn
convertGeminiCandidatesToResponsesOutput:
Function calls are converted with:
- Arguments marshalled to JSON.
CallIDpreferringpart.FunctionCall.IDand falling back toNameif ID is blank.Namealways set fromFunctionCall.Name.- Optional
ThoughtSignatureemitted as a sibling reasoning message withEncryptedContent.Function responses use
extractFunctionResponseOutputto build a textual payload and setResponsesToolMessage.CallIDfrompart.FunctionResponse.ID, withNamefilled fromFunctionResponse.Namewhen present.This combination keeps Gemini‑supplied call IDs intact and carries thought signatures through, which should avoid the lossy behavior you previously had.
1928-2069: Responses→Gemini content conversion handles system, messages, and tools comprehensively
convertResponsesMessagesToGeminiContents:
- Separates system messages into a dedicated
systemInstructionContent.- Maps assistant/user roles into Gemini
Content.Role("model"vs"user").- Converts text/blocks via
convertContentBlockToGeminiPart.- Translates function calls into
FunctionCallparts (including arguments and ID), and function call outputs intoFunctionResponseparts withName/IDand"output"payload.- Preserves thought signatures by looking ahead to a following reasoning message with
EncryptedContent.This gives you a full round‑trip between Bifrost Responses and Gemini native contents with tool‑calling and reasoning intact.
2071-2188: ContentBlock→Gemini Part conversion covers all supported modalities
convertContentBlockToGeminiPartcorrectly:
- Maps text, reasoning, and refusal blocks into
Text/Thoughtparts (with refusals downgraded to plain text for Gemini).- Converts base64 images and audio into
InlineDatawith appropriate MIME types, handling media‑type prefixes and defaults.- Converts image/file URLs into
FileDatawith sanitized URIs.- Handles file blocks with either URL or inline data as
FileData/InlineData.Error cases are localized to malformed URLs/base64 and bubble up as
fmt.Errorf, which is appropriate given how callers treat them. Overall this converter looks solid.
ffd9580 to
e04023a
Compare
461269a to
41ccb0f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
♻️ Duplicate comments (1)
core/providers/gemini/responses.go (1)
114-123: Guard against nilParamsbefore accessingExtraParams.Line 114 accesses
bifrostReq.Params.ExtraParamswithout first checking ifbifrostReq.Paramsis nil. SinceParamsis only set conditionally at line 87, it can be nil here, causing a panic.Apply this diff:
- if bifrostReq.Params.ExtraParams != nil { + if bifrostReq.Params != nil && bifrostReq.Params.ExtraParams != nil { if safetySettings, ok := schemas.SafeExtractFromMap(bifrostReq.Params.ExtraParams, "safety_settings"); ok { if settings, ok := safetySettings.([]SafetySetting); ok { geminiReq.SafetySettings = settings } } if cachedContent, ok := schemas.SafeExtractString(bifrostReq.Params.ExtraParams["cached_content"]); ok { geminiReq.CachedContent = cachedContent } }
🧹 Nitpick comments (4)
core/providers/gemini/speech.go (1)
110-127: Consider returning an error when Input is nil or empty.The nil check prevents a panic, which is good. However, when
bifrostReq.Inputis nil orInput.Inputis empty, the function returns aGeminiGenerationRequestwithout anyContents, which may result in an invalid API request to Gemini.Consider returning an error for invalid input to provide clearer feedback:
// ToGeminiSpeechRequest converts a BifrostSpeechRequest to a GeminiGenerationRequest func ToGeminiSpeechRequest(bifrostReq *schemas.BifrostSpeechRequest) (*GeminiGenerationRequest, error) { if bifrostReq == nil { return nil, fmt.Errorf("bifrostReq is nil") } + if bifrostReq.Input == nil || bifrostReq.Input.Input == "" { + return nil, fmt.Errorf("speech input is required") + } // ... rest of validation ... // Convert speech input to Gemini format - if bifrostReq.Input != nil && bifrostReq.Input.Input != "" { + geminiReq.Contents = []Content{ + { + Parts: []*Part{ + { + Text: bifrostReq.Input.Input, + }, + }, + }, + }core/providers/vertex/vertex.go (2)
951-953: Duplicate raw response assignment in Anthropic Responses path.Lines 944-946 already set
RawResponsewhenShouldSendBackRawResponsereturns true. Lines 951-953 duplicate this check and assignment, which is unnecessary.if providerUtils.ShouldSendBackRawResponse(ctx, provider.sendBackRawResponse) { response.ExtraFields.RawResponse = rawResponse } if request.Model != deployment { response.ExtraFields.ModelDeployment = deployment } - - if providerUtils.ShouldSendBackRawResponse(ctx, provider.sendBackRawResponse) { - response.ExtraFields.RawResponse = rawResponse - } return response, nil
1030-1031: Remove debug comment.The comment "print response body" appears to be a leftover debug note that should be removed.
- // print response body geminiResponse := &gemini.GenerateContentResponse{}core/providers/gemini/gemini.go (1)
183-186: Duplicate raw response assignment in listModelsByKey.Lines 179-181 already handle setting
RawResponse. Lines 183-186 duplicate this check unnecessarily.// Set raw response if enabled if providerUtils.ShouldSendBackRawResponse(ctx, provider.sendBackRawResponse) { response.ExtraFields.RawResponse = rawResponse } - - if providerUtils.ShouldSendBackRawResponse(ctx, provider.sendBackRawResponse) { - response.ExtraFields.RawResponse = rawResponse - } return response, nil
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (12)
core/providers/gemini/chat.go(3 hunks)core/providers/gemini/gemini.go(1 hunks)core/providers/gemini/gemini_test.go(1 hunks)core/providers/gemini/responses.go(9 hunks)core/providers/gemini/speech.go(1 hunks)core/providers/gemini/utils.go(7 hunks)core/providers/vertex/vertex.go(8 hunks)core/providers/vertex/vertex_test.go(1 hunks)core/schemas/responses.go(1 hunks)core/schemas/utils.go(1 hunks)tests/integrations/tests/test_google.py(1 hunks)transports/bifrost-http/integrations/genai.go(2 hunks)
🚧 Files skipped from review as they are similar to previous changes (3)
- core/schemas/responses.go
- core/providers/gemini/gemini_test.go
- transports/bifrost-http/integrations/genai.go
🧰 Additional context used
📓 Path-based instructions (1)
**
⚙️ CodeRabbit configuration file
always check the stack if there is one for the current PR. do not give localized reviews for the PR, always see all changes in the light of the whole stack of PRs (if there is a stack, if there is no stack you can continue to make localized suggestions/reviews)
Files:
core/providers/vertex/vertex_test.gocore/providers/gemini/speech.gocore/schemas/utils.gocore/providers/vertex/vertex.gotests/integrations/tests/test_google.pycore/providers/gemini/chat.gocore/providers/gemini/gemini.gocore/providers/gemini/utils.gocore/providers/gemini/responses.go
🧬 Code graph analysis (1)
core/providers/gemini/gemini.go (3)
core/providers/gemini/chat.go (1)
ToGeminiChatCompletionRequest(11-62)core/providers/utils/utils.go (2)
ShouldSendBackRawResponse(533-538)GetPathFromContext(212-217)core/providers/gemini/responses.go (1)
ToGeminiResponsesRequest(76-126)
⏰ Context from checks skipped due to timeout of 900000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (11)
- GitHub Check: Graphite / mergeability_check
- GitHub Check: Graphite / mergeability_check
- GitHub Check: Graphite / mergeability_check
- GitHub Check: Graphite / mergeability_check
- GitHub Check: Graphite / mergeability_check
- GitHub Check: Graphite / mergeability_check
- GitHub Check: Graphite / mergeability_check
- GitHub Check: Graphite / mergeability_check
- GitHub Check: Graphite / mergeability_check
- GitHub Check: Graphite / mergeability_check
- GitHub Check: Graphite / mergeability_check
🔇 Additional comments (18)
core/providers/vertex/vertex_test.go (1)
26-27: Model naming updated to align with native Gemini detection.The removal of the
google/prefix aligns with the newschemas.IsGeminiModel()function which usesstrings.Contains(model, "gemini"). The Vertex provider'sgetModelDeploymentmethod handles deployment resolution internally, so this change is appropriate.core/schemas/utils.go (1)
1052-1054: LGTM - Consistent model detection pattern.The
IsGeminiModelfunction follows the same pattern asIsAnthropicModelandIsMistralModel, providing a simple and consistent way to detect Gemini models by substring matching.core/providers/vertex/vertex.go (1)
1439-1455: LGTM - Vertex-specific field stripping for Gemini compatibility.The
stripVertexGeminiUnsupportedFieldshelper correctly removes theIDfield fromFunctionCallandFunctionResponseparts, which Vertex AI's Gemini API doesn't support. This is properly called in all Gemini paths (ChatCompletion, ChatCompletionStream, Responses, ResponsesStream).core/providers/gemini/chat.go (3)
11-23: Signature simplification approved.The removal of the
responseModalitiesparameter fromToGeminiChatCompletionRequestsimplifies the API. The empty slice passed toconvertParamsToGenerationConfigmaintains backward compatibility while preparing for future modality handling.
106-117: Tool call ID handling now consistent with streaming path.The non-streaming path now properly checks
part.FunctionCall.IDfirst before falling back to the function name, matching the streaming path logic. This addresses the previous review concern.
165-312: Well-structured streaming response conversion.The
ToBifrostChatCompletionStreammethod cleanly handles:
- Text and thought/reasoning content separation
- Tool call extraction with ID preservation
- ThoughtSignature preservation for Gemini 3 Pro
- Finish reason conversion
- Usage metadata on final chunks
The early return for empty delta content (lines 283-287) prevents sending unnecessary empty chunks.
core/providers/gemini/utils.go (3)
194-218: Comprehensive finish reason mapping.The
geminiFinishReasonToBifrostmap covers all Gemini finish reasons with sensible mappings:
- Safety-related reasons →
content_filter- Token limit →
length- Tool-related issues →
tool_callsOne consideration:
FinishReasonMalformedFunctionCallmapping totool_callsmight be semantically incorrect since it indicates an error rather than successful tool use. However, this is a minor edge case.Consider whether
FinishReasonMalformedFunctionCallshould map to an error-specific reason or remain astool_calls. The current mapping may cause confusion when debugging malformed function calls.
262-293: Good separation of usage metadata conversion.Having separate functions for chat (
convertGeminiUsageMetadataToChatUsage) and responses (convertGeminiUsageMetadataToResponsesUsage) usage metadata is appropriate given the different output structures required by each API.
932-954: Robust function response output extraction.The
extractFunctionResponseOutputfunction properly:
- Handles nil cases
- Tries to extract the
outputkey first (common convention)- Falls back to marshaling the entire response
- Returns empty string on failure (safe fallback)
core/providers/gemini/gemini.go (3)
304-496: Well-implemented streaming handler for chat completions.The
HandleGeminiChatCompletionStreamfunction properly handles:
- SSE parsing with appropriate buffer sizing (10MB max)
- Error conditions (context cancellation, timeouts, HTTP errors, API errors)
- Chunk indexing and latency tracking
- Response ID and model propagation
- Post-response conversion hook
- Final chunk detection via finish reason
583-820: Responses stream handler with proper lifecycle management.The
HandleGeminiResponsesStreamimplementation correctly manages:
- Stream state via
acquireGeminiResponsesStreamState/releaseGeminiResponsesStreamState(object pooling)- Usage metadata tracking across chunks
- Sequence numbering for responses events
- Finalization via
FinalizeGeminiResponsesStream- Multiple responses per SSE event handling (lines 728-774)
The pattern of emitting final events when the scanner completes without error (lines 781-816) ensures proper stream completion even if no explicit "completed" event is received.
501-504: Correct provider in operation check.The
CheckOperationAllowedcall now correctly usesschemas.Geminiinstead of the previously incorrectschemas.Anthropic. This addresses the past review concern.core/providers/gemini/responses.go (6)
15-74: LGTM! Params initialization properly addresses previous nil pointer concerns.The defensive initialization at lines 28-33 ensures
paramsis never nil before field assignments, correctly resolving the past review feedback.
228-253: LGTM! Function call ID preservation correctly implemented.Lines 238-239 properly preserve the
CallIDasFunctionCall.ID, keeping IDs distinct through round-trips as intended by the previous fix.
511-639: LGTM! Stream state management with pooling is correctly implemented.The addition of
HasEmittedCompleted(line 516) and proper initialization/reset in pool lifecycle (lines 589, 634) correctly address the double-emission concern from previous reviews.
1166-1222: LGTM! Double-emission prevention correctly implemented.The guard at lines 1168-1170 and flag setting at line 1214 ensure
response.completedandoutput_item.doneevents are emitted exactly once, resolving the previous issue.
1558-1574: LGTM! Function call ID handling correctly preserves Gemini IDs.Lines 1558-1567 properly use
part.FunctionCall.IDforCallID(with Name as fallback) and preserve the function name separately, maintaining ID distinctiveness across conversions.
2083-2097: LGTM! New reasoning and refusal content types properly handled.Lines 2083-2089 correctly map reasoning content to Gemini's
Thoughtflag, and lines 2091-2097 appropriately handle refusals as text content for Gemini compatibility.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (1)
core/providers/vertex/vertex.go (1)
1030-1031: Remove debug comment.The comment "print response body" appears to be leftover from debugging.
- // print response body geminiResponse := &gemini.GenerateContentResponse{}
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (12)
core/providers/gemini/chat.go(3 hunks)core/providers/gemini/gemini.go(1 hunks)core/providers/gemini/gemini_test.go(1 hunks)core/providers/gemini/responses.go(9 hunks)core/providers/gemini/speech.go(1 hunks)core/providers/gemini/utils.go(7 hunks)core/providers/vertex/vertex.go(8 hunks)core/providers/vertex/vertex_test.go(1 hunks)core/schemas/responses.go(1 hunks)core/schemas/utils.go(1 hunks)tests/integrations/tests/test_google.py(1 hunks)transports/bifrost-http/integrations/genai.go(2 hunks)
🚧 Files skipped from review as they are similar to previous changes (4)
- core/providers/gemini/speech.go
- core/schemas/responses.go
- core/schemas/utils.go
- core/providers/gemini/gemini_test.go
🧰 Additional context used
📓 Path-based instructions (1)
**
⚙️ CodeRabbit configuration file
always check the stack if there is one for the current PR. do not give localized reviews for the PR, always see all changes in the light of the whole stack of PRs (if there is a stack, if there is no stack you can continue to make localized suggestions/reviews)
Files:
transports/bifrost-http/integrations/genai.gotests/integrations/tests/test_google.pycore/providers/gemini/chat.gocore/providers/vertex/vertex.gocore/providers/gemini/utils.gocore/providers/vertex/vertex_test.gocore/providers/gemini/gemini.gocore/providers/gemini/responses.go
🧬 Code graph analysis (5)
transports/bifrost-http/integrations/genai.go (6)
core/schemas/bifrost.go (1)
ResponsesRequest(91-91)transports/bifrost-http/handlers/inference.go (1)
ResponsesRequest(257-261)transports/bifrost-http/integrations/router.go (3)
EmbeddingResponseConverter(102-102)ResponsesResponseConverter(98-98)ResponsesStreamResponseConverter(123-123)core/providers/gemini/embedding.go (1)
ToGeminiEmbeddingResponse(57-100)core/schemas/responses.go (2)
BifrostResponsesResponse(45-84)BifrostResponsesStreamResponse(1430-1469)core/providers/gemini/responses.go (2)
ToGeminiResponsesResponse(153-339)ToGeminiResponsesStreamResponse(341-509)
tests/integrations/tests/test_google.py (1)
tests/integrations/tests/utils/config_loader.py (2)
get_model(140-161)get_model(443-445)
core/providers/gemini/chat.go (3)
core/schemas/chatcompletions.go (9)
BifrostChatRequest(12-19)ChatAssistantMessageToolCall(686-692)ChatContentBlock(590-597)ChatContentBlockTypeText(582-582)ChatAssistantMessageToolCallFunction(695-698)ChatToolChoiceTypeFunction(373-373)ChatMessage(469-478)ChatMessageContent(524-527)ChatAssistantMessage(626-632)core/providers/gemini/types.go (8)
GeminiGenerationRequest(55-73)GenerationConfig(631-697)Content(922-930)Type(778-778)Role(13-13)FinishReason(22-22)FinishReasonUnspecified(26-26)RoleModel(17-17)core/providers/gemini/utils.go (1)
ConvertGeminiFinishReasonToBifrost(213-218)
core/providers/gemini/gemini.go (7)
core/providers/gemini/chat.go (1)
ToGeminiChatCompletionRequest(11-62)core/schemas/bifrost.go (12)
RequestType(83-83)ChatCompletionRequest(89-89)BifrostStream(321-328)BifrostError(356-365)Gemini(48-48)ChatCompletionStreamRequest(90-90)ModelProvider(32-32)ErrorField(374-381)BifrostErrorExtraFields(423-427)BifrostResponseExtraFields(286-296)ResponsesRequest(91-91)ResponsesStreamRequest(92-92)core/schemas/provider.go (2)
Provider(282-309)PostHookRunner(279-279)core/providers/utils/utils.go (2)
ShouldSendBackRawResponse(533-538)ReleaseStreamingResponse(818-825)core/utils.go (1)
Ptr(51-53)core/schemas/responses.go (3)
BifrostResponsesRequest(32-39)BifrostResponsesResponse(45-84)BifrostResponsesStreamResponse(1430-1469)core/providers/gemini/responses.go (1)
ToGeminiResponsesRequest(76-126)
core/providers/gemini/responses.go (3)
core/providers/gemini/types.go (11)
GeminiGenerationRequest(55-73)ToolConfig(146-151)Part(936-960)Role(13-13)Content(922-930)Type(778-778)FunctionCall(1091-1101)FunctionResponse(1104-1126)FileData(1080-1088)Blob(963-971)Tool(600-627)core/schemas/responses.go (13)
BifrostResponsesRequest(32-39)BifrostResponsesResponse(45-84)ResponsesToolMessage(465-485)ResponsesReasoning(734-737)ResponsesMessage(317-330)ResponsesMessageContent(342-347)ResponsesMessageContentBlock(402-414)ResponsesToolMessageOutputStruct(534-538)ResponsesInputMessageContentBlockImage(416-419)ResponsesInputMessageContentBlockAudio(427-430)ResponsesInputMessageContentBlockFile(421-425)ResponsesTool(1033-1048)ResponsesOutputMessageContentRefusal(461-463)core/schemas/utils.go (4)
ParseModelString(23-36)SafeExtractFromMap(519-525)SafeExtractString(280-297)Ptr(16-18)
⏰ Context from checks skipped due to timeout of 900000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (29)
- GitHub Check: Graphite / mergeability_check
- GitHub Check: Graphite / mergeability_check
- GitHub Check: Graphite / mergeability_check
- GitHub Check: Graphite / mergeability_check
- GitHub Check: Graphite / mergeability_check
- GitHub Check: Graphite / mergeability_check
- GitHub Check: Graphite / mergeability_check
- GitHub Check: Graphite / mergeability_check
- GitHub Check: Graphite / mergeability_check
- GitHub Check: Graphite / mergeability_check
- GitHub Check: Graphite / mergeability_check
- GitHub Check: Graphite / mergeability_check
- GitHub Check: Graphite / mergeability_check
- GitHub Check: Graphite / mergeability_check
- GitHub Check: Graphite / mergeability_check
- GitHub Check: Graphite / mergeability_check
- GitHub Check: Graphite / mergeability_check
- GitHub Check: Graphite / mergeability_check
- GitHub Check: Graphite / mergeability_check
- GitHub Check: Graphite / mergeability_check
- GitHub Check: Graphite / mergeability_check
- GitHub Check: Graphite / mergeability_check
- GitHub Check: Graphite / mergeability_check
- GitHub Check: Graphite / mergeability_check
- GitHub Check: Graphite / mergeability_check
- GitHub Check: Graphite / mergeability_check
- GitHub Check: Graphite / mergeability_check
- GitHub Check: Graphite / mergeability_check
- GitHub Check: Graphite / mergeability_check
🔇 Additional comments (32)
core/providers/vertex/vertex_test.go (1)
26-27: LGTM! Model identifiers correctly updated for native Gemini API.The removal of the
google/prefix aligns with the native Gemini API format. This change is consistent with the broader refactoring to use native Gemini endpoints instead of OpenAI-compatible ones.tests/integrations/tests/test_google.py (2)
507-516: Test correctly updated to reflect Gemini's error handling behavior.The test now validates that Gemini handles invalid roles gracefully by returning a successful response rather than raising an exception. This aligns with the native Gemini API behavior.
518-566: Well-structured streaming test with proper chunk handling.The streaming test correctly:
- Uses
generate_content_streamfor SSE streaming- Iterates through chunks with defensive
hasattrchecks- Accumulates content from both the candidates structure and direct text attribute
- Validates both chunk count and content quality
core/providers/vertex/vertex.go (5)
310-326: LGTM! Gemini request conversion properly integrated.The Gemini branch correctly:
- Converts Bifrost requests to native Gemini format
- Strips Vertex-unsupported fields (ID in function_call/function_response)
- Uses marshal/unmarshal for Vertex API compatibility
497-522: LGTM! Gemini response handling is consistent with other branches.The response handling correctly:
- Parses the native Gemini response format
- Sets all ExtraFields (RequestType, Provider, ModelRequested, ModelDeployment, Latency)
- Handles raw request/response when configured
660-733: LGTM! Gemini streaming path properly implemented.The streaming implementation correctly:
- Constructs URLs with
alt=sseparameter for SSE streaming- Supports both API key (query string) and OAuth2 (header) authentication
- Delegates to shared Gemini streaming handler
1181-1273: LGTM! ResponsesStream Gemini branch properly handles streaming.The implementation correctly:
- Calls
stripVertexGeminiUnsupportedFieldsto sanitize requests (line 1203)- Uses consistent URL construction and auth handling
- Properly sets up the post-response converter for model metadata
1439-1455: LGTM! Helper function correctly strips Vertex-unsupported fields.The
stripVertexGeminiUnsupportedFieldsfunction cleanly removes ID fields from function_call and function_response parts that Vertex AI's Gemini API doesn't support.transports/bifrost-http/integrations/genai.go (1)
49-75: LGTM! Route configuration correctly updated to use Responses API.The migration from Chat-based to Responses-based types aligns with the PR's objective to use native Gemini format:
- Request conversion now uses
ToBifrostResponsesRequest()- Response converters properly use
ToGeminiResponsesResponseandToGeminiResponsesStreamResponse- Type signatures match the new
BifrostResponsesResponseandBifrostResponsesStreamResponseschemascore/providers/gemini/chat.go (4)
10-23: LGTM! Simplified signature with reasonable defaults.The function signature simplification removes the
responseModalitiesparameter and defaults to an empty slice. This is appropriate since chat completion requests typically don't need explicit modality configuration—speech/transcription requests are handled through separate code paths.
77-157: LGTM! Non-streaming response conversion properly handles content and tool calls.The implementation correctly:
- Collects content blocks for text content
- Handles tool calls with proper ID fallback (prefers
FunctionCall.ID, falls back to name)- Extracts function response output to text blocks
- Builds a single assistant message with both content and tool calls
165-312: Well-structured streaming conversion with proper thought signature handling.The
ToBifrostChatCompletionStreammethod correctly:
- Returns a 3-tuple with response, error, and isLastChunk flag
- Separates thought content from regular text content
- Preserves thought_signature in ExtraContent for Gemini 3 Pro compatibility
- Only adds usage metadata on the final chunk
- Skips empty deltas to avoid noise in the stream (lines 283-287)
119-130: No action needed —extractFunctionResponseOutputis properly defined inutils.go(line 936) and handles multiple response formats: it extracts string values from the "output" field, falls back to JSON marshaling for map responses, and includes nil checks for safety.core/providers/gemini/utils.go (6)
11-70: LGTM! Conversion logic correctly updated for Responses API.The refactored method properly maps Gemini generation config to ResponsesParameters structure. The mapping of Logprobs to TopLogProbs and the ExtraParams handling align with the new Responses-oriented schema.
73-95: LGTM! Standalone function improves modularity.Converting from a method to a standalone function is a good refactor that makes the code more reusable and easier to test.
194-218: LGTM! Comprehensive finish reason mapping.The finish reason mapping provides good coverage of Gemini-specific reasons and includes a sensible fallback for unmapped values.
233-293: LGTM! Thorough usage metadata conversion for both APIs.Both converters properly handle nil cases, cached tokens, reasoning tokens, and modality-specific details. The separation between Chat and Responses usage structures is appropriate.
839-890: LGTM! Response format properly adapted for Responses API.The function correctly returns the new ResponsesTextConfig structure with appropriate nesting for JSON schema configuration. The fallback to json_object mode when no schema is provided is sensible.
933-954: LGTM! Clean utility for function response extraction.The function provides a sensible two-tier extraction strategy: first attempting to extract the "output" field, then falling back to marshaling the entire response. The nil checks are appropriate.
core/providers/gemini/gemini.go (5)
220-255: LGTM! ChatCompletion properly implemented.The implementation correctly uses the native Gemini API format, populates response metadata, and handles raw response inclusion conditionally. The error handling and field population are appropriate.
257-302: LGTM! ChatCompletionStream correctly configured.The streaming setup properly configures SSE headers, uses the correct Gemini streaming endpoint
:streamGenerateContent?alt=sse, and passes all necessary parameters to the shared streaming handler.
304-496: LGTM! Comprehensive streaming implementation with robust error handling.The streaming handler properly:
- Sets up SSE streaming with appropriate buffers
- Handles cancellation, timeouts, and HTTP errors
- Processes chunks with API error detection
- Tracks response metadata (ID, model version)
- Finalizes streams with proper latency calculation
- Handles scanner errors gracefully
The buffer sizes (1MB initial, 10MB max) are appropriate for handling large audio/image chunks.
498-537: LGTM! Responses method correctly implemented.The implementation properly uses native Gemini format, converts responses appropriately, and populates all required metadata fields. The permission check now correctly uses
schemas.Gemini.
539-820: LGTM! Responses streaming properly implements lifecycle with state management.The streaming implementation correctly:
- Uses pooled stream state for efficient lifecycle tracking
- Processes chunks with state-aware conversion
- Handles multiple responses per event
- Detects completion via response type
- Finalizes streams with proper cleanup
- Releases resources in defer statements
The past review concern about double-emitting terminal events has been addressed with the
HasEmittedCompletedflag in the stream state (lines 659-660, 783-816).core/providers/gemini/responses.go (8)
15-74: LGTM! Nil safety properly addressed.The conversion correctly initializes
Paramsbefore any field access (lines 28-33), addressing the past review concern about potential nil pointer dereferences. All subsequent Params field assignments are now safe.
76-126: LGTM! ExtraParams nil check properly added.The conversion now safely checks
bifrostReq.Params.ExtraParams != nilat line 114 before accessing it, addressing the past review concern about potential nil pointer dereference.
153-339: LGTM! Comprehensive bidirectional conversion preserves IDs.The conversion properly:
- Preserves function call IDs from CallID field (lines 238-240)
- Handles function responses with correct ID mapping (lines 274-276)
- Preserves thought signatures for Gemini 3 Pro compatibility (lines 247-249)
- Groups messages by role appropriately
- Converts usage metadata correctly
This addresses past review concerns about ID preservation.
511-639: LGTM! Stream state management with proper completion tracking.The stream state properly:
- Uses pooling for efficient resource management
- Includes
HasEmittedCompletedflag to prevent double-emission (line 516)- Clears all maps and resets flags in acquire/flush methods
- Initializes all fields to safe defaults
This addresses the past review concern about duplicate terminal events.
641-711: LGTM! Streaming conversion with proper lifecycle management.The conversion correctly:
- Emits lifecycle events only once using state flags
- Processes all candidate parts
- Closes open items when finish reason is present
- Relies on closeGeminiOpenItems for completion guard
1166-1217: LGTM! Completion guard prevents duplicate terminal events.The function properly:
- Guards against re-execution with
HasEmittedCompletedcheck (lines 1168-1170)- Closes all open items (text and tool calls)
- Emits response.completed exactly once
- Sets the completion flag (line 1214)
This fully addresses the past review concern about double-emitting terminal events.
1503-1729: LGTM! Function call conversion preserves IDs correctly.The conversion properly:
- Extracts CallID with preference for ID field, fallback to Name (lines 1558-1561)
- Uses distinct variables for CallID and Name (lines 1565-1566)
- Extracts function response output using utility function (line 1593)
- Preserves thought signatures for compatibility (lines 1576-1589)
This addresses the past review concern about function call ID preservation and ensures round-trip consistency.
1929-2070: LGTM! Bidirectional conversion maintains ID consistency.The conversion properly:
- Maps roles correctly between Responses and Gemini formats
- Preserves function call IDs from CallID field (lines 2019-2021)
- Handles thought signature preservation via lookahead (lines 2023-2031)
- Extracts function response names with proper fallback (lines 2044-2049)
- Maintains consistency with the reverse conversion
cd07ffe to
cc1bd19
Compare
cc1bd19 to
fd14a0e
Compare
63a0c31 to
ac6bbd4
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 4
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
core/schemas/responses.go (1)
133-142: Avoid ambiguous JSON shape: enforce “Schema vs inline fields” exclusivity; also reconsider*any.Right now
ResponsesTextConfigFormatJSONSchemacan serialize bothschema(newSchema *any) andproperties/type/required(existing), which can create conflicting/invalid JSON for downstream providers. Consider:
- Making
Schemaamap[string]anyorsonic.RawMessage/json.RawMessage-like type (instead of*any), and- Adding a
MarshalJSON(or a validation helper) that errors whenSchema != niland any ofType/Properties/Required/AdditionalPropertiesare also set.core/providers/gemini/speech.go (1)
103-121: Prefer returning an error for empty speech input (instead of sending an empty Gemini request).As-is,
ToGeminiSpeechRequestcan return a request withResponseModalities=[AUDIO]but noContents, which is likely a provider error and harder to diagnose than a local validation error.
♻️ Duplicate comments (4)
tests/integrations/tests/test_google.py (2)
512-521: Test intent mismatch: “error handling” test asserts success.Either rename/docstring to reflect “graceful handling of alternate role content”, or switch assertions to validate an error path—right now it reads contradictory.
889-966: Fix unusedtest_configargs + verify ThinkingConfig API for google-genai==1.0.0.
- Rename to
_test_config(or drop the arg) to satisfy Ruff ARG002.- Please verify
types.ThinkingConfig(include_thoughts=..., thinking_budget=...)andusage_metadata.thoughts_token_countare valid/returned under google-genai 1.0.0 and your chosenget_model("google","chat")model; otherwise these tests will be flaky.- def test_26_extended_thinking(self, google_client, test_config): + def test_26_extended_thinking(self, google_client, _test_config): ... - def test_27_extended_thinking_streaming(self, google_client, test_config): + def test_27_extended_thinking_streaming(self, google_client, _test_config):Also applies to: 966-1054
core/providers/vertex/vertex.go (1)
500-525: Gemini branches bypass context-aware raw-request/response toggles.Both Gemini non-streaming branches call
HandleProviderResponse(..., provider.sendBackRawRequest, provider.sendBackRawResponse)even though later they checkproviderUtils.ShouldSendBackRawRequest/Response(...). This prevents per-request ctx overrides and can do unnecessary work.- rawRequest, rawResponse, bifrostErr := providerUtils.HandleProviderResponse(resp.Body(), &geminiResponse, jsonBody, provider.sendBackRawRequest, provider.sendBackRawResponse) + rawRequest, rawResponse, bifrostErr := providerUtils.HandleProviderResponse( + resp.Body(), &geminiResponse, jsonBody, + providerUtils.ShouldSendBackRawRequest(ctx, provider.sendBackRawRequest), + providerUtils.ShouldSendBackRawResponse(ctx, provider.sendBackRawResponse), + )Also applies to: 1012-1033
core/providers/gemini/responses.go (1)
428-433: Fix field-path check for encrypted reasoning content (likely wrong / possibly won’t compile).The condition checks
bifrostResp.Item.EncryptedContentbut then dereferencesbifrostResp.Item.ResponsesReasoning.EncryptedContent. These need to match.- if bifrostResp.Item != nil && bifrostResp.Item.ResponsesReasoning != nil && bifrostResp.Item.EncryptedContent != nil { + if bifrostResp.Item != nil && bifrostResp.Item.ResponsesReasoning != nil && bifrostResp.Item.ResponsesReasoning.EncryptedContent != nil { candidate.Content.Parts = append(candidate.Content.Parts, &Part{ ThoughtSignature: []byte(*bifrostResp.Item.ResponsesReasoning.EncryptedContent), }) }
🧹 Nitpick comments (6)
tests/integrations/tests/utils/common.py (1)
225-246: Make reasoning prompts less brittle (avoid demanding “step-by-step” if not strictly needed).If the tests only need “reasoning tokens present + answer quality”, consider rewording to “show your calculations / explain briefly” to reduce variability across Gemini/Vertex routing and future model changes.
core/providers/gemini/speech.go (1)
142-155: Guardctx.Value(BifrostContextKeyResponseFormat).(string)to avoid panics.Even though the change is formatting-only here, this block still assumes the context key exists and is a string; consider a safe default to
"wav"when unset.core/providers/vertex/vertex.go (1)
283-286: Deduplicategoogle/prefix stripping (one helper).This same normalization appears in 4 places; a tiny helper (e.g.,
normalizeVertexGeminiDeployment(string) string) would reduce drift.Also applies to: 578-582, 834-837, 1065-1068
core/providers/gemini/gemini.go (1)
216-251: Consider aligningChatCompletionrequest-body conversion with the explicit nil-check pattern.
ChatCompletionStream,Responses, andResponsesStreamall guard againstreqBody == nil, butChatCompletiondoes not. It’s probably safe (converter only returns nil whenrequestis nil), but consistency would reduce future footguns when the converter grows errorable paths.core/providers/gemini/utils.go (1)
11-83: Utilities look solid; reasoning/usage mappings are coherent.The ThinkingConfig→Reasoning mapping, finish-reason normalization, and usage conversions (including cached/reasoning/audio tokens) look consistent with the new Gemini native + Responses surfaces.
Also applies to: 208-233, 247-311, 873-924, 967-988
core/providers/gemini/responses.go (1)
1262-1308:output_text.doneemits an empty string—confirm this matches your stream contract.
closeGeminiTextItemsendsoutput_text.donewithText: "", but you never accumulate the full text in state. If consumers expect the “done” event to include the finalized text (common in evented protocols), this will be surprising.Options:
- track an accumulated text buffer in
GeminiResponsesStreamStateand emit it onoutput_text.done, or- explicitly document that
output_text.doneis a pure lifecycle marker and must be reconstructed from deltas.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (17)
core/providers/gemini/chat.go(3 hunks)core/providers/gemini/embedding.go(0 hunks)core/providers/gemini/gemini.go(1 hunks)core/providers/gemini/gemini_test.go(2 hunks)core/providers/gemini/responses.go(14 hunks)core/providers/gemini/speech.go(2 hunks)core/providers/gemini/transcription.go(0 hunks)core/providers/gemini/types.go(1 hunks)core/providers/gemini/utils.go(8 hunks)core/providers/vertex/vertex.go(13 hunks)core/providers/vertex/vertex_test.go(2 hunks)core/schemas/responses.go(1 hunks)core/schemas/utils.go(1 hunks)tests/integrations/config.yml(1 hunks)tests/integrations/tests/test_google.py(4 hunks)tests/integrations/tests/utils/common.py(1 hunks)transports/bifrost-http/integrations/genai.go(2 hunks)
💤 Files with no reviewable changes (2)
- core/providers/gemini/embedding.go
- core/providers/gemini/transcription.go
🚧 Files skipped from review as they are similar to previous changes (4)
- core/providers/gemini/types.go
- tests/integrations/config.yml
- core/schemas/utils.go
- core/providers/vertex/vertex_test.go
🧰 Additional context used
📓 Path-based instructions (1)
**
⚙️ CodeRabbit configuration file
always check the stack if there is one for the current PR. do not give localized reviews for the PR, always see all changes in the light of the whole stack of PRs (if there is a stack, if there is no stack you can continue to make localized suggestions/reviews)
Files:
core/providers/gemini/speech.gocore/providers/gemini/gemini_test.gocore/providers/vertex/vertex.gocore/schemas/responses.gotransports/bifrost-http/integrations/genai.gocore/providers/gemini/gemini.gotests/integrations/tests/utils/common.pytests/integrations/tests/test_google.pycore/providers/gemini/utils.gocore/providers/gemini/chat.gocore/providers/gemini/responses.go
🧠 Learnings (1)
📚 Learning: 2025-12-09T17:07:42.007Z
Learnt from: qwerty-dvorak
Repo: maximhq/bifrost PR: 1006
File: core/schemas/account.go:9-18
Timestamp: 2025-12-09T17:07:42.007Z
Learning: In core/schemas/account.go, the HuggingFaceKeyConfig field within the Key struct is currently unused and reserved for future Hugging Face inference endpoint deployments. Do not flag this field as missing from OpenAPI documentation or require its presence in the API spec until the feature is actively implemented and used. When the feature is added, update the OpenAPI docs accordingly; otherwise, treat this field as non-breaking and not part of the current API surface.
Applied to files:
core/providers/gemini/speech.gocore/providers/gemini/gemini_test.gocore/providers/vertex/vertex.gocore/schemas/responses.gotransports/bifrost-http/integrations/genai.gocore/providers/gemini/gemini.gocore/providers/gemini/utils.gocore/providers/gemini/chat.gocore/providers/gemini/responses.go
🧬 Code graph analysis (5)
core/providers/vertex/vertex.go (5)
core/schemas/utils.go (1)
IsGeminiModel(1052-1054)core/providers/gemini/chat.go (1)
ToGeminiChatCompletionRequest(12-63)core/providers/gemini/types.go (1)
GeminiGenerationRequest(55-73)core/providers/utils/utils.go (2)
HandleProviderResponse(359-445)CheckContextAndGetRequestBody(257-275)core/providers/gemini/gemini.go (2)
HandleGeminiChatCompletionStream(304-495)HandleGeminiResponsesStream(597-834)
core/schemas/responses.go (1)
core/providers/gemini/types.go (1)
Schema(724-775)
core/providers/gemini/gemini.go (6)
core/providers/gemini/chat.go (1)
ToGeminiChatCompletionRequest(12-63)core/schemas/bifrost.go (9)
RequestType(83-83)ChatCompletionRequest(89-89)BifrostStream(323-330)BifrostError(358-367)Gemini(48-48)ModelProvider(32-32)ErrorField(376-383)BifrostErrorExtraFields(425-429)ResponsesRequest(91-91)core/schemas/provider.go (2)
Provider(282-309)PostHookRunner(279-279)core/schemas/utils.go (1)
Ptr(16-18)core/utils.go (1)
Ptr(51-53)core/providers/gemini/responses.go (1)
ToGeminiResponsesRequest(71-121)
tests/integrations/tests/test_google.py (2)
tests/integrations/tests/utils/config_loader.py (2)
get_model(140-161)get_model(443-445)tests/integrations/tests/utils/common.py (1)
skip_if_no_api_key(1625-1636)
core/providers/gemini/responses.go (3)
core/providers/gemini/types.go (13)
GeminiGenerationRequest(55-73)ToolConfig(146-151)FunctionCallingConfig(92-99)GenerationConfig(631-697)SafetySetting(81-89)GenerateContentResponse(1298-1314)Candidate(1223-1249)Content(933-941)FunctionCall(1102-1112)FunctionResponse(1115-1137)FinishReason(22-22)GenerateContentResponseUsageMetadata(1270-1295)Tool(600-627)core/schemas/utils.go (3)
SafeExtractFromMap(519-525)SafeExtractString(280-297)Ptr(16-18)core/schemas/models.go (1)
Model(109-129)
🪛 Ruff (0.14.8)
tests/integrations/tests/test_google.py
890-890: Unused method argument: test_config
(ARG002)
966-966: Unused method argument: test_config
(ARG002)
⏰ Context from checks skipped due to timeout of 900000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (7)
- GitHub Check: Graphite / mergeability_check
- GitHub Check: Graphite / mergeability_check
- GitHub Check: Graphite / mergeability_check
- GitHub Check: Graphite / mergeability_check
- GitHub Check: Graphite / mergeability_check
- GitHub Check: Graphite / mergeability_check
- GitHub Check: Graphite / mergeability_check
🔇 Additional comments (7)
core/providers/gemini/gemini_test.go (1)
35-57: Reasoning model switch to a preview SKU may make these tests flaky across environments.If CI/dev environments don’t all have access to
"gemini-3-pro-preview", consider gating theReasoningscenario/model behind an env var (or falling back to a stable reasoning-capable model when unavailable), so the stack stays green.tests/integrations/tests/test_google.py (1)
86-89: Gemini reasoning prompt wiring looks fine.core/providers/vertex/vertex.go (3)
20-24: Gemini native integration import is expected for this stack.
314-329: Vertex Gemini native request shaping/endpoint selection looks consistent (generateContenton v1).Also applies to: 403-408
1392-1408: > Likely an incorrect or invalid review comment.core/providers/gemini/gemini.go (1)
369-395: The SSE parsing implementation is correct for Gemini's documented format.Gemini's official API documentation confirms that each SSE event contains exactly one
data:line with a single JSON object (GenerateContentResponse). The server does not send multi-line events,event:fields, orid:fields. The code's line-by-line parsing withdata:prefix extraction correctly implements Gemini's SSE contract.transports/bifrost-http/integrations/genai.go (1)
35-56: No issues found with callback ordering or nil payload handling.The
PreCallback(extractAndSetModelFromURL) executes beforeRequestConverter(router.go lines 338–346), properly populatinggeminiReq.Model,geminiReq.Stream,geminiReq.IsEmbedding,geminiReq.IsSpeech, andgeminiReq.IsTranscriptionbefore the routing logic branches on these flags.For streaming, the router explicitly skips
nilpayloads via thecontinuestatement (router.go lines 809–812) whenconvertedResponse == nil && err == nil, ensuring no blank SSE frames are emitted for skipped lifecycle events.
ac6bbd4 to
bd1d8c9
Compare
fd14a0e to
3154215
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 4
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
core/schemas/utils.go (1)
1047-1055: Add a proper doc comment + makeIsGeminiModelcase-insensitive.As an exported helper, it should have a comment, and
strings.Contains(model, "gemini")is brittle if callers passGemini-....// IsMistralModel checks if the model is a Mistral or Codestral model. func IsMistralModel(model string) bool { return strings.Contains(model, "mistral") || strings.Contains(model, "codestral") } -func IsGeminiModel(model string) bool { - return strings.Contains(model, "gemini") -} +// IsGeminiModel checks if the model id refers to a Gemini model. +func IsGeminiModel(model string) bool { + return strings.Contains(strings.ToLower(model), "gemini") +}core/providers/gemini/utils.go (1)
576-610: Ifthought_signatureis base64 in ExtraContent, decode it here (keep fallback for older payloads).
This should be updated together with the streaming fix incore/providers/gemini/chat.go.+import "encoding/base64" ... if toolCall.ExtraContent != nil { if googleData, ok := toolCall.ExtraContent["google"].(map[string]interface{}); ok { if thoughtSig, ok := googleData["thought_signature"].(string); ok { - part.ThoughtSignature = []byte(thoughtSig) + if decoded, err := base64.StdEncoding.DecodeString(thoughtSig); err == nil { + part.ThoughtSignature = decoded + } else { + // Backward-compat: treat as raw string bytes if not base64 + part.ThoughtSignature = []byte(thoughtSig) + } } } }
♻️ Duplicate comments (5)
tests/integrations/tests/test_google.py (2)
512-521: Rename this test (or change assertions): it’s not “error handling” anymore.It still reads like an error-path test but now asserts a successful response. This was already called out in prior review feedback.
889-964: Extended-thinking test: fix unused arg + confirm SDK/model support forthinking_configandthoughts_token_count.
- Ruff flags
test_configas unused (same pattern noted earlier).- Please verify the repo’s pinned
google-genaiSDK version supportstypes.ThinkingConfig(...)and that the configured model reliably returnsusage_metadata.thoughts_token_count.- def test_26_extended_thinking(self, google_client, test_config): + def test_26_extended_thinking(self, google_client, _test_config):core/providers/gemini/chat.go (2)
190-193: Stream “last chunk” detection should also consider usage metadata.
Right nowisLastChunkdepends only onFinishReason, so you can miss emitting the terminal chunk/usage.- isLastChunk := candidate.FinishReason != "" && candidate.FinishReason != FinishReasonUnspecified + isLastChunk := (candidate.FinishReason != "" && candidate.FinishReason != FinishReasonUnspecified) || + response.UsageMetadata != nilAlso applies to: 333-337
259-266: Critical: don’t serializethought_signatureas raw[]byte→stringin streaming tool calls.
This can be invalid UTF-8 and will get corrupted on JSON marshal. Base64 it (and decode when converting back to Gemini requests).- "thought_signature": string(part.ThoughtSignature), + "thought_signature": base64.StdEncoding.EncodeToString(part.ThoughtSignature),core/providers/vertex/vertex.go (1)
500-525: Use context-aware raw flags in Gemini ChatCompletion response handling (currently bypassed).
HandleProviderResponse(..., provider.sendBackRawRequest, provider.sendBackRawResponse)ignores per-request overrides (unlike other branches in this file).- rawRequest, rawResponse, bifrostErr := providerUtils.HandleProviderResponse(resp.Body(), &geminiResponse, jsonBody, provider.sendBackRawRequest, provider.sendBackRawResponse) + rawRequest, rawResponse, bifrostErr := providerUtils.HandleProviderResponse( + resp.Body(), + &geminiResponse, + jsonBody, + providerUtils.ShouldSendBackRawRequest(ctx, provider.sendBackRawRequest), + providerUtils.ShouldSendBackRawResponse(ctx, provider.sendBackRawResponse), + )
🧹 Nitpick comments (3)
tests/integrations/tests/utils/common.py (1)
225-247: New Gemini reasoning prompts are OK; keep an eye on flakiness/cost.These are long, multi-step prompts; if CI starts timing out, consider shortening them or making “reasoning intensity” configurable (env-driven) in the tests that consume them.
core/providers/vertex/vertex.go (1)
1392-1408: Optional: nil-guardstripVertexGeminiUnsupportedFieldsfor defensive safety.
This helper can safely no-op on nil to reduce future call-site risk.func stripVertexGeminiUnsupportedFields(requestBody *gemini.GeminiGenerationRequest) { + if requestBody == nil { + return + } for _, content := range requestBody.Contents { for _, part := range content.Parts { ... } } }core/providers/gemini/gemini.go (1)
225-251: Add explicit nil-check for non-streaming ChatCompletion request conversion (parity with stream).jsonData, err := providerUtils.CheckContextAndGetRequestBody( ctx, request, - func() (any, error) { return ToGeminiChatCompletionRequest(request), nil }, + func() (any, error) { + reqBody := ToGeminiChatCompletionRequest(request) + if reqBody == nil { + return nil, fmt.Errorf("chat completion request is not provided or could not be converted to Gemini format") + } + return reqBody, nil + }, provider.GetProviderKey(), )
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (17)
core/providers/gemini/chat.go(3 hunks)core/providers/gemini/embedding.go(0 hunks)core/providers/gemini/gemini.go(1 hunks)core/providers/gemini/gemini_test.go(2 hunks)core/providers/gemini/responses.go(14 hunks)core/providers/gemini/speech.go(2 hunks)core/providers/gemini/transcription.go(0 hunks)core/providers/gemini/types.go(1 hunks)core/providers/gemini/utils.go(8 hunks)core/providers/vertex/vertex.go(13 hunks)core/providers/vertex/vertex_test.go(2 hunks)core/schemas/responses.go(1 hunks)core/schemas/utils.go(1 hunks)tests/integrations/config.yml(1 hunks)tests/integrations/tests/test_google.py(4 hunks)tests/integrations/tests/utils/common.py(1 hunks)transports/bifrost-http/integrations/genai.go(2 hunks)
💤 Files with no reviewable changes (2)
- core/providers/gemini/transcription.go
- core/providers/gemini/embedding.go
🚧 Files skipped from review as they are similar to previous changes (4)
- core/providers/gemini/gemini_test.go
- core/providers/gemini/types.go
- core/providers/gemini/speech.go
- core/schemas/responses.go
🧰 Additional context used
📓 Path-based instructions (1)
**
⚙️ CodeRabbit configuration file
always check the stack if there is one for the current PR. do not give localized reviews for the PR, always see all changes in the light of the whole stack of PRs (if there is a stack, if there is no stack you can continue to make localized suggestions/reviews)
Files:
core/providers/vertex/vertex_test.gocore/providers/vertex/vertex.gocore/schemas/utils.gotests/integrations/tests/test_google.pytests/integrations/config.ymlcore/providers/gemini/chat.gotransports/bifrost-http/integrations/genai.gocore/providers/gemini/utils.gotests/integrations/tests/utils/common.pycore/providers/gemini/responses.gocore/providers/gemini/gemini.go
🧠 Learnings (2)
📚 Learning: 2025-12-09T17:07:42.007Z
Learnt from: qwerty-dvorak
Repo: maximhq/bifrost PR: 1006
File: core/schemas/account.go:9-18
Timestamp: 2025-12-09T17:07:42.007Z
Learning: In core/schemas/account.go, the HuggingFaceKeyConfig field within the Key struct is currently unused and reserved for future Hugging Face inference endpoint deployments. Do not flag this field as missing from OpenAPI documentation or require its presence in the API spec until the feature is actively implemented and used. When the feature is added, update the OpenAPI docs accordingly; otherwise, treat this field as non-breaking and not part of the current API surface.
Applied to files:
core/providers/vertex/vertex_test.gocore/providers/vertex/vertex.gocore/schemas/utils.gocore/providers/gemini/chat.gotransports/bifrost-http/integrations/genai.gocore/providers/gemini/utils.gocore/providers/gemini/responses.gocore/providers/gemini/gemini.go
📚 Learning: 2025-12-12T08:25:02.629Z
Learnt from: Pratham-Mishra04
Repo: maximhq/bifrost PR: 1000
File: transports/bifrost-http/integrations/router.go:709-712
Timestamp: 2025-12-12T08:25:02.629Z
Learning: In transports/bifrost-http/**/*.go, update streaming response handling to align with OpenAI Responses API: use typed SSE events such as response.created, response.output_text.delta, response.done, etc., and do not rely on the legacy data: [DONE] termination marker. Note that data: [DONE] is only used by the older Chat Completions and Text Completions streaming APIs. Ensure parsers, writers, and tests distinguish SSE events from the [DONE] sentinel and handle each event type accordingly for correct stream termination and progress updates.
Applied to files:
transports/bifrost-http/integrations/genai.go
🧬 Code graph analysis (4)
tests/integrations/tests/test_google.py (2)
tests/integrations/tests/utils/config_loader.py (2)
get_model(140-161)get_model(443-445)tests/integrations/tests/utils/common.py (1)
skip_if_no_api_key(1625-1636)
core/providers/gemini/chat.go (4)
core/schemas/chatcompletions.go (13)
BifrostChatRequest(12-19)ChatAssistantMessageToolCall(686-692)ChatContentBlock(590-597)ChatReasoningDetails(723-730)ChatAssistantMessageToolCallFunction(695-698)ChatToolChoiceTypeFunction(373-373)ChatMessage(469-478)ChatAssistantMessage(626-632)BifrostResponseChoice(704-712)ChatNonStreamResponseChoice(745-748)BifrostChatResponse(26-41)ChatStreamResponseChoiceDelta(756-763)ChatStreamResponseChoice(751-753)core/providers/gemini/types.go (6)
GeminiGenerationRequest(55-73)GenerationConfig(631-697)Content(933-941)Type(778-778)Role(13-13)FinishReason(22-22)core/schemas/utils.go (1)
Ptr(16-18)core/providers/gemini/utils.go (1)
ConvertGeminiFinishReasonToBifrost(227-232)
core/providers/gemini/utils.go (4)
core/providers/gemini/types.go (22)
ThinkingLevel(915-915)ThinkingLevelUnspecified(918-918)ThinkingLevelLow(919-919)ThinkingLevelHigh(920-920)Type(778-778)Schema(724-775)FinishReason(22-22)FinishReasonStop(28-28)FinishReasonMaxTokens(30-30)FinishReasonSafety(33-33)FinishReasonRecitation(35-35)FinishReasonLanguage(37-37)FinishReasonOther(39-39)FinishReasonBlocklist(41-41)FinishReasonProhibitedContent(43-43)FinishReasonSPII(46-46)FinishReasonMalformedFunctionCall(48-48)FinishReasonImageSafety(50-50)FinishReasonUnexpectedToolCall(52-52)GenerateContentResponseUsageMetadata(1270-1295)Modality(706-706)FunctionResponse(1115-1137)core/schemas/responses.go (5)
ResponsesParameters(87-114)ResponsesParametersReasoning(238-243)ResponsesTextConfig(120-123)ResponsesTextConfigFormat(125-130)ResponsesTextConfigFormatJSONSchema(133-142)core/schemas/utils.go (1)
Ptr(16-18)core/utils.go (1)
Ptr(51-53)
core/providers/gemini/responses.go (2)
core/providers/gemini/types.go (9)
GeminiGenerationRequest(55-73)ToolConfig(146-151)FunctionCallingConfig(92-99)GenerationConfig(631-697)SafetySetting(81-89)Content(933-941)FunctionCall(1102-1112)FunctionResponse(1115-1137)Tool(600-627)core/schemas/utils.go (4)
ParseModelString(23-36)SafeExtractFromMap(519-525)SafeExtractString(280-297)Ptr(16-18)
🪛 Ruff (0.14.8)
tests/integrations/tests/test_google.py
890-890: Unused method argument: test_config
(ARG002)
966-966: Unused method argument: test_config
(ARG002)
⏰ Context from checks skipped due to timeout of 900000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (10)
- GitHub Check: Graphite / mergeability_check
- GitHub Check: Graphite / mergeability_check
- GitHub Check: Graphite / mergeability_check
- GitHub Check: Graphite / mergeability_check
- GitHub Check: Graphite / mergeability_check
- GitHub Check: Graphite / mergeability_check
- GitHub Check: Graphite / mergeability_check
- GitHub Check: Graphite / mergeability_check
- GitHub Check: Graphite / mergeability_check
- GitHub Check: Graphite / mergeability_check
🔇 Additional comments (12)
tests/integrations/config.yml (1)
62-75: Gemini model bump looks consistent; please sanity-check cross-stack expectations.Switching chat/vision/tools/streaming to
gemini-2.5-flashis straightforward and matches the PR’s “native Gemini + streaming-first” direction. Just make sure other PRs in the Graphite stack (and any pinned fixtures) aren’t still assertinggemini-2.0-flash*model ids for these scenarios.core/providers/vertex/vertex_test.go (2)
25-49: Verify ChatModel/VisionModel naming consistency for Vertex (prefix vs no prefix).
ChatModelis stillgoogle/gemini-2.0-flash-001whileVisionModelis nowgemini-2.0-flash-001. If Vertex parsing now expects the native Gemini format for Gemini models, consider aligning both (or add a short comment why they must differ).
37-41: Scenario gating for multiple tool calls is clear.Setting
MultipleToolCalls: falsematches the documented constraint and should reduce flaky failures in this suite.tests/integrations/tests/utils/common.py (1)
248-256: IMAGE_URL_MESSAGES structure change is the right direction.Splitting text and image into separate content blocks is more interoperable across providers/converters than a combined structure.
tests/integrations/tests/test_google.py (2)
37-38: Test plan doc update matches the new extended-thinking coverage.
86-89: Imports for Gemini-specific prompts are fine; ensure they’re only used in Google-keyed tests.core/providers/vertex/vertex.go (1)
282-286: Good normalization ofgoogle/deployment prefix.
This makes Vertex deployments consistent with Gemini model naming expectations.core/providers/gemini/chat.go (1)
78-174: Non-stream Gemini→Bifrost assembly looks solid (tool IDs, args serialization, thought signature base64).
Good improvements to preserve IDs and keep signature JSON-safe.transports/bifrost-http/integrations/genai.go (2)
35-84: GenAI route refactor to Responses is consistent and the nil-skip for lifecycle events is sensible.
72-80: Verify SSE writer behavior whenResponsesStreamResponseConverterreturns empty event type.
Given the transport-layer guidance to use typed SSE events for Responses-style streams, confirm that""doesn't accidentally trigger legacy termination behavior (e.g.,[DONE]) or omit required event names. Check the SSE writer implementation to ensure empty event types are handled correctly and align with stream event naming conventions.core/providers/gemini/utils.go (1)
11-84: ResponsesParameters mapping (incl. ThinkingConfig) looks good.core/providers/gemini/gemini.go (1)
369-485: SSE parsing is correct for Gemini's format. Gemini sends complete JSON data in individual SSE events (onedata:line per complete JSON payload), so the line-oriented scanner correctly processes each chunk without needing to buffer across multiple lines.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
core/providers/vertex/vertex.go (1)
1272-1275: Add global region handling to embedding endpoint to match other methods.The embedding endpoint (lines 1273-1274) fails to special-case the
globalregion, unlike all other methods in the file. When region is"global", the current code produces an invalid URL:https://global-aiplatform.googleapis.com/...instead ofhttps://aiplatform.googleapis.com/...withlocations/global.- url := fmt.Sprintf("https://%s-aiplatform.googleapis.com/v1/projects/%s/locations/%s/publishers/google/models/%s:predict", - key.VertexKeyConfig.Region, key.VertexKeyConfig.ProjectID, key.VertexKeyConfig.Region, deployment) + host := "aiplatform.googleapis.com" + location := region + if region != "global" { + host = fmt.Sprintf("%s-aiplatform.googleapis.com", region) + } else { + location = "global" + } + url := fmt.Sprintf("https://%s/v1/projects/%s/locations/%s/publishers/google/models/%s:predict", + host, projectID, location, deployment)core/providers/gemini/utils.go (1)
11-84: ExtraParams stores penalty pointers; this can break downstreamSafeExtractFloat64.
presence_penaltyandfrequency_penaltyare stored as*float64(Lines 54-58). Downstream conversions typically expect primitives (float64) for safe extraction/marshalling consistency. Store dereferenced values instead.@@ - if config.PresencePenalty != nil { - params.ExtraParams["presence_penalty"] = config.PresencePenalty - } + if config.PresencePenalty != nil { + params.ExtraParams["presence_penalty"] = *config.PresencePenalty + } @@ - if config.FrequencyPenalty != nil { - params.ExtraParams["frequency_penalty"] = config.FrequencyPenalty - } + if config.FrequencyPenalty != nil { + params.ExtraParams["frequency_penalty"] = *config.FrequencyPenalty + }
♻️ Duplicate comments (8)
core/providers/vertex/vertex.go (2)
504-505: Use ctx-aware raw flags in Gemini branches (per-request overrides currently bypassed).
Right now Gemini branches passprovider.sendBackRawRequest/ResponseintoHandleProviderResponse, while later gating usesShouldSendBackRaw*, which can (a) ignore per-request enable/disable, and (b) do unnecessary work.- rawRequest, rawResponse, bifrostErr := providerUtils.HandleProviderResponse(resp.Body(), &geminiResponse, jsonBody, provider.sendBackRawRequest, provider.sendBackRawResponse) + rawRequest, rawResponse, bifrostErr := providerUtils.HandleProviderResponse( + resp.Body(), + &geminiResponse, + jsonBody, + providerUtils.ShouldSendBackRawRequest(ctx, provider.sendBackRawRequest), + providerUtils.ShouldSendBackRawResponse(ctx, provider.sendBackRawResponse), + )(Apply similarly to the Responses Gemini branch.)
Also applies to: 1012-1013
689-728: Do not send Vertex Gemini auth via?key=; always use OAuth2 Bearer + avoid URL secret leakage.
These Gemini publisher streaming URLs appendkey=and skipAuthorizationwhen present. This is both a likely functional bug (Vertex publisher endpoints typically require OAuth2) and a security footgun (API keys in URLs get logged, cached, and propagated).- // Auth query is used to pass the API key in the query string - authQuery := "" - if key.Value != "" { - authQuery = fmt.Sprintf("key=%s", url.QueryEscape(key.Value)) - } - // Construct the URL for Gemini streaming var completeURL string if region == "global" { completeURL = fmt.Sprintf("https://aiplatform.googleapis.com/v1/projects/%s/locations/global/publishers/google/models/%s:streamGenerateContent", projectID, deployment) } else { completeURL = fmt.Sprintf("https://%s-aiplatform.googleapis.com/v1/projects/%s/locations/%s/publishers/google/models/%s:streamGenerateContent", region, projectID, region, deployment) } - - // Add alt=sse parameter - if authQuery != "" { - completeURL = fmt.Sprintf("%s?alt=sse&%s", completeURL, authQuery) - } else { - completeURL = fmt.Sprintf("%s?alt=sse", completeURL) - } + completeURL = fmt.Sprintf("%s?alt=sse", completeURL) headers := map[string]string{ + "Content-Type": "application/json", "Accept": "text/event-stream", "Cache-Control": "no-cache", } - // If no auth query, use OAuth2 token - if authQuery == "" { - tokenSource, err := getAuthTokenSource(key) - ... - headers["Authorization"] = "Bearer " + token.AccessToken - } + // Always use OAuth2 token for Vertex publisher endpoints + tokenSource, err := getAuthTokenSource(key) + if err != nil { ... } + token, err := tokenSource.Token() + if err != nil { ... } + headers["Authorization"] = "Bearer " + token.AccessTokenVertex AI Gemini publisher endpoint authentication: does `:streamGenerateContent` support API keys via `?key=` / `x-goog-api-key`, or is OAuth2 Bearer required?Also applies to: 1164-1202
tests/integrations/tests/test_google.py (2)
86-89: Fix unusedtest_configargs in new tests (ruff), and align numbering if intended.
Ruff flagstest_configunused intest_26_extended_thinking/test_27_extended_thinking_streaming.- def test_26_extended_thinking(self, google_client, test_config): + def test_26_extended_thinking(self, google_client, _test_config): - def test_27_extended_thinking_streaming(self, google_client, test_config): + def test_27_extended_thinking_streaming(self, google_client, _test_config):Also applies to: 889-1054
512-521: Test name/docstring still say “error handling”, but assertions expect success.
Either rename to “graceful handling” (and keep success assertions) or switch assertions back to validating the error path.core/providers/gemini/chat.go (2)
176-339: Fix thought signature serialization: don’t cast[]bytetostringin JSON payloads.In
ToBifrostChatCompletionStream,thought_signatureis set viastring(part.ThoughtSignature)(Lines 259-266). This will corrupt non-UTF8 bytes on JSON marshal. Base64 encode (same file already uses base64 forreasoningDetails).
This matches a previously raised issue.@@ - if len(part.ThoughtSignature) > 0 { + if len(part.ThoughtSignature) > 0 { + thoughtSigB64 := base64.StdEncoding.EncodeToString(part.ThoughtSignature) toolCall.ExtraContent = map[string]interface{}{ "google": map[string]interface{}{ - "thought_signature": string(part.ThoughtSignature), + "thought_signature": thoughtSigB64, }, } }
188-193: “Last chunk” detection contradicts the comment and can miss stream termination.The comment says “finish reason or usage metadata” (Line 190), but
isLastChunkonly checksFinishReason(Line 191). If Gemini ends with usage metadata but no finish reason, you may never emit the terminal chunk/usage. This matches a previously raised issue and looks regressed here.- isLastChunk := candidate.FinishReason != "" && candidate.FinishReason != FinishReasonUnspecified + isLastChunk := (candidate.FinishReason != "" && candidate.FinishReason != FinishReasonUnspecified) || + response.UsageMetadata != nilAlso applies to: 333-339
core/providers/gemini/responses.go (2)
428-433: Bug: nil-check uses the wrong field path (Item.EncryptedContent).In
ToGeminiResponsesStreamResponse, the condition checksbifrostResp.Item.EncryptedContentbut then dereferencesbifrostResp.Item.ResponsesReasoning.EncryptedContent(Lines 429-432). This can panic or skip valid signatures.
This matches a previously raised issue.- if bifrostResp.Item != nil && bifrostResp.Item.ResponsesReasoning != nil && bifrostResp.Item.EncryptedContent != nil { + if bifrostResp.Item != nil && bifrostResp.Item.ResponsesReasoning != nil && bifrostResp.Item.ResponsesReasoning.EncryptedContent != nil { candidate.Content.Parts = append(candidate.Content.Parts, &Part{ ThoughtSignature: []byte(*bifrostResp.Item.ResponsesReasoning.EncryptedContent), }) }
1049-1053:outputIndexcollisions likely when streams start with tool calls / non-text items.Several handlers force
outputIndex = 0when!state.HasStartedText(e.g., function response / inline data / file data). If the stream begins with tool calls, indices can collide and overwritestate.ItemIDs/tool buffers.
This matches a previously raised issue.@@ type GeminiResponsesStreamState struct { @@ - CurrentOutputIndex int // Current output index + CurrentOutputIndex int // Last allocated output index (init to -1) @@ } + +func (state *GeminiResponsesStreamState) nextOutputIndex() int { + state.CurrentOutputIndex++ + return state.CurrentOutputIndex +} @@ - state.CurrentOutputIndex = 0 + state.CurrentOutputIndex = -1Then replace ad-hoc calculations like:
- outputIndex := state.CurrentOutputIndex + 1 - if !state.HasStartedText { outputIndex = 0 } - state.CurrentOutputIndex = outputIndex + outputIndex := state.nextOutputIndex()Also applies to: 1122-1126, 1199-1204
🧹 Nitpick comments (6)
core/providers/vertex/vertex.go (2)
282-287: Deduplicategoogle/deployment normalization (helper).
Same prefix-stripping logic is repeated across multiple entrypoints; consider a small helper onVertexProviderto keep this consistent and avoid drift.Also applies to: 577-582, 833-838, 1064-1069
1392-1408: Sanitizer is clear; add a safety note re multi-tool-call correlation.
ClearingFunctionCall.ID/FunctionResponse.IDto satisfy Vertex constraints is understandable, but please ensure multi-tool-call flows still correlate correctly (esp. same-name tool calls). A small comment here explaining the correlation strategy would help future maintainers.transports/bifrost-http/integrations/genai.go (1)
35-56: Good migration toResponsesRequest, but the route comment is now misleading.The “Chat completions endpoint” comment (Line 27) no longer matches behavior: this route now multiplexes embedding/speech/transcription/responses. Consider renaming to avoid confusion while maintaining stack-wide clarity.
core/providers/gemini/gemini.go (2)
225-251:ChatCompletionshould mirror the explicit nil-check pattern used elsewhere.
ChatCompletionpassesToGeminiChatCompletionRequest(request)directly (Line 228) whileChatCompletionStream/Responses*paths guard nil conversions. For consistency (and clearer errors), wrap it the same way.
335-353:errors.Is(doErr, context.Canceled)is likely ineffective withfasthttp.Client.Do.
fasthttp.Client.Doisn’t context-aware; cancellation usually won’t surface ascontext.Canceled. Consider simplifying error classification (timeout vs other), and rely on thectx.Done()loop exit (previous comment) for cancellation behavior.Also applies to: 628-646
core/providers/gemini/responses.go (1)
96-107: Conversion errors are silently dropped (ToGeminiResponsesRequestreturns nil).When
convertResponsesMessagesToGeminiContentsfails,ToGeminiResponsesRequestreturnsnil(Lines 99-101), losing the underlying error detail (callers only see a generic “could not be converted” elsewhere). Consider restoring an error return ((*GeminiGenerationRequest, error)) to preserve actionable diagnostics across the stack.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (17)
core/providers/gemini/chat.go(3 hunks)core/providers/gemini/embedding.go(0 hunks)core/providers/gemini/gemini.go(1 hunks)core/providers/gemini/gemini_test.go(2 hunks)core/providers/gemini/responses.go(14 hunks)core/providers/gemini/speech.go(2 hunks)core/providers/gemini/transcription.go(0 hunks)core/providers/gemini/types.go(1 hunks)core/providers/gemini/utils.go(8 hunks)core/providers/vertex/vertex.go(13 hunks)core/providers/vertex/vertex_test.go(2 hunks)core/schemas/responses.go(1 hunks)core/schemas/utils.go(1 hunks)tests/integrations/config.yml(1 hunks)tests/integrations/tests/test_google.py(4 hunks)tests/integrations/tests/utils/common.py(1 hunks)transports/bifrost-http/integrations/genai.go(2 hunks)
💤 Files with no reviewable changes (2)
- core/providers/gemini/transcription.go
- core/providers/gemini/embedding.go
🚧 Files skipped from review as they are similar to previous changes (5)
- core/providers/vertex/vertex_test.go
- tests/integrations/config.yml
- core/schemas/utils.go
- core/providers/gemini/speech.go
- core/schemas/responses.go
🧰 Additional context used
📓 Path-based instructions (1)
**
⚙️ CodeRabbit configuration file
always check the stack if there is one for the current PR. do not give localized reviews for the PR, always see all changes in the light of the whole stack of PRs (if there is a stack, if there is no stack you can continue to make localized suggestions/reviews)
Files:
core/providers/gemini/types.gocore/providers/vertex/vertex.gocore/providers/gemini/gemini_test.gotransports/bifrost-http/integrations/genai.gotests/integrations/tests/test_google.pytests/integrations/tests/utils/common.pycore/providers/gemini/utils.gocore/providers/gemini/chat.gocore/providers/gemini/gemini.gocore/providers/gemini/responses.go
🧠 Learnings (2)
📚 Learning: 2025-12-09T17:07:42.007Z
Learnt from: qwerty-dvorak
Repo: maximhq/bifrost PR: 1006
File: core/schemas/account.go:9-18
Timestamp: 2025-12-09T17:07:42.007Z
Learning: In core/schemas/account.go, the HuggingFaceKeyConfig field within the Key struct is currently unused and reserved for future Hugging Face inference endpoint deployments. Do not flag this field as missing from OpenAPI documentation or require its presence in the API spec until the feature is actively implemented and used. When the feature is added, update the OpenAPI docs accordingly; otherwise, treat this field as non-breaking and not part of the current API surface.
Applied to files:
core/providers/gemini/types.gocore/providers/vertex/vertex.gocore/providers/gemini/gemini_test.gotransports/bifrost-http/integrations/genai.gocore/providers/gemini/utils.gocore/providers/gemini/chat.gocore/providers/gemini/gemini.gocore/providers/gemini/responses.go
📚 Learning: 2025-12-12T08:25:02.629Z
Learnt from: Pratham-Mishra04
Repo: maximhq/bifrost PR: 1000
File: transports/bifrost-http/integrations/router.go:709-712
Timestamp: 2025-12-12T08:25:02.629Z
Learning: In transports/bifrost-http/**/*.go, update streaming response handling to align with OpenAI Responses API: use typed SSE events such as response.created, response.output_text.delta, response.done, etc., and do not rely on the legacy data: [DONE] termination marker. Note that data: [DONE] is only used by the older Chat Completions and Text Completions streaming APIs. Ensure parsers, writers, and tests distinguish SSE events from the [DONE] sentinel and handle each event type accordingly for correct stream termination and progress updates.
Applied to files:
transports/bifrost-http/integrations/genai.go
🧬 Code graph analysis (4)
core/providers/gemini/utils.go (3)
core/providers/gemini/types.go (22)
GeminiGenerationRequest(55-73)GenerationConfig(631-697)ThinkingLevel(915-915)ThinkingLevelUnspecified(918-918)ThinkingLevelLow(919-919)ThinkingLevelHigh(920-920)Type(778-778)Schema(724-775)FinishReason(22-22)FinishReasonStop(28-28)FinishReasonMaxTokens(30-30)FinishReasonSafety(33-33)FinishReasonRecitation(35-35)FinishReasonLanguage(37-37)FinishReasonOther(39-39)FinishReasonBlocklist(41-41)FinishReasonProhibitedContent(43-43)FinishReasonSPII(46-46)FinishReasonMalformedFunctionCall(48-48)FinishReasonImageSafety(50-50)FinishReasonUnexpectedToolCall(52-52)GenerateContentResponseUsageMetadata(1270-1295)core/schemas/responses.go (7)
ResponsesParameters(87-114)ResponsesParametersReasoning(238-243)ResponsesTextConfig(120-123)ResponsesTextConfigFormat(125-130)ResponsesResponseUsage(258-265)ResponsesResponseOutputTokens(275-285)ResponsesResponseInputTokens(267-273)core/schemas/chatcompletions.go (4)
ToolFunctionParameters(259-266)BifrostLLMUsage(812-819)ChatPromptTokensDetails(821-827)ChatCompletionTokensDetails(829-840)
core/providers/gemini/chat.go (5)
core/schemas/chatcompletions.go (15)
BifrostChatRequest(12-19)ChatAssistantMessageToolCall(686-692)ChatContentBlock(590-597)ChatReasoningDetails(723-730)ChatAssistantMessageToolCallFunction(695-698)BifrostReasoningDetailsTypeEncrypted(718-718)ChatMessage(469-478)ChatMessageRoleAssistant(461-461)ChatMessageContent(524-527)ChatAssistantMessage(626-632)BifrostResponseChoice(704-712)ChatNonStreamResponseChoice(745-748)BifrostChatResponse(26-41)ChatStreamResponseChoiceDelta(756-763)ChatStreamResponseChoice(751-753)core/schemas/utils.go (1)
Ptr(16-18)core/schemas/bifrost.go (1)
BifrostError(358-367)core/schemas/models.go (1)
Model(109-129)core/providers/gemini/utils.go (1)
ConvertGeminiFinishReasonToBifrost(227-232)
core/providers/gemini/gemini.go (5)
core/providers/gemini/chat.go (1)
ToGeminiChatCompletionRequest(12-63)core/schemas/bifrost.go (9)
RequestType(83-83)ChatCompletionRequest(89-89)BifrostStream(323-330)BifrostError(358-367)Gemini(48-48)ModelProvider(32-32)ErrorField(376-383)BifrostErrorExtraFields(425-429)BifrostResponseExtraFields(288-298)core/providers/utils/utils.go (8)
CheckContextAndGetRequestBody(257-275)GetPathFromContext(212-217)SetExtraHeaders(179-209)ReleaseStreamingResponse(844-851)NewBifrostOperationError(516-527)NewProviderAPIError(531-546)ProcessAndSendResponse(612-642)GetBifrostResponseForStreamResponse(854-882)core/providers/gemini/types.go (1)
Type(778-778)core/utils.go (1)
Ptr(51-53)
core/providers/gemini/responses.go (3)
core/providers/gemini/types.go (14)
GeminiGenerationRequest(55-73)ToolConfig(146-151)GenerationConfig(631-697)SafetySetting(81-89)Part(947-971)Role(13-13)Content(933-941)Type(778-778)FinishReason(22-22)FinishReasonMaxTokens(30-30)FinishReasonSafety(33-33)FinishReasonOther(39-39)FinishReasonStop(28-28)Tool(600-627)core/schemas/responses.go (21)
BifrostResponsesRequest(32-39)BifrostResponsesResponse(45-85)ResponsesToolMessage(466-486)ResponsesReasoning(735-738)ResponsesStreamResponseTypePing(1365-1365)ResponsesStreamResponseTypeCreated(1367-1367)ResponsesStreamResponseTypeInProgress(1368-1368)ResponsesStreamResponseTypeCompleted(1369-1369)ResponsesStreamResponseTypeFailed(1370-1370)ResponsesMessage(318-331)ResponsesMessageContent(343-348)ResponsesMessageContentBlock(403-415)ResponsesOutputMessageContentTypeText(396-396)ResponsesToolMessageOutputStruct(535-539)ResponsesInputMessageContentBlockTypeText(392-392)ResponsesOutputMessageContentTypeReasoning(398-398)ResponsesInputMessageContentBlockImage(417-420)ResponsesInputMessageContentBlockAudio(428-431)ResponsesInputMessageContentBlockFile(422-426)ResponsesTool(1034-1049)ResponsesParameters(87-114)core/schemas/utils.go (4)
ParseModelString(23-36)SafeExtractFromMap(519-525)SafeExtractString(280-297)Ptr(16-18)
🪛 Ruff (0.14.8)
tests/integrations/tests/test_google.py
890-890: Unused method argument: test_config
(ARG002)
966-966: Unused method argument: test_config
(ARG002)
⏰ Context from checks skipped due to timeout of 900000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (22)
- GitHub Check: Graphite / mergeability_check
- GitHub Check: Graphite / mergeability_check
- GitHub Check: Graphite / mergeability_check
- GitHub Check: Graphite / mergeability_check
- GitHub Check: Graphite / mergeability_check
- GitHub Check: Graphite / mergeability_check
- GitHub Check: Graphite / mergeability_check
- GitHub Check: Graphite / mergeability_check
- GitHub Check: Graphite / mergeability_check
- GitHub Check: Graphite / mergeability_check
- GitHub Check: Graphite / mergeability_check
- GitHub Check: Graphite / mergeability_check
- GitHub Check: Graphite / mergeability_check
- GitHub Check: Graphite / mergeability_check
- GitHub Check: Graphite / mergeability_check
- GitHub Check: Graphite / mergeability_check
- GitHub Check: Graphite / mergeability_check
- GitHub Check: Graphite / mergeability_check
- GitHub Check: Graphite / mergeability_check
- GitHub Check: Graphite / mergeability_check
- GitHub Check: Graphite / mergeability_check
- GitHub Check: Graphite / mergeability_check
🔇 Additional comments (5)
core/providers/vertex/vertex.go (1)
314-330: Gemini native request shaping + VertexgenerateContentrouting looks consistent.
Converting viagemini.ToGeminiChatCompletionRequest, settingModel=deployment, and routing to...:generateContentis aligned with the PR goal of using native Gemini formats.Also applies to: 403-408
tests/integrations/tests/utils/common.py (2)
225-246: Reasoning prompt fixtures are straightforward and reusable.
The new constants keep integration tests cleaner and avoid duplicating long prompts.
248-256: IMAGE_URL_MESSAGES shape change: ensure all consumers accept the new multimodal block format.
If any integration still expects a string content or the old single-block structure, it will break.core/providers/gemini/types.go (1)
903-921: ThinkingLevel implementation is correct and follows established codebase patterns.The enum values and omitempty behavior are consistent with other similar enum types in the codebase (FinishReason, FunctionCallingConfigMode, Behavior, etc.). All non-pointer string enum fields serialize their unspecified value explicitly (e.g., "THINKING_LEVEL_UNSPECIFIED"), which is the expected behavior per Gemini API. No changes needed.
transports/bifrost-http/integrations/genai.go (1)
72-84: Ensure “nil-skip” streaming conversion cannot emit blank/invalid SSE frames.
ResponsesStreamResponseConverterreturns("", nil, nil)whengemini.ToGeminiResponsesStreamResponse(resp)is nil (Lines 73-79). Please verify the downstream stream writer treats this as “skip” (no write), not as an empty event/data frame.
Based on learnings, this is especially important because Responses-style streaming relies on typed events, not[DONE]sentinels.
bd1d8c9 to
b6f0c52
Compare
3154215 to
d5b03f0
Compare
b6f0c52 to
ad08844
Compare
ad08844 to
e1ce12c
Compare
d5b03f0 to
7f3e893
Compare
Merge activity
|
7f3e893 to
828c298
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 7
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
core/schemas/responses.go (1)
134-142: Fix structural ambiguity inResponsesTextConfigFormatJSONSchema: fields belong at different nesting levels per OpenAI spec.
ResponsesTextConfigFormatJSONSchemaincorrectly mixes wrapper-level fields (name,strict,description) with JSON-Schema-level fields (type,properties,required,additionalProperties). According to the OpenAI Responses API spec, the correct structure is:{ "type": "json_schema", "strict": true, "schema": { "type": "object", "properties": {...}, "required": [...], "additionalProperties": false } }The current struct design will produce incorrect JSON with duplicate/misplaced wrapper fields and a confusing
schema.schemanesting when theSchemafield is populated alongside schema-level fields. TheSchemafield itself is problematic—it should either represent the entire schema object (making other schema fields redundant) or not exist at all.Add a marshal test that validates the exact JSON output against the intended OpenAI shape, then refactor to enforce a single, correct representation: either populate
Schemawith the complete schema object, or use only the individual schema fields—not both.core/providers/gemini/responses.go (1)
1705-1735: ThoughtSignature → EncryptedContent should be base64, notstring([]byte)This has the same invalid-UTF8/JSON-corruption risk as the chat streaming path, and it’s inconsistent with other places that base64-encode encrypted content.
- thoughtSig := string(part.ThoughtSignature) + thoughtSig := base64.StdEncoding.EncodeToString(part.ThoughtSignature) reasoningMsg := schemas.ResponsesMessage{ ... ResponsesReasoning: &schemas.ResponsesReasoning{ Summary: []schemas.ResponsesReasoningSummary{}, EncryptedContent: &thoughtSig, }, }Then, when converting
EncryptedContentback intoPart.ThoughtSignature, base64-decode instead of[]byte(str)(several call sites in this file currently do the latter).
♻️ Duplicate comments (5)
tests/integrations/tests/test_google.py (2)
512-521: Test 12 intent mismatch (name/docstring vs assertions).Either rename/docstring to “graceful handling of alternate role content” (keeping success assertions) or switch assertions to validate the error path.
889-1054: Fix unusedtest_config+ make streaming “too many chunks” fail loudly.Suggested diff:
- def test_26_extended_thinking(self, google_client, test_config): + def test_26_extended_thinking(self, google_client, _test_config): - def test_27_extended_thinking_streaming(self, google_client, test_config): + def test_27_extended_thinking_streaming(self, google_client, _test_config): @@ - if chunk_count > 500: - break + if chunk_count > 500: + raise AssertionError(f"Received >500 streaming chunks; possible non-terminating stream (chunk_count={chunk_count})")Also please verify
get_model("google", "chat")resolves to a thinking-capable Gemini model in your test config (otherwise these assertions will be flaky).#!/bin/bash set -euo pipefail # Show what "chat" maps to for google in your integration test config plumbing. rg -n --hidden -S 'get_model\("google",\s*"chat"\)|model_type:\s*chat|google:\s*$|provider:\s*google|integration:\s*google' tests/integrations -n || true # If you have a YAML/JSON config for integrations, print likely candidates. fd -a 'config\.yml|config\.yaml|integrations\.yml|integrations\.yaml' tests/integrations || truecore/providers/vertex/vertex.go (1)
669-743: Vertex Gemini streaming auth: don’t fall back to API key query params; always use OAuth2.Both Gemini streaming branches build
?alt=sse&key=...and omitAuthorizationwhenkey.Valueis set. For Vertex publisher streaming (:streamGenerateContent), this is very likely to fail (and can leak keys in logs).Suggested diff pattern (apply to both ChatCompletionStream Gemini + ResponsesStream Gemini):
- // Auth query is used to pass the API key in the query string - authQuery := "" - if key.Value != "" { - authQuery = fmt.Sprintf("key=%s", url.QueryEscape(key.Value)) - } ... - if authQuery != "" { - completeURL = fmt.Sprintf("%s?alt=sse&%s", completeURL, authQuery) - } else { - completeURL = fmt.Sprintf("%s?alt=sse", completeURL) - } + completeURL = fmt.Sprintf("%s?alt=sse", completeURL) ... - // If no auth query, use OAuth2 token - if authQuery == "" { - tokenSource, err := getAuthTokenSource(key) - ... - headers["Authorization"] = "Bearer " + token.AccessToken - } + tokenSource, err := getAuthTokenSource(key) + ... + headers["Authorization"] = "Bearer " + token.AccessTokenVertex AI publisher `streamGenerateContent` authentication: does it support API keys (`?key=...`) or require OAuth2 `Authorization: Bearer <token>` only?Also applies to: 1134-1226
core/providers/gemini/gemini.go (1)
380-485: Streaming goroutines should stop onctx.Done()to avoid leaks.Both streaming handlers keep scanning until EOF without an in-loop cancellation check, so canceled callers can leave goroutines/streams running longer than necessary.
At minimum:
for scanner.Scan() { + select { + case <-ctx.Done(): + return + default: + } line := scanner.Text() ... }(If you can, also ensure the underlying body stream gets closed on cancellation, otherwise
scanner.Scan()may still block.)Also applies to: 677-789
core/providers/gemini/responses.go (1)
424-433: Fix incorrect nil-check field path for reasoning encrypted content.This condition checks
bifrostResp.Item.EncryptedContentbut then dereferencesbifrostResp.Item.ResponsesReasoning.EncryptedContent. The check should match the dereference:case schemas.ResponsesStreamResponseTypeOutputItemAdded: - if bifrostResp.Item != nil && bifrostResp.Item.ResponsesReasoning != nil && bifrostResp.Item.EncryptedContent != nil { + if bifrostResp.Item != nil && bifrostResp.Item.ResponsesReasoning != nil && bifrostResp.Item.ResponsesReasoning.EncryptedContent != nil { candidate.Content.Parts = append(candidate.Content.Parts, &Part{ ThoughtSignature: []byte(*bifrostResp.Item.ResponsesReasoning.EncryptedContent), }) }
🧹 Nitpick comments (2)
tests/integrations/tests/utils/common.py (1)
225-246: Nice addition: shared Gemini reasoning prompts improve test reuse.Minor: prompts explicitly request “step-by-step reasoning”, which some models/providers may refuse or redact; if you see flakes, consider wording like “explain your reasoning” / “show calculations” instead.
core/providers/vertex/vertex.go (1)
20-21:google/deployment prefix normalization is a good consistency win.Given this pattern is repeated 4x, consider a tiny helper (e.g.,
normalizeGoogleDeployment(deployment string) string) to reduce drift.Also applies to: 283-286, 578-581, 833-837, 1065-1068
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (17)
core/providers/gemini/chat.go(3 hunks)core/providers/gemini/embedding.go(0 hunks)core/providers/gemini/gemini.go(1 hunks)core/providers/gemini/gemini_test.go(2 hunks)core/providers/gemini/responses.go(14 hunks)core/providers/gemini/speech.go(2 hunks)core/providers/gemini/transcription.go(0 hunks)core/providers/gemini/types.go(1 hunks)core/providers/gemini/utils.go(8 hunks)core/providers/vertex/vertex.go(13 hunks)core/providers/vertex/vertex_test.go(2 hunks)core/schemas/responses.go(1 hunks)core/schemas/utils.go(1 hunks)tests/integrations/config.yml(1 hunks)tests/integrations/tests/test_google.py(4 hunks)tests/integrations/tests/utils/common.py(1 hunks)transports/bifrost-http/integrations/genai.go(2 hunks)
💤 Files with no reviewable changes (2)
- core/providers/gemini/transcription.go
- core/providers/gemini/embedding.go
🚧 Files skipped from review as they are similar to previous changes (6)
- tests/integrations/config.yml
- core/providers/gemini/gemini_test.go
- core/providers/gemini/types.go
- core/providers/gemini/speech.go
- core/providers/vertex/vertex_test.go
- core/schemas/utils.go
🧰 Additional context used
📓 Path-based instructions (1)
**
⚙️ CodeRabbit configuration file
always check the stack if there is one for the current PR. do not give localized reviews for the PR, always see all changes in the light of the whole stack of PRs (if there is a stack, if there is no stack you can continue to make localized suggestions/reviews)
Files:
tests/integrations/tests/utils/common.pycore/providers/vertex/vertex.gotransports/bifrost-http/integrations/genai.gocore/providers/gemini/gemini.gocore/schemas/responses.gotests/integrations/tests/test_google.pycore/providers/gemini/utils.gocore/providers/gemini/chat.gocore/providers/gemini/responses.go
🧠 Learnings (2)
📚 Learning: 2025-12-09T17:07:42.007Z
Learnt from: qwerty-dvorak
Repo: maximhq/bifrost PR: 1006
File: core/schemas/account.go:9-18
Timestamp: 2025-12-09T17:07:42.007Z
Learning: In core/schemas/account.go, the HuggingFaceKeyConfig field within the Key struct is currently unused and reserved for future Hugging Face inference endpoint deployments. Do not flag this field as missing from OpenAPI documentation or require its presence in the API spec until the feature is actively implemented and used. When the feature is added, update the OpenAPI docs accordingly; otherwise, treat this field as non-breaking and not part of the current API surface.
Applied to files:
core/providers/vertex/vertex.gotransports/bifrost-http/integrations/genai.gocore/providers/gemini/gemini.gocore/schemas/responses.gocore/providers/gemini/utils.gocore/providers/gemini/chat.gocore/providers/gemini/responses.go
📚 Learning: 2025-12-12T08:25:02.629Z
Learnt from: Pratham-Mishra04
Repo: maximhq/bifrost PR: 1000
File: transports/bifrost-http/integrations/router.go:709-712
Timestamp: 2025-12-12T08:25:02.629Z
Learning: In transports/bifrost-http/**/*.go, update streaming response handling to align with OpenAI Responses API: use typed SSE events such as response.created, response.output_text.delta, response.done, etc., and do not rely on the legacy data: [DONE] termination marker. Note that data: [DONE] is only used by the older Chat Completions and Text Completions streaming APIs. Ensure parsers, writers, and tests distinguish SSE events from the [DONE] sentinel and handle each event type accordingly for correct stream termination and progress updates.
Applied to files:
transports/bifrost-http/integrations/genai.go
🧬 Code graph analysis (6)
core/providers/vertex/vertex.go (4)
core/schemas/utils.go (1)
IsGeminiModel(1052-1054)core/providers/gemini/chat.go (1)
ToGeminiChatCompletionRequest(12-63)core/providers/gemini/types.go (1)
GeminiGenerationRequest(55-73)core/providers/gemini/gemini.go (2)
HandleGeminiChatCompletionStream(304-495)HandleGeminiResponsesStream(597-834)
transports/bifrost-http/integrations/genai.go (4)
core/schemas/bifrost.go (1)
ResponsesRequest(91-91)transports/bifrost-http/handlers/inference.go (1)
ResponsesRequest(257-261)core/schemas/responses.go (2)
BifrostResponsesResponse(45-85)BifrostResponsesStreamResponse(1431-1470)core/providers/gemini/responses.go (2)
ToGeminiResponsesResponse(148-342)ToGeminiResponsesStreamResponse(344-519)
core/providers/gemini/gemini.go (9)
core/providers/gemini/chat.go (1)
ToGeminiChatCompletionRequest(12-63)core/schemas/bifrost.go (10)
RequestType(83-83)ChatCompletionRequest(89-89)BifrostStream(323-330)BifrostError(358-367)Gemini(48-48)ModelProvider(32-32)ErrorField(376-383)BifrostErrorExtraFields(425-429)BifrostResponseExtraFields(288-298)ResponsesRequest(91-91)core/schemas/provider.go (2)
Provider(282-309)PostHookRunner(279-279)core/providers/utils/utils.go (2)
ShouldSendBackRawResponse(559-564)GetPathFromContext(212-217)core/schemas/chatcompletions.go (2)
BifrostChatRequest(12-19)BifrostChatResponse(26-41)core/utils.go (1)
Ptr(51-53)core/schemas/responses.go (3)
BifrostResponsesRequest(32-39)BifrostResponsesResponse(45-85)BifrostResponsesStreamResponse(1431-1470)transports/bifrost-http/handlers/inference.go (1)
ResponsesRequest(257-261)core/providers/gemini/responses.go (1)
ToGeminiResponsesRequest(71-121)
tests/integrations/tests/test_google.py (2)
tests/integrations/tests/utils/config_loader.py (2)
get_model(140-161)get_model(443-445)tests/integrations/tests/utils/common.py (1)
skip_if_no_api_key(1625-1636)
core/providers/gemini/chat.go (4)
core/schemas/chatcompletions.go (15)
BifrostChatRequest(12-19)ChatAssistantMessageToolCall(686-692)ChatContentBlock(590-597)ChatReasoningDetails(723-730)ChatContentBlockTypeText(582-582)ChatAssistantMessageToolCallFunction(695-698)ChatToolChoiceTypeFunction(373-373)ChatMessage(469-478)ChatMessageRoleAssistant(461-461)ChatMessageContent(524-527)ChatAssistantMessage(626-632)BifrostResponseChoice(704-712)BifrostChatResponse(26-41)ChatStreamResponseChoiceDelta(756-763)ChatStreamResponseChoice(751-753)core/providers/gemini/types.go (6)
GeminiGenerationRequest(55-73)GenerationConfig(631-697)Content(933-941)Type(778-778)Role(13-13)FinishReason(22-22)core/schemas/utils.go (1)
Ptr(16-18)core/providers/gemini/utils.go (1)
ConvertGeminiFinishReasonToBifrost(227-232)
core/providers/gemini/responses.go (3)
core/providers/gemini/types.go (14)
GeminiGenerationRequest(55-73)ToolConfig(146-151)GenerationConfig(631-697)GenerateContentResponse(1298-1314)Part(947-971)Role(13-13)Content(933-941)Type(778-778)FunctionCall(1102-1112)FunctionResponse(1115-1137)GenerateContentResponseUsageMetadata(1270-1295)FileData(1091-1099)Blob(974-982)Tool(600-627)core/schemas/responses.go (9)
BifrostResponsesRequest(32-39)ResponsesInputMessageRoleAssistant(336-336)ResponsesInputMessageRoleUser(337-337)ResponsesMessageTypeFunctionCall(299-299)ResponsesToolMessage(466-486)ResponsesMessageTypeReasoning(311-311)BifrostResponsesStreamResponse(1431-1470)ResponsesMessage(318-331)ResponsesTool(1034-1049)core/schemas/utils.go (4)
ParseModelString(23-36)SafeExtractFromMap(519-525)SafeExtractString(280-297)Ptr(16-18)
🪛 Ruff (0.14.8)
tests/integrations/tests/test_google.py
890-890: Unused method argument: test_config
(ARG002)
966-966: Unused method argument: test_config
(ARG002)
🔇 Additional comments (7)
tests/integrations/tests/test_google.py (1)
86-89: Imports for Gemini reasoning prompts look correct.core/providers/gemini/chat.go (1)
24-34: OK: modalities plumbing removed; but double-check speech/responseModalities callers didn’t rely on this.
convertParamsToGenerationConfig(..., []string{})is fine if chat completion should no longer infer modalities here. Please just verify there are no remaining call sites expecting non-empty modalities for audio/speech on the chat path.core/providers/gemini/gemini.go (1)
216-251: ChatCompletion path looks coherent end-to-end.One small consistency check: ChatCompletion sets
RawResponsewhen enabled, but notRawRequest. If other providers include RawRequest for non-streaming chat, consider matching that pattern here too (optional).core/providers/gemini/utils.go (2)
11-84: Responses parameter mapping is sensible; ThinkingConfig → Reasoning mapping looks OK.Just verify the
Logprobs→TopLogProbscast and any expected bounds (since this is surfaced as an int pointer now).
872-924:buildOpenAIResponseFormatreturn type update looks correct.LGTM: returning
ResponsesTextConfigforjson_object/json_schemaaligns better with Responses API surfaces.transports/bifrost-http/integrations/genai.go (2)
51-53: Responses API converters are correctly implemented; the SSE writer properly skips lifecycle events returned as (eventName: "", payload: nil, err: nil).The switch to
ResponsesRequest+ResponsesResponseConverter+ToGeminiResponsesResponseis consistent with the established pattern. The SSE writer in router.go (line 809–811) explicitly handles the skip case: whenconvertedResponse == nil && err == nil, it continues without writing, which is exactly what GenAI'sResponsesStreamResponseConverterreturns for lifecycle events with no Gemini equivalent. The same pattern is used consistently across OpenAI, Anthropic, and Bedrock integrations, confirming this is intentional design rather than a transport hazard.Also applies to: 60-62
73-80: The implementation is correct as-is. The stream writer already properly handles nil payloads: the router explicitly skips chunks whereconvertedResponse == nil && err == nil(no output is sent). Additionally, GenAI streaming termination is already correctly configured—the [DONE] marker logic explicitly excludes GenAI (config.Type != RouteConfigTypeGenAI), allowing it to signal completion by closing the stream rather than sending[DONE], which is the correct behavior for Responses APIs. No changes are needed.Likely an incorrect or invalid review comment.

Summary
Refactored the Gemini provider to use native Gemini API format instead of OpenAI-compatible endpoints, improving support for Gemini-specific features and fixing issues with tool calls and streaming.
Fixes
#997
Changes
Type of change
Affected areas
How to test
Test Gemini provider with various features:
Breaking changes
Related issues
Fixes issues with Gemini tool calls and streaming responses.
Security considerations
No significant security implications. The code maintains the same authentication patterns.
Checklist