Skip to content

Conversation

@TejasGhatte
Copy link
Collaborator

@TejasGhatte TejasGhatte commented Dec 8, 2025

Summary

Refactored the Gemini provider to use native Gemini API format instead of OpenAI-compatible endpoints, improving support for Gemini-specific features and fixing issues with tool calls and streaming.

Fixes

#997

Changes

  • Replaced OpenAI-compatible endpoints with direct Gemini API calls
  • Implemented native Gemini streaming for both chat and responses APIs
  • Added proper handling for Gemini-specific features like thought signatures and reasoning
  • Fixed tool call handling to properly preserve IDs and correlate function responses
  • Updated Vertex AI provider to use native Gemini format for Gemini models
  • Added support for Gemini transcription in the provider capabilities
  • Improved error handling and response conversion

Type of change

  • Bug fix
  • Feature
  • Refactor
  • Documentation
  • Chore/CI

Affected areas

  • Core (Go)
  • Transports (HTTP)
  • Providers/Integrations
  • Plugins
  • UI (Next.js)
  • Docs

How to test

Test Gemini provider with various features:

# Run Gemini provider tests
go test ./core/providers/gemini/...

# Test with Vertex AI using Gemini models
go test ./core/providers/vertex/...

# Test GenAI integration
go test ./transports/bifrost-http/integrations/...

Breaking changes

  • Yes
  • No

Related issues

Fixes issues with Gemini tool calls and streaming responses.

Security considerations

No significant security implications. The code maintains the same authentication patterns.

Checklist

  • I added/updated tests where appropriate
  • I verified builds succeed (Go and UI)
  • I verified the CI pipeline passes locally if applicable

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Dec 8, 2025

📝 Walkthrough

Summary by CodeRabbit

Release Notes

  • New Features

    • Added comprehensive streaming support for Gemini models in chat and responses APIs
    • Integrated Gemini model support with Vertex provider
    • Enabled extended thinking/reasoning capabilities for Gemini models
    • Enhanced JSON schema response formatting with additional configuration options
  • Updates

    • Updated default Gemini models to latest versions (gemini-2.5-flash)
    • Expanded Vertex provider compatibility to support Gemini alongside existing providers
  • Tests

    • Added extended thinking tests for both standard and streaming modes

✏️ Tip: You can customize this high-level summary in your review settings.

Walkthrough

Replaces non‑streaming Gemini conversions with streaming‑first implementations: adds SSE streaming handlers and per‑chunk delta assembly for ChatCompletion and Responses, expands Gemini↔Bifrost conversion surfaces, integrates Gemini into Vertex/transport layers, introduces ThinkingLevel, and updates integration tests for reasoning and streaming.

Changes

Cohort / File(s) Summary
Gemini Chat Conversions
core/providers/gemini/chat.go
Removed non‑streaming conversion paths; added streaming entrypoint ToBifrostChatCompletionStream() and delta‑based per‑chunk response assembly with finalization and usage emission on final chunk.
Gemini Provider & Handlers
core/providers/gemini/gemini.go, core/providers/gemini/gemini_test.go
Added ChatCompletion/Responses and streaming entry points (ChatCompletionStream/ResponsesStream), SSE stream handlers (HandleGemini*Stream), chunk parser processGeminiStreamChunk, usage extraction helpers, and updated test models for reasoning.
Gemini Responses & Streaming Engine
core/providers/gemini/responses.go
Large streaming engine: GeminiResponsesStreamState, pooling, per‑part processors (text/thought/function/inline/file), ToBifrostResponsesStream, FinalizeGeminiResponsesStream, and many conversions between Gemini parts and Bifrost Responses stream events.
Gemini Utilities & Conversions
core/providers/gemini/utils.go, core/providers/gemini/types.go
Reworked generation→responses parameter mapping, response text config handling, finish‑reason & usage converters, extractFunctionResponseOutput, and added ThinkingLevel type + field in ThinkingConfig.
Speech / Transcription / Embedding Adjustments
core/providers/gemini/speech.go, core/providers/gemini/transcription.go, core/providers/gemini/embedding.go
Removed Vertex-specific automatic google/ model prefixing; added nil-guard for speech input contents; minor formatting fixes.
Schemas & Helpers
core/schemas/responses.go, core/schemas/utils.go
Extended ResponsesTextConfigFormatJSONSchema with Name, Schema, Description, Strict; added IsGeminiModel(model string) bool helper (duplicate insertion observed).
Vertex Integration
core/providers/vertex/vertex.go, core/providers/vertex/vertex_test.go
Routed Gemini models through Vertex Gemini endpoints for Chat/Responses (streaming & non‑streaming), added stripVertexGeminiUnsupportedFields, and adjusted test model identifiers.
Transport / GenAI Route Configs
transports/bifrost-http/integrations/genai.go
Switched non‑embedding/speech routes from Chat→Responses types using ToBifrostResponsesRequest(), replaced chat response/stream converters with responses converters, and added lifecycle‑skip logic for streaming converters.
Integration Tests & Test Data
tests/integrations/tests/test_google.py, tests/integrations/tests/utils/common.py
Added Gemini reasoning prompts, two extended‑thinking tests (non‑streaming & streaming), updated streaming/error tests, helper to extract Google function calls, and updated image‑message fixtures.
Transports / GenAI -> Gemini Bridge
transports/bifrost-http/integrations/genai.go
Updated converters to call Gemini responses conversion functions and adjusted stream converter behavior to skip lifecycle events when appropriate.

Sequence Diagram(s)

sequenceDiagram
  participant Client
  participant Transport as Bifrost-HTTP
  participant Provider as GeminiProvider
  participant Gemini as Gemini API

  Client->>Transport: Start stream request (ChatCompletionStream / ResponsesStream)
  Transport->>Provider: Forward request + postHookRunner
  Provider->>Provider: Build GeminiGenerationRequest (ToGemini*Request)
  Provider->>Gemini: POST /generateContent_stream (SSE)
  Gemini-->>Provider: SSE chunks (parts)
  loop per SSE chunk
    Provider->>Provider: processGeminiStreamChunk -> GenerateContentResponse
    Provider->>Provider: assemble delta (text/thought/tool/file)
    Provider-->>Transport: emit Bifrost stream chunk (delta + metadata)
    Transport-->>Client: deliver chunk
  end
  Provider->>Provider: on final chunk attach usage/final outputs
  Provider-->>Transport: final event / done
  Transport-->>Client: stream end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60–90 minutes

  • Focus areas:
    • core/providers/gemini/responses.go — streaming state machine, per‑part processors, pooling and finalization.
    • core/providers/gemini/gemini.go — SSE lifecycle, chunk parsing, error/cancellation handling, raw payload handling.
    • transports/bifrost-http/integrations/genai.go & core/providers/vertex/vertex.go — converter updates and field stripping.
    • core/schemas/utils.go — duplicate IsGeminiModel insertion to resolve.
    • tests/integrations — validate new streaming/reasoning tests and fixture changes.

Poem

🐰 I hopped through streams of JSON light,

Deltas, thoughts, and tools took flight,
Pooling states snug in chunked night,
Final usage stitched the end just right,
Hooray — the Gemini stream feels bright.

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 78.00% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title 'feat: added support for gemini native converters' accurately describes the main refactoring effort—migrating the Gemini provider from OpenAI-compatible endpoints to native Gemini API format.
Description check ✅ Passed The description is comprehensive, covering the summary, all required sections (changes, type of change, affected areas, how to test, breaking changes, related issues, security, and checklist). It clearly outlines the refactoring objectives and implementation approach.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch 12-08-feat_added_support_for_gemini_native_converterswq

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Collaborator Author

TejasGhatte commented Dec 8, 2025

@github-actions
Copy link
Contributor

github-actions bot commented Dec 8, 2025

🧪 Test Suite Available

This PR can be tested by a repository admin.

Run tests for PR #1018

@TejasGhatte TejasGhatte changed the title feat: added support for gemini native converterswq feat: added support for gemini native converters Dec 8, 2025
@TejasGhatte TejasGhatte marked this pull request as ready for review December 8, 2025 04:27
@TejasGhatte TejasGhatte requested review from Pratham-Mishra04 and removed request for roroghost17 December 8, 2025 04:29
@TejasGhatte TejasGhatte force-pushed the 12-08-feat_added_support_for_gemini_native_converterswq branch from 4374efc to 2cc10d8 Compare December 8, 2025 04:47
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 6

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
core/providers/gemini/chat.go (1)

64-156: Preserve Gemini function-call IDs in non‑streaming chat responses.

In ToBifrostChatResponse, tool calls currently set:

  • Type using ChatToolChoiceTypeFunction (choice type), and
  • ID to function.Name (the function name), ignoring part.FunctionCall.ID.

This differs from the streaming path, which:

  • Uses ChatToolTypeFunction, and
  • Preferentially uses FunctionCall.ID, falling back to Name.

Losing the provider‑supplied ID breaks correlation when Gemini emits multiple calls with the same name.

Consider aligning non‑streaming logic with streaming, e.g.:

-                if part.FunctionCall != nil {
-                    function := schemas.ChatAssistantMessageToolCallFunction{
-                        Name: &part.FunctionCall.Name,
-                    }
-                    if part.FunctionCall.Args != nil {
-                        jsonArgs, err := json.Marshal(part.FunctionCall.Args)
-                        if err != nil {
-                            jsonArgs = []byte(fmt.Sprintf("%v", part.FunctionCall.Args))
-                        }
-                        function.Arguments = string(jsonArgs)
-                    }
-
-                    toolCalls = append(toolCalls, schemas.ChatAssistantMessageToolCall{
-                        Index:    uint16(len(toolCalls)),
-                        Type:     schemas.Ptr(string(schemas.ChatToolChoiceTypeFunction)),
-                        ID:       function.Name,
-                        Function: function,
-                    })
-                }
+                if part.FunctionCall != nil {
+                    function := schemas.ChatAssistantMessageToolCallFunction{
+                        Name: &part.FunctionCall.Name,
+                    }
+                    if part.FunctionCall.Args != nil {
+                        jsonArgs, err := json.Marshal(part.FunctionCall.Args)
+                        if err != nil {
+                            jsonArgs = []byte(fmt.Sprintf("%v", part.FunctionCall.Args))
+                        }
+                        function.Arguments = string(jsonArgs)
+                    }
+
+                    callID := part.FunctionCall.Name
+                    if part.FunctionCall.ID != "" {
+                        callID = part.FunctionCall.ID
+                    }
+
+                    toolCalls = append(toolCalls, schemas.ChatAssistantMessageToolCall{
+                        Index:    uint16(len(toolCalls)),
+                        Type:     schemas.Ptr(string(schemas.ChatToolTypeFunction)),
+                        ID:       &callID,
+                        Function: function,
+                    })
+                }

This keeps IDs consistent across streaming and non‑streaming chat.

[scratchpad_end] -->

🧹 Nitpick comments (2)
core/providers/gemini/utils.go (1)

262-293: Consider lazy initialization of token details structs.

The OutputTokensDetails and InputTokensDetails are initialized unconditionally on lines 272-273, but these allocations are only meaningful when there's data to populate. Consider initializing them only when needed to avoid unnecessary allocations.

 usage := &schemas.ResponsesResponseUsage{
 	TotalTokens:         int(metadata.TotalTokenCount),
 	InputTokens:         int(metadata.PromptTokenCount),
 	OutputTokens:        int(metadata.CandidatesTokenCount),
-	OutputTokensDetails: &schemas.ResponsesResponseOutputTokens{},
-	InputTokensDetails:  &schemas.ResponsesResponseInputTokens{},
 }

 // Add cached tokens if present
 if metadata.CachedContentTokenCount > 0 {
 	usage.InputTokensDetails = &schemas.ResponsesResponseInputTokens{
 		CachedTokens: int(metadata.CachedContentTokenCount),
 	}
 }

 if metadata.CandidatesTokensDetails != nil {
+	usage.OutputTokensDetails = &schemas.ResponsesResponseOutputTokens{}
 	for _, detail := range metadata.CandidatesTokensDetails {
 		switch detail.Modality {
 		case "AUDIO":
 			usage.OutputTokensDetails.AudioTokens = int(detail.TokenCount)
 		}
 	}
 }
core/schemas/responses.go (1)

130-140: JSON schema struct extensions look correct; consider tightening Schema typing.

The added Name, Description, and Strict fields align with richer JSON schema metadata and are safe. The Schema *any field is flexible but very loose; if callers always treat this as raw JSON, consider json.RawMessage or a documented alias type instead to make usage clearer and avoid accidental double-marshalling.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between ffd9580 and 4374efc.

📒 Files selected for processing (12)
  • core/providers/gemini/chat.go (3 hunks)
  • core/providers/gemini/gemini.go (1 hunks)
  • core/providers/gemini/gemini_test.go (2 hunks)
  • core/providers/gemini/responses.go (9 hunks)
  • core/providers/gemini/speech.go (1 hunks)
  • core/providers/gemini/utils.go (7 hunks)
  • core/providers/vertex/vertex.go (8 hunks)
  • core/providers/vertex/vertex_test.go (1 hunks)
  • core/schemas/responses.go (1 hunks)
  • core/schemas/utils.go (1 hunks)
  • tests/integrations/tests/test_google.py (1 hunks)
  • transports/bifrost-http/integrations/genai.go (2 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**

⚙️ CodeRabbit configuration file

always check the stack if there is one for the current PR. do not give localized reviews for the PR, always see all changes in the light of the whole stack of PRs (if there is a stack, if there is no stack you can continue to make localized suggestions/reviews)

Files:

  • core/providers/gemini/gemini_test.go
  • core/providers/vertex/vertex.go
  • core/providers/vertex/vertex_test.go
  • core/schemas/utils.go
  • core/schemas/responses.go
  • transports/bifrost-http/integrations/genai.go
  • core/providers/gemini/speech.go
  • tests/integrations/tests/test_google.py
  • core/providers/gemini/gemini.go
  • core/providers/gemini/utils.go
  • core/providers/gemini/chat.go
  • core/providers/gemini/responses.go
🧬 Code graph analysis (7)
core/providers/vertex/vertex.go (4)
core/schemas/utils.go (1)
  • IsGeminiModel (1052-1054)
core/providers/gemini/chat.go (1)
  • ToGeminiChatCompletionRequest (11-62)
core/providers/gemini/types.go (3)
  • GeminiGenerationRequest (55-73)
  • FunctionCall (1091-1101)
  • FunctionResponse (1104-1126)
core/providers/gemini/gemini.go (1)
  • HandleGeminiChatCompletionStream (294-485)
core/schemas/responses.go (1)
core/providers/gemini/types.go (1)
  • Schema (724-775)
transports/bifrost-http/integrations/genai.go (3)
transports/bifrost-http/handlers/inference.go (1)
  • ResponsesRequest (193-197)
core/schemas/bifrost.go (1)
  • ResponsesRequest (91-91)
transports/bifrost-http/integrations/router.go (3)
  • EmbeddingResponseConverter (102-102)
  • ResponsesResponseConverter (98-98)
  • ResponsesStreamResponseConverter (123-123)
tests/integrations/tests/test_google.py (1)
tests/integrations/tests/utils/config_loader.py (2)
  • get_model (140-161)
  • get_model (443-445)
core/providers/gemini/gemini.go (6)
core/providers/gemini/chat.go (1)
  • ToGeminiChatCompletionRequest (11-62)
core/schemas/bifrost.go (8)
  • RequestType (83-83)
  • ChatCompletionRequest (89-89)
  • BifrostStream (318-325)
  • BifrostError (353-362)
  • Gemini (48-48)
  • ModelProvider (32-32)
  • BifrostResponseExtraFields (284-293)
  • ResponsesRequest (91-91)
core/providers/utils/utils.go (7)
  • ShouldSendBackRawResponse (482-487)
  • NewBifrostOperationError (449-460)
  • NewProviderAPIError (464-479)
  • ProcessAndSendBifrostError (571-601)
  • ProcessAndSendResponse (535-565)
  • GetBifrostResponseForStreamResponse (777-805)
  • ProcessAndSendError (607-653)
core/schemas/utils.go (1)
  • Ptr (16-18)
core/utils.go (1)
  • Ptr (51-53)
core/providers/gemini/responses.go (1)
  • ToGeminiResponsesRequest (63-102)
core/providers/gemini/utils.go (4)
core/providers/gemini/types.go (16)
  • Type (778-778)
  • Schema (724-775)
  • FinishReason (22-22)
  • FinishReasonStop (28-28)
  • FinishReasonMaxTokens (30-30)
  • FinishReasonSafety (33-33)
  • FinishReasonRecitation (35-35)
  • FinishReasonLanguage (37-37)
  • FinishReasonOther (39-39)
  • FinishReasonBlocklist (41-41)
  • FinishReasonProhibitedContent (43-43)
  • FinishReasonSPII (46-46)
  • FinishReasonMalformedFunctionCall (48-48)
  • FinishReasonImageSafety (50-50)
  • FinishReasonUnexpectedToolCall (52-52)
  • FunctionResponse (1104-1126)
core/schemas/responses.go (4)
  • ResponsesParameters (85-112)
  • ResponsesTextConfig (118-121)
  • ResponsesTextConfigFormat (123-128)
  • ResponsesTextConfigFormatJSONSchema (131-140)
core/schemas/utils.go (1)
  • Ptr (16-18)
core/schemas/chatcompletions.go (3)
  • BifrostLLMUsage (640-647)
  • ChatPromptTokensDetails (649-655)
  • ChatCompletionTokensDetails (657-667)
core/providers/gemini/chat.go (5)
core/schemas/chatcompletions.go (10)
  • ChatAssistantMessageToolCall (564-570)
  • ChatContentBlock (505-512)
  • ChatContentBlockTypeText (497-497)
  • ChatAssistantMessageToolCallFunction (573-576)
  • ChatAssistantMessage (541-545)
  • BifrostResponseChoice (582-590)
  • ChatNonStreamResponseChoice (605-608)
  • BifrostChatResponse (26-41)
  • ChatStreamResponseChoiceDelta (616-622)
  • ChatStreamResponseChoice (611-613)
core/providers/gemini/types.go (8)
  • GeminiGenerationRequest (55-73)
  • GenerationConfig (631-697)
  • Content (922-930)
  • Type (778-778)
  • Role (13-13)
  • GenerateContentResponse (1287-1303)
  • FinishReason (22-22)
  • RoleModel (17-17)
core/schemas/utils.go (1)
  • Ptr (16-18)
core/schemas/bifrost.go (1)
  • BifrostError (353-362)
core/providers/gemini/utils.go (1)
  • ConvertGeminiFinishReasonToBifrost (213-218)
🔇 Additional comments (14)
core/providers/vertex/vertex_test.go (1)

26-27: LGTM!

The model name changes align with the PR's native Gemini converter integration. Removing the google/ prefix is consistent with how IsGeminiModel() detects Gemini models via substring matching on "gemini".

core/providers/gemini/utils.go (3)

194-218: Finish reason mapping looks appropriate.

The mapping of Gemini finish reasons to Bifrost format is well-structured. Mapping FinishReasonMalformedFunctionCall and FinishReasonUnexpectedToolCall to "tool_calls" makes sense as they indicate tool-related termination reasons.


883-886: Hardcoded Strict: false may not always be appropriate.

The Strict field is always set to false. If the original Gemini schema had strictness requirements, this information is lost. Consider whether this should be configurable or derived from the schema.


932-954: LGTM!

The extractFunctionResponseOutput helper has good defensive nil checks and follows a sensible extraction strategy: prefer the "output" key when present, otherwise marshal the entire response.

core/providers/gemini/gemini_test.go (1)

50-54: Verify the test scenario flag changes are intentional.

The test flags have been flipped: Transcription and TranscriptionStream are now enabled, while SpeechSynthesis and SpeechSynthesisStream are now disabled. Please confirm this reflects the actual capability changes in this PR.

core/providers/gemini/speech.go (1)

110-127: LGTM!

Good defensive nil-check added to prevent potential nil pointer dereference when bifrostReq.Input is nil.

core/schemas/utils.go (1)

1052-1054: LGTM!

The IsGeminiModel function follows the established pattern used by IsAnthropicModel and IsMistralModel, providing consistent model type detection across providers.

tests/integrations/tests/test_google.py (1)

507-516: Verify the change in error handling behavior is intentional.

This test previously expected an exception for invalid roles but now expects a successful response. This represents a significant change in error handling semantics. Please confirm this aligns with the expected behavior of the new native Gemini converters.

core/providers/vertex/vertex.go (3)

297-312: LGTM!

The Gemini branch follows the established pattern from the Anthropic branch: convert to native format, strip unsupported fields, and marshal/unmarshal for the Vertex API map format.


630-703: LGTM!

The Gemini streaming implementation correctly:

  • Uses v1beta1 API version for streaming endpoints
  • Adds the alt=sse parameter for SSE streaming
  • Handles both API key (query param) and OAuth2 authentication paths
  • Delegates to the shared HandleGeminiChatCompletionStream handler

1384-1400: LGTM!

The stripVertexGeminiUnsupportedFields helper cleanly removes the id fields from function_call and function_response objects that aren't supported by Vertex AI's Gemini API.

core/providers/gemini/chat.go (1)

10-62: Verify convertParamsToGenerationConfig defaults when given an empty modalities slice.

The call to convertParamsToGenerationConfig(bifrostReq.Params, []string{}) with an empty modalities slice should be reviewed to ensure it correctly defaults ResponseModalities for chat operations (e.g., to text). Confirm that this change does not regress any existing behavior that relied on non-empty default modalities.

core/providers/gemini/responses.go (1)

63-101: Error handling verification incomplete — unable to assess full context.

The review comment cannot be fully verified without access to:

  1. The actual repository and all call sites of ToGeminiResponsesRequest
  2. The PR stack context (as required by coding guidelines)
  3. Verification that the suggested signature change aligns with all callers across the codebase

To properly rewrite this review, please provide:

  • Access to the repository, or
  • The list of all call sites of ToGeminiResponsesRequest, or
  • The git diff/PR context showing all related changes in the stack
transports/bifrost-http/integrations/genai.go (1)

35-81: GenAI routes correctly switched to Responses API and native Gemini converters.

The RequestConverter and new ResponsesResponseConverter / ResponsesStreamResponseConverter are wired consistently with the router types and Gemini helpers, and the embedding/speech/transcription branches remain intact. Verify that no remaining ChatResponseConverter/ChatStreamResponseConverter usages exist for GenAI Gemini routes in the integration layer.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 6

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
core/providers/gemini/responses.go (1)

2165-2172: FileData must be base64-decoded before casting to bytes.

At line 2169, block.ResponsesInputMessageContentBlockFile.FileData is cast directly to []byte, but the type definition explicitly documents this field as "Base64 encoded file data". The code must decode it first. This mirrors the correct pattern already used in the same file for audio data (lines 2134–2136) and image data (lines 2110–2112).

} else if block.ResponsesInputMessageContentBlockFile.FileData != nil {
+    decodedData, err := base64.StdEncoding.DecodeString(*block.ResponsesInputMessageContentBlockFile.FileData)
+    if err != nil {
+        return nil, fmt.Errorf("failed to decode base64 file data: %w", err)
+    }
     return &Part{
         InlineData: &Blob{
             MIMEType: "application/octet-stream", // default
-            Data:     []byte(*block.ResponsesInputMessageContentBlockFile.FileData),
+            Data:     decodedData,
         },
     }, nil
 }
🧹 Nitpick comments (6)
core/providers/gemini/speech.go (1)

110-127: Consider returning an error when input is empty.

The nil check prevents a potential nil pointer dereference, which is good. However, when bifrostReq.Input is nil or empty, the function returns a GeminiGenerationRequest without any Contents, which may cause unexpected behavior when making API calls.

Consider whether an error should be returned for missing input:

+	if bifrostReq.Input == nil || bifrostReq.Input.Input == "" {
+		return nil, fmt.Errorf("speech input is required")
+	}
-	if bifrostReq.Input != nil && bifrostReq.Input.Input != "" {
-		geminiReq.Contents = []Content{
+	geminiReq.Contents = []Content{
 			{
 				Parts: []*Part{
 					{
 						Text: bifrostReq.Input.Input,
 					},
 				},
 			},
 		}
-		// Add speech config to generation config if voice config is provided
-		if bifrostReq.Params != nil && bifrostReq.Params.VoiceConfig != nil {
-			// Handle both single voice and multi-voice configurations
-			if bifrostReq.Params.VoiceConfig.Voice != nil || len(bifrostReq.Params.VoiceConfig.MultiVoiceConfig) > 0 {
-				addSpeechConfigToGenerationConfig(&geminiReq.GenerationConfig, bifrostReq.Params.VoiceConfig)
-			}
+	// Add speech config to generation config if voice config is provided
+	if bifrostReq.Params != nil && bifrostReq.Params.VoiceConfig != nil {
+		// Handle both single voice and multi-voice configurations
+		if bifrostReq.Params.VoiceConfig.Voice != nil || len(bifrostReq.Params.VoiceConfig.MultiVoiceConfig) > 0 {
+			addSpeechConfigToGenerationConfig(&geminiReq.GenerationConfig, bifrostReq.Params.VoiceConfig)
 		}
 	}
tests/integrations/tests/test_google.py (1)

507-516: Test name may be misleading after behavior change.

The test name test_12_error_handling_invalid_roles and docstring "Error handling for invalid roles" suggest it tests error scenarios, but it now validates successful responses. Consider renaming for clarity:

-    def test_12_error_handling_invalid_roles(self, google_client, test_config):
-        """Test Case 12: Error handling for invalid roles"""
+    def test_12_invalid_roles_handling(self, google_client, test_config):
+        """Test Case 12: Verify API handles invalid roles gracefully"""
core/providers/gemini/utils.go (1)

262-293: Unnecessary initialization of empty detail structs.

Lines 272-273 initialize OutputTokensDetails and InputTokensDetails to empty structs, but then lines 277-280 reassign InputTokensDetails only when cached tokens > 0. This leaves potentially empty structs in the response when there are no details.

Consider initializing the detail structs only when there's actual data:

 usage := &schemas.ResponsesResponseUsage{
     TotalTokens:         int(metadata.TotalTokenCount),
     InputTokens:         int(metadata.PromptTokenCount),
     OutputTokens:        int(metadata.CandidatesTokenCount),
-    OutputTokensDetails: &schemas.ResponsesResponseOutputTokens{},
-    InputTokensDetails:  &schemas.ResponsesResponseInputTokens{},
 }

 // Add cached tokens if present
 if metadata.CachedContentTokenCount > 0 {
     usage.InputTokensDetails = &schemas.ResponsesResponseInputTokens{
         CachedTokens: int(metadata.CachedContentTokenCount),
     }
 }

 if metadata.CandidatesTokensDetails != nil {
+    usage.OutputTokensDetails = &schemas.ResponsesResponseOutputTokens{}
     for _, detail := range metadata.CandidatesTokensDetails {
         switch detail.Modality {
         case "AUDIO":
             usage.OutputTokensDetails.AudioTokens = int(detail.TokenCount)
         }
     }
 }
core/providers/gemini/responses.go (3)

539-556: Pool's New function creates state with non-zero CreatedAt, but acquireGeminiResponsesStreamState resets it.

The sync.Pool's New function initializes CreatedAt with int(time.Now().Unix()), but this value gets overwritten in acquireGeminiResponsesStreamState at line 587. The initialization in New is redundant, though harmless.

Consider simplifying the pool's New function since all fields are reset in acquire anyway:

 var geminiResponsesStreamStatePool = sync.Pool{
     New: func() interface{} {
         return &GeminiResponsesStreamState{
             ItemIDs:             make(map[int]string),
             ToolCallIDs:         make(map[int]string),
             ToolCallNames:       make(map[int]string),
             ToolArgumentBuffers: make(map[int]string),
-            CurrentOutputIndex:   0,
-            CreatedAt:            int(time.Now().Unix()),
-            HasEmittedCreated:    false,
-            HasEmittedInProgress: false,
-            TextItemClosed:       false,
-            HasStartedText:       false,
-            HasStartedToolCall:   false,
         }
     },
 }

604-636: Duplicate code: flush() is identical to the reset logic in acquireGeminiResponsesStreamState().

The flush() method (lines 604-636) duplicates the reset logic in acquireGeminiResponsesStreamState() (lines 561-593). Consider extracting to a single method.

+// reset clears all state fields to prepare for reuse
+func (state *GeminiResponsesStreamState) reset() {
+    if state.ItemIDs == nil {
+        state.ItemIDs = make(map[int]string)
+    } else {
+        clear(state.ItemIDs)
+    }
+    // ... rest of clearing logic
+    state.CreatedAt = int(time.Now().Unix())
+    state.HasEmittedCreated = false
+    // ... remaining fields
+}

 func acquireGeminiResponsesStreamState() *GeminiResponsesStreamState {
     state := geminiResponsesStreamStatePool.Get().(*GeminiResponsesStreamState)
-    // Clear maps
-    if state.ItemIDs == nil {
-    // ... all the duplicate code
+    state.reset()
     return state
 }

-func (state *GeminiResponsesStreamState) flush() {
-    // ... duplicate code
-}
+func (state *GeminiResponsesStreamState) flush() {
+    state.reset()
+}

1961-1965: Role mapping may silently default non-assistant roles to "user".

Lines 1961-1965 map assistant to "model" but default everything else to "user". If there are other roles (e.g., "developer", "tool"), they'll be incorrectly mapped.

Consider explicit handling of known roles:

 if msg.Role != nil {
     if *msg.Role == schemas.ResponsesInputMessageRoleAssistant {
         content.Role = "model"
-    } else {
+    } else if *msg.Role == schemas.ResponsesInputMessageRoleUser {
         content.Role = "user"
+    } else {
+        // Log or handle unexpected role
+        content.Role = "user" // Fallback
     }
 }
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between ffd9580 and 2cc10d8.

📒 Files selected for processing (12)
  • core/providers/gemini/chat.go (3 hunks)
  • core/providers/gemini/gemini.go (1 hunks)
  • core/providers/gemini/gemini_test.go (1 hunks)
  • core/providers/gemini/responses.go (9 hunks)
  • core/providers/gemini/speech.go (1 hunks)
  • core/providers/gemini/utils.go (7 hunks)
  • core/providers/vertex/vertex.go (8 hunks)
  • core/providers/vertex/vertex_test.go (1 hunks)
  • core/schemas/responses.go (1 hunks)
  • core/schemas/utils.go (1 hunks)
  • tests/integrations/tests/test_google.py (1 hunks)
  • transports/bifrost-http/integrations/genai.go (2 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**

⚙️ CodeRabbit configuration file

always check the stack if there is one for the current PR. do not give localized reviews for the PR, always see all changes in the light of the whole stack of PRs (if there is a stack, if there is no stack you can continue to make localized suggestions/reviews)

Files:

  • core/providers/gemini/gemini_test.go
  • core/schemas/responses.go
  • core/providers/vertex/vertex_test.go
  • core/schemas/utils.go
  • core/providers/vertex/vertex.go
  • core/providers/gemini/speech.go
  • core/providers/gemini/utils.go
  • tests/integrations/tests/test_google.py
  • core/providers/gemini/gemini.go
  • transports/bifrost-http/integrations/genai.go
  • core/providers/gemini/chat.go
  • core/providers/gemini/responses.go
🧬 Code graph analysis (7)
core/schemas/responses.go (1)
core/providers/gemini/types.go (1)
  • Schema (724-775)
core/providers/gemini/utils.go (5)
core/providers/gemini/types.go (5)
  • GeminiGenerationRequest (55-73)
  • GenerationConfig (631-697)
  • Type (778-778)
  • Schema (724-775)
  • Modality (706-706)
core/schemas/responses.go (7)
  • ResponsesParameters (85-112)
  • ResponsesTextConfig (118-121)
  • ResponsesTextConfigFormat (123-128)
  • ResponsesResponseUsage (255-262)
  • ResponsesResponseOutputTokens (272-282)
  • ResponsesResponseInputTokens (264-270)
  • ResponsesTextConfigFormatJSONSchema (131-140)
core/schemas/utils.go (1)
  • Ptr (16-18)
core/utils.go (1)
  • Ptr (51-53)
core/schemas/chatcompletions.go (3)
  • BifrostLLMUsage (640-647)
  • ChatPromptTokensDetails (649-655)
  • ChatCompletionTokensDetails (657-667)
tests/integrations/tests/test_google.py (1)
tests/integrations/tests/utils/config_loader.py (2)
  • get_model (140-161)
  • get_model (443-445)
core/providers/gemini/gemini.go (5)
core/providers/gemini/chat.go (1)
  • ToGeminiChatCompletionRequest (11-62)
core/providers/utils/utils.go (4)
  • ShouldSendBackRawResponse (482-487)
  • GetPathFromContext (211-216)
  • ProcessAndSendBifrostError (571-601)
  • ProcessAndSendResponse (535-565)
core/schemas/utils.go (1)
  • Ptr (16-18)
core/utils.go (1)
  • Ptr (51-53)
core/providers/gemini/responses.go (1)
  • ToGeminiResponsesRequest (77-127)
transports/bifrost-http/integrations/genai.go (5)
transports/bifrost-http/handlers/inference.go (1)
  • ResponsesRequest (193-197)
core/schemas/bifrost.go (1)
  • ResponsesRequest (91-91)
transports/bifrost-http/integrations/router.go (3)
  • EmbeddingResponseConverter (102-102)
  • ResponsesResponseConverter (98-98)
  • ResponsesStreamResponseConverter (123-123)
core/schemas/responses.go (2)
  • BifrostResponsesResponse (45-83)
  • BifrostResponsesStreamResponse (1426-1464)
core/providers/gemini/responses.go (2)
  • ToGeminiResponsesResponse (154-340)
  • ToGeminiResponsesStreamResponse (342-510)
core/providers/gemini/chat.go (3)
core/schemas/chatcompletions.go (12)
  • ChatAssistantMessageToolCall (564-570)
  • ChatContentBlock (505-512)
  • ChatContentBlockTypeText (497-497)
  • ChatAssistantMessageToolCallFunction (573-576)
  • ChatToolChoiceTypeFunction (331-331)
  • ChatMessage (427-436)
  • ChatAssistantMessage (541-545)
  • BifrostResponseChoice (582-590)
  • ChatNonStreamResponseChoice (605-608)
  • BifrostChatResponse (26-41)
  • ChatStreamResponseChoiceDelta (616-622)
  • ChatStreamResponseChoice (611-613)
core/schemas/utils.go (1)
  • Ptr (16-18)
core/providers/gemini/utils.go (1)
  • ConvertGeminiFinishReasonToBifrost (213-218)
core/providers/gemini/responses.go (2)
core/providers/gemini/types.go (13)
  • GeminiGenerationRequest (55-73)
  • ToolConfig (146-151)
  • FunctionCallingConfig (92-99)
  • SafetySetting (81-89)
  • Part (936-960)
  • Role (13-13)
  • Content (922-930)
  • Type (778-778)
  • FunctionCall (1091-1101)
  • FunctionResponse (1104-1126)
  • FileData (1080-1088)
  • Blob (963-971)
  • Tool (600-627)
core/schemas/utils.go (4)
  • ParseModelString (23-36)
  • SafeExtractFromMap (519-525)
  • SafeExtractString (280-297)
  • Ptr (16-18)
🔇 Additional comments (21)
core/providers/gemini/gemini_test.go (1)

63-63: LGTM!

Trailing newline addition aligns with POSIX standards for text files.

tests/integrations/tests/test_google.py (1)

518-566: LGTM!

The streaming test correctly uses generate_content_stream and properly handles both response structures (candidates-based and direct text). The validation logic is thorough.

core/schemas/utils.go (1)

1052-1055: LGTM!

The IsGeminiModel function follows the established pattern of IsAnthropicModel and IsMistralModel, providing consistent model detection across the codebase.

core/providers/vertex/vertex_test.go (1)

26-27: No action required—model name resolution works correctly.

The Vertex provider explicitly handles unprefixed Gemini model names. The provider strips the google/ prefix at line 1262 of core/providers/vertex/vertex.go (deployment = strings.TrimPrefix(deployment, "google/")), and IsGeminiModel() correctly identifies Gemini models by checking for the "gemini" substring, which matches both prefixed and unprefixed names. The test configuration change to "gemini-2.0-flash-001" is compatible with existing model resolution logic.

core/schemas/responses.go (1)

132-135: Potential field duplication with parent struct.

The fields Name and Strict added here duplicate the same fields in ResponsesTextConfigFormat (lines 125, 127). This could cause confusion about which field takes precedence during serialization/deserialization.

If both structs need these fields for different serialization contexts (e.g., nested vs. flat JSON representations), consider adding a brief comment clarifying the distinction.

core/providers/vertex/vertex.go (1)

1384-1400: LGTM!

The helper function correctly strips the unsupported ID fields from function_call and function_response parts for Vertex Gemini compatibility.

transports/bifrost-http/integrations/genai.go (1)

50-75: LGTM!

The GenAI integration correctly routes non-embedding/speech/transcription requests through the Responses API path. The converter updates align with the new BifrostResponsesResponse and BifrostResponsesStreamResponse types, and the relevant code snippets from router.go confirm the expected signatures.

core/providers/gemini/chat.go (1)

160-307: LGTM!

The streaming conversion implementation is well-structured with proper handling of:

  • Thought vs regular text content differentiation
  • ThoughtSignature preservation for Gemini 3 Pro compatibility
  • Empty delta filtering to avoid unnecessary chunks
  • Proper finish reason conversion using the shared helper
core/providers/gemini/gemini.go (2)

293-485: LGTM!

The shared streaming handler for chat completion is well-implemented with:

  • Proper resource management via defer statements
  • Comprehensive error handling for cancellation, timeout, HTTP errors, and API errors
  • Appropriate buffer sizing for large streaming responses
  • Correct stream lifecycle management with end indicators

572-809: LGTM!

The responses stream handler properly manages stream state with pooling, handles errors appropriately, and finalizes the stream correctly when the scanner completes. The pattern of setting the stream end indicator for final responses ensures proper lifecycle signaling.

core/providers/gemini/utils.go (5)

11-70: LGTM on the ResponsesParameters conversion.

The conversion logic correctly maps Gemini generation config fields to the new ResponsesParameters structure. The use of ExtraParams for provider-specific fields (top_k, candidate_count, stop_sequences, presence_penalty, frequency_penalty, seed) is appropriate for maintaining flexibility.


194-218: Finish reason mapping looks correct.

The mapping from Gemini finish reasons to Bifrost format is comprehensive. Minor observation: FinishReasonMalformedFunctionCall and FinishReasonUnexpectedToolCall both map to "tool_calls" which may lose fidelity about the error nature, but this is reasonable for a normalized API.


233-260: LGTM on chat usage conversion.

The convertGeminiUsageMetadataToChatUsage function correctly handles nil input, maps token counts, and conditionally populates details for cached and reasoning tokens.


839-890: LGTM on buildOpenAIResponseFormat signature change.

The function now correctly returns *schemas.ResponsesTextConfig and properly handles the three cases: explicit schema, JSON schema from map, and fallback to json_object mode.


932-954: LGTM on extractFunctionResponseOutput helper.

The function safely handles nil checks and has a sensible fallback strategy: try "output" field first, then marshal the entire response. This is consistent with how function responses are handled elsewhere in the codebase.

core/providers/gemini/responses.go (6)

886-957: LGTM on function call part processing.

The processGeminiFunctionCallPart correctly:

  • Closes any open text item first
  • Uses function call ID with fallback to name
  • Emits both OutputItemAdded and FunctionCallArgumentsDone events
  • Properly tracks state

1115-1161: Empty text emitted on output_text.done event.

At line 1124, emptyText := "" is created and passed as the Text field in the OutputTextDone event. This seems intentional but consider whether the accumulated text should be tracked and emitted instead for completeness.

Is emitting an empty text string on output_text.done consistent with the Bifrost responses API specification? Some consumers may expect the complete text in this event.


1539-1577: Function call handling creates separate reasoning message for thought signature.

This approach stores thought signatures in a separate ResponsesReasoning message following the function call message. This is a design choice that works but introduces coupling between adjacent messages.

The approach is documented in the comments and maintains scalability. The reconstruction in convertResponsesMessagesToGeminiContents (lines 2013-2018) correctly looks ahead to find the paired reasoning message.


2063-2085: LGTM on expanded content block type handling.

The switch case now properly handles:

  • Both input and output text types
  • Reasoning content with Thought flag
  • Refusal content converted to plain text

This aligns well with the bidirectional conversion requirements.


959-1035: Inconsistent outputIndex initialization for inline data.

At lines 975-978, if !state.HasStartedText, outputIndex is set to 0, but for other part types (thought, function call), it's set to 1. This inconsistency could cause item ID collisions or ordering issues.

 // Create new output item for the inline data
 outputIndex := state.CurrentOutputIndex + 1
 if !state.HasStartedText {
-    outputIndex = 0
+    outputIndex = 0 // Inline data takes first slot if no text
 }

Consider documenting the indexing strategy or unifying the approach across all part types.

Likely an incorrect or invalid review comment.


738-799: Review comment is incorrect – no nil dereference risk exists.

MessageID is guaranteed to be initialized before processGeminiTextPart accesses it. In ToBifrostResponsesStream, the if !state.HasEmittedCreated block (which initializes MessageID) always executes before processGeminiPart is called. Since HasEmittedCreated starts as false on first invocation and is set to true immediately after initialization, subsequent events reuse the same persisted state with MessageID already set. The defensive nil check is unnecessary.

Likely an incorrect or invalid review comment.

@TejasGhatte TejasGhatte force-pushed the 12-08-feat_added_support_for_gemini_native_converterswq branch from 2cc10d8 to eae8367 Compare December 8, 2025 05:07
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (2)
core/providers/gemini/chat.go (1)

77-112: Align non‑stream tool call IDs with streaming logic.

Here the non‑stream path always sets ID to the function name:

toolCalls = append(toolCalls, schemas.ChatAssistantMessageToolCall{
    Index:    uint16(len(toolCalls)),
    Type:     schemas.Ptr(string(schemas.ChatToolChoiceTypeFunction)),
    ID:       function.Name,
    Function: function,
})

But in the streaming path you correctly prefer FunctionCall.ID when present and only fall back to the function name. For consistency (and to preserve provider‑assigned tool call IDs), the non‑stream path should use the same logic.

A minimal fix:

-				function := schemas.ChatAssistantMessageToolCallFunction{
-					Name: &part.FunctionCall.Name,
-				}
+				function := schemas.ChatAssistantMessageToolCallFunction{
+					Name: &part.FunctionCall.Name,
+				}
+
+				// Use ID if available, otherwise use function name (matches streaming path)
+				callID := part.FunctionCall.Name
+				if part.FunctionCall.ID != "" {
+					callID = part.FunctionCall.ID
+				}
...
-					toolCalls = append(toolCalls, schemas.ChatAssistantMessageToolCall{
-						Index:    uint16(len(toolCalls)),
-						Type:     schemas.Ptr(string(schemas.ChatToolChoiceTypeFunction)),
-						ID:       function.Name,
-						Function: function,
-					})
+					toolCalls = append(toolCalls, schemas.ChatAssistantMessageToolCall{
+						Index:    uint16(len(toolCalls)),
+						Type:     schemas.Ptr(string(schemas.ChatToolChoiceTypeFunction)),
+						ID:       &callID,
+						Function: function,
+					})

This keeps ID behavior consistent across streaming and non‑streaming Gemini chat responses and avoids losing original call IDs.

core/providers/gemini/responses.go (1)

114-123: Guard against nil bifrostReq.Params before accessing ExtraParams.

At line 114, bifrostReq.Params.ExtraParams is accessed without first verifying that bifrostReq.Params is non-nil. The nil-check at line 87 only protects lines 88–99; after that block, Params can still be nil (e.g., if the request has no parameters but does have ExtraParams in a different code path). Dereferencing .ExtraParams on a nil Params will panic.

Apply this diff to guard the access:

-	if bifrostReq.Params.ExtraParams != nil {
+	if bifrostReq.Params != nil && bifrostReq.Params.ExtraParams != nil {
 		if safetySettings, ok := schemas.SafeExtractFromMap(bifrostReq.Params.ExtraParams, "safety_settings"); ok {
🧹 Nitpick comments (3)
core/providers/gemini/gemini_test.go (1)

35-56: Consider enabling reasoning and transcription scenarios to cover new Gemini capabilities

Given this PR adds Gemini-native reasoning and transcription support, it may be worth turning Reasoning, Transcription, and (if implemented) TranscriptionStream to true here (or adding dedicated tests) so the comprehensive suite exercises those paths end-to-end.

tests/integrations/tests/test_google.py (1)

507-517: Clarify intent of test_12_error_handling_invalid_roles.

The test name/docstring still talk about “error handling” but the body now asserts a successful response with candidates. If the new contract is “invalid roles are tolerated/normalized,” consider renaming the test or updating the docstring to reflect that it verifies graceful handling rather than an error path.

core/providers/vertex/vertex_test.go (1)

24-48: Model IDs update LGTM; comment for Reasoning scenario is now stale.

Switching ChatModel/VisionModel to gemini-2.0-flash-001 aligns these tests with the new native Gemini paths. However, the Reasoning: false comment still says “because we are not using native gemini converters”, which is no longer accurate. Consider updating that comment (or adding a TODO) so future readers don’t misinterpret why reasoning is disabled.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 2cc10d8 and eae8367.

📒 Files selected for processing (12)
  • core/providers/gemini/chat.go (3 hunks)
  • core/providers/gemini/gemini.go (1 hunks)
  • core/providers/gemini/gemini_test.go (1 hunks)
  • core/providers/gemini/responses.go (9 hunks)
  • core/providers/gemini/speech.go (1 hunks)
  • core/providers/gemini/utils.go (7 hunks)
  • core/providers/vertex/vertex.go (8 hunks)
  • core/providers/vertex/vertex_test.go (1 hunks)
  • core/schemas/responses.go (1 hunks)
  • core/schemas/utils.go (1 hunks)
  • tests/integrations/tests/test_google.py (1 hunks)
  • transports/bifrost-http/integrations/genai.go (2 hunks)
🚧 Files skipped from review as they are similar to previous changes (4)
  • core/schemas/responses.go
  • core/providers/gemini/speech.go
  • core/schemas/utils.go
  • transports/bifrost-http/integrations/genai.go
🧰 Additional context used
📓 Path-based instructions (1)
**

⚙️ CodeRabbit configuration file

always check the stack if there is one for the current PR. do not give localized reviews for the PR, always see all changes in the light of the whole stack of PRs (if there is a stack, if there is no stack you can continue to make localized suggestions/reviews)

Files:

  • core/providers/vertex/vertex.go
  • core/providers/vertex/vertex_test.go
  • core/providers/gemini/utils.go
  • core/providers/gemini/gemini_test.go
  • tests/integrations/tests/test_google.py
  • core/providers/gemini/chat.go
  • core/providers/gemini/gemini.go
  • core/providers/gemini/responses.go
🧬 Code graph analysis (6)
core/providers/vertex/vertex.go (6)
core/schemas/utils.go (1)
  • IsGeminiModel (1052-1054)
core/providers/gemini/chat.go (1)
  • ToGeminiChatCompletionRequest (11-62)
core/providers/gemini/types.go (1)
  • GeminiGenerationRequest (55-73)
core/providers/utils/utils.go (1)
  • HandleProviderResponse (343-387)
core/providers/gemini/gemini.go (2)
  • HandleGeminiChatCompletionStream (294-485)
  • HandleGeminiResponsesStream (573-809)
core/providers/gemini/responses.go (1)
  • ToGeminiResponsesRequest (76-126)
core/providers/gemini/utils.go (3)
core/providers/gemini/types.go (7)
  • GeminiGenerationRequest (55-73)
  • GenerationConfig (631-697)
  • Type (778-778)
  • Schema (724-775)
  • GenerateContentResponseUsageMetadata (1259-1284)
  • Modality (706-706)
  • FunctionResponse (1104-1126)
core/schemas/utils.go (1)
  • Ptr (16-18)
core/utils.go (1)
  • Ptr (51-53)
tests/integrations/tests/test_google.py (1)
tests/integrations/tests/utils/config_loader.py (2)
  • get_model (140-161)
  • get_model (443-445)
core/providers/gemini/chat.go (4)
core/schemas/chatcompletions.go (12)
  • ChatAssistantMessageToolCall (564-570)
  • ChatContentBlock (505-512)
  • ChatAssistantMessageToolCallFunction (573-576)
  • ChatToolChoiceTypeFunction (331-331)
  • ChatMessage (427-436)
  • ChatMessageRoleAssistant (419-419)
  • ChatAssistantMessage (541-545)
  • BifrostResponseChoice (582-590)
  • BifrostChatResponse (26-41)
  • ChatStreamResponseChoiceDelta (616-622)
  • ChatToolTypeFunction (197-197)
  • ChatStreamResponseChoice (611-613)
core/schemas/utils.go (1)
  • Ptr (16-18)
core/schemas/models.go (1)
  • Model (109-129)
core/providers/gemini/utils.go (1)
  • ConvertGeminiFinishReasonToBifrost (213-218)
core/providers/gemini/gemini.go (8)
core/providers/gemini/chat.go (1)
  • ToGeminiChatCompletionRequest (11-62)
core/schemas/bifrost.go (3)
  • ChatCompletionRequest (89-89)
  • Gemini (48-48)
  • ResponsesRequest (91-91)
core/schemas/provider.go (1)
  • Provider (280-307)
core/providers/utils/utils.go (3)
  • ShouldSendBackRawResponse (482-487)
  • GetPathFromContext (211-216)
  • GetBifrostResponseForStreamResponse (777-805)
core/schemas/utils.go (1)
  • Ptr (16-18)
core/utils.go (1)
  • Ptr (51-53)
core/schemas/responses.go (2)
  • BifrostResponsesRequest (32-39)
  • BifrostResponsesStreamResponse (1426-1464)
core/providers/gemini/responses.go (2)
  • ToGeminiResponsesRequest (76-126)
  • FinalizeGeminiResponsesStream (1210-1212)
core/providers/gemini/responses.go (3)
core/providers/gemini/types.go (15)
  • GeminiGenerationRequest (55-73)
  • ToolConfig (146-151)
  • FunctionCallingConfig (92-99)
  • SafetySetting (81-89)
  • Part (936-960)
  • Role (13-13)
  • Content (922-930)
  • Type (778-778)
  • FunctionCall (1091-1101)
  • FunctionResponse (1104-1126)
  • FileData (1080-1088)
  • Blob (963-971)
  • Tool (600-627)
  • FunctionCallingConfigModeAuto (109-109)
  • FunctionCallingConfigModeNone (116-116)
core/schemas/responses.go (14)
  • BifrostResponsesResponse (45-83)
  • ResponsesInputMessageRoleAssistant (332-332)
  • ResponsesInputMessageRoleUser (333-333)
  • ResponsesMessageTypeFunctionCall (296-296)
  • ResponsesToolMessage (461-481)
  • ResponsesReasoning (730-733)
  • ResponsesMessageTypeFunctionCallOutput (297-297)
  • ResponsesMessageTypeReasoning (308-308)
  • BifrostResponsesStreamResponse (1426-1464)
  • ResponsesMessage (315-327)
  • ResponsesMessageTypeMessage (291-291)
  • ResponsesMessageContent (339-344)
  • ResponsesMessageContentBlock (399-410)
  • ResponsesTool (1029-1044)
core/schemas/utils.go (3)
  • SafeExtractFromMap (519-525)
  • SafeExtractString (280-297)
  • Ptr (16-18)
🔇 Additional comments (8)
core/providers/gemini/gemini_test.go (1)

63-63: No functional change – formatting only

This change just adds the final newline/closing brace; there’s no behavioral impact and the test setup remains valid.

tests/integrations/tests/test_google.py (1)

518-566: Streaming aggregation logic for test_13_streaming looks solid.

Chunk iteration, content aggregation, and the robot-related assertions align with the Gemini streaming shape (candidates → content.parts → text) and validate both that chunks arrive and that the story is on-topic. No issues from a provider/streaming perspective.

core/providers/vertex/vertex.go (2)

272-305: Vertex Gemini chat paths are wired coherently.

The Gemini branches in ChatCompletion and ChatCompletionStream correctly:

  • Build requests via gemini.ToGeminiChatCompletionRequest,
  • Strip unsupported function IDs with stripVertexGeminiUnsupportedFields,
  • Route to native generateContent / streamGenerateContent?alt=sse endpoints, and
  • Map responses back through Gemini’s converters with consistent ExtraFields (provider, model, latency, raw response).

Nothing blocking here from a Vertex/Gemini integration standpoint.

Also applies to: 386-391, 477-497


1391-1407: stripVertexGeminiUnsupportedFields correctly clears unsupported IDs in-place.

Iterating over requestBody.Contents and mutating part.FunctionCall.ID / part.FunctionResponse.ID via the *Part pointers ensures the underlying request is sanitized for Vertex Gemini without extra allocations. This matches the stated intent of dropping unsupported id fields before hitting Vertex.

core/providers/gemini/chat.go (1)

160-307: Streaming conversion (ToBifrostChatCompletionStream) is well‑structured.

The streaming delta builder:

  • Separates regular text vs. thought content via part.Thought,
  • Properly maps Gemini RoleModel → assistant,
  • Preserves tool calls (with IDs and Gemini thought signatures in ExtraContent), and
  • Emits a final chunk with finish reason and usage only once.

This matches Bifrost’s streaming schema expectations and should interoperate cleanly with the new Gemini/Vertex streaming handlers.

core/providers/gemini/utils.go (1)

11-70: Responses parameter, finish‑reason, usage, and function‑response helpers look consistent.

  • convertGenerationConfigToResponsesParameters cleanly maps Gemini GenerationConfig into ResponsesParameters, using ExtraParams for knobs like top_k, penalties, seed, stop sequences, and response schemas, and wiring ResponseMIMEType into the new ResponsesTextConfig.
  • ConvertGeminiFinishReasonToBifrost and geminiFinishReasonToBifrost cover the expected Gemini finish reasons and sensibly default to the raw provider value.
  • The two usage converters correctly project GenerateContentResponseUsageMetadata into chat and responses usage structs, including cached and reasoning tokens where present.
  • extractFunctionResponseOutput gives a reasonable best‑effort: prefer "output" as a string, otherwise fall back to JSON of the full response map.

Nothing blocking here; these helpers should make request/response conversions much easier to maintain.

Also applies to: 194-293, 839-890, 933-954

core/providers/gemini/gemini.go (2)

209-244: Native Gemini ChatCompletion and streaming wiring look correct.

The new ChatCompletion and ChatCompletionStream:

  • Gate operations via CheckOperationAllowed,
  • Build request bodies with ToGeminiChatCompletionRequest,
  • Use a shared completeRequest for non‑streaming and HandleGeminiChatCompletionStream for SSE streaming,
  • Populate ExtraFields (provider, model, latency, raw response) consistently.

The streaming handler’s SSE loop, chunk indexing, finish detection, and error propagation all align with Bifrost’s streaming contracts.

Also applies to: 246-307


487-526: Responses and ResponsesStream Gemini paths are consistent with chat streaming.

Responses and ResponsesStream mirror the chat flows:

  • Convert via ToGeminiResponsesRequest,
  • Reuse completeRequest for non‑streaming and HandleGeminiResponsesStream for SSE,
  • Maintain per‑chunk sequence indices, latency, and finalization (including FinalizeGeminiResponsesStream and usage attachment),
  • Integrate with post‑hooks and GetBifrostResponseForStreamResponse like other providers.

The shared processGeminiStreamChunk and extractGeminiUsageMetadata helpers centralize parsing and usage extraction, reducing duplicated error‑prone logic. No issues spotted in these paths.

Also applies to: 528-809

@TejasGhatte TejasGhatte force-pushed the 12-08-feat_added_support_for_gemini_native_converterswq branch 2 times, most recently from e0964ba to 461269a Compare December 8, 2025 05:56
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (1)
core/providers/gemini/responses.go (1)

76-126: Guard against nil Params and avoid silently swallowing conversion errors in ToGeminiResponsesRequest

Two issues here:

  1. Nil pointer risk on Params (regression of earlier fix).

bifrostReq.Params is optional on BifrostResponsesRequest, but this block unconditionally dereferences it:

if bifrostReq.Params.ExtraParams != nil {
    ...
}

If a caller provides Input without Params, this will panic.

Minimal fix:

- if bifrostReq.Params.ExtraParams != nil {
+ if bifrostReq.Params != nil && bifrostReq.Params.ExtraParams != nil {
     if safetySettings, ok := schemas.SafeExtractFromMap(bifrostReq.Params.ExtraParams, "safety_settings"); ok {
         ...
     }
 }
  1. Conversion errors from convertResponsesMessagesToGeminiContents are lost.

If convertResponsesMessagesToGeminiContents returns an error (e.g., invalid image/audio block), ToGeminiResponsesRequest returns nil, but the callers wrap it in:

func() (any, error) { return ToGeminiResponsesRequest(request), nil }

So CheckContextAndGetRequestBody never sees the error and happily marshals a nil body (null) instead of returning a structured BifrostError.

Consider changing ToGeminiResponsesRequest to return ( *GeminiGenerationRequest, error ) and threading that through the builder, so malformed Responses inputs fail fast with a clear error instead of a confusing provider 4xx.

🧹 Nitpick comments (4)
tests/integrations/tests/test_google.py (1)

507-516: Test name no longer reflects actual behavior.

The test is named test_12_error_handling_invalid_roles but now expects a successful response with valid candidates. The docstring still says "Error handling for invalid roles" but the test asserts success.

If invalid roles no longer cause errors (perhaps Bifrost/Gemini now handles them gracefully), consider renaming the test to reflect its actual purpose, e.g., test_12_invalid_roles_handled_gracefully.

 @skip_if_no_api_key("google")
-def test_12_error_handling_invalid_roles(self, google_client, test_config):
-    """Test Case 12: Error handling for invalid roles"""
+def test_12_invalid_roles_handled_gracefully(self, google_client, test_config):
+    """Test Case 12: Verify invalid roles are handled gracefully without errors"""
     response = google_client.models.generate_content(
         model=get_model("google", "chat"), contents=GENAI_INVALID_ROLE_CONTENT
     )
core/providers/gemini/utils.go (2)

194-218: Fallback for unknown finish reasons may leak raw provider values

ConvertGeminiFinishReasonToBifrost falls back to string(providerReason) when a reason isn’t in geminiFinishReasonToBifrost. That can expose new/raw Gemini enums instead of a stable Bifrost set (stop, length, content_filter, tool_calls, etc.).

Consider normalizing unknown reasons to a generic bucket (e.g. "other" or "stop") and only emit raw strings via an auxiliary field if you need them.


233-293: Also surface reasoning tokens in Responses usage metadata

convertGeminiUsageMetadataToChatUsage correctly maps ThoughtsTokenCount into ChatCompletionTokensDetails.ReasoningTokens, but convertGeminiUsageMetadataToResponsesUsage ignores it even though ResponsesResponseOutputTokens supports reasoning tokens.

You can preserve that signal like this:

 func convertGeminiUsageMetadataToResponsesUsage(metadata *GenerateContentResponseUsageMetadata) *schemas.ResponsesResponseUsage {
@@
-    usage := &schemas.ResponsesResponseUsage{
-        TotalTokens:         int(metadata.TotalTokenCount),
-        InputTokens:         int(metadata.PromptTokenCount),
-        OutputTokens:        int(metadata.CandidatesTokenCount),
-        OutputTokensDetails: &schemas.ResponsesResponseOutputTokens{},
-        InputTokensDetails:  &schemas.ResponsesResponseInputTokens{},
-    }
+    usage := &schemas.ResponsesResponseUsage{
+        TotalTokens:  int(metadata.TotalTokenCount),
+        InputTokens:  int(metadata.PromptTokenCount),
+        OutputTokens: int(metadata.CandidatesTokenCount),
+        // details structs are allocated lazily below
+    }
@@
-    if metadata.CachedContentTokenCount > 0 {
-        usage.InputTokensDetails = &schemas.ResponsesResponseInputTokens{
-            CachedTokens: int(metadata.CachedContentTokenCount),
-        }
-    }
+    if metadata.CachedContentTokenCount > 0 {
+        if usage.InputTokensDetails == nil {
+            usage.InputTokensDetails = &schemas.ResponsesResponseInputTokens{}
+        }
+        usage.InputTokensDetails.CachedTokens = int(metadata.CachedContentTokenCount)
+    }
@@
-    if metadata.CandidatesTokensDetails != nil {
+    if metadata.CandidatesTokensDetails != nil {
         for _, detail := range metadata.CandidatesTokensDetails {
@@
-            case "AUDIO":
-                usage.OutputTokensDetails.AudioTokens = int(detail.TokenCount)
+            case "AUDIO":
+                if usage.OutputTokensDetails == nil {
+                    usage.OutputTokensDetails = &schemas.ResponsesResponseOutputTokens{}
+                }
+                usage.OutputTokensDetails.AudioTokens = int(detail.TokenCount)
             }
         }
     }
+
+    // Also map reasoning tokens from thoughts
+    if metadata.ThoughtsTokenCount > 0 {
+        if usage.OutputTokensDetails == nil {
+            usage.OutputTokensDetails = &schemas.ResponsesResponseOutputTokens{}
+        }
+        usage.OutputTokensDetails.ReasoningTokens = int(metadata.ThoughtsTokenCount)
+    }

This keeps Chat and Responses usage aligned and makes reasoning‑token accounting available to clients.

core/providers/gemini/responses.go (1)

1224-1360: Inline/file output blocks always use input content block types

In convertGeminiContentsToResponsesMessages:

  • Text blocks correctly use input vs output types based on content.Role (modelResponsesOutputMessageContentTypeText, user → input text).
  • For inline data and file data, you always use the input block types (ResponsesInputMessageContentBlockTypeImage/Audio/File), even when the role is assistant/model:
Type: func() schemas.ResponsesMessageContentBlockType {
    if strings.HasPrefix(part.InlineData.MIMEType, "image/") {
        return schemas.ResponsesInputMessageContentBlockTypeImage
    } else if strings.HasPrefix(part.InlineData.MIMEType, "audio/") {
        return schemas.ResponsesInputMessageContentBlockTypeAudio
    }
    return schemas.ResponsesInputMessageContentBlockTypeText
}(),

For consistency with how text/reasoning are handled (and with the Responses schema’s separation of input vs output block types), it may be worth:

  • Selecting output variants when content.Role == "model" for images/audio/files, and
  • Reserving input variants for user‑authored content.

If existing consumers don’t care about the distinction this is cosmetic, but aligning the types now will make downstream interpretation (e.g., UI rendering, analytics) more predictable.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between eae8367 and 461269a.

📒 Files selected for processing (12)
  • core/providers/gemini/chat.go (3 hunks)
  • core/providers/gemini/gemini.go (1 hunks)
  • core/providers/gemini/gemini_test.go (1 hunks)
  • core/providers/gemini/responses.go (9 hunks)
  • core/providers/gemini/speech.go (1 hunks)
  • core/providers/gemini/utils.go (7 hunks)
  • core/providers/vertex/vertex.go (8 hunks)
  • core/providers/vertex/vertex_test.go (1 hunks)
  • core/schemas/responses.go (1 hunks)
  • core/schemas/utils.go (1 hunks)
  • tests/integrations/tests/test_google.py (1 hunks)
  • transports/bifrost-http/integrations/genai.go (2 hunks)
🚧 Files skipped from review as they are similar to previous changes (4)
  • core/providers/vertex/vertex_test.go
  • core/schemas/responses.go
  • core/providers/gemini/gemini_test.go
  • core/schemas/utils.go
🧰 Additional context used
📓 Path-based instructions (1)
**

⚙️ CodeRabbit configuration file

always check the stack if there is one for the current PR. do not give localized reviews for the PR, always see all changes in the light of the whole stack of PRs (if there is a stack, if there is no stack you can continue to make localized suggestions/reviews)

Files:

  • tests/integrations/tests/test_google.py
  • core/providers/gemini/speech.go
  • transports/bifrost-http/integrations/genai.go
  • core/providers/gemini/utils.go
  • core/providers/gemini/chat.go
  • core/providers/gemini/gemini.go
  • core/providers/vertex/vertex.go
  • core/providers/gemini/responses.go
🧬 Code graph analysis (6)
tests/integrations/tests/test_google.py (1)
tests/integrations/tests/utils/config_loader.py (2)
  • get_model (140-161)
  • get_model (443-445)
transports/bifrost-http/integrations/genai.go (4)
core/schemas/bifrost.go (1)
  • ResponsesRequest (91-91)
core/providers/gemini/embedding.go (1)
  • ToGeminiEmbeddingResponse (57-100)
core/schemas/responses.go (2)
  • BifrostResponsesResponse (45-83)
  • BifrostResponsesStreamResponse (1426-1464)
core/providers/gemini/responses.go (2)
  • ToGeminiResponsesResponse (153-339)
  • ToGeminiResponsesStreamResponse (341-509)
core/providers/gemini/utils.go (5)
core/providers/gemini/types.go (16)
  • Type (778-778)
  • Schema (724-775)
  • FinishReason (22-22)
  • FinishReasonStop (28-28)
  • FinishReasonMaxTokens (30-30)
  • FinishReasonSafety (33-33)
  • FinishReasonRecitation (35-35)
  • FinishReasonLanguage (37-37)
  • FinishReasonOther (39-39)
  • FinishReasonBlocklist (41-41)
  • FinishReasonProhibitedContent (43-43)
  • FinishReasonSPII (46-46)
  • FinishReasonMalformedFunctionCall (48-48)
  • FinishReasonImageSafety (50-50)
  • FinishReasonUnexpectedToolCall (52-52)
  • FunctionResponse (1104-1126)
core/schemas/responses.go (7)
  • ResponsesParameters (85-112)
  • ResponsesTextConfig (118-121)
  • ResponsesTextConfigFormat (123-128)
  • ResponsesResponseUsage (255-262)
  • ResponsesResponseOutputTokens (272-282)
  • ResponsesResponseInputTokens (264-270)
  • ResponsesTextConfigFormatJSONSchema (131-140)
core/schemas/utils.go (1)
  • Ptr (16-18)
core/utils.go (1)
  • Ptr (51-53)
core/schemas/chatcompletions.go (3)
  • BifrostLLMUsage (640-647)
  • ChatPromptTokensDetails (649-655)
  • ChatCompletionTokensDetails (657-667)
core/providers/gemini/chat.go (3)
core/schemas/chatcompletions.go (7)
  • ChatAssistantMessageToolCall (564-570)
  • ChatContentBlock (505-512)
  • ChatAssistantMessage (541-545)
  • BifrostResponseChoice (582-590)
  • BifrostChatResponse (26-41)
  • ChatStreamResponseChoiceDelta (616-622)
  • ChatStreamResponseChoice (611-613)
core/schemas/utils.go (1)
  • Ptr (16-18)
core/providers/gemini/utils.go (1)
  • ConvertGeminiFinishReasonToBifrost (213-218)
core/providers/gemini/gemini.go (5)
core/providers/gemini/chat.go (1)
  • ToGeminiChatCompletionRequest (11-62)
core/providers/utils/utils.go (4)
  • ShouldSendBackRawResponse (482-487)
  • GetPathFromContext (211-216)
  • NewBifrostOperationError (449-460)
  • ProcessAndSendResponse (535-565)
core/schemas/utils.go (1)
  • Ptr (16-18)
core/utils.go (1)
  • Ptr (51-53)
core/providers/gemini/responses.go (2)
  • ToGeminiResponsesRequest (76-126)
  • FinalizeGeminiResponsesStream (1220-1222)
core/providers/gemini/responses.go (3)
core/providers/gemini/types.go (11)
  • GeminiGenerationRequest (55-73)
  • ToolConfig (146-151)
  • SafetySetting (81-89)
  • Part (936-960)
  • Content (922-930)
  • Type (778-778)
  • FunctionCall (1091-1101)
  • FunctionResponse (1104-1126)
  • FileData (1080-1088)
  • Blob (963-971)
  • Tool (600-627)
core/schemas/responses.go (9)
  • BifrostResponsesRequest (32-39)
  • ResponsesParameters (85-112)
  • BifrostResponsesResponse (45-83)
  • BifrostResponsesStreamResponse (1426-1464)
  • ResponsesMessage (315-327)
  • ResponsesMessageContent (339-344)
  • ResponsesMessageContentBlock (399-410)
  • ResponsesTool (1029-1044)
  • ResponsesToolChoice (958-961)
core/schemas/utils.go (3)
  • ParseModelString (23-36)
  • SafeExtractFromMap (519-525)
  • SafeExtractString (280-297)
🔇 Additional comments (31)
core/providers/gemini/speech.go (1)

110-127: LGTM! Proper nil check added to prevent potential panic.

The nil check on line 110 correctly guards against nil pointer dereference when bifrostReq.Input is nil. This is a good defensive programming practice.

One consideration: the voice config handling (lines 121-126) is now inside the if block, meaning if Input is nil or empty, the request will have no contents and no voice config. This appears intentional since a speech request without input text would be invalid anyway.

tests/integrations/tests/test_google.py (1)

519-566: LGTM! Well-implemented streaming test.

The streaming test properly:

  1. Uses the correct Google GenAI SDK streaming method (generate_content_stream)
  2. Handles both the structured response format and fallback to direct text attribute
  3. Validates chunk count and content length
  4. Uses a comprehensive list of robot-related terms for flexible content validation

Good defensive handling of the response structure at lines 537-546.

transports/bifrost-http/integrations/genai.go (2)

49-62: LGTM! Correctly migrated to Responses API.

The changes properly:

  1. Switch from ChatRequest to ResponsesRequest with ToBifrostResponsesRequest() (line 51)
  2. Update converters to use BifrostResponsesResponse and ToGeminiResponsesResponse (lines 60-61)

This aligns with the PR objective to use native Gemini API format instead of OpenAI-compatible endpoints.


72-78: LGTM! Streaming converter updated for Responses API.

The streaming converter is correctly renamed and updated to use BifrostResponsesStreamResponse and ToGeminiResponsesStreamResponse, maintaining consistency with the non-streaming path.

core/providers/vertex/vertex.go (6)

297-312: LGTM! Gemini branch in ChatCompletion properly implemented.

The Gemini model handling correctly:

  1. Uses gemini.ToGeminiChatCompletionRequest() for conversion
  2. Strips unsupported fields via stripVertexGeminiUnsupportedFields()
  3. Sets the deployment model
  4. Follows the same marshal/unmarshal pattern as other branches for Vertex API compatibility

478-499: LGTM! Gemini response handling in ChatCompletion is correct.

The response processing:

  1. Uses the proper GenerateContentResponse type
  2. Sets all required ExtraFields including ModelDeployment when applicable
  3. Handles raw response correctly based on configuration

656-668: Verify API version consistency between streaming and non-streaming paths.

The streaming Gemini URLs use v1beta1:

completeURL = fmt.Sprintf("https://%s-aiplatform.googleapis.com/v1beta1/projects/%s/locations/%s/publishers/google/models/%s:streamGenerateContent", ...)

However, the non-streaming ChatCompletion path (lines 388-391) uses v1:

completeURL = fmt.Sprintf("https://%s-aiplatform.googleapis.com/v1/projects/%s/locations/%s/publishers/google/models/%s:generateContent", ...)

If this is intentional (e.g., SSE streaming only available in v1beta1), this is fine. Otherwise, consider aligning the versions.


916-1012: LGTM! Gemini branch in Responses is well-implemented.

The implementation correctly:

  1. Calls stripVertexGeminiUnsupportedFields() (line 927)
  2. Handles auth errors by removing client from pool (lines 984-986)
  3. Sets ModelDeployment when it differs from request.Model (lines 1004-1006)
  4. Uses consistent v1 API endpoint matching the non-streaming ChatCompletion path

All previously raised review concerns have been addressed.


1137-1229: LGTM! ResponsesStream Gemini branch properly implemented.

The implementation:

  1. Correctly calls stripVertexGeminiUnsupportedFields() (line 1159) - addressing the previous review concern
  2. Uses v1beta1 endpoint for streaming (consistent with ChatCompletionStream)
  3. Properly handles authentication via API key or OAuth2
  4. Uses the shared gemini.HandleGeminiResponsesStream handler

1395-1411: LGTM! Helper function correctly strips unsupported fields.

The stripVertexGeminiUnsupportedFields function properly:

  1. Iterates through all contents and parts
  2. Clears the ID field from FunctionCall and FunctionResponse objects
  3. Operates in-place on the request (no allocation overhead)

Note: Setting ID = "" works correctly with omitempty JSON tags, ensuring the field is omitted from the serialized request.

core/providers/gemini/chat.go (3)

11-23: LGTM! Signature simplification is appropriate.

The removal of the responseModalities parameter from ToGeminiChatCompletionRequest simplifies the API. The function now internally passes an empty modalities list to convertParamsToGenerationConfig, which is appropriate for chat completions where response modalities are typically not needed.


77-157: LGTM! Non-streaming response handling is well-structured.

The updated ToBifrostChatResponse:

  1. Properly aggregates content blocks and tool calls from all parts
  2. Uses FunctionCall.ID when available, falling back to name (lines 106-109) - addressing the previous review concern
  3. Handles FunctionResponse by extracting output to text content blocks
  4. Correctly structures the response with ChatNonStreamResponseChoice

The aggregation pattern ensures all content from Gemini's multi-part response format is captured in the Bifrost response.


165-312: LGTM! Well-implemented streaming response conversion.

The ToBifrostChatCompletionStream function:

  1. Correctly handles streaming state - Returns (response, error, isLastChunk) tuple for proper stream lifecycle management
  2. Role mapping - Converts Gemini's "model" role to "assistant" (lines 200-206)
  3. Content separation - Properly distinguishes between regular text, thought/reasoning content, and tool calls (lines 212-265)
  4. ThoughtSignature preservation - Correctly preserves thought_signature in ExtraContent for Gemini 3 Pro compatibility (lines 247-254)
  5. Empty chunk handling - Skips chunks with no meaningful content unless it's the final chunk (lines 283-287)
  6. Finish reason and usage - Only attached on the last chunk (lines 291-294, 307-309)

The implementation correctly converts Gemini's streaming format to Bifrost's streaming format.

core/providers/gemini/utils.go (3)

11-70: GenerationConfig → ResponsesParameters mapping looks correct

The conversion from GenerationConfig to schemas.ResponsesParameters (temperature, top‑p, logprobs, max tokens, stop sequences, penalties, response MIME/schema) is consistent with the Responses schema and keeps provider‑specific knobs under ExtraParams. No functional issues spotted here.


839-890: Structured response_format → ResponsesTextConfig mapping looks solid

buildOpenAIResponseFormat’s conversion of Gemini Schema / responseJsonSchema into ResponsesTextConfig (json_schema vs json_object fallback, lowercasing types, preserving title as Name, and defaulting Strict to false) matches the Responses text config types and should interop cleanly with OpenAI‑style structured outputs.


933-954: Function response extraction provides sensible fallback

extractFunctionResponseOutput preferring a top‑level "output" string and otherwise JSON‑encoding the whole Response map is a good compromise to get both clean text outputs and robust behavior for arbitrary payloads.

core/providers/gemini/gemini.go (7)

209-244: ChatCompletion now uses native Gemini converters correctly

The non‑streaming ChatCompletion path cleanly switches to ToGeminiChatCompletionRequest + completeRequest + ToBifrostChatResponse, with ExtraFields (provider, model, request type, latency, raw response) populated consistently with other providers. No issues here.


246-291: ChatCompletionStream wiring into shared Gemini SSE handler looks good

The streaming chat entrypoint delegates to HandleGeminiChatCompletionStream with the right URL (/models/<model>:streamGenerateContent?alt=sse via context override), headers, extra headers, and provider/model metadata. This keeps the provider surface small and reuses the common Gemini SSE logic.


293-485: Shared chat streaming handler has robust error and lifecycle handling

HandleGeminiChatCompletionStream correctly:

  • Sets resp.StreamBody = true and uses bufio.Scanner over BodyStream().
  • Handles network errors (cancel, timeout, generic) and HTTP errors with structured BifrostErrors.
  • Parses SSE data: lines into GenerateContentResponse chunks via processGeminiStreamChunk.
  • Tracks responseID/modelName, converts chunks via ToBifrostChatCompletionStream, and annotates ExtraFields (request type, provider, model, chunk index, per‑chunk and total latency, optional raw chunk).

Stream‑end is signaled both on isLastChunk and scanner errors with the appropriate context key and post‑hook processing. No correctness concerns.


487-526: Non‑streaming Responses path correctly reuses core request/response helpers

GeminiProvider.Responses now:

  • Gates via CheckOperationAllowed for Gemini/Responses.
  • Uses ToGeminiResponsesRequest + completeRequest to hit :generateContent.
  • Converts via ToResponsesBifrostResponsesResponse and fills ExtraFields (provider, model, request type, latency, optional raw response).

Assuming ToGeminiResponsesRequest is fixed as noted in responses.go (nil‑Params guard & error surfacing), this implementation looks correct.


528-809: Responses streaming integration aligns with the new Responses state machine

ResponsesStream and HandleGeminiResponsesStream:

  • Share the same SSE plumbing as chat (headers, URL, error handling).
  • Maintain chunkIndex and a global sequenceNumber passed into ToBifrostResponsesStream, so sequence_number is strictly increasing across all events.
  • Track stream state via GeminiResponsesStreamState (acquired/released from pool), keeping tool‑call lifecycle and text items coherent.
  • On normal EOF, finalize with FinalizeGeminiResponsesStream, which now guards against double response.completed / output_item.done via HasEmittedCompleted.

This addresses the earlier double‑terminal‑event issue and keeps the Responses streaming contract intact.


872-1077: Speech streaming reuse of Gemini chunk processing is consistent

SpeechStream now uses processGeminiStreamChunk and extractGeminiUsageMetadata to:

  • Parse SSE chunks into GenerateContentResponse.
  • Aggregate inline audio data from InlineData parts.
  • Attach token usage from the terminal chunk into SpeechUsage.
  • Emit SpeechStreamResponseTypeDelta chunks with per‑chunk latency and optional raw payload, followed by a single SpeechStreamResponseTypeDone with total latency and usage.

Network and API errors are handled analogously to chat/responses. The structure looks sound.


1345-1377: Shared helpers for stream chunk parsing and usage extraction are straightforward

processGeminiStreamChunk cleanly separates error‑payload detection ("error" key) from normal chunks and returns descriptive errors used upstream to differentiate Gemini API errors from decode issues. extractGeminiUsageMetadata centralizes token extraction for reuse across speech/transcription streaming logic. Both are minimal and correct.

core/providers/gemini/responses.go (8)

15-74: Provider→Responses request conversion is now safer and more complete

ToBifrostResponsesRequest:

  • Parses request.Model into (provider, model) via ParseModelString.
  • Always initializes params (with ExtraParams map) via convertGenerationConfigToResponsesParameters, then overlays instructions, tools/toolChoice, safety settings, cached_content.
  • Populates bifrostReq.Input from Contents via convertGeminiContentsToResponsesMessages.

This removes previous nil‑Params hazards and ensures all Gemini metadata (safety, cached content) is preserved through the Responses layer.


153-339: BifrostResponsesResponse → Gemini GenerateContentResponse mapping is thorough

ToGeminiResponsesResponse:

  • Copies Model and ID/CreatedAt into ModelVersion, ResponseID, and CreateTime.
  • Groups output messages by role into candidates (role: "model"/"user"), translating:
    • Text and content blocks via convertContentBlockToGeminiPart.
    • Function calls with proper Name, ID (from CallID), arguments JSON, and optional ThoughtSignature from adjacent reasoning messages.
    • Function call outputs into FunctionResponse with Name, ID, and structured Response.
    • Reasoning messages into Thought text parts.

Finish reasons are derived from IncompleteDetails.Reason (mapping max_tokens and content_filter to Gemini enums, defaulting to STOP/OTHER), and usage is reconstructed from bifrostResp.Usage. This looks correct and consistent with the other converters.


341-509: Responses streaming back‑conversion covers the important event types

ToGeminiResponsesStreamResponse:

  • Ignores purely lifecycle events (ping, created, in_progress, etc.) and only materializes events that correspond to Gemini content: text deltas, reasoning deltas, function calls, content parts, completions, failures, refusals.
  • Builds a single Candidate with appropriate Parts and FinishReason, and restores basic usage metadata on Completed events (including cached tokens and audio tokens via CandidatesTokensDetails).

The behavior for FunctionCallArgumentsDelta/ContentPartDone/reasoning‑done events (returning nil and letting the caller rely on the final Completed event) is reasonable, and should be enough for Vertex or other Gemini‑facing integrations that only need a best‑effort reverse mapping.


511-639: Streaming state object and pool correctly enforce single terminal emission

GeminiResponsesStreamState and its pool helpers:

  • Track lifecycle flags (HasEmittedCreated, HasEmittedInProgress, HasEmittedCompleted) plus per‑output indices, item/tool IDs, argument buffers.
  • Initialize and reset all maps/fields in acquireGeminiResponsesStreamState and flush.
  • Use HasEmittedCompleted in closeGeminiOpenItems to guarantee response.completed (and associated output_item.dones) is only emitted once, whether via FinishReason on the last chunk or via finalization.

This resolves the earlier double‑terminal‑event issue while keeping state reusable.


641-722: Responses stream conversion cleanly orchestrates lifecycle and closing logic

ToBifrostResponsesStream:

  • Issues response.created and response.in_progress once per stream (with synthetic message ID and model/version).
  • Routes each Gemini Part through specialized handlers (processGeminiTextPart, processGeminiThoughtPart, processGeminiFunctionCallPart, inline/file handlers), passing the running sequenceNumber.
  • On candidate.FinishReason != "", calls closeGeminiOpenItems with usage metadata, which in turn emits any remaining output_item.dones and a single response.completed.

Sequence numbers and output indices are computed in a way that remains monotonic and consistent with the Responses streaming schema. No correctness issues here.


1501-1616: Function call / response mapping preserves IDs and thought signatures

In convertGeminiCandidatesToResponsesOutput:

  • Function calls are converted with:

    • Arguments marshalled to JSON.
    • CallID preferring part.FunctionCall.ID and falling back to Name if ID is blank.
    • Name always set from FunctionCall.Name.
    • Optional ThoughtSignature emitted as a sibling reasoning message with EncryptedContent.
  • Function responses use extractFunctionResponseOutput to build a textual payload and set ResponsesToolMessage.CallID from part.FunctionResponse.ID, with Name filled from FunctionResponse.Name when present.

This combination keeps Gemini‑supplied call IDs intact and carries thought signatures through, which should avoid the lossy behavior you previously had.


1928-2069: Responses→Gemini content conversion handles system, messages, and tools comprehensively

convertResponsesMessagesToGeminiContents:

  • Separates system messages into a dedicated systemInstruction Content.
  • Maps assistant/user roles into Gemini Content.Role ("model" vs "user").
  • Converts text/blocks via convertContentBlockToGeminiPart.
  • Translates function calls into FunctionCall parts (including arguments and ID), and function call outputs into FunctionResponse parts with Name/ID and "output" payload.
  • Preserves thought signatures by looking ahead to a following reasoning message with EncryptedContent.

This gives you a full round‑trip between Bifrost Responses and Gemini native contents with tool‑calling and reasoning intact.


2071-2188: ContentBlock→Gemini Part conversion covers all supported modalities

convertContentBlockToGeminiPart correctly:

  • Maps text, reasoning, and refusal blocks into Text/Thought parts (with refusals downgraded to plain text for Gemini).
  • Converts base64 images and audio into InlineData with appropriate MIME types, handling media‑type prefixes and defaults.
  • Converts image/file URLs into FileData with sanitized URIs.
  • Handles file blocks with either URL or inline data as FileData/InlineData.

Error cases are localized to malformed URLs/base64 and bubble up as fmt.Errorf, which is appropriate given how callers treat them. Overall this converter looks solid.

@TejasGhatte TejasGhatte changed the base branch from main to graphite-base/1018 December 8, 2025 07:16
@TejasGhatte TejasGhatte force-pushed the 12-08-feat_added_support_for_gemini_native_converterswq branch from 461269a to 41ccb0f Compare December 8, 2025 07:16
@TejasGhatte TejasGhatte changed the base branch from graphite-base/1018 to 12-04-feat_responses_reasoning_fixes December 8, 2025 07:16
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (1)
core/providers/gemini/responses.go (1)

114-123: Guard against nil Params before accessing ExtraParams.

Line 114 accesses bifrostReq.Params.ExtraParams without first checking if bifrostReq.Params is nil. Since Params is only set conditionally at line 87, it can be nil here, causing a panic.

Apply this diff:

-	if bifrostReq.Params.ExtraParams != nil {
+	if bifrostReq.Params != nil && bifrostReq.Params.ExtraParams != nil {
 		if safetySettings, ok := schemas.SafeExtractFromMap(bifrostReq.Params.ExtraParams, "safety_settings"); ok {
 			if settings, ok := safetySettings.([]SafetySetting); ok {
 				geminiReq.SafetySettings = settings
 			}
 		}
 		if cachedContent, ok := schemas.SafeExtractString(bifrostReq.Params.ExtraParams["cached_content"]); ok {
 			geminiReq.CachedContent = cachedContent
 		}
 	}
🧹 Nitpick comments (4)
core/providers/gemini/speech.go (1)

110-127: Consider returning an error when Input is nil or empty.

The nil check prevents a panic, which is good. However, when bifrostReq.Input is nil or Input.Input is empty, the function returns a GeminiGenerationRequest without any Contents, which may result in an invalid API request to Gemini.

Consider returning an error for invalid input to provide clearer feedback:

 // ToGeminiSpeechRequest converts a BifrostSpeechRequest to a GeminiGenerationRequest
 func ToGeminiSpeechRequest(bifrostReq *schemas.BifrostSpeechRequest) (*GeminiGenerationRequest, error) {
 	if bifrostReq == nil {
 		return nil, fmt.Errorf("bifrostReq is nil")
 	}
+	if bifrostReq.Input == nil || bifrostReq.Input.Input == "" {
+		return nil, fmt.Errorf("speech input is required")
+	}
 	// ... rest of validation ...
 	// Convert speech input to Gemini format
-	if bifrostReq.Input != nil && bifrostReq.Input.Input != "" {
+	geminiReq.Contents = []Content{
+		{
+			Parts: []*Part{
+				{
+					Text: bifrostReq.Input.Input,
+				},
+			},
+		},
+	}
core/providers/vertex/vertex.go (2)

951-953: Duplicate raw response assignment in Anthropic Responses path.

Lines 944-946 already set RawResponse when ShouldSendBackRawResponse returns true. Lines 951-953 duplicate this check and assignment, which is unnecessary.

 		if providerUtils.ShouldSendBackRawResponse(ctx, provider.sendBackRawResponse) {
 			response.ExtraFields.RawResponse = rawResponse
 		}
 		if request.Model != deployment {
 			response.ExtraFields.ModelDeployment = deployment
 		}
-
-		if providerUtils.ShouldSendBackRawResponse(ctx, provider.sendBackRawResponse) {
-			response.ExtraFields.RawResponse = rawResponse
-		}
 
 		return response, nil

1030-1031: Remove debug comment.

The comment "print response body" appears to be a leftover debug note that should be removed.

-		// print response body
 		geminiResponse := &gemini.GenerateContentResponse{}
core/providers/gemini/gemini.go (1)

183-186: Duplicate raw response assignment in listModelsByKey.

Lines 179-181 already handle setting RawResponse. Lines 183-186 duplicate this check unnecessarily.

 	// Set raw response if enabled
 	if providerUtils.ShouldSendBackRawResponse(ctx, provider.sendBackRawResponse) {
 		response.ExtraFields.RawResponse = rawResponse
 	}
-
-	if providerUtils.ShouldSendBackRawResponse(ctx, provider.sendBackRawResponse) {
-		response.ExtraFields.RawResponse = rawResponse
-	}
 
 	return response, nil
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 461269a and 41ccb0f.

📒 Files selected for processing (12)
  • core/providers/gemini/chat.go (3 hunks)
  • core/providers/gemini/gemini.go (1 hunks)
  • core/providers/gemini/gemini_test.go (1 hunks)
  • core/providers/gemini/responses.go (9 hunks)
  • core/providers/gemini/speech.go (1 hunks)
  • core/providers/gemini/utils.go (7 hunks)
  • core/providers/vertex/vertex.go (8 hunks)
  • core/providers/vertex/vertex_test.go (1 hunks)
  • core/schemas/responses.go (1 hunks)
  • core/schemas/utils.go (1 hunks)
  • tests/integrations/tests/test_google.py (1 hunks)
  • transports/bifrost-http/integrations/genai.go (2 hunks)
🚧 Files skipped from review as they are similar to previous changes (3)
  • core/schemas/responses.go
  • core/providers/gemini/gemini_test.go
  • transports/bifrost-http/integrations/genai.go
🧰 Additional context used
📓 Path-based instructions (1)
**

⚙️ CodeRabbit configuration file

always check the stack if there is one for the current PR. do not give localized reviews for the PR, always see all changes in the light of the whole stack of PRs (if there is a stack, if there is no stack you can continue to make localized suggestions/reviews)

Files:

  • core/providers/vertex/vertex_test.go
  • core/providers/gemini/speech.go
  • core/schemas/utils.go
  • core/providers/vertex/vertex.go
  • tests/integrations/tests/test_google.py
  • core/providers/gemini/chat.go
  • core/providers/gemini/gemini.go
  • core/providers/gemini/utils.go
  • core/providers/gemini/responses.go
🧬 Code graph analysis (1)
core/providers/gemini/gemini.go (3)
core/providers/gemini/chat.go (1)
  • ToGeminiChatCompletionRequest (11-62)
core/providers/utils/utils.go (2)
  • ShouldSendBackRawResponse (533-538)
  • GetPathFromContext (212-217)
core/providers/gemini/responses.go (1)
  • ToGeminiResponsesRequest (76-126)
⏰ Context from checks skipped due to timeout of 900000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (11)
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
🔇 Additional comments (18)
core/providers/vertex/vertex_test.go (1)

26-27: Model naming updated to align with native Gemini detection.

The removal of the google/ prefix aligns with the new schemas.IsGeminiModel() function which uses strings.Contains(model, "gemini"). The Vertex provider's getModelDeployment method handles deployment resolution internally, so this change is appropriate.

core/schemas/utils.go (1)

1052-1054: LGTM - Consistent model detection pattern.

The IsGeminiModel function follows the same pattern as IsAnthropicModel and IsMistralModel, providing a simple and consistent way to detect Gemini models by substring matching.

core/providers/vertex/vertex.go (1)

1439-1455: LGTM - Vertex-specific field stripping for Gemini compatibility.

The stripVertexGeminiUnsupportedFields helper correctly removes the ID field from FunctionCall and FunctionResponse parts, which Vertex AI's Gemini API doesn't support. This is properly called in all Gemini paths (ChatCompletion, ChatCompletionStream, Responses, ResponsesStream).

core/providers/gemini/chat.go (3)

11-23: Signature simplification approved.

The removal of the responseModalities parameter from ToGeminiChatCompletionRequest simplifies the API. The empty slice passed to convertParamsToGenerationConfig maintains backward compatibility while preparing for future modality handling.


106-117: Tool call ID handling now consistent with streaming path.

The non-streaming path now properly checks part.FunctionCall.ID first before falling back to the function name, matching the streaming path logic. This addresses the previous review concern.


165-312: Well-structured streaming response conversion.

The ToBifrostChatCompletionStream method cleanly handles:

  • Text and thought/reasoning content separation
  • Tool call extraction with ID preservation
  • ThoughtSignature preservation for Gemini 3 Pro
  • Finish reason conversion
  • Usage metadata on final chunks

The early return for empty delta content (lines 283-287) prevents sending unnecessary empty chunks.

core/providers/gemini/utils.go (3)

194-218: Comprehensive finish reason mapping.

The geminiFinishReasonToBifrost map covers all Gemini finish reasons with sensible mappings:

  • Safety-related reasons → content_filter
  • Token limit → length
  • Tool-related issues → tool_calls

One consideration: FinishReasonMalformedFunctionCall mapping to tool_calls might be semantically incorrect since it indicates an error rather than successful tool use. However, this is a minor edge case.

Consider whether FinishReasonMalformedFunctionCall should map to an error-specific reason or remain as tool_calls. The current mapping may cause confusion when debugging malformed function calls.


262-293: Good separation of usage metadata conversion.

Having separate functions for chat (convertGeminiUsageMetadataToChatUsage) and responses (convertGeminiUsageMetadataToResponsesUsage) usage metadata is appropriate given the different output structures required by each API.


932-954: Robust function response output extraction.

The extractFunctionResponseOutput function properly:

  1. Handles nil cases
  2. Tries to extract the output key first (common convention)
  3. Falls back to marshaling the entire response
  4. Returns empty string on failure (safe fallback)
core/providers/gemini/gemini.go (3)

304-496: Well-implemented streaming handler for chat completions.

The HandleGeminiChatCompletionStream function properly handles:

  • SSE parsing with appropriate buffer sizing (10MB max)
  • Error conditions (context cancellation, timeouts, HTTP errors, API errors)
  • Chunk indexing and latency tracking
  • Response ID and model propagation
  • Post-response conversion hook
  • Final chunk detection via finish reason

583-820: Responses stream handler with proper lifecycle management.

The HandleGeminiResponsesStream implementation correctly manages:

  • Stream state via acquireGeminiResponsesStreamState/releaseGeminiResponsesStreamState (object pooling)
  • Usage metadata tracking across chunks
  • Sequence numbering for responses events
  • Finalization via FinalizeGeminiResponsesStream
  • Multiple responses per SSE event handling (lines 728-774)

The pattern of emitting final events when the scanner completes without error (lines 781-816) ensures proper stream completion even if no explicit "completed" event is received.


501-504: Correct provider in operation check.

The CheckOperationAllowed call now correctly uses schemas.Gemini instead of the previously incorrect schemas.Anthropic. This addresses the past review concern.

core/providers/gemini/responses.go (6)

15-74: LGTM! Params initialization properly addresses previous nil pointer concerns.

The defensive initialization at lines 28-33 ensures params is never nil before field assignments, correctly resolving the past review feedback.


228-253: LGTM! Function call ID preservation correctly implemented.

Lines 238-239 properly preserve the CallID as FunctionCall.ID, keeping IDs distinct through round-trips as intended by the previous fix.


511-639: LGTM! Stream state management with pooling is correctly implemented.

The addition of HasEmittedCompleted (line 516) and proper initialization/reset in pool lifecycle (lines 589, 634) correctly address the double-emission concern from previous reviews.


1166-1222: LGTM! Double-emission prevention correctly implemented.

The guard at lines 1168-1170 and flag setting at line 1214 ensure response.completed and output_item.done events are emitted exactly once, resolving the previous issue.


1558-1574: LGTM! Function call ID handling correctly preserves Gemini IDs.

Lines 1558-1567 properly use part.FunctionCall.ID for CallID (with Name as fallback) and preserve the function name separately, maintaining ID distinctiveness across conversions.


2083-2097: LGTM! New reasoning and refusal content types properly handled.

Lines 2083-2089 correctly map reasoning content to Gemini's Thought flag, and lines 2091-2097 appropriately handle refusals as text content for Gemini compatibility.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
core/providers/vertex/vertex.go (1)

1030-1031: Remove debug comment.

The comment "print response body" appears to be leftover from debugging.

-		// print response body
 		geminiResponse := &gemini.GenerateContentResponse{}
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 461269a and 41ccb0f.

📒 Files selected for processing (12)
  • core/providers/gemini/chat.go (3 hunks)
  • core/providers/gemini/gemini.go (1 hunks)
  • core/providers/gemini/gemini_test.go (1 hunks)
  • core/providers/gemini/responses.go (9 hunks)
  • core/providers/gemini/speech.go (1 hunks)
  • core/providers/gemini/utils.go (7 hunks)
  • core/providers/vertex/vertex.go (8 hunks)
  • core/providers/vertex/vertex_test.go (1 hunks)
  • core/schemas/responses.go (1 hunks)
  • core/schemas/utils.go (1 hunks)
  • tests/integrations/tests/test_google.py (1 hunks)
  • transports/bifrost-http/integrations/genai.go (2 hunks)
🚧 Files skipped from review as they are similar to previous changes (4)
  • core/providers/gemini/speech.go
  • core/schemas/responses.go
  • core/schemas/utils.go
  • core/providers/gemini/gemini_test.go
🧰 Additional context used
📓 Path-based instructions (1)
**

⚙️ CodeRabbit configuration file

always check the stack if there is one for the current PR. do not give localized reviews for the PR, always see all changes in the light of the whole stack of PRs (if there is a stack, if there is no stack you can continue to make localized suggestions/reviews)

Files:

  • transports/bifrost-http/integrations/genai.go
  • tests/integrations/tests/test_google.py
  • core/providers/gemini/chat.go
  • core/providers/vertex/vertex.go
  • core/providers/gemini/utils.go
  • core/providers/vertex/vertex_test.go
  • core/providers/gemini/gemini.go
  • core/providers/gemini/responses.go
🧬 Code graph analysis (5)
transports/bifrost-http/integrations/genai.go (6)
core/schemas/bifrost.go (1)
  • ResponsesRequest (91-91)
transports/bifrost-http/handlers/inference.go (1)
  • ResponsesRequest (257-261)
transports/bifrost-http/integrations/router.go (3)
  • EmbeddingResponseConverter (102-102)
  • ResponsesResponseConverter (98-98)
  • ResponsesStreamResponseConverter (123-123)
core/providers/gemini/embedding.go (1)
  • ToGeminiEmbeddingResponse (57-100)
core/schemas/responses.go (2)
  • BifrostResponsesResponse (45-84)
  • BifrostResponsesStreamResponse (1430-1469)
core/providers/gemini/responses.go (2)
  • ToGeminiResponsesResponse (153-339)
  • ToGeminiResponsesStreamResponse (341-509)
tests/integrations/tests/test_google.py (1)
tests/integrations/tests/utils/config_loader.py (2)
  • get_model (140-161)
  • get_model (443-445)
core/providers/gemini/chat.go (3)
core/schemas/chatcompletions.go (9)
  • BifrostChatRequest (12-19)
  • ChatAssistantMessageToolCall (686-692)
  • ChatContentBlock (590-597)
  • ChatContentBlockTypeText (582-582)
  • ChatAssistantMessageToolCallFunction (695-698)
  • ChatToolChoiceTypeFunction (373-373)
  • ChatMessage (469-478)
  • ChatMessageContent (524-527)
  • ChatAssistantMessage (626-632)
core/providers/gemini/types.go (8)
  • GeminiGenerationRequest (55-73)
  • GenerationConfig (631-697)
  • Content (922-930)
  • Type (778-778)
  • Role (13-13)
  • FinishReason (22-22)
  • FinishReasonUnspecified (26-26)
  • RoleModel (17-17)
core/providers/gemini/utils.go (1)
  • ConvertGeminiFinishReasonToBifrost (213-218)
core/providers/gemini/gemini.go (7)
core/providers/gemini/chat.go (1)
  • ToGeminiChatCompletionRequest (11-62)
core/schemas/bifrost.go (12)
  • RequestType (83-83)
  • ChatCompletionRequest (89-89)
  • BifrostStream (321-328)
  • BifrostError (356-365)
  • Gemini (48-48)
  • ChatCompletionStreamRequest (90-90)
  • ModelProvider (32-32)
  • ErrorField (374-381)
  • BifrostErrorExtraFields (423-427)
  • BifrostResponseExtraFields (286-296)
  • ResponsesRequest (91-91)
  • ResponsesStreamRequest (92-92)
core/schemas/provider.go (2)
  • Provider (282-309)
  • PostHookRunner (279-279)
core/providers/utils/utils.go (2)
  • ShouldSendBackRawResponse (533-538)
  • ReleaseStreamingResponse (818-825)
core/utils.go (1)
  • Ptr (51-53)
core/schemas/responses.go (3)
  • BifrostResponsesRequest (32-39)
  • BifrostResponsesResponse (45-84)
  • BifrostResponsesStreamResponse (1430-1469)
core/providers/gemini/responses.go (1)
  • ToGeminiResponsesRequest (76-126)
core/providers/gemini/responses.go (3)
core/providers/gemini/types.go (11)
  • GeminiGenerationRequest (55-73)
  • ToolConfig (146-151)
  • Part (936-960)
  • Role (13-13)
  • Content (922-930)
  • Type (778-778)
  • FunctionCall (1091-1101)
  • FunctionResponse (1104-1126)
  • FileData (1080-1088)
  • Blob (963-971)
  • Tool (600-627)
core/schemas/responses.go (13)
  • BifrostResponsesRequest (32-39)
  • BifrostResponsesResponse (45-84)
  • ResponsesToolMessage (465-485)
  • ResponsesReasoning (734-737)
  • ResponsesMessage (317-330)
  • ResponsesMessageContent (342-347)
  • ResponsesMessageContentBlock (402-414)
  • ResponsesToolMessageOutputStruct (534-538)
  • ResponsesInputMessageContentBlockImage (416-419)
  • ResponsesInputMessageContentBlockAudio (427-430)
  • ResponsesInputMessageContentBlockFile (421-425)
  • ResponsesTool (1033-1048)
  • ResponsesOutputMessageContentRefusal (461-463)
core/schemas/utils.go (4)
  • ParseModelString (23-36)
  • SafeExtractFromMap (519-525)
  • SafeExtractString (280-297)
  • Ptr (16-18)
⏰ Context from checks skipped due to timeout of 900000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (29)
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
🔇 Additional comments (32)
core/providers/vertex/vertex_test.go (1)

26-27: LGTM! Model identifiers correctly updated for native Gemini API.

The removal of the google/ prefix aligns with the native Gemini API format. This change is consistent with the broader refactoring to use native Gemini endpoints instead of OpenAI-compatible ones.

tests/integrations/tests/test_google.py (2)

507-516: Test correctly updated to reflect Gemini's error handling behavior.

The test now validates that Gemini handles invalid roles gracefully by returning a successful response rather than raising an exception. This aligns with the native Gemini API behavior.


518-566: Well-structured streaming test with proper chunk handling.

The streaming test correctly:

  • Uses generate_content_stream for SSE streaming
  • Iterates through chunks with defensive hasattr checks
  • Accumulates content from both the candidates structure and direct text attribute
  • Validates both chunk count and content quality
core/providers/vertex/vertex.go (5)

310-326: LGTM! Gemini request conversion properly integrated.

The Gemini branch correctly:

  • Converts Bifrost requests to native Gemini format
  • Strips Vertex-unsupported fields (ID in function_call/function_response)
  • Uses marshal/unmarshal for Vertex API compatibility

497-522: LGTM! Gemini response handling is consistent with other branches.

The response handling correctly:

  • Parses the native Gemini response format
  • Sets all ExtraFields (RequestType, Provider, ModelRequested, ModelDeployment, Latency)
  • Handles raw request/response when configured

660-733: LGTM! Gemini streaming path properly implemented.

The streaming implementation correctly:

  • Constructs URLs with alt=sse parameter for SSE streaming
  • Supports both API key (query string) and OAuth2 (header) authentication
  • Delegates to shared Gemini streaming handler

1181-1273: LGTM! ResponsesStream Gemini branch properly handles streaming.

The implementation correctly:

  • Calls stripVertexGeminiUnsupportedFields to sanitize requests (line 1203)
  • Uses consistent URL construction and auth handling
  • Properly sets up the post-response converter for model metadata

1439-1455: LGTM! Helper function correctly strips Vertex-unsupported fields.

The stripVertexGeminiUnsupportedFields function cleanly removes ID fields from function_call and function_response parts that Vertex AI's Gemini API doesn't support.

transports/bifrost-http/integrations/genai.go (1)

49-75: LGTM! Route configuration correctly updated to use Responses API.

The migration from Chat-based to Responses-based types aligns with the PR's objective to use native Gemini format:

  • Request conversion now uses ToBifrostResponsesRequest()
  • Response converters properly use ToGeminiResponsesResponse and ToGeminiResponsesStreamResponse
  • Type signatures match the new BifrostResponsesResponse and BifrostResponsesStreamResponse schemas
core/providers/gemini/chat.go (4)

10-23: LGTM! Simplified signature with reasonable defaults.

The function signature simplification removes the responseModalities parameter and defaults to an empty slice. This is appropriate since chat completion requests typically don't need explicit modality configuration—speech/transcription requests are handled through separate code paths.


77-157: LGTM! Non-streaming response conversion properly handles content and tool calls.

The implementation correctly:

  • Collects content blocks for text content
  • Handles tool calls with proper ID fallback (prefers FunctionCall.ID, falls back to name)
  • Extracts function response output to text blocks
  • Builds a single assistant message with both content and tool calls

165-312: Well-structured streaming conversion with proper thought signature handling.

The ToBifrostChatCompletionStream method correctly:

  • Returns a 3-tuple with response, error, and isLastChunk flag
  • Separates thought content from regular text content
  • Preserves thought_signature in ExtraContent for Gemini 3 Pro compatibility
  • Only adds usage metadata on the final chunk
  • Skips empty deltas to avoid noise in the stream (lines 283-287)

119-130: No action neededextractFunctionResponseOutput is properly defined in utils.go (line 936) and handles multiple response formats: it extracts string values from the "output" field, falls back to JSON marshaling for map responses, and includes nil checks for safety.

core/providers/gemini/utils.go (6)

11-70: LGTM! Conversion logic correctly updated for Responses API.

The refactored method properly maps Gemini generation config to ResponsesParameters structure. The mapping of Logprobs to TopLogProbs and the ExtraParams handling align with the new Responses-oriented schema.


73-95: LGTM! Standalone function improves modularity.

Converting from a method to a standalone function is a good refactor that makes the code more reusable and easier to test.


194-218: LGTM! Comprehensive finish reason mapping.

The finish reason mapping provides good coverage of Gemini-specific reasons and includes a sensible fallback for unmapped values.


233-293: LGTM! Thorough usage metadata conversion for both APIs.

Both converters properly handle nil cases, cached tokens, reasoning tokens, and modality-specific details. The separation between Chat and Responses usage structures is appropriate.


839-890: LGTM! Response format properly adapted for Responses API.

The function correctly returns the new ResponsesTextConfig structure with appropriate nesting for JSON schema configuration. The fallback to json_object mode when no schema is provided is sensible.


933-954: LGTM! Clean utility for function response extraction.

The function provides a sensible two-tier extraction strategy: first attempting to extract the "output" field, then falling back to marshaling the entire response. The nil checks are appropriate.

core/providers/gemini/gemini.go (5)

220-255: LGTM! ChatCompletion properly implemented.

The implementation correctly uses the native Gemini API format, populates response metadata, and handles raw response inclusion conditionally. The error handling and field population are appropriate.


257-302: LGTM! ChatCompletionStream correctly configured.

The streaming setup properly configures SSE headers, uses the correct Gemini streaming endpoint :streamGenerateContent?alt=sse, and passes all necessary parameters to the shared streaming handler.


304-496: LGTM! Comprehensive streaming implementation with robust error handling.

The streaming handler properly:

  • Sets up SSE streaming with appropriate buffers
  • Handles cancellation, timeouts, and HTTP errors
  • Processes chunks with API error detection
  • Tracks response metadata (ID, model version)
  • Finalizes streams with proper latency calculation
  • Handles scanner errors gracefully

The buffer sizes (1MB initial, 10MB max) are appropriate for handling large audio/image chunks.


498-537: LGTM! Responses method correctly implemented.

The implementation properly uses native Gemini format, converts responses appropriately, and populates all required metadata fields. The permission check now correctly uses schemas.Gemini.


539-820: LGTM! Responses streaming properly implements lifecycle with state management.

The streaming implementation correctly:

  • Uses pooled stream state for efficient lifecycle tracking
  • Processes chunks with state-aware conversion
  • Handles multiple responses per event
  • Detects completion via response type
  • Finalizes streams with proper cleanup
  • Releases resources in defer statements

The past review concern about double-emitting terminal events has been addressed with the HasEmittedCompleted flag in the stream state (lines 659-660, 783-816).

core/providers/gemini/responses.go (8)

15-74: LGTM! Nil safety properly addressed.

The conversion correctly initializes Params before any field access (lines 28-33), addressing the past review concern about potential nil pointer dereferences. All subsequent Params field assignments are now safe.


76-126: LGTM! ExtraParams nil check properly added.

The conversion now safely checks bifrostReq.Params.ExtraParams != nil at line 114 before accessing it, addressing the past review concern about potential nil pointer dereference.


153-339: LGTM! Comprehensive bidirectional conversion preserves IDs.

The conversion properly:

  • Preserves function call IDs from CallID field (lines 238-240)
  • Handles function responses with correct ID mapping (lines 274-276)
  • Preserves thought signatures for Gemini 3 Pro compatibility (lines 247-249)
  • Groups messages by role appropriately
  • Converts usage metadata correctly

This addresses past review concerns about ID preservation.


511-639: LGTM! Stream state management with proper completion tracking.

The stream state properly:

  • Uses pooling for efficient resource management
  • Includes HasEmittedCompleted flag to prevent double-emission (line 516)
  • Clears all maps and resets flags in acquire/flush methods
  • Initializes all fields to safe defaults

This addresses the past review concern about duplicate terminal events.


641-711: LGTM! Streaming conversion with proper lifecycle management.

The conversion correctly:

  • Emits lifecycle events only once using state flags
  • Processes all candidate parts
  • Closes open items when finish reason is present
  • Relies on closeGeminiOpenItems for completion guard

1166-1217: LGTM! Completion guard prevents duplicate terminal events.

The function properly:

  • Guards against re-execution with HasEmittedCompleted check (lines 1168-1170)
  • Closes all open items (text and tool calls)
  • Emits response.completed exactly once
  • Sets the completion flag (line 1214)

This fully addresses the past review concern about double-emitting terminal events.


1503-1729: LGTM! Function call conversion preserves IDs correctly.

The conversion properly:

  • Extracts CallID with preference for ID field, fallback to Name (lines 1558-1561)
  • Uses distinct variables for CallID and Name (lines 1565-1566)
  • Extracts function response output using utility function (line 1593)
  • Preserves thought signatures for compatibility (lines 1576-1589)

This addresses the past review concern about function call ID preservation and ensures round-trip consistency.


1929-2070: LGTM! Bidirectional conversion maintains ID consistency.

The conversion properly:

  • Maps roles correctly between Responses and Gemini formats
  • Preserves function call IDs from CallID field (lines 2019-2021)
  • Handles thought signature preservation via lookahead (lines 2023-2031)
  • Extracts function response names with proper fallback (lines 2044-2049)
  • Maintains consistency with the reverse conversion

@Pratham-Mishra04 Pratham-Mishra04 changed the base branch from 12-04-feat_responses_reasoning_fixes to graphite-base/1018 December 8, 2025 08:24
@TejasGhatte TejasGhatte force-pushed the 12-08-feat_added_support_for_gemini_native_converterswq branch from cd07ffe to cc1bd19 Compare December 11, 2025 21:47
@TejasGhatte TejasGhatte changed the base branch from graphite-base/1018 to 12-04-feat_responses_reasoning_fixes December 11, 2025 21:47
@TejasGhatte TejasGhatte force-pushed the 12-08-feat_added_support_for_gemini_native_converterswq branch from cc1bd19 to fd14a0e Compare December 12, 2025 06:37
@TejasGhatte TejasGhatte force-pushed the 12-04-feat_responses_reasoning_fixes branch from 63a0c31 to ac6bbd4 Compare December 12, 2025 06:37
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
core/schemas/responses.go (1)

133-142: Avoid ambiguous JSON shape: enforce “Schema vs inline fields” exclusivity; also reconsider *any.

Right now ResponsesTextConfigFormatJSONSchema can serialize both schema (new Schema *any) and properties/type/required (existing), which can create conflicting/invalid JSON for downstream providers. Consider:

  • Making Schema a map[string]any or sonic.RawMessage/json.RawMessage-like type (instead of *any), and
  • Adding a MarshalJSON (or a validation helper) that errors when Schema != nil and any of Type/Properties/Required/AdditionalProperties are also set.
core/providers/gemini/speech.go (1)

103-121: Prefer returning an error for empty speech input (instead of sending an empty Gemini request).

As-is, ToGeminiSpeechRequest can return a request with ResponseModalities=[AUDIO] but no Contents, which is likely a provider error and harder to diagnose than a local validation error.

♻️ Duplicate comments (4)
tests/integrations/tests/test_google.py (2)

512-521: Test intent mismatch: “error handling” test asserts success.

Either rename/docstring to reflect “graceful handling of alternate role content”, or switch assertions to validate an error path—right now it reads contradictory.


889-966: Fix unused test_config args + verify ThinkingConfig API for google-genai==1.0.0.

  • Rename to _test_config (or drop the arg) to satisfy Ruff ARG002.
  • Please verify types.ThinkingConfig(include_thoughts=..., thinking_budget=...) and usage_metadata.thoughts_token_count are valid/returned under google-genai 1.0.0 and your chosen get_model("google","chat") model; otherwise these tests will be flaky.
-    def test_26_extended_thinking(self, google_client, test_config):
+    def test_26_extended_thinking(self, google_client, _test_config):
...
-    def test_27_extended_thinking_streaming(self, google_client, test_config):
+    def test_27_extended_thinking_streaming(self, google_client, _test_config):

Also applies to: 966-1054

core/providers/vertex/vertex.go (1)

500-525: Gemini branches bypass context-aware raw-request/response toggles.

Both Gemini non-streaming branches call HandleProviderResponse(..., provider.sendBackRawRequest, provider.sendBackRawResponse) even though later they check providerUtils.ShouldSendBackRawRequest/Response(...). This prevents per-request ctx overrides and can do unnecessary work.

- rawRequest, rawResponse, bifrostErr := providerUtils.HandleProviderResponse(resp.Body(), &geminiResponse, jsonBody, provider.sendBackRawRequest, provider.sendBackRawResponse)
+ rawRequest, rawResponse, bifrostErr := providerUtils.HandleProviderResponse(
+   resp.Body(), &geminiResponse, jsonBody,
+   providerUtils.ShouldSendBackRawRequest(ctx, provider.sendBackRawRequest),
+   providerUtils.ShouldSendBackRawResponse(ctx, provider.sendBackRawResponse),
+ )

Also applies to: 1012-1033

core/providers/gemini/responses.go (1)

428-433: Fix field-path check for encrypted reasoning content (likely wrong / possibly won’t compile).

The condition checks bifrostResp.Item.EncryptedContent but then dereferences bifrostResp.Item.ResponsesReasoning.EncryptedContent. These need to match.

-    if bifrostResp.Item != nil && bifrostResp.Item.ResponsesReasoning != nil && bifrostResp.Item.EncryptedContent != nil {
+    if bifrostResp.Item != nil && bifrostResp.Item.ResponsesReasoning != nil && bifrostResp.Item.ResponsesReasoning.EncryptedContent != nil {
         candidate.Content.Parts = append(candidate.Content.Parts, &Part{
             ThoughtSignature: []byte(*bifrostResp.Item.ResponsesReasoning.EncryptedContent),
         })
     }
🧹 Nitpick comments (6)
tests/integrations/tests/utils/common.py (1)

225-246: Make reasoning prompts less brittle (avoid demanding “step-by-step” if not strictly needed).

If the tests only need “reasoning tokens present + answer quality”, consider rewording to “show your calculations / explain briefly” to reduce variability across Gemini/Vertex routing and future model changes.

core/providers/gemini/speech.go (1)

142-155: Guard ctx.Value(BifrostContextKeyResponseFormat).(string) to avoid panics.

Even though the change is formatting-only here, this block still assumes the context key exists and is a string; consider a safe default to "wav" when unset.

core/providers/vertex/vertex.go (1)

283-286: Deduplicate google/ prefix stripping (one helper).

This same normalization appears in 4 places; a tiny helper (e.g., normalizeVertexGeminiDeployment(string) string) would reduce drift.

Also applies to: 578-582, 834-837, 1065-1068

core/providers/gemini/gemini.go (1)

216-251: Consider aligning ChatCompletion request-body conversion with the explicit nil-check pattern.

ChatCompletionStream, Responses, and ResponsesStream all guard against reqBody == nil, but ChatCompletion does not. It’s probably safe (converter only returns nil when request is nil), but consistency would reduce future footguns when the converter grows errorable paths.

core/providers/gemini/utils.go (1)

11-83: Utilities look solid; reasoning/usage mappings are coherent.

The ThinkingConfig→Reasoning mapping, finish-reason normalization, and usage conversions (including cached/reasoning/audio tokens) look consistent with the new Gemini native + Responses surfaces.

Also applies to: 208-233, 247-311, 873-924, 967-988

core/providers/gemini/responses.go (1)

1262-1308: output_text.done emits an empty string—confirm this matches your stream contract.

closeGeminiTextItem sends output_text.done with Text: "", but you never accumulate the full text in state. If consumers expect the “done” event to include the finalized text (common in evented protocols), this will be surprising.

Options:

  • track an accumulated text buffer in GeminiResponsesStreamState and emit it on output_text.done, or
  • explicitly document that output_text.done is a pure lifecycle marker and must be reconstructed from deltas.
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e38b506 and fd14a0e.

📒 Files selected for processing (17)
  • core/providers/gemini/chat.go (3 hunks)
  • core/providers/gemini/embedding.go (0 hunks)
  • core/providers/gemini/gemini.go (1 hunks)
  • core/providers/gemini/gemini_test.go (2 hunks)
  • core/providers/gemini/responses.go (14 hunks)
  • core/providers/gemini/speech.go (2 hunks)
  • core/providers/gemini/transcription.go (0 hunks)
  • core/providers/gemini/types.go (1 hunks)
  • core/providers/gemini/utils.go (8 hunks)
  • core/providers/vertex/vertex.go (13 hunks)
  • core/providers/vertex/vertex_test.go (2 hunks)
  • core/schemas/responses.go (1 hunks)
  • core/schemas/utils.go (1 hunks)
  • tests/integrations/config.yml (1 hunks)
  • tests/integrations/tests/test_google.py (4 hunks)
  • tests/integrations/tests/utils/common.py (1 hunks)
  • transports/bifrost-http/integrations/genai.go (2 hunks)
💤 Files with no reviewable changes (2)
  • core/providers/gemini/embedding.go
  • core/providers/gemini/transcription.go
🚧 Files skipped from review as they are similar to previous changes (4)
  • core/providers/gemini/types.go
  • tests/integrations/config.yml
  • core/schemas/utils.go
  • core/providers/vertex/vertex_test.go
🧰 Additional context used
📓 Path-based instructions (1)
**

⚙️ CodeRabbit configuration file

always check the stack if there is one for the current PR. do not give localized reviews for the PR, always see all changes in the light of the whole stack of PRs (if there is a stack, if there is no stack you can continue to make localized suggestions/reviews)

Files:

  • core/providers/gemini/speech.go
  • core/providers/gemini/gemini_test.go
  • core/providers/vertex/vertex.go
  • core/schemas/responses.go
  • transports/bifrost-http/integrations/genai.go
  • core/providers/gemini/gemini.go
  • tests/integrations/tests/utils/common.py
  • tests/integrations/tests/test_google.py
  • core/providers/gemini/utils.go
  • core/providers/gemini/chat.go
  • core/providers/gemini/responses.go
🧠 Learnings (1)
📚 Learning: 2025-12-09T17:07:42.007Z
Learnt from: qwerty-dvorak
Repo: maximhq/bifrost PR: 1006
File: core/schemas/account.go:9-18
Timestamp: 2025-12-09T17:07:42.007Z
Learning: In core/schemas/account.go, the HuggingFaceKeyConfig field within the Key struct is currently unused and reserved for future Hugging Face inference endpoint deployments. Do not flag this field as missing from OpenAPI documentation or require its presence in the API spec until the feature is actively implemented and used. When the feature is added, update the OpenAPI docs accordingly; otherwise, treat this field as non-breaking and not part of the current API surface.

Applied to files:

  • core/providers/gemini/speech.go
  • core/providers/gemini/gemini_test.go
  • core/providers/vertex/vertex.go
  • core/schemas/responses.go
  • transports/bifrost-http/integrations/genai.go
  • core/providers/gemini/gemini.go
  • core/providers/gemini/utils.go
  • core/providers/gemini/chat.go
  • core/providers/gemini/responses.go
🧬 Code graph analysis (5)
core/providers/vertex/vertex.go (5)
core/schemas/utils.go (1)
  • IsGeminiModel (1052-1054)
core/providers/gemini/chat.go (1)
  • ToGeminiChatCompletionRequest (12-63)
core/providers/gemini/types.go (1)
  • GeminiGenerationRequest (55-73)
core/providers/utils/utils.go (2)
  • HandleProviderResponse (359-445)
  • CheckContextAndGetRequestBody (257-275)
core/providers/gemini/gemini.go (2)
  • HandleGeminiChatCompletionStream (304-495)
  • HandleGeminiResponsesStream (597-834)
core/schemas/responses.go (1)
core/providers/gemini/types.go (1)
  • Schema (724-775)
core/providers/gemini/gemini.go (6)
core/providers/gemini/chat.go (1)
  • ToGeminiChatCompletionRequest (12-63)
core/schemas/bifrost.go (9)
  • RequestType (83-83)
  • ChatCompletionRequest (89-89)
  • BifrostStream (323-330)
  • BifrostError (358-367)
  • Gemini (48-48)
  • ModelProvider (32-32)
  • ErrorField (376-383)
  • BifrostErrorExtraFields (425-429)
  • ResponsesRequest (91-91)
core/schemas/provider.go (2)
  • Provider (282-309)
  • PostHookRunner (279-279)
core/schemas/utils.go (1)
  • Ptr (16-18)
core/utils.go (1)
  • Ptr (51-53)
core/providers/gemini/responses.go (1)
  • ToGeminiResponsesRequest (71-121)
tests/integrations/tests/test_google.py (2)
tests/integrations/tests/utils/config_loader.py (2)
  • get_model (140-161)
  • get_model (443-445)
tests/integrations/tests/utils/common.py (1)
  • skip_if_no_api_key (1625-1636)
core/providers/gemini/responses.go (3)
core/providers/gemini/types.go (13)
  • GeminiGenerationRequest (55-73)
  • ToolConfig (146-151)
  • FunctionCallingConfig (92-99)
  • GenerationConfig (631-697)
  • SafetySetting (81-89)
  • GenerateContentResponse (1298-1314)
  • Candidate (1223-1249)
  • Content (933-941)
  • FunctionCall (1102-1112)
  • FunctionResponse (1115-1137)
  • FinishReason (22-22)
  • GenerateContentResponseUsageMetadata (1270-1295)
  • Tool (600-627)
core/schemas/utils.go (3)
  • SafeExtractFromMap (519-525)
  • SafeExtractString (280-297)
  • Ptr (16-18)
core/schemas/models.go (1)
  • Model (109-129)
🪛 Ruff (0.14.8)
tests/integrations/tests/test_google.py

890-890: Unused method argument: test_config

(ARG002)


966-966: Unused method argument: test_config

(ARG002)

⏰ Context from checks skipped due to timeout of 900000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (7)
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
🔇 Additional comments (7)
core/providers/gemini/gemini_test.go (1)

35-57: Reasoning model switch to a preview SKU may make these tests flaky across environments.

If CI/dev environments don’t all have access to "gemini-3-pro-preview", consider gating the Reasoning scenario/model behind an env var (or falling back to a stable reasoning-capable model when unavailable), so the stack stays green.

tests/integrations/tests/test_google.py (1)

86-89: Gemini reasoning prompt wiring looks fine.

core/providers/vertex/vertex.go (3)

20-24: Gemini native integration import is expected for this stack.


314-329: Vertex Gemini native request shaping/endpoint selection looks consistent (generateContent on v1).

Also applies to: 403-408


1392-1408: > Likely an incorrect or invalid review comment.

core/providers/gemini/gemini.go (1)

369-395: The SSE parsing implementation is correct for Gemini's documented format.

Gemini's official API documentation confirms that each SSE event contains exactly one data: line with a single JSON object (GenerateContentResponse). The server does not send multi-line events, event: fields, or id: fields. The code's line-by-line parsing with data: prefix extraction correctly implements Gemini's SSE contract.

transports/bifrost-http/integrations/genai.go (1)

35-56: No issues found with callback ordering or nil payload handling.

The PreCallback (extractAndSetModelFromURL) executes before RequestConverter (router.go lines 338–346), properly populating geminiReq.Model, geminiReq.Stream, geminiReq.IsEmbedding, geminiReq.IsSpeech, and geminiReq.IsTranscription before the routing logic branches on these flags.

For streaming, the router explicitly skips nil payloads via the continue statement (router.go lines 809–812) when convertedResponse == nil && err == nil, ensuring no blank SSE frames are emitted for skipped lifecycle events.

@Pratham-Mishra04 Pratham-Mishra04 changed the base branch from 12-04-feat_responses_reasoning_fixes to graphite-base/1018 December 12, 2025 08:27
@TejasGhatte TejasGhatte force-pushed the 12-08-feat_added_support_for_gemini_native_converterswq branch from fd14a0e to 3154215 Compare December 12, 2025 09:54
@TejasGhatte TejasGhatte changed the base branch from graphite-base/1018 to 12-04-feat_responses_reasoning_fixes December 12, 2025 09:54
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
core/schemas/utils.go (1)

1047-1055: Add a proper doc comment + make IsGeminiModel case-insensitive.

As an exported helper, it should have a comment, and strings.Contains(model, "gemini") is brittle if callers pass Gemini-....

 // IsMistralModel checks if the model is a Mistral or Codestral model.
 func IsMistralModel(model string) bool {
 	return strings.Contains(model, "mistral") || strings.Contains(model, "codestral")
 }
 
-func IsGeminiModel(model string) bool {
-	return strings.Contains(model, "gemini")
-}
+// IsGeminiModel checks if the model id refers to a Gemini model.
+func IsGeminiModel(model string) bool {
+	return strings.Contains(strings.ToLower(model), "gemini")
+}
core/providers/gemini/utils.go (1)

576-610: If thought_signature is base64 in ExtraContent, decode it here (keep fallback for older payloads).
This should be updated together with the streaming fix in core/providers/gemini/chat.go.

+import "encoding/base64"
...
 if toolCall.ExtraContent != nil {
   if googleData, ok := toolCall.ExtraContent["google"].(map[string]interface{}); ok {
     if thoughtSig, ok := googleData["thought_signature"].(string); ok {
-      part.ThoughtSignature = []byte(thoughtSig)
+      if decoded, err := base64.StdEncoding.DecodeString(thoughtSig); err == nil {
+        part.ThoughtSignature = decoded
+      } else {
+        // Backward-compat: treat as raw string bytes if not base64
+        part.ThoughtSignature = []byte(thoughtSig)
+      }
     }
   }
 }
♻️ Duplicate comments (5)
tests/integrations/tests/test_google.py (2)

512-521: Rename this test (or change assertions): it’s not “error handling” anymore.

It still reads like an error-path test but now asserts a successful response. This was already called out in prior review feedback.


889-964: Extended-thinking test: fix unused arg + confirm SDK/model support for thinking_config and thoughts_token_count.

  • Ruff flags test_config as unused (same pattern noted earlier).
  • Please verify the repo’s pinned google-genai SDK version supports types.ThinkingConfig(...) and that the configured model reliably returns usage_metadata.thoughts_token_count.
-    def test_26_extended_thinking(self, google_client, test_config):
+    def test_26_extended_thinking(self, google_client, _test_config):
core/providers/gemini/chat.go (2)

190-193: Stream “last chunk” detection should also consider usage metadata.
Right now isLastChunk depends only on FinishReason, so you can miss emitting the terminal chunk/usage.

- isLastChunk := candidate.FinishReason != "" && candidate.FinishReason != FinishReasonUnspecified
+ isLastChunk := (candidate.FinishReason != "" && candidate.FinishReason != FinishReasonUnspecified) ||
+   response.UsageMetadata != nil

Also applies to: 333-337


259-266: Critical: don’t serialize thought_signature as raw []bytestring in streaming tool calls.
This can be invalid UTF-8 and will get corrupted on JSON marshal. Base64 it (and decode when converting back to Gemini requests).

- "thought_signature": string(part.ThoughtSignature),
+ "thought_signature": base64.StdEncoding.EncodeToString(part.ThoughtSignature),
core/providers/vertex/vertex.go (1)

500-525: Use context-aware raw flags in Gemini ChatCompletion response handling (currently bypassed).
HandleProviderResponse(..., provider.sendBackRawRequest, provider.sendBackRawResponse) ignores per-request overrides (unlike other branches in this file).

- rawRequest, rawResponse, bifrostErr := providerUtils.HandleProviderResponse(resp.Body(), &geminiResponse, jsonBody, provider.sendBackRawRequest, provider.sendBackRawResponse)
+ rawRequest, rawResponse, bifrostErr := providerUtils.HandleProviderResponse(
+   resp.Body(),
+   &geminiResponse,
+   jsonBody,
+   providerUtils.ShouldSendBackRawRequest(ctx, provider.sendBackRawRequest),
+   providerUtils.ShouldSendBackRawResponse(ctx, provider.sendBackRawResponse),
+ )
🧹 Nitpick comments (3)
tests/integrations/tests/utils/common.py (1)

225-247: New Gemini reasoning prompts are OK; keep an eye on flakiness/cost.

These are long, multi-step prompts; if CI starts timing out, consider shortening them or making “reasoning intensity” configurable (env-driven) in the tests that consume them.

core/providers/vertex/vertex.go (1)

1392-1408: Optional: nil-guard stripVertexGeminiUnsupportedFields for defensive safety.
This helper can safely no-op on nil to reduce future call-site risk.

 func stripVertexGeminiUnsupportedFields(requestBody *gemini.GeminiGenerationRequest) {
+  if requestBody == nil {
+    return
+  }
   for _, content := range requestBody.Contents {
     for _, part := range content.Parts {
       ...
     }
   }
 }
core/providers/gemini/gemini.go (1)

225-251: Add explicit nil-check for non-streaming ChatCompletion request conversion (parity with stream).

 jsonData, err := providerUtils.CheckContextAndGetRequestBody(
   ctx,
   request,
-  func() (any, error) { return ToGeminiChatCompletionRequest(request), nil },
+  func() (any, error) {
+    reqBody := ToGeminiChatCompletionRequest(request)
+    if reqBody == nil {
+      return nil, fmt.Errorf("chat completion request is not provided or could not be converted to Gemini format")
+    }
+    return reqBody, nil
+  },
   provider.GetProviderKey(),
 )
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between fd14a0e and 3154215.

📒 Files selected for processing (17)
  • core/providers/gemini/chat.go (3 hunks)
  • core/providers/gemini/embedding.go (0 hunks)
  • core/providers/gemini/gemini.go (1 hunks)
  • core/providers/gemini/gemini_test.go (2 hunks)
  • core/providers/gemini/responses.go (14 hunks)
  • core/providers/gemini/speech.go (2 hunks)
  • core/providers/gemini/transcription.go (0 hunks)
  • core/providers/gemini/types.go (1 hunks)
  • core/providers/gemini/utils.go (8 hunks)
  • core/providers/vertex/vertex.go (13 hunks)
  • core/providers/vertex/vertex_test.go (2 hunks)
  • core/schemas/responses.go (1 hunks)
  • core/schemas/utils.go (1 hunks)
  • tests/integrations/config.yml (1 hunks)
  • tests/integrations/tests/test_google.py (4 hunks)
  • tests/integrations/tests/utils/common.py (1 hunks)
  • transports/bifrost-http/integrations/genai.go (2 hunks)
💤 Files with no reviewable changes (2)
  • core/providers/gemini/transcription.go
  • core/providers/gemini/embedding.go
🚧 Files skipped from review as they are similar to previous changes (4)
  • core/providers/gemini/gemini_test.go
  • core/providers/gemini/types.go
  • core/providers/gemini/speech.go
  • core/schemas/responses.go
🧰 Additional context used
📓 Path-based instructions (1)
**

⚙️ CodeRabbit configuration file

always check the stack if there is one for the current PR. do not give localized reviews for the PR, always see all changes in the light of the whole stack of PRs (if there is a stack, if there is no stack you can continue to make localized suggestions/reviews)

Files:

  • core/providers/vertex/vertex_test.go
  • core/providers/vertex/vertex.go
  • core/schemas/utils.go
  • tests/integrations/tests/test_google.py
  • tests/integrations/config.yml
  • core/providers/gemini/chat.go
  • transports/bifrost-http/integrations/genai.go
  • core/providers/gemini/utils.go
  • tests/integrations/tests/utils/common.py
  • core/providers/gemini/responses.go
  • core/providers/gemini/gemini.go
🧠 Learnings (2)
📚 Learning: 2025-12-09T17:07:42.007Z
Learnt from: qwerty-dvorak
Repo: maximhq/bifrost PR: 1006
File: core/schemas/account.go:9-18
Timestamp: 2025-12-09T17:07:42.007Z
Learning: In core/schemas/account.go, the HuggingFaceKeyConfig field within the Key struct is currently unused and reserved for future Hugging Face inference endpoint deployments. Do not flag this field as missing from OpenAPI documentation or require its presence in the API spec until the feature is actively implemented and used. When the feature is added, update the OpenAPI docs accordingly; otherwise, treat this field as non-breaking and not part of the current API surface.

Applied to files:

  • core/providers/vertex/vertex_test.go
  • core/providers/vertex/vertex.go
  • core/schemas/utils.go
  • core/providers/gemini/chat.go
  • transports/bifrost-http/integrations/genai.go
  • core/providers/gemini/utils.go
  • core/providers/gemini/responses.go
  • core/providers/gemini/gemini.go
📚 Learning: 2025-12-12T08:25:02.629Z
Learnt from: Pratham-Mishra04
Repo: maximhq/bifrost PR: 1000
File: transports/bifrost-http/integrations/router.go:709-712
Timestamp: 2025-12-12T08:25:02.629Z
Learning: In transports/bifrost-http/**/*.go, update streaming response handling to align with OpenAI Responses API: use typed SSE events such as response.created, response.output_text.delta, response.done, etc., and do not rely on the legacy data: [DONE] termination marker. Note that data: [DONE] is only used by the older Chat Completions and Text Completions streaming APIs. Ensure parsers, writers, and tests distinguish SSE events from the [DONE] sentinel and handle each event type accordingly for correct stream termination and progress updates.

Applied to files:

  • transports/bifrost-http/integrations/genai.go
🧬 Code graph analysis (4)
tests/integrations/tests/test_google.py (2)
tests/integrations/tests/utils/config_loader.py (2)
  • get_model (140-161)
  • get_model (443-445)
tests/integrations/tests/utils/common.py (1)
  • skip_if_no_api_key (1625-1636)
core/providers/gemini/chat.go (4)
core/schemas/chatcompletions.go (13)
  • BifrostChatRequest (12-19)
  • ChatAssistantMessageToolCall (686-692)
  • ChatContentBlock (590-597)
  • ChatReasoningDetails (723-730)
  • ChatAssistantMessageToolCallFunction (695-698)
  • ChatToolChoiceTypeFunction (373-373)
  • ChatMessage (469-478)
  • ChatAssistantMessage (626-632)
  • BifrostResponseChoice (704-712)
  • ChatNonStreamResponseChoice (745-748)
  • BifrostChatResponse (26-41)
  • ChatStreamResponseChoiceDelta (756-763)
  • ChatStreamResponseChoice (751-753)
core/providers/gemini/types.go (6)
  • GeminiGenerationRequest (55-73)
  • GenerationConfig (631-697)
  • Content (933-941)
  • Type (778-778)
  • Role (13-13)
  • FinishReason (22-22)
core/schemas/utils.go (1)
  • Ptr (16-18)
core/providers/gemini/utils.go (1)
  • ConvertGeminiFinishReasonToBifrost (227-232)
core/providers/gemini/utils.go (4)
core/providers/gemini/types.go (22)
  • ThinkingLevel (915-915)
  • ThinkingLevelUnspecified (918-918)
  • ThinkingLevelLow (919-919)
  • ThinkingLevelHigh (920-920)
  • Type (778-778)
  • Schema (724-775)
  • FinishReason (22-22)
  • FinishReasonStop (28-28)
  • FinishReasonMaxTokens (30-30)
  • FinishReasonSafety (33-33)
  • FinishReasonRecitation (35-35)
  • FinishReasonLanguage (37-37)
  • FinishReasonOther (39-39)
  • FinishReasonBlocklist (41-41)
  • FinishReasonProhibitedContent (43-43)
  • FinishReasonSPII (46-46)
  • FinishReasonMalformedFunctionCall (48-48)
  • FinishReasonImageSafety (50-50)
  • FinishReasonUnexpectedToolCall (52-52)
  • GenerateContentResponseUsageMetadata (1270-1295)
  • Modality (706-706)
  • FunctionResponse (1115-1137)
core/schemas/responses.go (5)
  • ResponsesParameters (87-114)
  • ResponsesParametersReasoning (238-243)
  • ResponsesTextConfig (120-123)
  • ResponsesTextConfigFormat (125-130)
  • ResponsesTextConfigFormatJSONSchema (133-142)
core/schemas/utils.go (1)
  • Ptr (16-18)
core/utils.go (1)
  • Ptr (51-53)
core/providers/gemini/responses.go (2)
core/providers/gemini/types.go (9)
  • GeminiGenerationRequest (55-73)
  • ToolConfig (146-151)
  • FunctionCallingConfig (92-99)
  • GenerationConfig (631-697)
  • SafetySetting (81-89)
  • Content (933-941)
  • FunctionCall (1102-1112)
  • FunctionResponse (1115-1137)
  • Tool (600-627)
core/schemas/utils.go (4)
  • ParseModelString (23-36)
  • SafeExtractFromMap (519-525)
  • SafeExtractString (280-297)
  • Ptr (16-18)
🪛 Ruff (0.14.8)
tests/integrations/tests/test_google.py

890-890: Unused method argument: test_config

(ARG002)


966-966: Unused method argument: test_config

(ARG002)

⏰ Context from checks skipped due to timeout of 900000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (10)
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
🔇 Additional comments (12)
tests/integrations/config.yml (1)

62-75: Gemini model bump looks consistent; please sanity-check cross-stack expectations.

Switching chat/vision/tools/streaming to gemini-2.5-flash is straightforward and matches the PR’s “native Gemini + streaming-first” direction. Just make sure other PRs in the Graphite stack (and any pinned fixtures) aren’t still asserting gemini-2.0-flash* model ids for these scenarios.

core/providers/vertex/vertex_test.go (2)

25-49: Verify ChatModel/VisionModel naming consistency for Vertex (prefix vs no prefix).

ChatModel is still google/gemini-2.0-flash-001 while VisionModel is now gemini-2.0-flash-001. If Vertex parsing now expects the native Gemini format for Gemini models, consider aligning both (or add a short comment why they must differ).


37-41: Scenario gating for multiple tool calls is clear.

Setting MultipleToolCalls: false matches the documented constraint and should reduce flaky failures in this suite.

tests/integrations/tests/utils/common.py (1)

248-256: IMAGE_URL_MESSAGES structure change is the right direction.

Splitting text and image into separate content blocks is more interoperable across providers/converters than a combined structure.

tests/integrations/tests/test_google.py (2)

37-38: Test plan doc update matches the new extended-thinking coverage.


86-89: Imports for Gemini-specific prompts are fine; ensure they’re only used in Google-keyed tests.

core/providers/vertex/vertex.go (1)

282-286: Good normalization of google/ deployment prefix.
This makes Vertex deployments consistent with Gemini model naming expectations.

core/providers/gemini/chat.go (1)

78-174: Non-stream Gemini→Bifrost assembly looks solid (tool IDs, args serialization, thought signature base64).
Good improvements to preserve IDs and keep signature JSON-safe.

transports/bifrost-http/integrations/genai.go (2)

35-84: GenAI route refactor to Responses is consistent and the nil-skip for lifecycle events is sensible.


72-80: Verify SSE writer behavior when ResponsesStreamResponseConverter returns empty event type.
Given the transport-layer guidance to use typed SSE events for Responses-style streams, confirm that "" doesn't accidentally trigger legacy termination behavior (e.g., [DONE]) or omit required event names. Check the SSE writer implementation to ensure empty event types are handled correctly and align with stream event naming conventions.

core/providers/gemini/utils.go (1)

11-84: ResponsesParameters mapping (incl. ThinkingConfig) looks good.

core/providers/gemini/gemini.go (1)

369-485: SSE parsing is correct for Gemini's format. Gemini sends complete JSON data in individual SSE events (one data: line per complete JSON payload), so the line-oriented scanner correctly processes each chunk without needing to buffer across multiple lines.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
core/providers/vertex/vertex.go (1)

1272-1275: Add global region handling to embedding endpoint to match other methods.

The embedding endpoint (lines 1273-1274) fails to special-case the global region, unlike all other methods in the file. When region is "global", the current code produces an invalid URL: https://global-aiplatform.googleapis.com/... instead of https://aiplatform.googleapis.com/... with locations/global.

- url := fmt.Sprintf("https://%s-aiplatform.googleapis.com/v1/projects/%s/locations/%s/publishers/google/models/%s:predict",
-     key.VertexKeyConfig.Region, key.VertexKeyConfig.ProjectID, key.VertexKeyConfig.Region, deployment)
+ host := "aiplatform.googleapis.com"
+ location := region
+ if region != "global" {
+     host = fmt.Sprintf("%s-aiplatform.googleapis.com", region)
+ } else {
+     location = "global"
+ }
+ url := fmt.Sprintf("https://%s/v1/projects/%s/locations/%s/publishers/google/models/%s:predict",
+     host, projectID, location, deployment)
core/providers/gemini/utils.go (1)

11-84: ExtraParams stores penalty pointers; this can break downstream SafeExtractFloat64.

presence_penalty and frequency_penalty are stored as *float64 (Lines 54-58). Downstream conversions typically expect primitives (float64) for safe extraction/marshalling consistency. Store dereferenced values instead.

@@
-    if config.PresencePenalty != nil {
-        params.ExtraParams["presence_penalty"] = config.PresencePenalty
-    }
+    if config.PresencePenalty != nil {
+        params.ExtraParams["presence_penalty"] = *config.PresencePenalty
+    }
@@
-    if config.FrequencyPenalty != nil {
-        params.ExtraParams["frequency_penalty"] = config.FrequencyPenalty
-    }
+    if config.FrequencyPenalty != nil {
+        params.ExtraParams["frequency_penalty"] = *config.FrequencyPenalty
+    }
♻️ Duplicate comments (8)
core/providers/vertex/vertex.go (2)

504-505: Use ctx-aware raw flags in Gemini branches (per-request overrides currently bypassed).
Right now Gemini branches pass provider.sendBackRawRequest/Response into HandleProviderResponse, while later gating uses ShouldSendBackRaw*, which can (a) ignore per-request enable/disable, and (b) do unnecessary work.

- rawRequest, rawResponse, bifrostErr := providerUtils.HandleProviderResponse(resp.Body(), &geminiResponse, jsonBody, provider.sendBackRawRequest, provider.sendBackRawResponse)
+ rawRequest, rawResponse, bifrostErr := providerUtils.HandleProviderResponse(
+     resp.Body(),
+     &geminiResponse,
+     jsonBody,
+     providerUtils.ShouldSendBackRawRequest(ctx, provider.sendBackRawRequest),
+     providerUtils.ShouldSendBackRawResponse(ctx, provider.sendBackRawResponse),
+ )

(Apply similarly to the Responses Gemini branch.)

Also applies to: 1012-1013


689-728: Do not send Vertex Gemini auth via ?key=; always use OAuth2 Bearer + avoid URL secret leakage.
These Gemini publisher streaming URLs append key= and skip Authorization when present. This is both a likely functional bug (Vertex publisher endpoints typically require OAuth2) and a security footgun (API keys in URLs get logged, cached, and propagated).

- // Auth query is used to pass the API key in the query string
- authQuery := ""
- if key.Value != "" {
-     authQuery = fmt.Sprintf("key=%s", url.QueryEscape(key.Value))
- }
-
  // Construct the URL for Gemini streaming
  var completeURL string
  if region == "global" {
      completeURL = fmt.Sprintf("https://aiplatform.googleapis.com/v1/projects/%s/locations/global/publishers/google/models/%s:streamGenerateContent", projectID, deployment)
  } else {
      completeURL = fmt.Sprintf("https://%s-aiplatform.googleapis.com/v1/projects/%s/locations/%s/publishers/google/models/%s:streamGenerateContent", region, projectID, region, deployment)
  }
-
- // Add alt=sse parameter
- if authQuery != "" {
-     completeURL = fmt.Sprintf("%s?alt=sse&%s", completeURL, authQuery)
- } else {
-     completeURL = fmt.Sprintf("%s?alt=sse", completeURL)
- }
+ completeURL = fmt.Sprintf("%s?alt=sse", completeURL)

  headers := map[string]string{
+     "Content-Type":  "application/json",
      "Accept":        "text/event-stream",
      "Cache-Control": "no-cache",
  }

- // If no auth query, use OAuth2 token
- if authQuery == "" {
-     tokenSource, err := getAuthTokenSource(key)
-     ...
-     headers["Authorization"] = "Bearer " + token.AccessToken
- }
+ // Always use OAuth2 token for Vertex publisher endpoints
+ tokenSource, err := getAuthTokenSource(key)
+ if err != nil { ... }
+ token, err := tokenSource.Token()
+ if err != nil { ... }
+ headers["Authorization"] = "Bearer " + token.AccessToken
Vertex AI Gemini publisher endpoint authentication: does `:streamGenerateContent` support API keys via `?key=` / `x-goog-api-key`, or is OAuth2 Bearer required?

Also applies to: 1164-1202

tests/integrations/tests/test_google.py (2)

86-89: Fix unused test_config args in new tests (ruff), and align numbering if intended.
Ruff flags test_config unused in test_26_extended_thinking / test_27_extended_thinking_streaming.

- def test_26_extended_thinking(self, google_client, test_config):
+ def test_26_extended_thinking(self, google_client, _test_config):

- def test_27_extended_thinking_streaming(self, google_client, test_config):
+ def test_27_extended_thinking_streaming(self, google_client, _test_config):

Also applies to: 889-1054


512-521: Test name/docstring still say “error handling”, but assertions expect success.
Either rename to “graceful handling” (and keep success assertions) or switch assertions back to validating the error path.

core/providers/gemini/chat.go (2)

176-339: Fix thought signature serialization: don’t cast []byte to string in JSON payloads.

In ToBifrostChatCompletionStream, thought_signature is set via string(part.ThoughtSignature) (Lines 259-266). This will corrupt non-UTF8 bytes on JSON marshal. Base64 encode (same file already uses base64 for reasoningDetails).
This matches a previously raised issue.

@@
-                if len(part.ThoughtSignature) > 0 {
+                if len(part.ThoughtSignature) > 0 {
+                    thoughtSigB64 := base64.StdEncoding.EncodeToString(part.ThoughtSignature)
                     toolCall.ExtraContent = map[string]interface{}{
                         "google": map[string]interface{}{
-                            "thought_signature": string(part.ThoughtSignature),
+                            "thought_signature": thoughtSigB64,
                         },
                     }
                 }

188-193: “Last chunk” detection contradicts the comment and can miss stream termination.

The comment says “finish reason or usage metadata” (Line 190), but isLastChunk only checks FinishReason (Line 191). If Gemini ends with usage metadata but no finish reason, you may never emit the terminal chunk/usage. This matches a previously raised issue and looks regressed here.

-    isLastChunk := candidate.FinishReason != "" && candidate.FinishReason != FinishReasonUnspecified
+    isLastChunk := (candidate.FinishReason != "" && candidate.FinishReason != FinishReasonUnspecified) ||
+        response.UsageMetadata != nil

Also applies to: 333-339

core/providers/gemini/responses.go (2)

428-433: Bug: nil-check uses the wrong field path (Item.EncryptedContent).

In ToGeminiResponsesStreamResponse, the condition checks bifrostResp.Item.EncryptedContent but then dereferences bifrostResp.Item.ResponsesReasoning.EncryptedContent (Lines 429-432). This can panic or skip valid signatures.
This matches a previously raised issue.

-    if bifrostResp.Item != nil && bifrostResp.Item.ResponsesReasoning != nil && bifrostResp.Item.EncryptedContent != nil {
+    if bifrostResp.Item != nil && bifrostResp.Item.ResponsesReasoning != nil && bifrostResp.Item.ResponsesReasoning.EncryptedContent != nil {
         candidate.Content.Parts = append(candidate.Content.Parts, &Part{
             ThoughtSignature: []byte(*bifrostResp.Item.ResponsesReasoning.EncryptedContent),
         })
     }

1049-1053: outputIndex collisions likely when streams start with tool calls / non-text items.

Several handlers force outputIndex = 0 when !state.HasStartedText (e.g., function response / inline data / file data). If the stream begins with tool calls, indices can collide and overwrite state.ItemIDs/tool buffers.
This matches a previously raised issue.

@@
 type GeminiResponsesStreamState struct {
@@
-    CurrentOutputIndex int            // Current output index
+    CurrentOutputIndex int            // Last allocated output index (init to -1)
@@
 }
+
+func (state *GeminiResponsesStreamState) nextOutputIndex() int {
+    state.CurrentOutputIndex++
+    return state.CurrentOutputIndex
+}
@@
-    state.CurrentOutputIndex = 0
+    state.CurrentOutputIndex = -1

Then replace ad-hoc calculations like:

-    outputIndex := state.CurrentOutputIndex + 1
-    if !state.HasStartedText { outputIndex = 0 }
-    state.CurrentOutputIndex = outputIndex
+    outputIndex := state.nextOutputIndex()

Also applies to: 1122-1126, 1199-1204

🧹 Nitpick comments (6)
core/providers/vertex/vertex.go (2)

282-287: Deduplicate google/ deployment normalization (helper).
Same prefix-stripping logic is repeated across multiple entrypoints; consider a small helper on VertexProvider to keep this consistent and avoid drift.

Also applies to: 577-582, 833-838, 1064-1069


1392-1408: Sanitizer is clear; add a safety note re multi-tool-call correlation.
Clearing FunctionCall.ID/FunctionResponse.ID to satisfy Vertex constraints is understandable, but please ensure multi-tool-call flows still correlate correctly (esp. same-name tool calls). A small comment here explaining the correlation strategy would help future maintainers.

transports/bifrost-http/integrations/genai.go (1)

35-56: Good migration to ResponsesRequest, but the route comment is now misleading.

The “Chat completions endpoint” comment (Line 27) no longer matches behavior: this route now multiplexes embedding/speech/transcription/responses. Consider renaming to avoid confusion while maintaining stack-wide clarity.

core/providers/gemini/gemini.go (2)

225-251: ChatCompletion should mirror the explicit nil-check pattern used elsewhere.

ChatCompletion passes ToGeminiChatCompletionRequest(request) directly (Line 228) while ChatCompletionStream/Responses* paths guard nil conversions. For consistency (and clearer errors), wrap it the same way.


335-353: errors.Is(doErr, context.Canceled) is likely ineffective with fasthttp.Client.Do.

fasthttp.Client.Do isn’t context-aware; cancellation usually won’t surface as context.Canceled. Consider simplifying error classification (timeout vs other), and rely on the ctx.Done() loop exit (previous comment) for cancellation behavior.

Also applies to: 628-646

core/providers/gemini/responses.go (1)

96-107: Conversion errors are silently dropped (ToGeminiResponsesRequest returns nil).

When convertResponsesMessagesToGeminiContents fails, ToGeminiResponsesRequest returns nil (Lines 99-101), losing the underlying error detail (callers only see a generic “could not be converted” elsewhere). Consider restoring an error return ((*GeminiGenerationRequest, error)) to preserve actionable diagnostics across the stack.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between fd14a0e and 3154215.

📒 Files selected for processing (17)
  • core/providers/gemini/chat.go (3 hunks)
  • core/providers/gemini/embedding.go (0 hunks)
  • core/providers/gemini/gemini.go (1 hunks)
  • core/providers/gemini/gemini_test.go (2 hunks)
  • core/providers/gemini/responses.go (14 hunks)
  • core/providers/gemini/speech.go (2 hunks)
  • core/providers/gemini/transcription.go (0 hunks)
  • core/providers/gemini/types.go (1 hunks)
  • core/providers/gemini/utils.go (8 hunks)
  • core/providers/vertex/vertex.go (13 hunks)
  • core/providers/vertex/vertex_test.go (2 hunks)
  • core/schemas/responses.go (1 hunks)
  • core/schemas/utils.go (1 hunks)
  • tests/integrations/config.yml (1 hunks)
  • tests/integrations/tests/test_google.py (4 hunks)
  • tests/integrations/tests/utils/common.py (1 hunks)
  • transports/bifrost-http/integrations/genai.go (2 hunks)
💤 Files with no reviewable changes (2)
  • core/providers/gemini/transcription.go
  • core/providers/gemini/embedding.go
🚧 Files skipped from review as they are similar to previous changes (5)
  • core/providers/vertex/vertex_test.go
  • tests/integrations/config.yml
  • core/schemas/utils.go
  • core/providers/gemini/speech.go
  • core/schemas/responses.go
🧰 Additional context used
📓 Path-based instructions (1)
**

⚙️ CodeRabbit configuration file

always check the stack if there is one for the current PR. do not give localized reviews for the PR, always see all changes in the light of the whole stack of PRs (if there is a stack, if there is no stack you can continue to make localized suggestions/reviews)

Files:

  • core/providers/gemini/types.go
  • core/providers/vertex/vertex.go
  • core/providers/gemini/gemini_test.go
  • transports/bifrost-http/integrations/genai.go
  • tests/integrations/tests/test_google.py
  • tests/integrations/tests/utils/common.py
  • core/providers/gemini/utils.go
  • core/providers/gemini/chat.go
  • core/providers/gemini/gemini.go
  • core/providers/gemini/responses.go
🧠 Learnings (2)
📚 Learning: 2025-12-09T17:07:42.007Z
Learnt from: qwerty-dvorak
Repo: maximhq/bifrost PR: 1006
File: core/schemas/account.go:9-18
Timestamp: 2025-12-09T17:07:42.007Z
Learning: In core/schemas/account.go, the HuggingFaceKeyConfig field within the Key struct is currently unused and reserved for future Hugging Face inference endpoint deployments. Do not flag this field as missing from OpenAPI documentation or require its presence in the API spec until the feature is actively implemented and used. When the feature is added, update the OpenAPI docs accordingly; otherwise, treat this field as non-breaking and not part of the current API surface.

Applied to files:

  • core/providers/gemini/types.go
  • core/providers/vertex/vertex.go
  • core/providers/gemini/gemini_test.go
  • transports/bifrost-http/integrations/genai.go
  • core/providers/gemini/utils.go
  • core/providers/gemini/chat.go
  • core/providers/gemini/gemini.go
  • core/providers/gemini/responses.go
📚 Learning: 2025-12-12T08:25:02.629Z
Learnt from: Pratham-Mishra04
Repo: maximhq/bifrost PR: 1000
File: transports/bifrost-http/integrations/router.go:709-712
Timestamp: 2025-12-12T08:25:02.629Z
Learning: In transports/bifrost-http/**/*.go, update streaming response handling to align with OpenAI Responses API: use typed SSE events such as response.created, response.output_text.delta, response.done, etc., and do not rely on the legacy data: [DONE] termination marker. Note that data: [DONE] is only used by the older Chat Completions and Text Completions streaming APIs. Ensure parsers, writers, and tests distinguish SSE events from the [DONE] sentinel and handle each event type accordingly for correct stream termination and progress updates.

Applied to files:

  • transports/bifrost-http/integrations/genai.go
🧬 Code graph analysis (4)
core/providers/gemini/utils.go (3)
core/providers/gemini/types.go (22)
  • GeminiGenerationRequest (55-73)
  • GenerationConfig (631-697)
  • ThinkingLevel (915-915)
  • ThinkingLevelUnspecified (918-918)
  • ThinkingLevelLow (919-919)
  • ThinkingLevelHigh (920-920)
  • Type (778-778)
  • Schema (724-775)
  • FinishReason (22-22)
  • FinishReasonStop (28-28)
  • FinishReasonMaxTokens (30-30)
  • FinishReasonSafety (33-33)
  • FinishReasonRecitation (35-35)
  • FinishReasonLanguage (37-37)
  • FinishReasonOther (39-39)
  • FinishReasonBlocklist (41-41)
  • FinishReasonProhibitedContent (43-43)
  • FinishReasonSPII (46-46)
  • FinishReasonMalformedFunctionCall (48-48)
  • FinishReasonImageSafety (50-50)
  • FinishReasonUnexpectedToolCall (52-52)
  • GenerateContentResponseUsageMetadata (1270-1295)
core/schemas/responses.go (7)
  • ResponsesParameters (87-114)
  • ResponsesParametersReasoning (238-243)
  • ResponsesTextConfig (120-123)
  • ResponsesTextConfigFormat (125-130)
  • ResponsesResponseUsage (258-265)
  • ResponsesResponseOutputTokens (275-285)
  • ResponsesResponseInputTokens (267-273)
core/schemas/chatcompletions.go (4)
  • ToolFunctionParameters (259-266)
  • BifrostLLMUsage (812-819)
  • ChatPromptTokensDetails (821-827)
  • ChatCompletionTokensDetails (829-840)
core/providers/gemini/chat.go (5)
core/schemas/chatcompletions.go (15)
  • BifrostChatRequest (12-19)
  • ChatAssistantMessageToolCall (686-692)
  • ChatContentBlock (590-597)
  • ChatReasoningDetails (723-730)
  • ChatAssistantMessageToolCallFunction (695-698)
  • BifrostReasoningDetailsTypeEncrypted (718-718)
  • ChatMessage (469-478)
  • ChatMessageRoleAssistant (461-461)
  • ChatMessageContent (524-527)
  • ChatAssistantMessage (626-632)
  • BifrostResponseChoice (704-712)
  • ChatNonStreamResponseChoice (745-748)
  • BifrostChatResponse (26-41)
  • ChatStreamResponseChoiceDelta (756-763)
  • ChatStreamResponseChoice (751-753)
core/schemas/utils.go (1)
  • Ptr (16-18)
core/schemas/bifrost.go (1)
  • BifrostError (358-367)
core/schemas/models.go (1)
  • Model (109-129)
core/providers/gemini/utils.go (1)
  • ConvertGeminiFinishReasonToBifrost (227-232)
core/providers/gemini/gemini.go (5)
core/providers/gemini/chat.go (1)
  • ToGeminiChatCompletionRequest (12-63)
core/schemas/bifrost.go (9)
  • RequestType (83-83)
  • ChatCompletionRequest (89-89)
  • BifrostStream (323-330)
  • BifrostError (358-367)
  • Gemini (48-48)
  • ModelProvider (32-32)
  • ErrorField (376-383)
  • BifrostErrorExtraFields (425-429)
  • BifrostResponseExtraFields (288-298)
core/providers/utils/utils.go (8)
  • CheckContextAndGetRequestBody (257-275)
  • GetPathFromContext (212-217)
  • SetExtraHeaders (179-209)
  • ReleaseStreamingResponse (844-851)
  • NewBifrostOperationError (516-527)
  • NewProviderAPIError (531-546)
  • ProcessAndSendResponse (612-642)
  • GetBifrostResponseForStreamResponse (854-882)
core/providers/gemini/types.go (1)
  • Type (778-778)
core/utils.go (1)
  • Ptr (51-53)
core/providers/gemini/responses.go (3)
core/providers/gemini/types.go (14)
  • GeminiGenerationRequest (55-73)
  • ToolConfig (146-151)
  • GenerationConfig (631-697)
  • SafetySetting (81-89)
  • Part (947-971)
  • Role (13-13)
  • Content (933-941)
  • Type (778-778)
  • FinishReason (22-22)
  • FinishReasonMaxTokens (30-30)
  • FinishReasonSafety (33-33)
  • FinishReasonOther (39-39)
  • FinishReasonStop (28-28)
  • Tool (600-627)
core/schemas/responses.go (21)
  • BifrostResponsesRequest (32-39)
  • BifrostResponsesResponse (45-85)
  • ResponsesToolMessage (466-486)
  • ResponsesReasoning (735-738)
  • ResponsesStreamResponseTypePing (1365-1365)
  • ResponsesStreamResponseTypeCreated (1367-1367)
  • ResponsesStreamResponseTypeInProgress (1368-1368)
  • ResponsesStreamResponseTypeCompleted (1369-1369)
  • ResponsesStreamResponseTypeFailed (1370-1370)
  • ResponsesMessage (318-331)
  • ResponsesMessageContent (343-348)
  • ResponsesMessageContentBlock (403-415)
  • ResponsesOutputMessageContentTypeText (396-396)
  • ResponsesToolMessageOutputStruct (535-539)
  • ResponsesInputMessageContentBlockTypeText (392-392)
  • ResponsesOutputMessageContentTypeReasoning (398-398)
  • ResponsesInputMessageContentBlockImage (417-420)
  • ResponsesInputMessageContentBlockAudio (428-431)
  • ResponsesInputMessageContentBlockFile (422-426)
  • ResponsesTool (1034-1049)
  • ResponsesParameters (87-114)
core/schemas/utils.go (4)
  • ParseModelString (23-36)
  • SafeExtractFromMap (519-525)
  • SafeExtractString (280-297)
  • Ptr (16-18)
🪛 Ruff (0.14.8)
tests/integrations/tests/test_google.py

890-890: Unused method argument: test_config

(ARG002)


966-966: Unused method argument: test_config

(ARG002)

⏰ Context from checks skipped due to timeout of 900000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (22)
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
🔇 Additional comments (5)
core/providers/vertex/vertex.go (1)

314-330: Gemini native request shaping + Vertex generateContent routing looks consistent.
Converting via gemini.ToGeminiChatCompletionRequest, setting Model=deployment, and routing to ...:generateContent is aligned with the PR goal of using native Gemini formats.

Also applies to: 403-408

tests/integrations/tests/utils/common.py (2)

225-246: Reasoning prompt fixtures are straightforward and reusable.
The new constants keep integration tests cleaner and avoid duplicating long prompts.


248-256: IMAGE_URL_MESSAGES shape change: ensure all consumers accept the new multimodal block format.
If any integration still expects a string content or the old single-block structure, it will break.

core/providers/gemini/types.go (1)

903-921: ThinkingLevel implementation is correct and follows established codebase patterns.

The enum values and omitempty behavior are consistent with other similar enum types in the codebase (FinishReason, FunctionCallingConfigMode, Behavior, etc.). All non-pointer string enum fields serialize their unspecified value explicitly (e.g., "THINKING_LEVEL_UNSPECIFIED"), which is the expected behavior per Gemini API. No changes needed.

transports/bifrost-http/integrations/genai.go (1)

72-84: Ensure “nil-skip” streaming conversion cannot emit blank/invalid SSE frames.

ResponsesStreamResponseConverter returns ("", nil, nil) when gemini.ToGeminiResponsesStreamResponse(resp) is nil (Lines 73-79). Please verify the downstream stream writer treats this as “skip” (no write), not as an empty event/data frame.
Based on learnings, this is especially important because Responses-style streaming relies on typed events, not [DONE] sentinels.

@Pratham-Mishra04 Pratham-Mishra04 changed the base branch from 12-04-feat_responses_reasoning_fixes to graphite-base/1018 December 12, 2025 11:24
@TejasGhatte TejasGhatte force-pushed the 12-08-feat_added_support_for_gemini_native_converterswq branch from 3154215 to d5b03f0 Compare December 12, 2025 11:25
@TejasGhatte TejasGhatte changed the base branch from graphite-base/1018 to 12-04-feat_responses_reasoning_fixes December 12, 2025 11:26
@Pratham-Mishra04 Pratham-Mishra04 force-pushed the 12-04-feat_responses_reasoning_fixes branch from b6f0c52 to ad08844 Compare December 12, 2025 11:26
@TejasGhatte TejasGhatte force-pushed the 12-04-feat_responses_reasoning_fixes branch from ad08844 to e1ce12c Compare December 12, 2025 11:40
@TejasGhatte TejasGhatte force-pushed the 12-08-feat_added_support_for_gemini_native_converterswq branch from d5b03f0 to 7f3e893 Compare December 12, 2025 11:40
@Pratham-Mishra04 Pratham-Mishra04 changed the base branch from 12-04-feat_responses_reasoning_fixes to graphite-base/1018 December 12, 2025 12:15
Copy link
Collaborator

Pratham-Mishra04 commented Dec 12, 2025

Merge activity

  • Dec 12, 1:21 PM UTC: A user started a stack merge that includes this pull request via Graphite.
  • Dec 12, 1:32 PM UTC: Graphite couldn't merge this PR because it had merge conflicts.
  • Dec 12, 2:43 PM UTC: A user started a stack merge that includes this pull request via Graphite.
  • Dec 12, 2:44 PM UTC: @Pratham-Mishra04 merged this pull request with Graphite.

@Pratham-Mishra04 Pratham-Mishra04 changed the base branch from graphite-base/1018 to main December 12, 2025 13:31
@TejasGhatte TejasGhatte force-pushed the 12-08-feat_added_support_for_gemini_native_converterswq branch from 7f3e893 to 828c298 Compare December 12, 2025 13:46
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 7

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
core/schemas/responses.go (1)

134-142: Fix structural ambiguity in ResponsesTextConfigFormatJSONSchema: fields belong at different nesting levels per OpenAI spec.

ResponsesTextConfigFormatJSONSchema incorrectly mixes wrapper-level fields (name, strict, description) with JSON-Schema-level fields (type, properties, required, additionalProperties). According to the OpenAI Responses API spec, the correct structure is:

{
  "type": "json_schema",
  "strict": true,
  "schema": {
    "type": "object",
    "properties": {...},
    "required": [...],
    "additionalProperties": false
  }
}

The current struct design will produce incorrect JSON with duplicate/misplaced wrapper fields and a confusing schema.schema nesting when the Schema field is populated alongside schema-level fields. The Schema field itself is problematic—it should either represent the entire schema object (making other schema fields redundant) or not exist at all.

Add a marshal test that validates the exact JSON output against the intended OpenAI shape, then refactor to enforce a single, correct representation: either populate Schema with the complete schema object, or use only the individual schema fields—not both.

core/providers/gemini/responses.go (1)

1705-1735: ThoughtSignature → EncryptedContent should be base64, not string([]byte)

This has the same invalid-UTF8/JSON-corruption risk as the chat streaming path, and it’s inconsistent with other places that base64-encode encrypted content.

- thoughtSig := string(part.ThoughtSignature)
+ thoughtSig := base64.StdEncoding.EncodeToString(part.ThoughtSignature)
  reasoningMsg := schemas.ResponsesMessage{
    ...
    ResponsesReasoning: &schemas.ResponsesReasoning{
      Summary:          []schemas.ResponsesReasoningSummary{},
      EncryptedContent: &thoughtSig,
    },
  }

Then, when converting EncryptedContent back into Part.ThoughtSignature, base64-decode instead of []byte(str) (several call sites in this file currently do the latter).

♻️ Duplicate comments (5)
tests/integrations/tests/test_google.py (2)

512-521: Test 12 intent mismatch (name/docstring vs assertions).

Either rename/docstring to “graceful handling of alternate role content” (keeping success assertions) or switch assertions to validate the error path.


889-1054: Fix unused test_config + make streaming “too many chunks” fail loudly.

Suggested diff:

-    def test_26_extended_thinking(self, google_client, test_config):
+    def test_26_extended_thinking(self, google_client, _test_config):

-    def test_27_extended_thinking_streaming(self, google_client, test_config):
+    def test_27_extended_thinking_streaming(self, google_client, _test_config):
@@
-            if chunk_count > 500:
-                break
+            if chunk_count > 500:
+                raise AssertionError(f"Received >500 streaming chunks; possible non-terminating stream (chunk_count={chunk_count})")

Also please verify get_model("google", "chat") resolves to a thinking-capable Gemini model in your test config (otherwise these assertions will be flaky).

#!/bin/bash
set -euo pipefail

# Show what "chat" maps to for google in your integration test config plumbing.
rg -n --hidden -S 'get_model\("google",\s*"chat"\)|model_type:\s*chat|google:\s*$|provider:\s*google|integration:\s*google' tests/integrations -n || true

# If you have a YAML/JSON config for integrations, print likely candidates.
fd -a 'config\.yml|config\.yaml|integrations\.yml|integrations\.yaml' tests/integrations || true
core/providers/vertex/vertex.go (1)

669-743: Vertex Gemini streaming auth: don’t fall back to API key query params; always use OAuth2.

Both Gemini streaming branches build ?alt=sse&key=... and omit Authorization when key.Value is set. For Vertex publisher streaming (:streamGenerateContent), this is very likely to fail (and can leak keys in logs).

Suggested diff pattern (apply to both ChatCompletionStream Gemini + ResponsesStream Gemini):

- // Auth query is used to pass the API key in the query string
- authQuery := ""
- if key.Value != "" {
-     authQuery = fmt.Sprintf("key=%s", url.QueryEscape(key.Value))
- }
...
- if authQuery != "" {
-     completeURL = fmt.Sprintf("%s?alt=sse&%s", completeURL, authQuery)
- } else {
-     completeURL = fmt.Sprintf("%s?alt=sse", completeURL)
- }
+ completeURL = fmt.Sprintf("%s?alt=sse", completeURL)
...
- // If no auth query, use OAuth2 token
- if authQuery == "" {
-     tokenSource, err := getAuthTokenSource(key)
-     ...
-     headers["Authorization"] = "Bearer " + token.AccessToken
- }
+ tokenSource, err := getAuthTokenSource(key)
+ ...
+ headers["Authorization"] = "Bearer " + token.AccessToken
Vertex AI publisher `streamGenerateContent` authentication: does it support API keys (`?key=...`) or require OAuth2 `Authorization: Bearer <token>` only?

Also applies to: 1134-1226

core/providers/gemini/gemini.go (1)

380-485: Streaming goroutines should stop on ctx.Done() to avoid leaks.

Both streaming handlers keep scanning until EOF without an in-loop cancellation check, so canceled callers can leave goroutines/streams running longer than necessary.

At minimum:

for scanner.Scan() {
+   select {
+   case <-ctx.Done():
+       return
+   default:
+   }
    line := scanner.Text()
    ...
}

(If you can, also ensure the underlying body stream gets closed on cancellation, otherwise scanner.Scan() may still block.)

Also applies to: 677-789

core/providers/gemini/responses.go (1)

424-433: Fix incorrect nil-check field path for reasoning encrypted content.

This condition checks bifrostResp.Item.EncryptedContent but then dereferences bifrostResp.Item.ResponsesReasoning.EncryptedContent. The check should match the dereference:

case schemas.ResponsesStreamResponseTypeOutputItemAdded:
-   if bifrostResp.Item != nil && bifrostResp.Item.ResponsesReasoning != nil && bifrostResp.Item.EncryptedContent != nil {
+   if bifrostResp.Item != nil && bifrostResp.Item.ResponsesReasoning != nil && bifrostResp.Item.ResponsesReasoning.EncryptedContent != nil {
        candidate.Content.Parts = append(candidate.Content.Parts, &Part{
            ThoughtSignature: []byte(*bifrostResp.Item.ResponsesReasoning.EncryptedContent),
        })
    }
🧹 Nitpick comments (2)
tests/integrations/tests/utils/common.py (1)

225-246: Nice addition: shared Gemini reasoning prompts improve test reuse.

Minor: prompts explicitly request “step-by-step reasoning”, which some models/providers may refuse or redact; if you see flakes, consider wording like “explain your reasoning” / “show calculations” instead.

core/providers/vertex/vertex.go (1)

20-21: google/ deployment prefix normalization is a good consistency win.

Given this pattern is repeated 4x, consider a tiny helper (e.g., normalizeGoogleDeployment(deployment string) string) to reduce drift.

Also applies to: 283-286, 578-581, 833-837, 1065-1068

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 3154215 and 828c298.

📒 Files selected for processing (17)
  • core/providers/gemini/chat.go (3 hunks)
  • core/providers/gemini/embedding.go (0 hunks)
  • core/providers/gemini/gemini.go (1 hunks)
  • core/providers/gemini/gemini_test.go (2 hunks)
  • core/providers/gemini/responses.go (14 hunks)
  • core/providers/gemini/speech.go (2 hunks)
  • core/providers/gemini/transcription.go (0 hunks)
  • core/providers/gemini/types.go (1 hunks)
  • core/providers/gemini/utils.go (8 hunks)
  • core/providers/vertex/vertex.go (13 hunks)
  • core/providers/vertex/vertex_test.go (2 hunks)
  • core/schemas/responses.go (1 hunks)
  • core/schemas/utils.go (1 hunks)
  • tests/integrations/config.yml (1 hunks)
  • tests/integrations/tests/test_google.py (4 hunks)
  • tests/integrations/tests/utils/common.py (1 hunks)
  • transports/bifrost-http/integrations/genai.go (2 hunks)
💤 Files with no reviewable changes (2)
  • core/providers/gemini/transcription.go
  • core/providers/gemini/embedding.go
🚧 Files skipped from review as they are similar to previous changes (6)
  • tests/integrations/config.yml
  • core/providers/gemini/gemini_test.go
  • core/providers/gemini/types.go
  • core/providers/gemini/speech.go
  • core/providers/vertex/vertex_test.go
  • core/schemas/utils.go
🧰 Additional context used
📓 Path-based instructions (1)
**

⚙️ CodeRabbit configuration file

always check the stack if there is one for the current PR. do not give localized reviews for the PR, always see all changes in the light of the whole stack of PRs (if there is a stack, if there is no stack you can continue to make localized suggestions/reviews)

Files:

  • tests/integrations/tests/utils/common.py
  • core/providers/vertex/vertex.go
  • transports/bifrost-http/integrations/genai.go
  • core/providers/gemini/gemini.go
  • core/schemas/responses.go
  • tests/integrations/tests/test_google.py
  • core/providers/gemini/utils.go
  • core/providers/gemini/chat.go
  • core/providers/gemini/responses.go
🧠 Learnings (2)
📚 Learning: 2025-12-09T17:07:42.007Z
Learnt from: qwerty-dvorak
Repo: maximhq/bifrost PR: 1006
File: core/schemas/account.go:9-18
Timestamp: 2025-12-09T17:07:42.007Z
Learning: In core/schemas/account.go, the HuggingFaceKeyConfig field within the Key struct is currently unused and reserved for future Hugging Face inference endpoint deployments. Do not flag this field as missing from OpenAPI documentation or require its presence in the API spec until the feature is actively implemented and used. When the feature is added, update the OpenAPI docs accordingly; otherwise, treat this field as non-breaking and not part of the current API surface.

Applied to files:

  • core/providers/vertex/vertex.go
  • transports/bifrost-http/integrations/genai.go
  • core/providers/gemini/gemini.go
  • core/schemas/responses.go
  • core/providers/gemini/utils.go
  • core/providers/gemini/chat.go
  • core/providers/gemini/responses.go
📚 Learning: 2025-12-12T08:25:02.629Z
Learnt from: Pratham-Mishra04
Repo: maximhq/bifrost PR: 1000
File: transports/bifrost-http/integrations/router.go:709-712
Timestamp: 2025-12-12T08:25:02.629Z
Learning: In transports/bifrost-http/**/*.go, update streaming response handling to align with OpenAI Responses API: use typed SSE events such as response.created, response.output_text.delta, response.done, etc., and do not rely on the legacy data: [DONE] termination marker. Note that data: [DONE] is only used by the older Chat Completions and Text Completions streaming APIs. Ensure parsers, writers, and tests distinguish SSE events from the [DONE] sentinel and handle each event type accordingly for correct stream termination and progress updates.

Applied to files:

  • transports/bifrost-http/integrations/genai.go
🧬 Code graph analysis (6)
core/providers/vertex/vertex.go (4)
core/schemas/utils.go (1)
  • IsGeminiModel (1052-1054)
core/providers/gemini/chat.go (1)
  • ToGeminiChatCompletionRequest (12-63)
core/providers/gemini/types.go (1)
  • GeminiGenerationRequest (55-73)
core/providers/gemini/gemini.go (2)
  • HandleGeminiChatCompletionStream (304-495)
  • HandleGeminiResponsesStream (597-834)
transports/bifrost-http/integrations/genai.go (4)
core/schemas/bifrost.go (1)
  • ResponsesRequest (91-91)
transports/bifrost-http/handlers/inference.go (1)
  • ResponsesRequest (257-261)
core/schemas/responses.go (2)
  • BifrostResponsesResponse (45-85)
  • BifrostResponsesStreamResponse (1431-1470)
core/providers/gemini/responses.go (2)
  • ToGeminiResponsesResponse (148-342)
  • ToGeminiResponsesStreamResponse (344-519)
core/providers/gemini/gemini.go (9)
core/providers/gemini/chat.go (1)
  • ToGeminiChatCompletionRequest (12-63)
core/schemas/bifrost.go (10)
  • RequestType (83-83)
  • ChatCompletionRequest (89-89)
  • BifrostStream (323-330)
  • BifrostError (358-367)
  • Gemini (48-48)
  • ModelProvider (32-32)
  • ErrorField (376-383)
  • BifrostErrorExtraFields (425-429)
  • BifrostResponseExtraFields (288-298)
  • ResponsesRequest (91-91)
core/schemas/provider.go (2)
  • Provider (282-309)
  • PostHookRunner (279-279)
core/providers/utils/utils.go (2)
  • ShouldSendBackRawResponse (559-564)
  • GetPathFromContext (212-217)
core/schemas/chatcompletions.go (2)
  • BifrostChatRequest (12-19)
  • BifrostChatResponse (26-41)
core/utils.go (1)
  • Ptr (51-53)
core/schemas/responses.go (3)
  • BifrostResponsesRequest (32-39)
  • BifrostResponsesResponse (45-85)
  • BifrostResponsesStreamResponse (1431-1470)
transports/bifrost-http/handlers/inference.go (1)
  • ResponsesRequest (257-261)
core/providers/gemini/responses.go (1)
  • ToGeminiResponsesRequest (71-121)
tests/integrations/tests/test_google.py (2)
tests/integrations/tests/utils/config_loader.py (2)
  • get_model (140-161)
  • get_model (443-445)
tests/integrations/tests/utils/common.py (1)
  • skip_if_no_api_key (1625-1636)
core/providers/gemini/chat.go (4)
core/schemas/chatcompletions.go (15)
  • BifrostChatRequest (12-19)
  • ChatAssistantMessageToolCall (686-692)
  • ChatContentBlock (590-597)
  • ChatReasoningDetails (723-730)
  • ChatContentBlockTypeText (582-582)
  • ChatAssistantMessageToolCallFunction (695-698)
  • ChatToolChoiceTypeFunction (373-373)
  • ChatMessage (469-478)
  • ChatMessageRoleAssistant (461-461)
  • ChatMessageContent (524-527)
  • ChatAssistantMessage (626-632)
  • BifrostResponseChoice (704-712)
  • BifrostChatResponse (26-41)
  • ChatStreamResponseChoiceDelta (756-763)
  • ChatStreamResponseChoice (751-753)
core/providers/gemini/types.go (6)
  • GeminiGenerationRequest (55-73)
  • GenerationConfig (631-697)
  • Content (933-941)
  • Type (778-778)
  • Role (13-13)
  • FinishReason (22-22)
core/schemas/utils.go (1)
  • Ptr (16-18)
core/providers/gemini/utils.go (1)
  • ConvertGeminiFinishReasonToBifrost (227-232)
core/providers/gemini/responses.go (3)
core/providers/gemini/types.go (14)
  • GeminiGenerationRequest (55-73)
  • ToolConfig (146-151)
  • GenerationConfig (631-697)
  • GenerateContentResponse (1298-1314)
  • Part (947-971)
  • Role (13-13)
  • Content (933-941)
  • Type (778-778)
  • FunctionCall (1102-1112)
  • FunctionResponse (1115-1137)
  • GenerateContentResponseUsageMetadata (1270-1295)
  • FileData (1091-1099)
  • Blob (974-982)
  • Tool (600-627)
core/schemas/responses.go (9)
  • BifrostResponsesRequest (32-39)
  • ResponsesInputMessageRoleAssistant (336-336)
  • ResponsesInputMessageRoleUser (337-337)
  • ResponsesMessageTypeFunctionCall (299-299)
  • ResponsesToolMessage (466-486)
  • ResponsesMessageTypeReasoning (311-311)
  • BifrostResponsesStreamResponse (1431-1470)
  • ResponsesMessage (318-331)
  • ResponsesTool (1034-1049)
core/schemas/utils.go (4)
  • ParseModelString (23-36)
  • SafeExtractFromMap (519-525)
  • SafeExtractString (280-297)
  • Ptr (16-18)
🪛 Ruff (0.14.8)
tests/integrations/tests/test_google.py

890-890: Unused method argument: test_config

(ARG002)


966-966: Unused method argument: test_config

(ARG002)

🔇 Additional comments (7)
tests/integrations/tests/test_google.py (1)

86-89: Imports for Gemini reasoning prompts look correct.

core/providers/gemini/chat.go (1)

24-34: OK: modalities plumbing removed; but double-check speech/responseModalities callers didn’t rely on this.

convertParamsToGenerationConfig(..., []string{}) is fine if chat completion should no longer infer modalities here. Please just verify there are no remaining call sites expecting non-empty modalities for audio/speech on the chat path.

core/providers/gemini/gemini.go (1)

216-251: ChatCompletion path looks coherent end-to-end.

One small consistency check: ChatCompletion sets RawResponse when enabled, but not RawRequest. If other providers include RawRequest for non-streaming chat, consider matching that pattern here too (optional).

core/providers/gemini/utils.go (2)

11-84: Responses parameter mapping is sensible; ThinkingConfig → Reasoning mapping looks OK.

Just verify the LogprobsTopLogProbs cast and any expected bounds (since this is surfaced as an int pointer now).


872-924: buildOpenAIResponseFormat return type update looks correct.

LGTM: returning ResponsesTextConfig for json_object/json_schema aligns better with Responses API surfaces.

transports/bifrost-http/integrations/genai.go (2)

51-53: Responses API converters are correctly implemented; the SSE writer properly skips lifecycle events returned as (eventName: "", payload: nil, err: nil).

The switch to ResponsesRequest + ResponsesResponseConverter + ToGeminiResponsesResponse is consistent with the established pattern. The SSE writer in router.go (line 809–811) explicitly handles the skip case: when convertedResponse == nil && err == nil, it continues without writing, which is exactly what GenAI's ResponsesStreamResponseConverter returns for lifecycle events with no Gemini equivalent. The same pattern is used consistently across OpenAI, Anthropic, and Bedrock integrations, confirming this is intentional design rather than a transport hazard.

Also applies to: 60-62


73-80: The implementation is correct as-is. The stream writer already properly handles nil payloads: the router explicitly skips chunks where convertedResponse == nil && err == nil (no output is sent). Additionally, GenAI streaming termination is already correctly configured—the [DONE] marker logic explicitly excludes GenAI (config.Type != RouteConfigTypeGenAI), allowing it to signal completion by closing the stream rather than sending [DONE], which is the correct behavior for Responses APIs. No changes are needed.

Likely an incorrect or invalid review comment.

@Pratham-Mishra04 Pratham-Mishra04 merged commit 04fa69e into main Dec 12, 2025
9 checks passed
@Pratham-Mishra04 Pratham-Mishra04 deleted the 12-08-feat_added_support_for_gemini_native_converterswq branch December 12, 2025 14:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants