Skip to content

Conversation

@TejasGhatte
Copy link
Collaborator

@TejasGhatte TejasGhatte commented Jan 6, 2026

Summary

Enhanced Gemini provider to support both responseJsonSchema and responseSchema for JSON response formatting, and updated integration tests to improve reliability and coverage.

Changes

  • Added support for responseSchema in Gemini's buildOpenAIResponseFormat function
  • Implemented type normalization for schema conversion
  • Updated integration test configuration:
    • Added Cohere endpoint
    • Changed file model from gpt-5 to gpt-4o
    • Updated embeddings model path
    • Disabled virtual key testing by default
  • Modified test thresholds for embedding similarity to be more lenient (0.7 → 0.6)
  • Added reasoning_summary parameter to Anthropic thinking test
  • Improved content extraction in test utilities to handle Anthropic thinking blocks
  • Fixed test assertions to better match expected content

Type of change

  • Bug fix
  • Feature
  • Refactor
  • Documentation
  • Chore/CI

Affected areas

  • Core (Go)
  • Transports (HTTP)
  • Providers/Integrations
  • Plugins
  • UI (Next.js)
  • Docs

How to test

Run the integration tests to verify the changes:

# Core/Transports
go test ./core/providers/gemini/...

# Integration tests
cd tests/integrations
python -m pytest tests/test_anthropic.py::TestAnthropicAPI::test_16_extended_thinking_streaming -v
python -m pytest tests/test_openai.py::TestOpenAIAPI::test_23_embedding_similarity_analysis -v

Breaking changes

  • No
  • Yes

Related issues

Improves JSON schema handling for Gemini provider responses

Security considerations

No security implications.

Checklist

  • I added/updated tests where appropriate
  • I verified the CI pipeline passes locally if applicable

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Jan 6, 2026

📝 Walkthrough

Summary by CodeRabbit

  • New Features

    • Vertex provider added; Cohere endpoint mapping and provider key included.
  • Updates

    • OpenAI file model set to gpt-4o.
    • Embeddings config revised across Bedrock/Cohere.
    • Response schema handling improved with normalization and fallback behavior.
    • Provider capability listings expanded for Vertex/Gemini/Bedrock.
  • Bug Fixes / Behavior Changes

    • Streaming extended-thinking can include extra request data.
    • Function-call messages no longer expose tool-name field.
  • Tests

    • Lowered embedding similarity thresholds; expanded streaming/reasoning tests and updated test content.

✏️ Tip: You can customize this high-level summary in your review settings.

Walkthrough

buildOpenAIResponseFormat gained an optional Schema parameter and now marshals/unmarshals/normalizes response schemas as a fallback to responseJsonSchema; Gemini conversion stops setting tool names on function responses; configs add Vertex/Cohere entries; tests and utilities expand streaming/thinking handling and Anthropic now accepts extra_body for streaming.

Changes

Cohort / File(s) Summary
Core schema handling
core/providers/gemini/utils.go
buildOpenAIResponseFormat signature changed to (..., responseSchema *Schema); adds marshaling/unmarshaling of responseSchema, lowercase normalization of type fields, title extraction, json_schema construction, and fallback to json_object. Call-site updated to pass config.ResponseSchema.
Gemini response conversion
core/providers/gemini/responses.go
Removed assignment of the tool name on ResponsesToolMessage for FunctionResponse parts during conversion (behavior change; no exported signature change).
Integration config
tests/integrations/config.yml
Added bifrost.endpoints.cohere; changed OpenAI model gpt-5gpt-4o; added vertex provider block and provider_api_keys.vertex; updated Bedrock/Cohere embeddings to global.cohere.embed-v4:0; added vertex capability flags and provider_scenarios.
Test utilities
tests/integrations/tests/utils/common.py
Added API key mappings for cohere and vertex; streaming collector now includes response.summary_text.delta; get_content_string_with_summary() enhanced to detect/aggregate Anthropic-style thinking (summary/content), LangChain AIMessage thinking blocks, plain strings, and set has_reasoning_content.
Anthropic tests / client
tests/integrations/tests/test_anthropic.py
Streaming extended-thinking test now passes extra_body (e.g., {"reasoning_summary":"detailed"}); AnthropicClient.messages.create supports an optional extra_body to include additional request-body fields for streaming.
OpenAI/provider tests
tests/integrations/tests/test_openai.py
Lowered embedding similarity thresholds (>0.7 → >0.6); broadened embedding usage exclusions to include Bedrock; increased reasoning max_output_tokens; added streaming reasoning tests (test_38a_responses_reasoning_streaming, test_38b_responses_reasoning_streaming_with_summary); updated assertions/content to space/exploration theme.
Manifest / misc
go.mod
Listed in diff; no detailed behavioral changes summarized.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Poem

🐰 I nibbled bytes and found a seam,
Schemas shifted, neat and clean,
New providers hopped into view,
Tests now listen for thinking too,
A tiny hop — but dreams renewed 🌿

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately captures the main changes: integration test fixes and adding responseSchema support to Gemini JSON response formatting.
Description check ✅ Passed The description comprehensively covers the PR objectives, changes, affected areas, and includes testing instructions. All major sections of the template are completed with relevant details.
Docstring Coverage ✅ Passed Docstring coverage is 80.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch 01-06-fix_integration_test_fixes

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Collaborator Author

TejasGhatte commented Jan 6, 2026

@TejasGhatte TejasGhatte changed the title fix: integration test fixes fix: support responseSchema in Gemini JSON response format Jan 6, 2026
@github-actions
Copy link
Contributor

github-actions bot commented Jan 6, 2026

🧪 Test Suite Available

This PR can be tested by a repository admin.

Run tests for PR #1255

@TejasGhatte TejasGhatte changed the title fix: support responseSchema in Gemini JSON response format fix: integration test fixes and support responseSchema in Gemini JSON response format Jan 6, 2026
@TejasGhatte TejasGhatte marked this pull request as ready for review January 6, 2026 08:10
@TejasGhatte TejasGhatte force-pushed the 01-06-fix_integration_test_fixes branch from d3ef392 to b2315af Compare January 6, 2026 08:13
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI Agents
In @tests/integrations/tests/test_openai.py:
- Around line 931-938: The inline comment above the similarity assertions is
inconsistent (it says "> 0.7" but the assertions use "> 0.6"); update the
comment near the similarity checks (around the assertions for similarity_1_2,
similarity_1_3, similarity_2_3 in the test file) to state the actual threshold
being enforced ("> 0.6") or change the assertions to > 0.7 if you intend that
higher threshold—ensure the comment and the assertions (similarity_1_2,
similarity_1_3, similarity_2_3) match.
🧹 Nitpick comments (2)
core/providers/gemini/utils.go (1)

165-165: Response schema handling for Gemini→Responses looks correct; watch for non‑map ResponseJSONSchema

The new buildOpenAIResponseFormat(responseJsonSchema, responseSchema) logic and the updated call:

  • Prefer an explicit ResponseJSONSchema (must be a map[string]interface{}) and only fall back to ResponseSchema when the JSON schema map is absent.
  • When only ResponseSchema is present, marshalling to a map plus convertTypeToLowerCase and buildJSONSchemaFromMap gives you a reasonable OpenAI‑style json_schema format, including lower‑cased "type" fields.
  • If ResponseJSONSchema is ever set to a non‑map value (e.g., a typed struct), this will silently degrade to Format.Type = "json_object" rather than json_schema.

Functionally this adds the desired support for both responseJsonSchema and responseSchema while keeping existing behavior intact; just ensure all writers of ResponseJSONSchema in the stack keep using a plain map[string]interface{} to avoid accidental fallback to json_object.

Also applies to: 1068-1127

tests/integrations/tests/test_openai.py (1)

1018-1021: Update comment to reflect bedrock exclusion.

The code now excludes both gemini and bedrock from usage data checks, but the comment only mentions gemini and openai. Consider updating the comment to clarify why bedrock is excluded.

🔎 Suggested comment fix
-        if provider != "gemini" and provider != "bedrock": # gemini does not return usage data and openai does not return usage data for long text
+        if provider != "gemini" and provider != "bedrock": # gemini and bedrock do not return usage data for long text embeddings
             assert response.usage is not None, "Usage should be reported for longer text"
             assert response.usage.total_tokens > 20, "Longer text should consume more tokens"
📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 8e2abee and b2315af.

📒 Files selected for processing (5)
  • core/providers/gemini/utils.go
  • tests/integrations/config.yml
  • tests/integrations/tests/test_anthropic.py
  • tests/integrations/tests/test_openai.py
  • tests/integrations/tests/utils/common.py
🧰 Additional context used
📓 Path-based instructions (1)
**

⚙️ CodeRabbit configuration file

always check the stack if there is one for the current PR. do not give localized reviews for the PR, always see all changes in the light of the whole stack of PRs (if there is a stack, if there is no stack you can continue to make localized suggestions/reviews)

Files:

  • core/providers/gemini/utils.go
  • tests/integrations/tests/utils/common.py
  • tests/integrations/config.yml
  • tests/integrations/tests/test_anthropic.py
  • tests/integrations/tests/test_openai.py
🧠 Learnings (4)
📚 Learning: 2025-12-09T17:07:42.007Z
Learnt from: qwerty-dvorak
Repo: maximhq/bifrost PR: 1006
File: core/schemas/account.go:9-18
Timestamp: 2025-12-09T17:07:42.007Z
Learning: In core/schemas/account.go, the HuggingFaceKeyConfig field within the Key struct is currently unused and reserved for future Hugging Face inference endpoint deployments. Do not flag this field as missing from OpenAPI documentation or require its presence in the API spec until the feature is actively implemented and used. When the feature is added, update the OpenAPI docs accordingly; otherwise, treat this field as non-breaking and not part of the current API surface.

Applied to files:

  • core/providers/gemini/utils.go
📚 Learning: 2025-12-29T11:54:55.836Z
Learnt from: akshaydeo
Repo: maximhq/bifrost PR: 1153
File: framework/configstore/rdb.go:2221-2246
Timestamp: 2025-12-29T11:54:55.836Z
Learning: In Go reviews, do not flag range-over-int patterns like for i := range n as compile-time errors, assuming Go 1.22+ semantics. Only flag actual range-capable values (slices, arrays, maps, channels, strings) and other compile-time issues. This applies to all Go files across the repository.

Applied to files:

  • core/providers/gemini/utils.go
📚 Learning: 2025-12-19T09:26:54.961Z
Learnt from: qwerty-dvorak
Repo: maximhq/bifrost PR: 1006
File: core/providers/utils/utils.go:1050-1051
Timestamp: 2025-12-19T09:26:54.961Z
Learning: Update streaming end-marker handling so HuggingFace is treated as a non-[DONE] provider for backends that do not emit a DONE marker (e.g., meta llama on novita). In core/providers/utils/utils.go, adjust ProviderSendsDoneMarker() (or related logic) to detect providers that may not emit DONE and avoid relying on DONE as the sole end signal. Add tests to cover both DONE-emitting and non-DONE backends, with clear documentation in code comments explaining the rationale and any fallback behavior.

Applied to files:

  • core/providers/gemini/utils.go
📚 Learning: 2025-12-24T07:38:16.990Z
Learnt from: qwerty-dvorak
Repo: maximhq/bifrost PR: 1095
File: tests/integrations/tests/test_google.py:2030-2030
Timestamp: 2025-12-24T07:38:16.990Z
Learning: In Python tests under tests/integrations/tests, allow a fixture parameter named test_config in test functions even if unused; do not flag it as unused. This is an internal convention to ensure consistency for integration tests.

Applied to files:

  • tests/integrations/tests/test_anthropic.py
  • tests/integrations/tests/test_openai.py
🧬 Code graph analysis (1)
core/providers/gemini/utils.go (2)
core/providers/gemini/types.go (2)
  • Schema (734-785)
  • Type (788-788)
core/schemas/responses.go (2)
  • ResponsesTextConfig (120-123)
  • ResponsesTextConfigFormat (125-130)
🔇 Additional comments (5)
tests/integrations/tests/test_anthropic.py (1)

799-801: Streaming thinking test extra_body aligns with non‑streaming path

Using extra_body={"reasoning_summary": "detailed"} here keeps the streaming thinking test consistent with the non‑streaming variant and should exercise the same summary behavior. No changes needed.

tests/integrations/config.yml (1)

13-13: Config updates for cohere/OpenAI/Bedrock look consistent; just confirm model availability

  • Line 13: Adding cohere under bifrost.endpoints matches the new providers.cohere block and the provider_api_keys / get_api_key wiring.
  • Line 40: Pointing openai.file to gpt-4o keeps file operations aligned with the rest of the OpenAI config (chat/vision) and avoids depending on a separate gpt‑5‑only entry.
  • Line 124: Updating Bedrock embeddings to global.cohere.embed-v4:0 is consistent with using Cohere v4 embeddings through Bedrock.

Assuming these model IDs exist in the deployed environment, the config is coherent with the rest of this stack.

Also applies to: 40-40, 124-124

tests/integrations/tests/utils/common.py (2)

1814-1815: Cohere API key wiring matches config and skip helpers

Adding "cohere": "COHERE_API_KEY" to get_api_key keeps the runtime mapping consistent with provider_api_keys in config.yml and lets skip_if_no_api_key("cohere") behave as expected for new cohere scenarios. Just make sure any direct get_api_key("cohere") usages in tests are either wrapped in skip_if_no_api_key or otherwise guarded so missing env vars result in skips rather than hard failures.


2498-2506: Reasoning/thinking extraction in get_content_string_with_summary is aligned with new provider formats

The extended handling in get_content_string_with_summary:

  • Detects Anthropic‑style thinking blocks ({"type": "thinking", "thinking": ...}) and treats them as reasoning content.
  • Supports Gemini‑style reasoning blocks that surface summary or content lists with text fields.
  • Handles plain string items in a LangChain response.content list so mixed content shapes still contribute to the aggregated text.

This should make reasoning tests more robust across Anthropic, Gemini, and LangChain/OpenAI response shapes without regressing existing behavior.

Also applies to: 2523-2525

tests/integrations/tests/test_openai.py (1)

1238-1241: Test input data is correctly aligned with assertions.

RESPONSES_SIMPLE_TEXT_INPUT contains a space exploration prompt ("Tell me a fun fact about space exploration"), which properly matches the updated keyword assertions expecting space exploration-related content.

@TejasGhatte TejasGhatte force-pushed the 01-06-fix_integration_test_fixes branch from b2315af to e53ced3 Compare January 6, 2026 09:45
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
tests/integrations/config.yml (1)

699-704: Virtual key still enabled despite PR summary

The PR description says “Disabled virtual key testing by default,” but virtual_key.enabled is still set to true here. If the intent is to stop double‑running cross‑provider tests, this should probably be false.

Proposed config tweak
-virtual_key:
-  enabled: true
-  value: "sk-bf-test-key"
+virtual_key:
+  enabled: false
+  value: "sk-bf-test-key"
🧹 Nitpick comments (2)
tests/integrations/tests/utils/common.py (1)

2480-2526: Reasoning/thinking content extraction for LangChain is sound

The updates to get_content_string_with_summary now:

  • Recognize Anthropic‑style {"type": "thinking", "thinking": ...} blocks and mark has_reasoning_content.
  • Handle {"type": "reasoning", "summary": [...]} and {"type": "reasoning", "content": [...]} blocks, accumulating their text.
  • Accept plain strings in response.content lists.
    This should make reasoning‑aware assertions more reliable across LangChain/Anthropic/Gemini without impacting existing paths.

If you see more response variants later, consider centralizing the per‑block handling into small helper functions to keep this function from growing further.

core/providers/gemini/utils.go (1)

1068-1145: buildOpenAIResponseFormat handles both schema sources with safe fallbacks

The refactored buildOpenAIResponseFormat(responseJsonSchema, responseSchema) behaves sensibly:

  • Prefers responseJsonSchema when non‑nil and a map[string]interface{}.
  • Falls back to json_object if responseJsonSchema is present but not a map.
  • Otherwise, if responseSchema is provided:
    • Marshals/unmarshals it, normalizes "type" fields via convertTypeToLowerCase, and builds a proper json_schema format.
    • On any marshal/unmarshal/type‑assert failure, falls back to json_object.
  • With neither set, defaults to json_object.

This gives robust behavior even with malformed or unexpected schemas, and keeps the OpenAI Responses config API surface consistent.

You might want a brief comment noting that responseJsonSchema intentionally takes precedence over responseSchema when both are set, so future maintainers don’t change that ordering accidentally.

📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between b2315af and e53ced3.

📒 Files selected for processing (5)
  • core/providers/gemini/utils.go
  • tests/integrations/config.yml
  • tests/integrations/tests/test_anthropic.py
  • tests/integrations/tests/test_openai.py
  • tests/integrations/tests/utils/common.py
🚧 Files skipped from review as they are similar to previous changes (1)
  • tests/integrations/tests/test_openai.py
🧰 Additional context used
📓 Path-based instructions (1)
**

⚙️ CodeRabbit configuration file

always check the stack if there is one for the current PR. do not give localized reviews for the PR, always see all changes in the light of the whole stack of PRs (if there is a stack, if there is no stack you can continue to make localized suggestions/reviews)

Files:

  • core/providers/gemini/utils.go
  • tests/integrations/tests/utils/common.py
  • tests/integrations/tests/test_anthropic.py
  • tests/integrations/config.yml
🧠 Learnings (4)
📚 Learning: 2025-12-09T17:07:42.007Z
Learnt from: qwerty-dvorak
Repo: maximhq/bifrost PR: 1006
File: core/schemas/account.go:9-18
Timestamp: 2025-12-09T17:07:42.007Z
Learning: In core/schemas/account.go, the HuggingFaceKeyConfig field within the Key struct is currently unused and reserved for future Hugging Face inference endpoint deployments. Do not flag this field as missing from OpenAPI documentation or require its presence in the API spec until the feature is actively implemented and used. When the feature is added, update the OpenAPI docs accordingly; otherwise, treat this field as non-breaking and not part of the current API surface.

Applied to files:

  • core/providers/gemini/utils.go
📚 Learning: 2025-12-29T11:54:55.836Z
Learnt from: akshaydeo
Repo: maximhq/bifrost PR: 1153
File: framework/configstore/rdb.go:2221-2246
Timestamp: 2025-12-29T11:54:55.836Z
Learning: In Go reviews, do not flag range-over-int patterns like for i := range n as compile-time errors, assuming Go 1.22+ semantics. Only flag actual range-capable values (slices, arrays, maps, channels, strings) and other compile-time issues. This applies to all Go files across the repository.

Applied to files:

  • core/providers/gemini/utils.go
📚 Learning: 2025-12-19T09:26:54.961Z
Learnt from: qwerty-dvorak
Repo: maximhq/bifrost PR: 1006
File: core/providers/utils/utils.go:1050-1051
Timestamp: 2025-12-19T09:26:54.961Z
Learning: Update streaming end-marker handling so HuggingFace is treated as a non-[DONE] provider for backends that do not emit a DONE marker (e.g., meta llama on novita). In core/providers/utils/utils.go, adjust ProviderSendsDoneMarker() (or related logic) to detect providers that may not emit DONE and avoid relying on DONE as the sole end signal. Add tests to cover both DONE-emitting and non-DONE backends, with clear documentation in code comments explaining the rationale and any fallback behavior.

Applied to files:

  • core/providers/gemini/utils.go
📚 Learning: 2025-12-24T07:38:16.990Z
Learnt from: qwerty-dvorak
Repo: maximhq/bifrost PR: 1095
File: tests/integrations/tests/test_google.py:2030-2030
Timestamp: 2025-12-24T07:38:16.990Z
Learning: In Python tests under tests/integrations/tests, allow a fixture parameter named test_config in test functions even if unused; do not flag it as unused. This is an internal convention to ensure consistency for integration tests.

Applied to files:

  • tests/integrations/tests/test_anthropic.py
🧬 Code graph analysis (1)
core/providers/gemini/utils.go (2)
core/providers/gemini/types.go (2)
  • Schema (734-785)
  • Type (788-788)
core/schemas/responses.go (2)
  • ResponsesTextConfig (120-123)
  • ResponsesTextConfigFormat (125-130)
🔇 Additional comments (9)
tests/integrations/config.yml (3)

11-18: Cohere endpoint wiring looks consistent

The new bifrost.endpoints.cohere: "cohere" entry is consistent with the cohere provider block and provider_api_keys.cohere in this file, and with get_api_key("cohere") in tests/integrations/tests/utils/common.py. Nothing blocking here.


38-47: Switching openai.file to gpt-4o is reasonable

Using gpt-4o for file-related tests aligns with the rest of the OpenAI capabilities in this config and avoids depending on gpt-5. No additional changes needed here.


116-125: Bedrock embeddings model updated to Cohere – verify availability

Pointing bedrock.embeddings to global.cohere.embed-v4:0 matches the new Cohere focus in this PR, but the actual ARN/name has to exist and be enabled in the target AWS account/region.

Please double‑check in your Bedrock console (or via CLI) that global.cohere.embed-v4:0 is the correct, available embeddings model for your test environment.

tests/integrations/tests/test_anthropic.py (1)

789-802: Streaming thinking test correctly exercises extra_body – confirm SDK support

Passing extra_body={"reasoning_summary": "detailed"} in the streaming extended‑thinking test now mirrors the non‑streaming test and validates the new Anthropic request shape; the loop over stream remains unchanged.

Please confirm the pinned Anthropic Python SDK version in this repo documents the extra_body keyword on messages.create(stream=True) so CI doesn’t fail on an unexpected argument.

tests/integrations/tests/utils/common.py (2)

1805-1815: Cohere API key mapping is consistent

Adding "cohere": "COHERE_API_KEY" to key_map matches provider_api_keys.cohere in tests/integrations/config.yml, so Cohere tests will pick up the correct env var.


2584-2612: Input-tokens validation helper aligns with Anthropic and OpenAI usage

assert_valid_input_tokens_response cleanly separates expectations:

  • library == "google": checks total_tokens (Gemini count endpoint).
  • library == "openai": checks object contains "input_tokens" and validates input_tokens.
  • Else (Anthropic and others): requires a positive input_tokens attribute.
    This matches how the new Anthropic tests call it (library="anthropic"), and should be flexible enough for future providers.
core/providers/gemini/utils.go (3)

162-179: Responses parameters now correctly surface response schemas

When ResponseMIMEType is application/json, you now build params.Text via buildOpenAIResponseFormat(config.ResponseJSONSchema, config.ResponseSchema) and also expose response_schema / response_json_schema via ExtraParams. This gives the OpenAI‑style Responses layer both the structured format and the raw Gemini schema knobs, which is exactly what downstream tests expect.


951-1065: JSON-schema normalization helper is appropriate for Responses JSON format

normalizeSchemaTypes and buildJSONSchemaFromMap:

  • Lower‑case "type" fields (and nested properties/items/anyOf/oneOf).
  • Populate Type, Properties, Required, Description, AdditionalProperties, and Name/Title into ResponsesTextConfigFormatJSONSchema.

This matches the OpenAI Responses JSON schema shape, while keeping the original map structure intact for downstream use. No issues spotted.


1279-1309: OpenAI response_format → Gemini schema extraction remains compatible

extractSchemaMapFromResponseFormat continues to:

  • Guard on type == "json_schema".
  • Pull the nested json_schema.schema map.
  • Normalize via normalizeSchemaForGemini.

This is still the right place to adapt OpenAI Response JSON schemas for Gemini, and works nicely with the new buildOpenAIResponseFormat path that turns Gemini’s schema back into an OpenAI‑style ResponsesTextConfigFormatJSONSchema.

@TejasGhatte TejasGhatte force-pushed the 01-06-fix_integration_test_fixes branch from e53ced3 to 0a1551a Compare January 6, 2026 10:17
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
core/providers/gemini/utils.go (1)

1086-1119: Schema conversion logic is correct but could benefit from error visibility.

The marshal/unmarshal approach to convert responseSchema to a normalized map is a common Go pattern and works correctly. The defensive error handling ensures graceful degradation to json_object mode.

Consider logging the conversion errors before falling back to json_object mode (lines 1090, 1100, 1113). This would aid debugging when schema conversion fails silently in production.

Example: Add debug logging for conversion failures
 		data, err := sonic.Marshal(responseSchema)
 		if err != nil {
+			// Consider: log.Debug("Failed to marshal responseSchema, falling back to json_object", "error", err)
 			// If marshaling fails, fall back to json_object mode
 			return &schemas.ResponsesTextConfig{

Similar logging could be added at lines 1100 and 1113.

📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e53ced3 and 0a1551a.

📒 Files selected for processing (5)
  • core/providers/gemini/utils.go
  • tests/integrations/config.yml
  • tests/integrations/tests/test_anthropic.py
  • tests/integrations/tests/test_openai.py
  • tests/integrations/tests/utils/common.py
🚧 Files skipped from review as they are similar to previous changes (3)
  • tests/integrations/config.yml
  • tests/integrations/tests/test_openai.py
  • tests/integrations/tests/test_anthropic.py
🧰 Additional context used
📓 Path-based instructions (1)
**

⚙️ CodeRabbit configuration file

always check the stack if there is one for the current PR. do not give localized reviews for the PR, always see all changes in the light of the whole stack of PRs (if there is a stack, if there is no stack you can continue to make localized suggestions/reviews)

Files:

  • core/providers/gemini/utils.go
  • tests/integrations/tests/utils/common.py
🧠 Learnings (3)
📚 Learning: 2025-12-09T17:07:42.007Z
Learnt from: qwerty-dvorak
Repo: maximhq/bifrost PR: 1006
File: core/schemas/account.go:9-18
Timestamp: 2025-12-09T17:07:42.007Z
Learning: In core/schemas/account.go, the HuggingFaceKeyConfig field within the Key struct is currently unused and reserved for future Hugging Face inference endpoint deployments. Do not flag this field as missing from OpenAPI documentation or require its presence in the API spec until the feature is actively implemented and used. When the feature is added, update the OpenAPI docs accordingly; otherwise, treat this field as non-breaking and not part of the current API surface.

Applied to files:

  • core/providers/gemini/utils.go
📚 Learning: 2025-12-29T11:54:55.836Z
Learnt from: akshaydeo
Repo: maximhq/bifrost PR: 1153
File: framework/configstore/rdb.go:2221-2246
Timestamp: 2025-12-29T11:54:55.836Z
Learning: In Go reviews, do not flag range-over-int patterns like for i := range n as compile-time errors, assuming Go 1.22+ semantics. Only flag actual range-capable values (slices, arrays, maps, channels, strings) and other compile-time issues. This applies to all Go files across the repository.

Applied to files:

  • core/providers/gemini/utils.go
📚 Learning: 2025-12-19T09:26:54.961Z
Learnt from: qwerty-dvorak
Repo: maximhq/bifrost PR: 1006
File: core/providers/utils/utils.go:1050-1051
Timestamp: 2025-12-19T09:26:54.961Z
Learning: Update streaming end-marker handling so HuggingFace is treated as a non-[DONE] provider for backends that do not emit a DONE marker (e.g., meta llama on novita). In core/providers/utils/utils.go, adjust ProviderSendsDoneMarker() (or related logic) to detect providers that may not emit DONE and avoid relying on DONE as the sole end signal. Add tests to cover both DONE-emitting and non-DONE backends, with clear documentation in code comments explaining the rationale and any fallback behavior.

Applied to files:

  • core/providers/gemini/utils.go
🧬 Code graph analysis (1)
core/providers/gemini/utils.go (2)
core/providers/gemini/types.go (2)
  • Schema (734-785)
  • Type (788-788)
core/schemas/responses.go (2)
  • ResponsesTextConfig (120-123)
  • ResponsesTextConfigFormat (125-130)
🔇 Additional comments (3)
tests/integrations/tests/utils/common.py (2)

1814-1814: LGTM - Cohere provider mapping added.

The Cohere API key mapping follows the established pattern and aligns with the new Cohere endpoint configuration mentioned in the PR objectives.


2498-2525: Enhanced reasoning content detection across multiple providers.

The additions properly handle thinking/reasoning blocks from various providers (Anthropic, Gemini, OpenAI) with appropriate fallbacks for plain string items. The logic correctly:

  • Detects Anthropic thinking blocks and extracts thinking text
  • Handles reasoning blocks with summary or content fields
  • Sets the has_reasoning_content flag consistently
  • Gracefully handles plain string items in content lists
core/providers/gemini/utils.go (1)

165-165: Function signature extended to support responseSchema parameter.

The addition of the responseSchema *Schema parameter enables fallback support when responseJsonSchema is not provided, implementing the feature described in the PR objectives. The call site at line 165 correctly passes config.ResponseSchema.

Also applies to: 1068-1068

@TejasGhatte TejasGhatte force-pushed the 01-06-fix_integration_test_fixes branch from 0a1551a to 9afa095 Compare January 6, 2026 16:56
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI Agents
In @tests/integrations/tests/test_openai.py:
- Around line 1674-1835: The test unpacks event_types in
test_38a_responses_reasoning_streaming but never uses it (causing linter
warnings) and both 38a/38b lack a fallback for providers that don't support
streaming/reasoning; rename the unused binding to _event_types (or assert
something lightweight like "assert isinstance(event_types, list)") in
test_38a_responses_reasoning_streaming, and wrap the streaming call +
collect_responses_streaming_content in a try/except that mirrors the existing
fallback pattern used in test_38_responses_reasoning (catch the
provider-not-supported exception and fall back to a non-streaming/simple
response assertion) so a lagging backend won’t fail the entire cross-provider
run.
🧹 Nitpick comments (2)
core/providers/gemini/utils.go (1)

1068-1127: Improve fallback when responseJsonSchema is non-map but responseSchema is available

Right now any non‑map[string]interface{} responseJsonSchema (e.g., a typed struct) causes an immediate fallback to json_object, even if responseSchema is populated. You could make this more robust by letting responseSchema act as a secondary source before giving up, e.g.:

Suggested control-flow tweak
 func buildOpenAIResponseFormat(responseJsonSchema interface{}, responseSchema *Schema) *schemas.ResponsesTextConfig {
   name := "json_response"

-  var schemaMap map[string]interface{}
-
-  // Try to use responseJsonSchema first
-  if responseJsonSchema != nil {
-    // Use responseJsonSchema directly if it's a map
-    var ok bool
-    schemaMap, ok = responseJsonSchema.(map[string]interface{})
-    if !ok {
-      // If not a map, fall back to json_object mode
-      return &schemas.ResponsesTextConfig{
-        Format: &schemas.ResponsesTextConfigFormat{
-          Type: "json_object",
-        },
-      }
-    }
-  } else if responseSchema != nil {
+  var schemaMap map[string]interface{}
+
+  // Prefer responseJsonSchema when it's a usable map
+  if responseJsonSchema != nil {
+    if m, ok := responseJsonSchema.(map[string]interface{}); ok {
+      schemaMap = m
+    } else if responseSchema == nil {
+      // No usable schema at all – fall back
+      return &schemas.ResponsesTextConfig{
+        Format: &schemas.ResponsesTextConfigFormat{
+          Type: "json_object",
+        },
+      }
+    }
+  }
+
+  // If we still don't have a map, try building one from responseSchema
+  if schemaMap == nil && responseSchema != nil {
     // Convert responseSchema to map using JSON marshaling and type normalization
     data, err := sonic.Marshal(responseSchema)
     if err != nil {
       // If marshaling fails, fall back to json_object mode
       return &schemas.ResponsesTextConfig{
         Format: &schemas.ResponsesTextConfigFormat{
           Type: "json_object",
         },
       }
     }
@@
-    normalized := convertTypeToLowerCase(rawMap)
-    var ok bool
-    schemaMap, ok = normalized.(map[string]interface{})
+    normalized := convertTypeToLowerCase(rawMap)
+    m, ok := normalized.(map[string]interface{})
+    schemaMap = m
     if !ok {
       // If type assertion fails, fall back to json_object mode
       return &schemas.ResponsesTextConfig{
         Format: &schemas.ResponsesTextConfigFormat{
           Type: "json_object",
         },
       }
     }
-  } else {
-    // No schema provided - use json_object mode
-    return &schemas.ResponsesTextConfig{
-      Format: &schemas.ResponsesTextConfigFormat{
-        Type: "json_object",
-      },
-    }
   }

This keeps the existing safety net but avoids silently dropping structured schemas when responseJsonSchema is present but not already a plain map.

tests/integrations/tests/test_openai.py (1)

1018-1021: Update long‑text usage comment to match the providers being exempted

The condition now skips the usage assertion for provider == "gemini" and provider == "bedrock", but the comment still mentions Gemini and OpenAI. To avoid confusion, consider updating it to mention Gemini and Bedrock instead, or broaden it to “providers that don’t return usage for this path”.

Minimal comment tweak
-        if provider != "gemini" and provider != "bedrock": # gemini does not return usage data and openai does not return usage data for long text
+        # gemini/bedrock do not return usage data for this long-text embeddings path
+        if provider != "gemini" and provider != "bedrock":
📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 0a1551a and 9afa095.

📒 Files selected for processing (6)
  • core/providers/gemini/responses.go
  • core/providers/gemini/utils.go
  • tests/integrations/config.yml
  • tests/integrations/tests/test_anthropic.py
  • tests/integrations/tests/test_openai.py
  • tests/integrations/tests/utils/common.py
💤 Files with no reviewable changes (1)
  • core/providers/gemini/responses.go
🚧 Files skipped from review as they are similar to previous changes (1)
  • tests/integrations/tests/test_anthropic.py
🧰 Additional context used
📓 Path-based instructions (1)
**

⚙️ CodeRabbit configuration file

always check the stack if there is one for the current PR. do not give localized reviews for the PR, always see all changes in the light of the whole stack of PRs (if there is a stack, if there is no stack you can continue to make localized suggestions/reviews)

Files:

  • tests/integrations/tests/utils/common.py
  • core/providers/gemini/utils.go
  • tests/integrations/config.yml
  • tests/integrations/tests/test_openai.py
🧠 Learnings (4)
📚 Learning: 2025-12-09T17:07:42.007Z
Learnt from: qwerty-dvorak
Repo: maximhq/bifrost PR: 1006
File: core/schemas/account.go:9-18
Timestamp: 2025-12-09T17:07:42.007Z
Learning: In core/schemas/account.go, the HuggingFaceKeyConfig field within the Key struct is currently unused and reserved for future Hugging Face inference endpoint deployments. Do not flag this field as missing from OpenAPI documentation or require its presence in the API spec until the feature is actively implemented and used. When the feature is added, update the OpenAPI docs accordingly; otherwise, treat this field as non-breaking and not part of the current API surface.

Applied to files:

  • core/providers/gemini/utils.go
📚 Learning: 2025-12-29T11:54:55.836Z
Learnt from: akshaydeo
Repo: maximhq/bifrost PR: 1153
File: framework/configstore/rdb.go:2221-2246
Timestamp: 2025-12-29T11:54:55.836Z
Learning: In Go reviews, do not flag range-over-int patterns like for i := range n as compile-time errors, assuming Go 1.22+ semantics. Only flag actual range-capable values (slices, arrays, maps, channels, strings) and other compile-time issues. This applies to all Go files across the repository.

Applied to files:

  • core/providers/gemini/utils.go
📚 Learning: 2025-12-19T09:26:54.961Z
Learnt from: qwerty-dvorak
Repo: maximhq/bifrost PR: 1006
File: core/providers/utils/utils.go:1050-1051
Timestamp: 2025-12-19T09:26:54.961Z
Learning: Update streaming end-marker handling so HuggingFace is treated as a non-[DONE] provider for backends that do not emit a DONE marker (e.g., meta llama on novita). In core/providers/utils/utils.go, adjust ProviderSendsDoneMarker() (or related logic) to detect providers that may not emit DONE and avoid relying on DONE as the sole end signal. Add tests to cover both DONE-emitting and non-DONE backends, with clear documentation in code comments explaining the rationale and any fallback behavior.

Applied to files:

  • core/providers/gemini/utils.go
📚 Learning: 2025-12-24T07:38:16.990Z
Learnt from: qwerty-dvorak
Repo: maximhq/bifrost PR: 1095
File: tests/integrations/tests/test_google.py:2030-2030
Timestamp: 2025-12-24T07:38:16.990Z
Learning: In Python tests under tests/integrations/tests, allow a fixture parameter named test_config in test functions even if unused; do not flag it as unused. This is an internal convention to ensure consistency for integration tests.

Applied to files:

  • tests/integrations/tests/test_openai.py
🧬 Code graph analysis (2)
core/providers/gemini/utils.go (2)
core/providers/gemini/types.go (2)
  • Schema (734-785)
  • Type (788-788)
core/schemas/responses.go (2)
  • ResponsesTextConfig (120-123)
  • ResponsesTextConfigFormat (125-130)
tests/integrations/tests/test_openai.py (2)
tests/integrations/tests/utils/common.py (2)
  • skip_if_no_api_key (1829-1840)
  • collect_responses_streaming_content (1935-1991)
tests/integrations/tests/utils/parametrize.py (2)
  • get_cross_provider_params_with_vk_for_scenario (50-101)
  • format_provider_model (126-141)
🪛 Ruff (0.14.10)
tests/integrations/tests/test_openai.py

1676-1676: Unused method argument: test_config

(ARG002)


1692-1692: Unpacked variable event_types is never used

Prefix it with an underscore or any other dummy variable pattern

(RUF059)


1746-1746: Unused method argument: test_config

(ARG002)

🔇 Additional comments (7)
core/providers/gemini/utils.go (1)

162-179: JSON response-format wiring from GenerationConfig looks solid

Passing both ResponseJSONSchema and ResponseSchema into buildOpenAIResponseFormat for application/json while still exposing them via ExtraParams gives Gemini maximum information without changing existing fallbacks. No issues from a correctness standpoint.

tests/integrations/config.yml (1)

13-19: Cohere/Vertex provider configuration is consistent; double‑check vertex routing

  • cohere now has a dedicated endpoint, provider block, API key, and scenarios entry, which line up correctly.
  • vertex is wired through providers, provider_api_keys, and provider_scenarios with capabilities that match what the tests exercise (no batch/files, embeddings/thinking enabled), and bedrock embeddings are updated to the global Cohere ARN.

One thing to verify: there is no vertex entry under bifrost.endpoints, so all vertex‑backed tests must be routed via an existing integration (likely google: genai). If that’s intentional, this config looks good as‑is.

Also applies to: 37-67, 87-157, 116-125, 126-157, 134-135, 148-167, 291-329

tests/integrations/tests/utils/common.py (2)

1804-1816: API key routing for Cohere and Vertex is aligned with provider config

Adding "cohere": "COHERE_API_KEY" and "vertex": "VERTEX_API_KEY" here keeps get_api_key consistent with provider_api_keys in config.yml, so cross‑provider tests can reuse the same helper without special‑casing.


1969-1975: Streaming and reasoning content extraction updates look correct

  • Including response.summary_text.delta in collect_responses_streaming_content ensures reasoning summaries contribute to the assembled content used by tests.
  • The LangChain branch in get_content_string_with_summary now correctly recognizes Anthropic‑style thinking blocks, Gemini‑style reasoning blocks (both summary and content), and plain string items, while flagging has_reasoning_content appropriately.

These changes should make reasoning/summary assertions much more reliable across clients.

Also applies to: 2503-2531

tests/integrations/tests/test_openai.py (3)

1197-1205: File‑input expectations keyed to “hello world” PDF look reasonable

Both the chat and responses file tests now assert that the model’s answer references generic document/file keywords (including “hello”/“world” tied to the embedded PDF). This keeps the assertions flexible but still grounded in the actual fixture content.

Also applies to: 1361-1376


1238-1242: Space‑exploration keyword checks for simple responses are appropriate

The updated keyword list ("space", "exploration", "astronaut", "moon", "mars", etc.) matches the test prompt and still leaves enough flexibility across providers while ensuring the response actually talks about space exploration.


1539-1543: Higher max_output_tokens for reasoning responses is a good safeguard

Bumping max_output_tokens to 1200 for the high‑effort reasoning test reduces the risk of truncating chain‑of‑thought style answers, especially now that you’re also asserting on multiple reasoning/step indicators.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In @tests/integrations/tests/test_openai.py:
- Around line 1017-1021: The inline comment and the conditional are
inconsistent: the code currently skips usage checks for providers by evaluating
if provider != "gemini" and provider != "bedrock" but the comment says "gemini
does not return usage data and openai does not return usage data for long text."
Update the comment to correctly reflect the providers being excluded (gemini and
bedrock) or, if you intended to exclude openai instead of bedrock, change the
condition to check provider != "gemini" and provider != "openai"; ensure this
uses the same provider names as the condition that guards the response.usage and
response.usage.total_tokens assertions so the comment and the if-check are
consistent with each other.
🧹 Nitpick comments (1)
core/providers/gemini/utils.go (1)

1068-1145: Consider adding fallback to responseSchema when responseJsonSchema is non-nil but invalid

The function handles the happy path well, but there's a robustness gap: if responseJsonSchema is non-nil but not a map[string]interface{} (which can happen when overridden via ExtraParams["response_json_schema"]), it immediately falls back to json_object mode without attempting to use a non-nil responseSchema.

Since both parameters can coexist, the function should try responseSchema before giving up entirely:

Optional robustness improvement
-    if responseJsonSchema != nil {
-        // Use responseJsonSchema directly if it's a map
-        var ok bool
-        schemaMap, ok = responseJsonSchema.(map[string]interface{})
-        if !ok {
-            // If not a map, fall back to json_object mode
-            return &schemas.ResponsesTextConfig{
-                Format: &schemas.ResponsesTextConfigFormat{
-                    Type: "json_object",
-                },
-            }
-        }
-    } else if responseSchema != nil {
+    if responseJsonSchema != nil {
+        // Use responseJsonSchema directly if it's a map
+        var ok bool
+        schemaMap, ok = responseJsonSchema.(map[string]interface{})
+        if !ok && responseSchema == nil {
+            // If not a map and we have no Gemini schema to fall back to, use json_object
+            return &schemas.ResponsesTextConfig{
+                Format: &schemas.ResponsesTextConfigFormat{
+                    Type: "json_object",
+                },
+            }
+        }
+    }
+
+    if schemaMap == nil && responseSchema != nil {
         // Convert responseSchema to map using JSON marshaling and type normalization
         data, err := sonic.Marshal(responseSchema)
         if err != nil {
             // If marshaling fails, fall back to json_object mode
             return &schemas.ResponsesTextConfig{
                 Format: &schemas.ResponsesTextConfigFormat{
                     Type: "json_object",
                 },
             }
         }
         ...
-    } else {
+    } else if schemaMap == nil {
         // No schema provided - use json_object mode
         return &schemas.ResponsesTextConfig{
             Format: &schemas.ResponsesTextConfigFormat{
                 Type: "json_object",
             },
         }
     }

Not required for correctness today, but it makes the helper more tolerant of mixed inputs, particularly when response_json_schema is provided via ExtraParams with an unexpected type.

📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 9afa095 and 652111d.

📒 Files selected for processing (6)
  • core/providers/gemini/responses.go
  • core/providers/gemini/utils.go
  • tests/integrations/config.yml
  • tests/integrations/tests/test_anthropic.py
  • tests/integrations/tests/test_openai.py
  • tests/integrations/tests/utils/common.py
💤 Files with no reviewable changes (1)
  • core/providers/gemini/responses.go
🚧 Files skipped from review as they are similar to previous changes (2)
  • tests/integrations/tests/test_anthropic.py
  • tests/integrations/config.yml
🧰 Additional context used
📓 Path-based instructions (1)
**

⚙️ CodeRabbit configuration file

always check the stack if there is one for the current PR. do not give localized reviews for the PR, always see all changes in the light of the whole stack of PRs (if there is a stack, if there is no stack you can continue to make localized suggestions/reviews)

Files:

  • core/providers/gemini/utils.go
  • tests/integrations/tests/test_openai.py
  • tests/integrations/tests/utils/common.py
🧠 Learnings (4)
📚 Learning: 2025-12-09T17:07:42.007Z
Learnt from: qwerty-dvorak
Repo: maximhq/bifrost PR: 1006
File: core/schemas/account.go:9-18
Timestamp: 2025-12-09T17:07:42.007Z
Learning: In core/schemas/account.go, the HuggingFaceKeyConfig field within the Key struct is currently unused and reserved for future Hugging Face inference endpoint deployments. Do not flag this field as missing from OpenAPI documentation or require its presence in the API spec until the feature is actively implemented and used. When the feature is added, update the OpenAPI docs accordingly; otherwise, treat this field as non-breaking and not part of the current API surface.

Applied to files:

  • core/providers/gemini/utils.go
📚 Learning: 2025-12-29T11:54:55.836Z
Learnt from: akshaydeo
Repo: maximhq/bifrost PR: 1153
File: framework/configstore/rdb.go:2221-2246
Timestamp: 2025-12-29T11:54:55.836Z
Learning: In Go reviews, do not flag range-over-int patterns like for i := range n as compile-time errors, assuming Go 1.22+ semantics. Only flag actual range-capable values (slices, arrays, maps, channels, strings) and other compile-time issues. This applies to all Go files across the repository.

Applied to files:

  • core/providers/gemini/utils.go
📚 Learning: 2025-12-19T09:26:54.961Z
Learnt from: qwerty-dvorak
Repo: maximhq/bifrost PR: 1006
File: core/providers/utils/utils.go:1050-1051
Timestamp: 2025-12-19T09:26:54.961Z
Learning: Update streaming end-marker handling so HuggingFace is treated as a non-[DONE] provider for backends that do not emit a DONE marker (e.g., meta llama on novita). In core/providers/utils/utils.go, adjust ProviderSendsDoneMarker() (or related logic) to detect providers that may not emit DONE and avoid relying on DONE as the sole end signal. Add tests to cover both DONE-emitting and non-DONE backends, with clear documentation in code comments explaining the rationale and any fallback behavior.

Applied to files:

  • core/providers/gemini/utils.go
📚 Learning: 2025-12-24T07:38:16.990Z
Learnt from: qwerty-dvorak
Repo: maximhq/bifrost PR: 1095
File: tests/integrations/tests/test_google.py:2030-2030
Timestamp: 2025-12-24T07:38:16.990Z
Learning: In Python tests under tests/integrations/tests, allow a fixture parameter named test_config in test functions even if unused; do not flag it as unused. This is an internal convention to ensure consistency for integration tests.

Applied to files:

  • tests/integrations/tests/test_openai.py
🧬 Code graph analysis (2)
core/providers/gemini/utils.go (2)
core/providers/gemini/types.go (2)
  • Schema (734-785)
  • Type (788-788)
core/schemas/responses.go (2)
  • ResponsesTextConfig (120-123)
  • ResponsesTextConfigFormat (125-130)
tests/integrations/tests/test_openai.py (2)
tests/integrations/tests/utils/common.py (2)
  • skip_if_no_api_key (1829-1840)
  • collect_responses_streaming_content (1935-1991)
tests/integrations/tests/utils/parametrize.py (2)
  • get_cross_provider_params_with_vk_for_scenario (50-101)
  • format_provider_model (126-141)
🪛 Ruff (0.14.10)
tests/integrations/tests/test_openai.py

1676-1676: Unused method argument: test_config

(ARG002)


1692-1692: Unpacked variable event_types is never used

Prefix it with an underscore or any other dummy variable pattern

(RUF059)


1746-1746: Unused method argument: test_config

(ARG002)

🔇 Additional comments (8)
tests/integrations/tests/utils/common.py (3)

1805-1816: Cohere/Vertex API key mapping is consistent with existing helpers

Adding "cohere"COHERE_API_KEY and "vertex"VERTEX_API_KEY fits the existing pattern and keeps get_api_key aligned with new providers.


1950-1991: Including response.summary_text.delta in streaming aggregation is correct

Capturing response.summary_text.delta alongside response.output_text.delta ensures reasoning summaries from the Responses API are not dropped during streaming collection and matches how tests consume content.


2485-2531: Expanded reasoning/thinking extraction for LangChain messages looks solid

The added handling for Anthropic-style {"type": "thinking", "thinking": "..."} blocks and plain string items in the content list correctly sets has_reasoning_content and preserves all text content without impacting existing reasoning/summary paths.

core/providers/gemini/utils.go (1)

162-179: Passing both ResponseJSONSchema and ResponseSchema into buildOpenAIResponseFormat matches the new behavior

Wiring config.ResponseJSONSchema and config.ResponseSchema into buildOpenAIResponseFormat in the application/json branch ensures Responses text config is derived from either OpenAI-style JSON schema or Gemini-native Schema, while still exposing both originals via ExtraParams.

tests/integrations/tests/test_openai.py (4)

924-938: Embedding similarity comment now matches the 0.6 threshold

The inline comment for similar-text embeddings has been updated to “> 0.6”, matching the actual assertions for similarity_1_2, similarity_1_3, and similarity_2_3. This keeps the test expectations self‑consistent.


1201-1205: PDF document keyword assertions align with actual test fixture

Both the chat completion and Responses API “with file” tests now assert against keywords like "hello", "world", "testing", "pdf", "file", which matches the testingpdf content and should be robust across providers while remaining flexible.

Also applies to: 1371-1376


1238-1242: Space‑exploration keyword checks match the prompt

The keywords list for test_32_responses_simple_text (space, exploration, astronaut, moon, mars, rocket, nasa, satellite) is well aligned with the “fun fact about space exploration” prompt and should work across providers without being overly strict.


1539-1543: Higher max_output_tokens for reasoning responses is reasonable

Increasing max_output_tokens from 800 to 1200 in test_38_responses_reasoning gives reasoning models more room for detailed chains of thought and summaries, consistent with the new streaming variants.

Copy link
Collaborator

Pratham-Mishra04 commented Jan 7, 2026

Merge activity

  • Jan 7, 12:18 PM UTC: A user started a stack merge that includes this pull request via Graphite.
  • Jan 7, 12:18 PM UTC: @Pratham-Mishra04 merged this pull request with Graphite.

@Pratham-Mishra04 Pratham-Mishra04 merged commit fff4ca8 into main Jan 7, 2026
8 checks passed
@Pratham-Mishra04 Pratham-Mishra04 deleted the 01-06-fix_integration_test_fixes branch January 7, 2026 12:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants