Skip to content

Conversation

@Pratham-Mishra04
Copy link
Collaborator

Summary

Fix the conversion between reasoning effort levels and thinking budgets in Gemini, ensuring proper handling across different Gemini model versions (2.5 and 3.0+).

Changes

  • Fixed conversion between thinking level and thinking budget in Gemini provider
  • Added proper model version detection to use appropriate reasoning parameters
  • Implemented priority rules when both max_tokens and effort are provided
  • Added special handling for Gemini Pro models which have limited thinking level support
  • Updated documentation with detailed explanation of Gemini reasoning parameters

Type of change

  • Bug fix
  • Feature
  • Refactor
  • Documentation
  • Chore/CI

Affected areas

  • Core (Go)
  • Transports (HTTP)
  • Providers/Integrations
  • Plugins
  • UI (Next.js)
  • Docs

How to test

Test with different Gemini models and reasoning parameters:

# Test with Gemini 2.5 models
curl -X POST http://localhost:8080/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gemini-2.5-flash",
    "messages": [{"role": "user", "content": "Solve this math problem: 5x + 3 = 18"}],
    "reasoning": {"effort": "high"}
  }'

# Test with Gemini 3.0 models
curl -X POST http://localhost:8080/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gemini-3.0-flash",
    "messages": [{"role": "user", "content": "Solve this math problem: 5x + 3 = 18"}],
    "reasoning": {"max_tokens": 4096}
  }'

# Run tests
go test ./core/providers/gemini/...

Breaking changes

  • Yes
  • No

Related issues

Fixes inconsistent reasoning behavior between Gemini 2.5 and 3.0+ models.

Security considerations

No security implications.

Checklist

  • I added/updated tests where appropriate
  • I updated documentation where needed
  • I verified builds succeed (Go and UI)
  • I verified the CI pipeline passes locally if applicable

@github-actions
Copy link
Contributor

github-actions bot commented Jan 5, 2026

🧪 Test Suite Available

This PR can be tested by a repository admin.

Run tests for PR #1245

Copy link
Collaborator Author

This stack of pull requests is managed by Graphite. Learn more about stacking.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Jan 5, 2026

📝 Walkthrough

Summary by CodeRabbit

  • Bug Fixes

    • Fixed conversion of thinking level to thinking budget in Gemini provider.
  • New Features

    • Enhanced Gemini reasoning support with model version-aware parameter handling for thinking budgets and levels.
  • Documentation

    • Updated Gemini reasoning documentation with model version distinctions, conversion rules, and usage examples.

✏️ Tip: You can customize this high-level summary in your review settings.

Walkthrough

This PR implements model version-aware reasoning configuration handling for Gemini. It adds detection for Gemini 3.0+ and refactors conversion logic between Bifrost reasoning parameters (effort/max_tokens) and Gemini's thinking configurations (thinkingLevel/thinkingBudget), with version-specific behavior and updated documentation.

Changes

Cohort / File(s) Summary
Reasoning Configuration Refactoring
core/providers/gemini/responses.go, core/providers/gemini/utils.go
Major logic overhaul in convertParamsToGenerationConfigResponses with nuanced handling for max tokens, thinking budgets, thinking levels, and effort-based reasoning. New functions: isGemini3Plus() for model detection and effortToThinkingLevel() for effort-to-level mapping. Extended convertParamsToGenerationConfig() to accept model parameter and route conversions based on Gemini version.
Type System Updates
core/providers/gemini/types.go
Added public constants: MinReasoningMaxTokens, DefaultCompletionMaxTokens, DefaultReasoningMinBudget, DynamicReasoningBudget. Changed GenerationConfigThinkingConfig.ThinkingLevel from custom enum type to *string. Refactored UnmarshalJSON() to use dual-tag auxiliary struct with camelCase/snake_case variants and sonic.Unmarshal(). Removed ThinkingLevel type and its constants.
Chat Request Handling
core/providers/gemini/chat.go
Modified ToGeminiChatCompletionRequest() to pass model to convertParamsToGenerationConfig(), extending parameter conversion to consider model capabilities.
Documentation Updates
docs/providers/reasoning.mdx
Comprehensive expansion distinguishing Gemini 2.5+ (budget-only) from Gemini 3.0+ (dual support for budget and level). Added conversion guidance, priority rules for overlapping parameters, version-specific behavior notes, examples, and new subsections on model differences and constraint handling.
Changelog Entries
core/changelog.md, transports/changelog.md
Updated changelog entries to document the reasoning conversion fix across both changelog locations.

Sequence Diagram(s)

sequenceDiagram
    participant Client
    participant ChatReq as Chat Request Handler
    participant ModelCheck as Model Capability<br/>(isGemini3Plus)
    participant ConfigConv as Config Converter
    participant GeminiAPI as Gemini API
    
    Client->>ChatReq: Bifrost request with<br/>reasoning.effort or<br/>reasoning.max_tokens
    activate ChatReq
    ChatReq->>ModelCheck: Check model version
    activate ModelCheck
    ModelCheck-->>ChatReq: Gemini 3.0+? / Version
    deactivate ModelCheck
    
    alt Has max_tokens (budget)
        ChatReq->>ConfigConv: Convert to thinkingBudget<br/>(both versions)
    else Only effort provided
        alt Gemini 3.0+
            ChatReq->>ConfigConv: Convert to thinkingLevel
        else Gemini 2.5 or earlier
            ChatReq->>ConfigConv: Convert to thinkingBudget<br/>(via effort lookup)
        end
    end
    
    activate ConfigConv
    ConfigConv->>ConfigConv: Set ThinkingConfig with<br/>appropriate field
    ConfigConv-->>ChatReq: GenerationConfig<br/>(reasoning fields set)
    deactivate ConfigConv
    
    ChatReq->>GeminiAPI: Gemini request with<br/>thinkingLevel or<br/>thinkingBudget
    deactivate ChatReq
    GeminiAPI-->>Client: Gemini response with<br/>reasoning details
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~50 minutes

Poem

🐰 Whiskers twitch with glee,
Model versions dance so free—
Budgets bloom and levels leap,
Thinking deep, conversions steep!
Gemini logic, patch complete,
Reasoning flows so neat! 🌟

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)
Check name Status Explanation
Title check ✅ Passed The PR title clearly and specifically describes the main change: fixing the conversion between thinking levels and thinking budgets in Gemini across different model versions.
Description check ✅ Passed The PR description is comprehensive and follows the template structure with all key sections completed: Summary, Changes, Type of change, Affected areas, How to test, Breaking changes, Related issues, Security considerations, and Checklist.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch 01-05-fix_fix_reasoning_effort_and_budget_convertion_for_gemini

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Fix all issues with AI Agents 🤖
In @docs/providers/reasoning.mdx:
- Around line 840-851: The numeric example budgets in the table (3482, 2330) are
inconsistent with the documented default and the implementation: update either
the example calculations or the explanatory note so they match
GetBudgetTokensFromReasoningEffort and DefaultCompletionMaxTokens; specifically
either recalc the shown budgets using DefaultCompletionMaxTokens = 8192 (and
replace 3482/2330 with those 8192-based values) or change the footnote/text to
state the examples assume max_completion_tokens = 4096 so the current numbers
remain correct.
🧹 Nitpick comments (2)
core/providers/gemini/responses.go (1)

10-13: Responses-side reasoning config logic stays consistent with chat path

The updated convertParamsToGenerationConfigResponses mirrors the chat path correctly:

  • Uses DefaultCompletionMaxTokens / DefaultReasoningMinBudget as fallbacks.
  • Enforces the same priority rule: when both Reasoning.MaxTokens and Reasoning.Effort are set, only thinkingBudget is sent (effort ignored).
  • Handles special cases (effort == "none", max_tokens == 0, max_tokens == -1) and routes effort-only requests to:
    • thinkingLevel on Gemini 3.0+ (isGemini3Plus(r.Model) + effortToThinkingLevel), or
    • a computed thinkingBudget via GetBudgetTokensFromReasoningEffort on older models.

The behavior aligns with the new docs and the chat conversion logic; any error from the estimator is safely ignored as a soft fallback, which is reasonable here.

Also applies to: 2014-2079

core/providers/gemini/utils.go (1)

8-11: Harden Gemini 3+ detection to avoid misclassifying non-numeric model names

The new reasoning logic and helpers are generally solid:

  • Request/response conversion symmetry looks good.
  • Priority rules, special values (0, -1, "none"), and Pro-only level constraints all line up with the docs and with GetReasoningEffortFromBudgetTokens / GetBudgetTokensFromReasoningEffort.

The one fragile spot is isGemini3Plus:

idx := strings.Index(model, "gemini-")
if idx == -1 { return false }
afterPrefix := model[idx+7:]
if len(afterPrefix) == 0 { return false }
firstChar := afterPrefix[0]
return firstChar >= '3'

This will treat any model whose first character after gemini- is non‑numeric (e.g. "gemini-pro", "gemini-exp") as “3+” because 'p' / 'e' are ≥ '3' in ASCII. That can flip older or experimental models into the 3.0+ code path and cause us to send thinkingLevel to models that only support thinkingBudget.

Consider tightening the check to only treat models with an explicit numeric major version ≥ 3 as 3+; for example:

Proposed improvement for isGemini3Plus
-func isGemini3Plus(model string) bool {
-	// Convert to lowercase for case-insensitive comparison
-	model = strings.ToLower(model)
-
-	// Find "gemini-" prefix
-	idx := strings.Index(model, "gemini-")
-	if idx == -1 {
-		return false
-	}
-
-	// Get the part after "gemini-"
-	afterPrefix := model[idx+7:] // len("gemini-") = 7
-	if len(afterPrefix) == 0 {
-		return false
-	}
-
-	// Check first character - if it's '3' or higher, it's 3.0+
-	firstChar := afterPrefix[0]
-	return firstChar >= '3'
-}
+func isGemini3Plus(model string) bool {
+	model = strings.ToLower(model)
+
+	idx := strings.Index(model, "gemini-")
+	if idx == -1 {
+		return false
+	}
+
+	afterPrefix := model[idx+7:]
+	if len(afterPrefix) == 0 {
+		return false
+	}
+
+	// Extract leading numeric major version, e.g. "3" from "3.0-flash"
+	i := 0
+	for i < len(afterPrefix) && afterPrefix[i] >= '0' && afterPrefix[i] <= '9' {
+		i++
+	}
+	if i == 0 {
+		// No numeric prefix → treat as pre‑3.0 for safety
+		return false
+	}
+
+	major := int(afterPrefix[0] - '0')
+	return major >= 3
+}

This keeps the function cheap but avoids accidentally routing non‑versioned model names into the 3.0+ reasoning path.

Also applies to: 13-34, 36-65, 448-537

📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between b6635d4 and 3fba3dd.

📒 Files selected for processing (7)
  • core/changelog.md
  • core/providers/gemini/chat.go
  • core/providers/gemini/responses.go
  • core/providers/gemini/types.go
  • core/providers/gemini/utils.go
  • docs/providers/reasoning.mdx
  • transports/changelog.md
🧰 Additional context used
📓 Path-based instructions (1)
**

⚙️ CodeRabbit configuration file

always check the stack if there is one for the current PR. do not give localized reviews for the PR, always see all changes in the light of the whole stack of PRs (if there is a stack, if there is no stack you can continue to make localized suggestions/reviews)

Files:

  • core/providers/gemini/chat.go
  • core/changelog.md
  • docs/providers/reasoning.mdx
  • core/providers/gemini/responses.go
  • core/providers/gemini/utils.go
  • transports/changelog.md
  • core/providers/gemini/types.go
🧠 Learnings (4)
📚 Learning: 2025-12-09T17:07:42.007Z
Learnt from: qwerty-dvorak
Repo: maximhq/bifrost PR: 1006
File: core/schemas/account.go:9-18
Timestamp: 2025-12-09T17:07:42.007Z
Learning: In core/schemas/account.go, the HuggingFaceKeyConfig field within the Key struct is currently unused and reserved for future Hugging Face inference endpoint deployments. Do not flag this field as missing from OpenAPI documentation or require its presence in the API spec until the feature is actively implemented and used. When the feature is added, update the OpenAPI docs accordingly; otherwise, treat this field as non-breaking and not part of the current API surface.

Applied to files:

  • core/providers/gemini/chat.go
  • core/providers/gemini/responses.go
  • core/providers/gemini/utils.go
  • core/providers/gemini/types.go
📚 Learning: 2025-12-29T11:54:55.836Z
Learnt from: akshaydeo
Repo: maximhq/bifrost PR: 1153
File: framework/configstore/rdb.go:2221-2246
Timestamp: 2025-12-29T11:54:55.836Z
Learning: In Go reviews, do not flag range-over-int patterns like for i := range n as compile-time errors, assuming Go 1.22+ semantics. Only flag actual range-capable values (slices, arrays, maps, channels, strings) and other compile-time issues. This applies to all Go files across the repository.

Applied to files:

  • core/providers/gemini/chat.go
  • core/providers/gemini/responses.go
  • core/providers/gemini/utils.go
  • core/providers/gemini/types.go
📚 Learning: 2025-12-19T09:26:54.961Z
Learnt from: qwerty-dvorak
Repo: maximhq/bifrost PR: 1006
File: core/providers/utils/utils.go:1050-1051
Timestamp: 2025-12-19T09:26:54.961Z
Learning: Update streaming end-marker handling so HuggingFace is treated as a non-[DONE] provider for backends that do not emit a DONE marker (e.g., meta llama on novita). In core/providers/utils/utils.go, adjust ProviderSendsDoneMarker() (or related logic) to detect providers that may not emit DONE and avoid relying on DONE as the sole end signal. Add tests to cover both DONE-emitting and non-DONE backends, with clear documentation in code comments explaining the rationale and any fallback behavior.

Applied to files:

  • core/providers/gemini/chat.go
  • core/providers/gemini/responses.go
  • core/providers/gemini/utils.go
  • core/providers/gemini/types.go
📚 Learning: 2025-12-30T05:37:48.365Z
Learnt from: Pratham-Mishra04
Repo: maximhq/bifrost PR: 1180
File: docs/features/mcp/connecting-to-servers.mdx:452-458
Timestamp: 2025-12-30T05:37:48.365Z
Learning: When reviewing documentation PRs in a Graphite-managed stack, first check related or previous PRs in the stack for feature implementations before flagging documentation as incorrect or unsupported. Documentation MDX files often reference features implemented in earlier stack PRs; verify that the documented behavior exists in earlier changes and that the docs accurately reflect the implemented state before requesting edits.

Applied to files:

  • docs/providers/reasoning.mdx
🧬 Code graph analysis (3)
core/providers/gemini/chat.go (1)
core/providers/gemini/types.go (1)
  • GenerationConfig (641-707)
core/providers/gemini/responses.go (3)
core/providers/gemini/types.go (3)
  • DefaultCompletionMaxTokens (18-18)
  • DefaultReasoningMinBudget (19-19)
  • DynamicReasoningBudget (20-20)
core/schemas/utils.go (1)
  • Ptr (16-18)
core/providers/utils/utils.go (1)
  • GetBudgetTokensFromReasoningEffort (1350-1387)
core/providers/gemini/utils.go (2)
core/providers/gemini/types.go (3)
  • DefaultCompletionMaxTokens (18-18)
  • DefaultReasoningMinBudget (19-19)
  • DynamicReasoningBudget (20-20)
core/providers/utils/utils.go (2)
  • GetReasoningEffortFromBudgetTokens (1308-1345)
  • GetBudgetTokensFromReasoningEffort (1350-1387)
🔇 Additional comments (5)
transports/changelog.md (1)

4-5: Changelog entries align with PR scope

The new bullets accurately describe the XAI docs update and Gemini reasoning conversion fix; nothing else to tweak here.

core/changelog.md (1)

2-3: Core changelog accurately reflects Gemini fix

The added Gemini reasoning conversion entry is consistent with the implementation changes in the provider.

docs/providers/reasoning.mdx (1)

24-25: Gemini reasoning docs match implementation behavior

The updated Gemini sections correctly capture:

  • 2.5+ as budget-only and 3.0+ as budget+level.
  • Priority rules where reasoning.max_tokens wins over effort and only thinkingBudget is sent when both are present.
  • Special handling for 0, -1, and "none", as well as Pro-model level restrictions and version detection.

This lines up with the new model-aware logic in convertParamsToGenerationConfig, convertParamsToGenerationConfigResponses, and effortToThinkingLevel.

Also applies to: 538-607, 613-637, 659-687, 706-717, 840-848, 1464-1483, 1497-1510

core/providers/gemini/chat.go (1)

22-25: Model-aware generation config wiring looks correct

Passing bifrostReq.Model into convertParamsToGenerationConfig is exactly what the new Gemini 2.5 vs 3.0+ reasoning logic needs; nothing else to change here.

core/providers/gemini/types.go (1)

17-21: Reasoning constants and ThinkingLevel pointer semantics look correct

  • The new constants (MinReasoningMaxTokens, DefaultCompletionMaxTokens, DefaultReasoningMinBudget, DynamicReasoningBudget) line up with how Gemini reasoning defaults and special values are used in the converters.
  • Switching GenerationConfigThinkingConfig.ThinkingLevel to *string plus the updated UnmarshalJSON that handles both camelCase and snake_case fields via sonic.Unmarshal gives you the flexibility needed for model-aware logic without breaking JSON interop.

No issues here.

Also applies to: 913-923, 925-963

Copy link
Contributor

akshaydeo commented Jan 5, 2026

Merge activity

  • Jan 5, 10:17 AM UTC: A user started a stack merge that includes this pull request via Graphite.
  • Jan 5, 10:17 AM UTC: @akshaydeo merged this pull request with Graphite.

@akshaydeo akshaydeo merged commit 48dd068 into main Jan 5, 2026
12 checks passed
@akshaydeo akshaydeo deleted the 01-05-fix_fix_reasoning_effort_and_budget_convertion_for_gemini branch January 5, 2026 10:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants