-
Notifications
You must be signed in to change notification settings - Fork 187
fix: dynamic thinking budget handling for gemini #1158
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: dynamic thinking budget handling for gemini #1158
Conversation
📝 WalkthroughWalkthroughHandles dynamic thinking budget sentinel (-1) across multiple provider integrations (Anthropic, Bedrock, Cohere, Gemini): normalizes or explicitly propagates -1 per provider rules, enforces minimum budget where required, and updates docs and changelogs. Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes
Poem
Pre-merge checks and finishing touches✅ Passed checks (3 passed)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
core/providers/bedrock/responses.go (1)
1313-1328: Guardrequest.InferenceConfigbefore dereferencing in reasoning_config parsingHere you do:
effort := providerUtils.GetReasoningEffortFromBudgetTokens( maxTokens, minBudgetTokens, *request.InferenceConfig.MaxTokens)without checking that
request.InferenceConfig(andMaxTokens) are non-nil. If Bedrock ever sendsreasoning_config.budget_tokenswithout a populatedInferenceConfig, this will panic.Consider defensively:
- Early‑returning or skipping effort derivation if
request.InferenceConfig == nilorrequest.InferenceConfig.MaxTokens == nil, or- Supplying a safe default maxTokens when it’s missing.
🧹 Nitpick comments (4)
core/providers/cohere/responses.go (1)
940-950: Dynamic budget handling looks correct for Cohere.The logic correctly implements Cohere's requirement to omit the token budget field when -1 is provided. Setting
TokenBudgettonilwill omit it from the JSON request due to theomitemptytag, which aligns with the PR objectives.Consider adding input validation.
The code doesn't validate that
MaxTokensis either -1 or >= 1 (per the type definition showingTokenBudgetshould be >= 1). Invalid values like 0, -2, or other negative numbers would be passed through without validation, potentially causing errors downstream.🔎 Proposed validation
if bifrostReq.Params.Reasoning.MaxTokens != nil { thinking := &CohereThinking{ Type: ThinkingTypeEnabled, } if *bifrostReq.Params.Reasoning.MaxTokens == -1 { // cohere does not support dynamic reasoning budget like gemini // setting it to nil thinking.TokenBudget = nil } else { + // Validate that MaxTokens is >= 1 + if *bifrostReq.Params.Reasoning.MaxTokens < 1 { + return nil, fmt.Errorf("reasoning max_tokens must be >= 1 or -1 for dynamic budget, got %d", *bifrostReq.Params.Reasoning.MaxTokens) + } thinking.TokenBudget = bifrostReq.Params.Reasoning.MaxTokens } cohereReq.Thinking = thinkingVerify consistency across the provider stack.
Since this PR is part of a Graphite stack with related changes to Gemini, Anthropic, and Bedrock, ensure that the -1 handling pattern is consistent and correctly documented across all providers.
Based on coding guidelines, stack context is important for this review.
core/providers/gemini/utils.go (2)
46-55: Consider clarifying the redundant assignment for dynamic thinking budget.Line 47 already assigns
MaxTokensfromThinkingBudget, so line 53's reassignment of-1is redundant. While the behavior is correct, this could be simplified for clarity:💡 Optional refactor to reduce redundancy
If the explicit reassignment isn't needed for future-proofing, consider:
if config.ThinkingConfig.ThinkingBudget != nil { params.Reasoning.MaxTokens = schemas.Ptr(int(*config.ThinkingConfig.ThinkingBudget)) switch *config.ThinkingConfig.ThinkingBudget { case 0: params.Reasoning.Effort = schemas.Ptr("none") - case -1: - // dynamic thinking budget - params.Reasoning.MaxTokens = schemas.Ptr(-1) } }However, retaining the explicit case may be intentional for:
- Documentation clarity (making -1 handling visible)
- Consistency with the
case 0pattern- Future-proofing if line 47's behavior changes
46-55: Consider validating negative thinking budget values.The current implementation handles
0,-1, and positive values, but doesn't validate against other negative values (e.g.,-2,-3). Since only-1has a documented meaning (dynamic thinking budget), consider adding validation to reject other negative values.🔎 Suggested validation
if config.ThinkingConfig.ThinkingBudget != nil { + if *config.ThinkingConfig.ThinkingBudget < -1 { + // Invalid negative value - could log warning or treat as 0 + params.Reasoning.MaxTokens = schemas.Ptr(0) + params.Reasoning.Effort = schemas.Ptr("none") + } else { params.Reasoning.MaxTokens = schemas.Ptr(int(*config.ThinkingConfig.ThinkingBudget)) switch *config.ThinkingConfig.ThinkingBudget { case 0: params.Reasoning.Effort = schemas.Ptr("none") case -1: // dynamic thinking budget params.Reasoning.MaxTokens = schemas.Ptr(-1) } + } }This ensures only valid values (
-1,0, or positive) are processed.core/providers/bedrock/responses.go (1)
1467-1514: Bedrock dynamic reasoning budget normalization is coherent across Anthropic & NovaThis block:
- Uses
tokenBudgetas the canonical scalar, remappingreasoning.max_tokens == -1toDefaultCompletionMaxTokensso Bedrock never receives-1.- Enforces Anthropic’s minimum reasoning budget before populating
"reasoning_config": {"type":"enabled","budget_tokens": tokenBudget}.- For Nova, translates
tokenBudgetintomaxReasoningEffortviaGetReasoningEffortFromBudgetTokens, and adjustsinferenceConfigfor"high"effort by clearingMaxTokens,Temperature, andTopP, which aligns with the existing effort-based path below.Behavioral nuance to keep in mind: for Anthropic models,
tokenBudgetis always derived from the Bifrost reasoning budget (or default) and is not clamped againstinferenceConfig.MaxTokens, so callers can request a larger reasoning budget than the completion max. If Bedrock/Anthropic ever enforcebudget_tokens <= max_tokens, that might need revisiting, but the current logic is internally consistent with your parsing side.
📜 Review details
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (10)
core/changelog.mdcore/providers/anthropic/chat.gocore/providers/anthropic/responses.gocore/providers/bedrock/responses.gocore/providers/cohere/chat.gocore/providers/cohere/responses.gocore/providers/gemini/responses.gocore/providers/gemini/utils.godocs/integrations/genai-sdk/overview.mdxtransports/changelog.md
🧰 Additional context used
📓 Path-based instructions (1)
**
⚙️ CodeRabbit configuration file
always check the stack if there is one for the current PR. do not give localized reviews for the PR, always see all changes in the light of the whole stack of PRs (if there is a stack, if there is no stack you can continue to make localized suggestions/reviews)
Files:
core/providers/gemini/responses.gocore/providers/gemini/utils.gocore/providers/anthropic/responses.gocore/providers/anthropic/chat.gocore/providers/bedrock/responses.godocs/integrations/genai-sdk/overview.mdxcore/providers/cohere/responses.gotransports/changelog.mdcore/providers/cohere/chat.gocore/changelog.md
🧠 Learnings (3)
📚 Learning: 2025-12-09T17:07:42.007Z
Learnt from: qwerty-dvorak
Repo: maximhq/bifrost PR: 1006
File: core/schemas/account.go:9-18
Timestamp: 2025-12-09T17:07:42.007Z
Learning: In core/schemas/account.go, the HuggingFaceKeyConfig field within the Key struct is currently unused and reserved for future Hugging Face inference endpoint deployments. Do not flag this field as missing from OpenAPI documentation or require its presence in the API spec until the feature is actively implemented and used. When the feature is added, update the OpenAPI docs accordingly; otherwise, treat this field as non-breaking and not part of the current API surface.
Applied to files:
core/providers/gemini/responses.gocore/providers/gemini/utils.gocore/providers/anthropic/responses.gocore/providers/anthropic/chat.gocore/providers/bedrock/responses.gocore/providers/cohere/responses.gocore/providers/cohere/chat.go
📚 Learning: 2025-12-19T09:26:54.961Z
Learnt from: qwerty-dvorak
Repo: maximhq/bifrost PR: 1006
File: core/providers/utils/utils.go:1050-1051
Timestamp: 2025-12-19T09:26:54.961Z
Learning: Update streaming end-marker handling so HuggingFace is treated as a non-[DONE] provider for backends that do not emit a DONE marker (e.g., meta llama on novita). In core/providers/utils/utils.go, adjust ProviderSendsDoneMarker() (or related logic) to detect providers that may not emit DONE and avoid relying on DONE as the sole end signal. Add tests to cover both DONE-emitting and non-DONE backends, with clear documentation in code comments explaining the rationale and any fallback behavior.
Applied to files:
core/providers/gemini/responses.gocore/providers/gemini/utils.gocore/providers/anthropic/responses.gocore/providers/anthropic/chat.gocore/providers/bedrock/responses.gocore/providers/cohere/responses.gocore/providers/cohere/chat.go
📚 Learning: 2025-12-12T10:28:54.988Z
Learnt from: Pratham-Mishra04
Repo: maximhq/bifrost PR: 978
File: core/providers/anthropic/chat.go:207-216
Timestamp: 2025-12-12T10:28:54.988Z
Learning: In core/providers/anthropic/chat.go, within ToAnthropicChatRequest, ensure the mapping from ReasoningDetails to Anthropic thinking blocks preserves all fields from ReasoningDetails, including when Text is nil or empty. Do not drop or coerce missing Text; propagate nil/empty values to maintain complete bidirectional mapping of Anthropic events in Bifrost. Add tests to verify that ReasoningDetails with Text nil or "" round-trip without loss, and update code to handle nullable Text without dropping data. If there are multiple blocks, preserve order and all fields during mapping.
Applied to files:
core/providers/anthropic/chat.go
🧬 Code graph analysis (7)
core/providers/gemini/responses.go (1)
core/schemas/utils.go (1)
Ptr(16-18)
core/providers/gemini/utils.go (2)
core/schemas/utils.go (1)
Ptr(16-18)core/utils.go (1)
Ptr(56-58)
core/providers/anthropic/responses.go (2)
core/providers/anthropic/types.go (2)
MinimumReasoningMaxTokens(15-15)AnthropicThinking(71-74)core/schemas/utils.go (1)
Ptr(16-18)
core/providers/anthropic/chat.go (5)
core/providers/anthropic/types.go (2)
MinimumReasoningMaxTokens(15-15)AnthropicThinking(71-74)core/providers/bedrock/types.go (1)
MinimumReasoningMaxTokens(12-12)core/providers/cohere/types.go (1)
MinimumReasoningMaxTokens(11-11)core/schemas/utils.go (1)
Ptr(16-18)core/utils.go (1)
Ptr(56-58)
core/providers/bedrock/responses.go (4)
core/providers/bedrock/types.go (2)
DefaultCompletionMaxTokens(13-13)MinimumReasoningMaxTokens(12-12)core/schemas/utils.go (1)
IsAnthropicModel(1048-1050)core/providers/anthropic/types.go (1)
MinimumReasoningMaxTokens(15-15)core/providers/utils/utils.go (1)
GetReasoningEffortFromBudgetTokens(1308-1345)
core/providers/cohere/responses.go (1)
core/providers/cohere/types.go (2)
CohereThinking(174-177)ThinkingTypeEnabled(183-183)
core/providers/cohere/chat.go (1)
core/providers/cohere/types.go (2)
CohereThinking(174-177)ThinkingTypeEnabled(183-183)
⏰ Context from checks skipped due to timeout of 900000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (19)
- GitHub Check: Graphite / mergeability_check
- GitHub Check: Graphite / mergeability_check
- GitHub Check: Graphite / mergeability_check
- GitHub Check: Graphite / mergeability_check
- GitHub Check: Graphite / mergeability_check
- GitHub Check: Graphite / mergeability_check
- GitHub Check: Graphite / mergeability_check
- GitHub Check: Graphite / mergeability_check
- GitHub Check: Graphite / mergeability_check
- GitHub Check: Graphite / mergeability_check
- GitHub Check: Graphite / mergeability_check
- GitHub Check: Graphite / mergeability_check
- GitHub Check: Graphite / mergeability_check
- GitHub Check: Graphite / mergeability_check
- GitHub Check: Graphite / mergeability_check
- GitHub Check: Graphite / mergeability_check
- GitHub Check: Graphite / mergeability_check
- GitHub Check: Graphite / mergeability_check
- GitHub Check: Graphite / mergeability_check
🔇 Additional comments (7)
core/changelog.md (1)
4-5: LGTM!The changelog entries accurately describe the changes introduced in this PR. The distinction between the refactor (line 4) and the fix for dynamic thinking budget handling (line 5) is clear.
transports/changelog.md (1)
5-6: LGTM!The changelog entries are consistent with the core changelog and accurately reflect the transport-layer changes for dynamic thinking budget handling.
core/providers/gemini/responses.go (1)
2024-2034: LGTM! Correct implementation of dynamic thinking budget for Gemini.The logic correctly handles the three cases for reasoning configuration:
MaxTokens = 0: Disables thinkingMaxTokens = -1: Preserves dynamic thinking budget (Gemini native support)- Other values: Sets explicit thinking budget
This implementation aligns with the documentation stating that Gemini preserves
-1for native dynamic thinking support.docs/integrations/genai-sdk/overview.mdx (1)
283-305: Documentation for dynamic thinking budget handling is accurate and implementations are verified.All four provider-specific implementations for
thinkingBudget: -1have been confirmed in the codebase:
- Gemini (core/providers/gemini/utils.go:51-53): Preserves -1 for native dynamic thinking
- Anthropic (core/providers/anthropic/responses.go:1423-1426): Converts to maximum budget (anthropicReq.MaxTokens)
- Bedrock (core/providers/bedrock/responses.go:1472-1476): Converts to maximum budget (DefaultCompletionMaxTokens)
- Cohere (core/providers/cohere/responses.go:943-949): Drops budget field (sets TokenBudget to nil)
The documentation accurately reflects the implementation behavior. No action required.
core/providers/anthropic/chat.go (1)
104-120: Reasoningmax_tokens = -1normalization for Anthropic looks correctThe new
budgetTokenspath cleanly:
- Normalizes
reasoning.max_tokens == -1toanthropicReq.MaxTokens(so Anthropic never sees-1).- Enforces
MinimumReasoningMaxTokenson the resolved value.- Writes
Thinking.BudgetTokensfrom the computed scalar instead of aliasing the original pointer.This matches the intended “dynamic budget” behavior while staying within Anthropic’s static-budget API constraints, and keeps the
Effort-based fallback path unchanged.core/providers/anthropic/responses.go (1)
1420-1435: Consistentmax_tokens = -1handling for Anthropic Responses pathThis mirrors the chat-side logic:
reasoning.max_tokens == -1is resolved toanthropicReq.MaxTokens, validated againstMinimumReasoningMaxTokens, and written viaschemas.Ptr(budgetTokens). That keeps Responses‑based Anthropic requests aligned with the chat path and avoids ever sending a-1budget upstream while preserving existingEfforthandling.core/providers/cohere/chat.go (1)
107-117: Review verified—implementation is correct across all providers.The -1 handling for dynamic reasoning budgets is correctly implemented:
- Cohere (lines 110-117): Correctly sets TokenBudget to nil, omitting the field from the API request
- Anthropic: Converts -1 to the request's max_tokens value (as verified in the code at lines 108-120)
- Gemini: Preserves -1 for native dynamic thinking support (as verified in utils.go line 324+)
The implementation aligns with each provider's API specifications and the explicit struct construction improves clarity. The ExtraParams code path (lines 147-159) operates as a fallback and does not contradict the -1 handling logic for the primary Reasoning.MaxTokens path.
6c8c03c to
6c01240
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
core/providers/bedrock/responses.go (1)
1471-1493: Model-agnostic default for -1 may be incorrect for Nova models.When
MaxTokensis -1 (line 1472), the code setstokenBudgettoanthropic.MinimumReasoningMaxTokens(1024) for all Bedrock models. However, this Anthropic-specific constant may not be appropriate for Nova models:
- Anthropic models: MinimumReasoningMaxTokens = 1024 ✓
- Nova models: MinimumReasoningMaxTokens = 1
For Nova models, using 1024 as the default when -1 is specified could lead to unintended effort level calculations at line 1493, since
GetReasoningEffortFromBudgetTokens(1024, 1, defaultMaxTokens)may produce a different effort level than intended.Consider model-specific defaults:
🔎 Suggested fix: Use model-specific defaults
tokenBudget := *bifrostReq.Params.Reasoning.MaxTokens if *bifrostReq.Params.Reasoning.MaxTokens == -1 { // bedrock does not support dynamic reasoning budget like gemini - // setting it to default max tokens - tokenBudget = anthropic.MinimumReasoningMaxTokens + // setting it to model-specific default max tokens + if schemas.IsAnthropicModel(bifrostReq.Model) { + tokenBudget = anthropic.MinimumReasoningMaxTokens + } else if schemas.IsNovaModel(bifrostReq.Model) { + // Use a sensible default for Nova, e.g., DefaultCompletionMaxTokens or a Nova-specific constant + tokenBudget = DefaultCompletionMaxTokens + if inferenceConfig.MaxTokens != nil { + tokenBudget = *inferenceConfig.MaxTokens + } + } else { + tokenBudget = MinimumReasoningMaxTokens + } }
🧹 Nitpick comments (3)
core/providers/anthropic/responses.go (1)
1421-1434: Anthropic responses reasoning budget normalization is consistent but duplicatedNormalizing
reasoning.max_tokens == -1toMinimumReasoningMaxTokensand enforcing>= MinimumReasoningMaxTokensis sensible and matches the chat path, but the logic is now duplicated betweenToAnthropicResponsesRequestandToAnthropicChatRequest. Consider extracting a small helper (e.g.,resolveAnthropicBudgetTokens(maxTokens *int) (int, error)) to keep the behavior in one place.core/providers/anthropic/chat.go (1)
104-119: Shared Anthropic reasoning budget logic should be centralizedThis block correctly maps
Reasoning.MaxTokens(including-1) to a validatedbudgetTokensand enables thinking, but it’s effectively identical to the logic inToAnthropicResponsesRequest. Extracting a common helper would reduce divergence risk if Anthropic’s minimum or mapping rules change later.core/providers/bedrock/utils.go (1)
39-95: Clarify-1reasoning budget mapping for Bedrock Anthropic vs Nova and other modelsThe new
tokenBudgetflow makes Bedrock’s behavior aroundreasoning.max_tokensmuch clearer, but a couple of details are worth double‑checking:
- For all Bedrock models, including Nova,
MaxTokens == -1is mapped toanthropic.MinimumReasoningMaxTokens. For Anthropic-on-Bedrock that’s intuitive, but for Nova and other Bedrock models it couples behavior to Anthropic’s minimum rather than this package’sMinimumReasoningMaxTokensor the per-model default/max tokens.- Only the Anthropic branch validates
tokenBudget >= anthropic.MinimumReasoningMaxTokens; Nova and other models will accept very small positivetokenBudgetvalues, while dynamic-1requests get a comparatively large default.If this cross-model coupling is intentional, consider adding a short comment explaining why Anthropic’s minimum is the shared default. Otherwise, it may be cleaner to:
- Use the local
MinimumReasoningMaxTokens(ordefaultMaxTokens) for Nova/other models, and- Optionally normalize/clamp
tokenBudgetthere as well for consistency.
📜 Review details
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (11)
core/changelog.mdcore/providers/anthropic/chat.gocore/providers/anthropic/responses.gocore/providers/bedrock/responses.gocore/providers/bedrock/utils.gocore/providers/cohere/chat.gocore/providers/cohere/responses.gocore/providers/gemini/responses.gocore/providers/gemini/utils.godocs/integrations/genai-sdk/overview.mdxtransports/changelog.md
✅ Files skipped from review due to trivial changes (1)
- transports/changelog.md
🚧 Files skipped from review as they are similar to previous changes (3)
- core/providers/cohere/responses.go
- docs/integrations/genai-sdk/overview.mdx
- core/providers/gemini/responses.go
🧰 Additional context used
📓 Path-based instructions (1)
**
⚙️ CodeRabbit configuration file
always check the stack if there is one for the current PR. do not give localized reviews for the PR, always see all changes in the light of the whole stack of PRs (if there is a stack, if there is no stack you can continue to make localized suggestions/reviews)
Files:
core/providers/bedrock/responses.gocore/providers/anthropic/responses.gocore/providers/cohere/chat.gocore/providers/anthropic/chat.gocore/providers/gemini/utils.gocore/providers/bedrock/utils.gocore/changelog.md
🧠 Learnings (3)
📚 Learning: 2025-12-09T17:07:42.007Z
Learnt from: qwerty-dvorak
Repo: maximhq/bifrost PR: 1006
File: core/schemas/account.go:9-18
Timestamp: 2025-12-09T17:07:42.007Z
Learning: In core/schemas/account.go, the HuggingFaceKeyConfig field within the Key struct is currently unused and reserved for future Hugging Face inference endpoint deployments. Do not flag this field as missing from OpenAPI documentation or require its presence in the API spec until the feature is actively implemented and used. When the feature is added, update the OpenAPI docs accordingly; otherwise, treat this field as non-breaking and not part of the current API surface.
Applied to files:
core/providers/bedrock/responses.gocore/providers/anthropic/responses.gocore/providers/cohere/chat.gocore/providers/anthropic/chat.gocore/providers/gemini/utils.gocore/providers/bedrock/utils.go
📚 Learning: 2025-12-19T09:26:54.961Z
Learnt from: qwerty-dvorak
Repo: maximhq/bifrost PR: 1006
File: core/providers/utils/utils.go:1050-1051
Timestamp: 2025-12-19T09:26:54.961Z
Learning: Update streaming end-marker handling so HuggingFace is treated as a non-[DONE] provider for backends that do not emit a DONE marker (e.g., meta llama on novita). In core/providers/utils/utils.go, adjust ProviderSendsDoneMarker() (or related logic) to detect providers that may not emit DONE and avoid relying on DONE as the sole end signal. Add tests to cover both DONE-emitting and non-DONE backends, with clear documentation in code comments explaining the rationale and any fallback behavior.
Applied to files:
core/providers/bedrock/responses.gocore/providers/anthropic/responses.gocore/providers/cohere/chat.gocore/providers/anthropic/chat.gocore/providers/gemini/utils.gocore/providers/bedrock/utils.go
📚 Learning: 2025-12-12T10:28:54.988Z
Learnt from: Pratham-Mishra04
Repo: maximhq/bifrost PR: 978
File: core/providers/anthropic/chat.go:207-216
Timestamp: 2025-12-12T10:28:54.988Z
Learning: In core/providers/anthropic/chat.go, within ToAnthropicChatRequest, ensure the mapping from ReasoningDetails to Anthropic thinking blocks preserves all fields from ReasoningDetails, including when Text is nil or empty. Do not drop or coerce missing Text; propagate nil/empty values to maintain complete bidirectional mapping of Anthropic events in Bifrost. Add tests to verify that ReasoningDetails with Text nil or "" round-trip without loss, and update code to handle nullable Text without dropping data. If there are multiple blocks, preserve order and all fields during mapping.
Applied to files:
core/providers/anthropic/chat.go
🧬 Code graph analysis (6)
core/providers/bedrock/responses.go (5)
core/providers/anthropic/types.go (1)
MinimumReasoningMaxTokens(15-15)core/providers/bedrock/types.go (1)
MinimumReasoningMaxTokens(12-12)core/providers/cohere/types.go (1)
MinimumReasoningMaxTokens(11-11)core/schemas/utils.go (1)
IsAnthropicModel(1048-1050)core/providers/utils/utils.go (1)
GetReasoningEffortFromBudgetTokens(1308-1345)
core/providers/anthropic/responses.go (2)
core/providers/anthropic/types.go (2)
MinimumReasoningMaxTokens(15-15)AnthropicThinking(71-74)core/schemas/utils.go (1)
Ptr(16-18)
core/providers/cohere/chat.go (1)
core/providers/cohere/types.go (2)
CohereThinking(174-177)ThinkingTypeEnabled(183-183)
core/providers/anthropic/chat.go (5)
core/providers/anthropic/types.go (2)
MinimumReasoningMaxTokens(15-15)AnthropicThinking(71-74)core/providers/bedrock/types.go (1)
MinimumReasoningMaxTokens(12-12)core/providers/cohere/types.go (1)
MinimumReasoningMaxTokens(11-11)core/schemas/utils.go (1)
Ptr(16-18)core/utils.go (1)
Ptr(56-58)
core/providers/gemini/utils.go (2)
core/schemas/utils.go (1)
Ptr(16-18)core/utils.go (1)
Ptr(56-58)
core/providers/bedrock/utils.go (4)
core/providers/anthropic/types.go (1)
MinimumReasoningMaxTokens(15-15)core/providers/bedrock/types.go (1)
MinimumReasoningMaxTokens(12-12)core/schemas/utils.go (1)
IsAnthropicModel(1048-1050)core/providers/utils/utils.go (1)
GetReasoningEffortFromBudgetTokens(1308-1345)
⏰ Context from checks skipped due to timeout of 900000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
- GitHub Check: Graphite / mergeability_check
- GitHub Check: Graphite / mergeability_check
- GitHub Check: Graphite / mergeability_check
🔇 Additional comments (3)
core/providers/gemini/utils.go (1)
46-55: Dynamic reasoning budget sentinel now preserved for GeminiExplicitly setting
params.Reasoning.MaxTokenstoPtr(-1)forThinkingBudget == -1matches the dynamic-budget sentinel semantics and keeps Gemini aligned with other providers’ handling.core/changelog.md (1)
4-5: Changelog entry accurately captures provider reasoning-budget changesThe new entries clearly describe the Vertex/Gemini endpoint refactor and cross-provider dynamic thinking budget handling.
core/providers/cohere/chat.go (1)
104-118: Cohere dynamic budget handling matches intended semanticsTreating
Reasoning.MaxTokens == -1asTokenBudget = nil(while enabling thinking) correctly encodes “no explicit budget” for Cohere, and keeps explicit positive budgets unchanged.
84057b8 to
0d6a716
Compare
Merge activity
|
0d6a716 to
53c4db8
Compare

Handle Dynamic Thinking Budget Across Providers
This PR implements support for dynamic thinking budget (-1) in Gemini and adapts the behavior for other providers that don't natively support this feature.
Changes
Type of change
Affected areas
How to test
Test dynamic thinking budget with different providers:
Breaking changes
Related issues
Fixes issue with dynamic thinking budget handling across providers
Security considerations
No security implications.
Checklist