-
Notifications
You must be signed in to change notification settings - Fork 150
feat: responses reasoning fixes #1000
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This stack of pull requests is managed by Graphite. Learn more about stacking. |
|
Caution Review failedThe pull request is closed. 📝 WalkthroughSummary by CodeRabbitRelease Notes
✏️ Tip: You can customize this high-level summary in your review settings. WalkthroughCentralizes Anthropic/Vertex/Azure request-body construction and passthrough logic, expands reasoning/thinking and summary handling across providers and streaming layers, adds custom JSON (un)marshallers and conversions, extends schemas with signatures/summaries, modifies router SSE/DONE behavior, and updates tests and UI surfaces. Changes
Sequence Diagram(s)sequenceDiagram
autonumber
participant Client
participant Router as "Bifrost Router"
participant Provider
participant Acc as "Framework Accumulator"
Client->>Router: POST /responses (may include raw body or reasoning params)
Router->>Router: evaluate shouldUsePassthrough / isClaudeModel
Router->>Router: build jsonBody via getRequestBodyForResponses / provider-specific helper
Router->>Provider: send request (streaming or non-streaming)
Provider-->>Router: stream SSE/JSON chunks (events, deltas, errors, or multi-event raw)
Router->>Acc: forward chunks/deltas
Acc->>Acc: assemble deltas (content, reasoning summaries, signatures)
Acc-->>Router: emit assembled SSE/data frames (errors formatted via ToAnthropicResponsesStreamError)
Router-->>Client: stream SSE (DONE marker suppressed for Anthropic/Responses routes)
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45–75 minutes
Poem
Pre-merge checks and finishing touches❌ Failed checks (3 warnings, 1 inconclusive)
✅ Passed checks (1 passed)
📜 Recent review detailsConfiguration used: CodeRabbit UI Review profile: CHILL Plan: Pro ⛔ Files ignored due to path filters (1)
📒 Files selected for processing (58)
Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 5
🧹 Nitpick comments (9)
transports/bifrost-http/integrations/anthropic.go (1)
72-77: Non‑stream Anthropic/v1/messagesno longer supports raw‑response passthrough; consider confirming intent and cleaning up commented codeThe previous behavior short‑circuited to
resp.ExtraFields.RawResponsefor Anthropic if present; now that path is commented out and we always go throughanthropic.ToAnthropicResponsesResponse(resp). This is a real behavior change for non‑stream responses:
- If any callers relied on getting the provider’s raw body for
/v1/messages(non‑stream), they will now receive the normalized Anthropic struct instead.- In streaming, we still support raw passthrough gated by
BifrostContextKeySendBackRawResponse, so behavior is now asymmetric between streaming vs non‑stream.If the intent is to fully normalize non‑stream responses (e.g., to ensure reasoning metadata is always passed through via our schemas), this looks fine functionally, but I’d suggest:
- Remove the commented block to avoid dead code, and
- Optionally add a brief comment above the converter clarifying that non‑stream Anthropic responses are intentionally always normalized and that raw passthrough is streaming‑only.
If, instead, non‑stream raw passthrough is still desired in some cases, we probably want to reintroduce this logic but gate it similarly to streaming using
BifrostContextKeySendBackRawResponsefor consistency.core/providers/utils/utils.go (1)
951-960: No urgent security fix needed for this use case, but consider documenting the function's non-security purpose.
GetRandomStringis used only for generating internal message IDs in Anthropic response parsing (with prefixes likemsg_andrs_), not for authentication tokens or security-sensitive identifiers. Whilemath/randis indeed not cryptographically secure, the current implementation is appropriate for internal message tracking.If you want to prevent future misuse, add a doc comment clarifying this is not suitable for security-sensitive purposes. Alternatively, if you're concerned about consistency with Go best practices for all random generation, using
crypto/randis defensible but not critical for this use case.core/providers/openai/responses.go (2)
57-59: Duplicate condition check.Line 57 and line 59 both check
len(message.ResponsesReasoning.Summary) > 0. This is redundant.// If the message has summaries but no content blocks and the model is gpt-oss, then convert the summaries to content blocks if len(message.ResponsesReasoning.Summary) > 0 && strings.Contains(bifrostReq.Model, "gpt-oss") && - len(message.ResponsesReasoning.Summary) > 0 && message.Content == nil {
45-84: Consider extracting model-specific reasoning logic to improve readability.The nested conditionals handling reasoning content transformation are complex. The logic correctly handles:
- Skipping reasoning blocks without summaries for non-gpt-oss models
- Converting summaries to content blocks for gpt-oss models
- Passing through other messages unchanged
However, using
strings.Contains(bifrostReq.Model, "gpt-oss")for model detection may be fragile if model naming conventions change.Consider extracting a helper function like
isGptOssModel(model string) boolfor clearer intent and easier maintenance:func isGptOssModel(model string) bool { return strings.Contains(model, "gpt-oss") }This would make the conditional checks more readable and centralize the model detection logic.
core/providers/gemini/responses.go (2)
143-146: Consider usingsonic.Marshalfor consistency.This uses
encoding/json.Marshalwhile the rest of the codebase usesgithub.com/bytedance/sonicfor JSON operations. For consistency and potential performance benefits, consider usingsonic.Marshalhere.- if argsBytes, err := json.Marshal(part.FunctionCall.Args); err == nil { + if argsBytes, err := sonic.Marshal(part.FunctionCall.Args); err == nil { argumentsStr = string(argsBytes) }You would also need to add the sonic import if not already present via another code path.
263-272: Duplicate ID generation logic.Lines 264-267 and 269-271 contain duplicate logic for generating
itemID. The second block (269-271) appears to be redundant as it only handles theMessageID == nilcase which is already covered.// Generate stable ID for text item var itemID string if state.MessageID == nil { itemID = fmt.Sprintf("item_%d", outputIndex) } else { itemID = fmt.Sprintf("msg_%s_item_%d", *state.MessageID, outputIndex) } - if state.MessageID == nil { - itemID = fmt.Sprintf("item_%d", outputIndex) - } state.ItemIDs[outputIndex] = itemIDcore/providers/cohere/responses.go (1)
263-272: Duplicate ID generation pattern repeated multiple times.The same ID generation logic with the redundant second
ifblock appears in multiple places (lines 263-272, 306-316, and 421-429). This appears to be copy-paste duplication.Consider extracting a helper function and removing the duplicate conditional:
func (state *CohereResponsesStreamState) generateItemID(outputIndex int, prefix string) string { if state.MessageID == nil { return fmt.Sprintf("%s_%d", prefix, outputIndex) } return fmt.Sprintf("msg_%s_%s_%d", *state.MessageID, prefix, outputIndex) }Then use it consistently:
itemID := state.generateItemID(outputIndex, "item") state.ItemIDs[outputIndex] = itemIDAlso applies to: 306-316, 421-429
transports/bifrost-http/handlers/inference.go (2)
224-254: CustomResponsesRequestunmarshal aligns with chat pattern; consider guarding against reuse-side effectsThe split unmarshal (BifrostParams →
Inputunion →ResponsesParameters) looks correct and mirrors theChatRequest.UnmarshalJSONpattern, so it should resolve the embedded‐struct issues withResponsesParameters’ custom unmarshaller.If you ever end up reusing a
ResponsesRequestinstance for multiple decodes, this implementation can leave stale values in fields that are omitted in subsequent payloads (standardencoding/jsonbehavior, but now under your control). It’s not a problem for the current usage (freshvar req ResponsesRequestper request), but if you want stricter reset semantics you could zero the struct at the start of the method before re-populating it.
91-118:responsesParamsKnownFieldsomits"user"; likely ends up duplicated inExtraParams
ResponsesParametershas aUser *string \json:"user,omitempty"`, but"user"is not listed inresponsesParamsKnownFields. That means/v1/responsesrequests with auserfield will both populateResponsesParameters.User(viasonic.Unmarshal) and also be treated as an unknown extra param and forwarded inExtraParams. This is inconsistent with the chat path (where"user"is marked as known) and could cause confusing duplication for provider adapters that look atExtraParams`.If
useris intended to be a first-class, schema-level field for responses (same as chat), consider adding it here so it is not treated as a provider-specific extra:var responsesParamsKnownFields = map[string]bool{ "model": true, "input": true, "fallbacks": true, "stream": true, @@ "top_p": true, "tool_choice": true, "tools": true, - "truncation": true, + "truncation": true, + "user": true, }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (12)
core/providers/anthropic/errors.go(2 hunks)core/providers/anthropic/types.go(2 hunks)core/providers/cohere/responses.go(4 hunks)core/providers/gemini/responses.go(2 hunks)core/providers/openai/openai.go(1 hunks)core/providers/openai/responses.go(2 hunks)core/providers/openai/types.go(2 hunks)core/providers/utils/utils.go(3 hunks)core/schemas/responses.go(3 hunks)transports/bifrost-http/handlers/inference.go(1 hunks)transports/bifrost-http/integrations/anthropic.go(1 hunks)ui/lib/types/logs.ts(2 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**
⚙️ CodeRabbit configuration file
always check the stack if there is one for the current PR. do not give localized reviews for the PR, always see all changes in the light of the whole stack of PRs (if there is a stack, if there is no stack you can continue to make localized suggestions/reviews)
Files:
core/providers/anthropic/errors.gotransports/bifrost-http/handlers/inference.gocore/providers/openai/types.goui/lib/types/logs.tscore/providers/anthropic/types.gocore/providers/openai/responses.gotransports/bifrost-http/integrations/anthropic.gocore/providers/utils/utils.gocore/providers/gemini/responses.gocore/providers/openai/openai.gocore/schemas/responses.gocore/providers/cohere/responses.go
🧬 Code graph analysis (6)
core/providers/anthropic/errors.go (2)
core/providers/anthropic/types.go (3)
AnthropicStreamEvent(328-337)AnthropicStreamEventTypeError(324-324)AnthropicStreamError(399-402)core/providers/anthropic/responses.go (1)
ToAnthropicResponsesStreamError(1392-1413)
transports/bifrost-http/handlers/inference.go (2)
core/schemas/bifrost.go (1)
ResponsesRequest(91-91)core/schemas/responses.go (1)
ResponsesParameters(85-112)
core/providers/openai/types.go (1)
core/schemas/responses.go (1)
ResponsesParametersReasoning(232-237)
ui/lib/types/logs.ts (1)
core/schemas/responses.go (2)
ResponsesReasoningSummary(743-746)ResponsesReasoning(729-732)
core/providers/gemini/responses.go (3)
ui/lib/types/logs.ts (6)
FunctionCall(157-160)ResponsesToolMessage(402-408)ResponsesMessage(422-437)ResponsesMessageContent(399-399)ResponsesReasoning(416-419)ResponsesReasoningSummary(411-414)core/providers/gemini/types.go (4)
FunctionCall(1091-1101)Role(13-13)Content(922-930)Type(778-778)core/schemas/responses.go (5)
ResponsesToolMessage(460-480)ResponsesMessage(312-325)ResponsesMessageContent(337-342)ResponsesReasoning(729-732)ResponsesReasoningSummary(743-746)
core/schemas/responses.go (2)
core/providers/gemini/types.go (1)
Type(778-778)ui/lib/types/logs.ts (2)
ResponsesMessageContentBlockType(351-358)ResponsesReasoningSummary(411-414)
⏰ Context from checks skipped due to timeout of 900000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (15)
- GitHub Check: Graphite / mergeability_check
- GitHub Check: Graphite / mergeability_check
- GitHub Check: Graphite / mergeability_check
- GitHub Check: Graphite / mergeability_check
- GitHub Check: Graphite / mergeability_check
- GitHub Check: Graphite / mergeability_check
- GitHub Check: Graphite / mergeability_check
- GitHub Check: Graphite / mergeability_check
- GitHub Check: Graphite / mergeability_check
- GitHub Check: Graphite / mergeability_check
- GitHub Check: Graphite / mergeability_check
- GitHub Check: Graphite / mergeability_check
- GitHub Check: Graphite / mergeability_check
- GitHub Check: Graphite / mergeability_check
- GitHub Check: Graphite / mergeability_check
🔇 Additional comments (13)
core/providers/anthropic/types.go (2)
143-144: LGTM! Proper addition of redacted thinking support.The new
redacted_thinkingcontent block type aligns with Anthropic's extended thinking API, which returns encrypted thinking content when the model's reasoning needs to be redacted. This is consistent with theDatafield added below for storing the encrypted content.
153-153: Clear documentation for the Data field.The comment accurately describes the field's purpose for encrypted redacted thinking data.
core/providers/openai/types.go (1)
154-192: Well-structured custom marshaling with clear intent.The implementation correctly:
- Preserves custom
Inputmarshaling viajson.RawMessage- Clones
Reasoningto avoid mutating the original struct- Always sets
MaxTokenstonilas OpenAI's API doesn't support this parameterOne minor note: the code mixes
encoding/json(forRawMessage) withsonic(for final marshal), which is intentional but could be documented.core/providers/openai/responses.go (1)
41-94: Logic correctly handles reasoning content transformation.The overall transformation logic for handling reasoning content blocks across different OpenAI model variants is sound. The approach of building a new messages slice while selectively transforming or skipping messages based on model capabilities is appropriate.
core/providers/anthropic/errors.go (1)
36-58: The functionToAnthropicResponsesStreamErrorexists only incore/providers/anthropic/errors.goand is not duplicated elsewhere in the codebase. There is no duplicate function definition inresponses.goor any other file. This code can be merged without compilation errors related to duplication.Likely an incorrect or invalid review comment.
ui/lib/types/logs.ts (1)
411-419: Type rename looks good and aligns with Go schema.The renaming from
ResponsesReasoningContenttoResponsesReasoningSummaryis consistent with the corresponding changes incore/schemas/responses.go(lines 742-745). The field structure matches the Go definition.core/schemas/responses.go (2)
398-401: NewSignaturefield addition looks correct.The
Signaturefield is appropriately added as an optional pointer field for carrying content signatures (used for reasoning in Gemini 3 Pro). The JSON tag withomitemptyis correct for optional fields.
728-746: Type rename and structure updates are consistent.The
ResponsesReasoningstruct now uses[]ResponsesReasoningSummaryfor theSummaryfield, and the newResponsesReasoningSummarytype is properly defined withTypeandTextfields. This aligns with the corresponding TypeScript types inui/lib/types/logs.ts.core/providers/gemini/responses.go (3)
148-164: Good fix for range loop variable capture issue.Creating copies of
functionCallIDandfunctionCallNamebefore using them in pointer assignments correctly avoids the Go range loop variable capture issue. This is a proper fix for Go versions prior to 1.22.
166-179: Thought signature preservation looks correct.The logic correctly creates a separate
ResponsesReasoningmessage when a thought signature is present, using an emptySummaryslice and the encrypted content. This aligns with the updated schema and supports Gemini 3 Pro requirements.
619-627: Look-ahead logic assumes reasoning message immediately follows function call.The look-ahead for thought signature assumes the reasoning message is at index
i+1. This may not handle cases where messages are reordered or there are intervening messages. Consider documenting this assumption or adding validation.Verify that the message ordering convention (reasoning message immediately after function call) is consistently maintained across all code paths that produce these messages.
core/providers/cohere/responses.go (2)
162-765: Streaming conversion implementation is comprehensive.The
ToBifrostResponsesStreammethod handles the full OpenAI-style streaming lifecycle (created, in_progress, output_item.added, deltas, output_item.done, completed) with proper state management. The tool call argument accumulation and tool plan lifecycle handling appear correct.
894-1029: Message conversion logic handles reasoning blocks correctly.The
ConvertBifrostMessagesToCohereMessagesfunction properly accumulates pending reasoning blocks and attaches them to assistant messages. The system message extraction and prepending logic is also correct.
a15c48b to
d4bfce4
Compare
6cf3108 to
4b4a584
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
♻️ Duplicate comments (4)
core/providers/utils/utils.go (1)
267-267: Previous review comment still applies.The past review already flagged this change from
sonic.Marshaltosonic.MarshalIndent. The concern about increased payload size for production API requests and the associated debug prints inopenai.goremains valid. Please address the feedback from the previous review.core/providers/cohere/responses.go (3)
148-153: Returning empty struct for invalid image block remains unaddressed.When
ImageURLis nil, an emptyResponsesMessageContentBlock{}is returned with a zero-valueTypefield, which could cause unexpected behavior downstream when processing content blocks.Consider one of the previously suggested approaches:
- Return a text block indicating the missing image
- Return
(schemas.ResponsesMessageContentBlock, bool)to indicate validity- Skip invalid blocks at the call site
Based on learnings, this issue was previously flagged but not yet addressed.
1131-1142: Tool choice "auto" mapping to "required" remains semantically incorrect.Line 1136 maps
"auto"toToolChoiceRequired, which changes the semantic meaning. In the Responses API,"auto"means the model decides whether to call a tool, while"required"forces a tool call.Please verify Cohere's tool choice options and update the mapping:
#!/bin/bash # Search for Cohere tool choice type definitions and usage ast-grep --pattern 'type CohereToolChoice $$$' ast-grep --pattern 'ToolChoice$_ CohereToolChoice = $_'Based on learnings, this issue was previously flagged but not yet addressed.
1216-1225: Encrypted reasoning content exposure in plain text marker remains unaddressed.Lines 1219-1224 wrap encrypted content in a plain text marker
[ENCRYPTED_REASONING: ...], exposing the encrypted content in an unprotected format. This defeats the purpose of encryption if the content is meant to remain opaque.Consider skipping encrypted content entirely since Cohere doesn't support it:
} else if msg.ResponsesReasoning.EncryptedContent != nil { - // Cohere doesn't have a direct equivalent to encrypted content, - // so we'll store it as a regular thinking block with a special marker - encryptedText := fmt.Sprintf("[ENCRYPTED_REASONING: %s]", *msg.ResponsesReasoning.EncryptedContent) - thinkingBlock := CohereContentBlock{ - Type: CohereContentBlockTypeThinking, - Thinking: &encryptedText, - } - thinkingBlocks = append(thinkingBlocks, thinkingBlock) + // Skip encrypted content as Cohere doesn't support it + // The encrypted content should remain opaque and not be sent to other providers }Based on learnings, this issue was previously flagged but not yet addressed.
🧹 Nitpick comments (5)
transports/bifrost-http/integrations/anthropic.go (3)
74-78: Remove commented-out code.This commented-out code block should either be removed or documented with a TODO/reason for keeping it. Leaving dead code in comments reduces readability and maintainability.
ResponsesResponseConverter: func(ctx *context.Context, resp *schemas.BifrostResponsesResponse) (interface{}, error) { - // if resp.ExtraFields.Provider == schemas.Anthropic { - // if resp.ExtraFields.RawResponse != nil { - // return resp.ExtraFields.RawResponse, nil - // } - // } return anthropic.ToAnthropicResponsesResponse(resp), nil },
94-97: Use the injected logger instead of stdliblog.The
AnthropicRouteris initialized with aschemas.Logger(line 246), but this closure uses the stdliblog.Printf. This inconsistency means errors logged here won't go through the configured logging infrastructure.Consider passing the logger through the route config or using a package-level logger that can be configured.
103-117: Remove large commented-out code block.This 15-line commented block should be removed. If this logic might be needed in the future, document the intent in a TODO or track it in an issue rather than leaving dead code.
} else { - // if resp.ExtraFields.Provider == schemas.Anthropic || - // (resp.ExtraFields.Provider == schemas.Vertex && - // (schemas.IsAnthropicModel(resp.ExtraFields.ModelRequested) || - // schemas.IsAnthropicModel(resp.ExtraFields.ModelDeployment))) { - // // This is always true in integrations - // isRawResponseEnabled, ok := (*ctx).Value(schemas.BifrostContextKeySendBackRawResponse).(bool) - // if ok && isRawResponseEnabled { - // if resp.ExtraFields.RawResponse != nil { - // return string(anthropicResponse[len(anthropicResponse)-1].Type), resp.ExtraFields.RawResponse, nil - // } else { - // // Explicitly return nil to indicate that no raw response is available (because 1 chunk of anthropic gets converted to multiple bifrost responses chunks) - // return "", nil, nil - // } - // } - // } return string(anthropicResponse[0].Type), anthropicResponse[0], nil }core/providers/utils/utils.go (1)
950-960: Consider using a package-level random source for better performance.The current implementation creates a new
rand.Sourceon every call, which is inefficient. However, the collision risk fromtime.Now().UnixNano()seeding is minimal in practice sinceGetRandomStringis used for generating message IDs in response processing (not in tight loops where nanosecond collisions would occur).For non-security use cases like message identification, consider a simpler optimization using a package-level source with synchronization:
var ( randMu sync.Mutex randSrc = rand.New(rand.NewSource(time.Now().UnixNano())) ) func GetRandomString(length int) string { letters := []rune("abcdefghijklmnopqrstuvwxyz0123456789") b := make([]rune, length) randMu.Lock() for i := range b { b[i] = letters[randSrc.Intn(len(letters))] } randMu.Unlock() return string(b) }This avoids repeated allocations without the complexity of
crypto/rand.core/providers/openai/responses.go (1)
56-81: Duplicate condition on line 59.The condition
len(message.ResponsesReasoning.Summary) > 0is checked twice in the same if statement at lines 57 and 59.Apply this diff to remove the redundant check:
// If the message has summaries but no content blocks and the model is gpt-oss, then convert the summaries to content blocks if len(message.ResponsesReasoning.Summary) > 0 && strings.Contains(bifrostReq.Model, "gpt-oss") && - len(message.ResponsesReasoning.Summary) > 0 && message.Content == nil {
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (13)
core/providers/anthropic/errors.go(2 hunks)core/providers/anthropic/types.go(3 hunks)core/providers/cohere/responses.go(9 hunks)core/providers/gemini/responses.go(2 hunks)core/providers/openai/openai.go(1 hunks)core/providers/openai/responses.go(2 hunks)core/providers/openai/types.go(2 hunks)core/providers/utils/utils.go(3 hunks)core/schemas/responses.go(5 hunks)framework/streaming/responses.go(2 hunks)transports/bifrost-http/handlers/inference.go(1 hunks)transports/bifrost-http/integrations/anthropic.go(3 hunks)ui/lib/types/logs.ts(2 hunks)
🚧 Files skipped from review as they are similar to previous changes (3)
- core/providers/openai/openai.go
- transports/bifrost-http/handlers/inference.go
- core/providers/anthropic/types.go
🧰 Additional context used
📓 Path-based instructions (1)
**
⚙️ CodeRabbit configuration file
always check the stack if there is one for the current PR. do not give localized reviews for the PR, always see all changes in the light of the whole stack of PRs (if there is a stack, if there is no stack you can continue to make localized suggestions/reviews)
Files:
ui/lib/types/logs.tscore/providers/anthropic/errors.gocore/providers/openai/types.gocore/providers/utils/utils.gotransports/bifrost-http/integrations/anthropic.gocore/schemas/responses.gocore/providers/gemini/responses.gocore/providers/openai/responses.gocore/providers/cohere/responses.goframework/streaming/responses.go
🧬 Code graph analysis (7)
ui/lib/types/logs.ts (1)
core/schemas/responses.go (2)
ResponsesReasoningSummary(744-747)ResponsesReasoning(730-733)
core/providers/anthropic/errors.go (2)
core/providers/anthropic/types.go (3)
AnthropicStreamEvent(328-337)AnthropicStreamEventTypeError(324-324)AnthropicStreamError(399-402)core/providers/anthropic/responses.go (1)
ToAnthropicResponsesStreamError(1392-1413)
core/providers/openai/types.go (1)
core/schemas/responses.go (1)
ResponsesParametersReasoning(233-238)
transports/bifrost-http/integrations/anthropic.go (1)
core/providers/gemini/types.go (1)
Type(778-778)
core/providers/gemini/responses.go (2)
core/providers/gemini/types.go (5)
FunctionCall(1091-1101)Role(13-13)Content(922-930)Type(778-778)Part(936-960)core/schemas/responses.go (8)
ResponsesToolMessage(461-481)ResponsesMessage(313-326)ResponsesInputMessageRoleAssistant(331-331)ResponsesMessageContent(338-343)ResponsesMessageTypeFunctionCall(294-294)ResponsesMessageTypeReasoning(306-306)ResponsesReasoning(730-733)ResponsesReasoningSummary(744-747)
core/providers/openai/responses.go (2)
core/schemas/responses.go (5)
ResponsesMessage(313-326)ResponsesReasoning(730-733)ResponsesMessageContentBlock(398-410)ResponsesOutputMessageContentTypeReasoning(393-393)ResponsesMessageContent(338-343)core/providers/openai/types.go (2)
OpenAIResponsesRequest(143-152)OpenAIResponsesRequestInput(110-113)
framework/streaming/responses.go (2)
core/schemas/responses.go (8)
ResponsesStreamResponseTypeReasoningSummaryTextDelta(1392-1392)ResponsesMessage(313-326)ResponsesInputMessageRoleAssistant(331-331)ResponsesReasoning(730-733)ResponsesReasoningSummary(744-747)ResponsesMessageContent(338-343)ResponsesMessageContentBlock(398-410)ResponsesOutputMessageContentTypeReasoning(393-393)core/schemas/utils.go (1)
Ptr(16-18)
⏰ Context from checks skipped due to timeout of 900000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)
- GitHub Check: Graphite / mergeability_check
- GitHub Check: Graphite / mergeability_check
- GitHub Check: Graphite / mergeability_check
- GitHub Check: Graphite / mergeability_check
- GitHub Check: Graphite / mergeability_check
- GitHub Check: Graphite / mergeability_check
🔇 Additional comments (20)
transports/bifrost-http/integrations/anthropic.go (1)
85-119: Streaming response handling logic looks correct.The three-way branching based on response count handles:
- Empty responses (OpenAI lifecycle events) → returns empty
- Multiple events → concatenates as SSE format
- Single event → returns directly
The SSE format construction at line 99 correctly follows
event: <type>\ndata: <json>\n\n.core/providers/openai/types.go (1)
154-192: LGTM - Custom marshaling correctly omitsmax_tokensfor OpenAI.The implementation correctly:
- Preserves the custom
Inputmarshaling viajson.RawMessage- Copies reasoning fields while explicitly setting
MaxTokensto nil- Follows the same pattern as
OpenAIChatRequest.MarshalJSONabovePlease verify that OpenAI's Responses API indeed does not support
reasoning.max_tokensand should have it omitted from requests.core/providers/openai/responses.go (1)
41-85: Reasoning transformation logic looks correct.The bidirectional conversion between gpt-oss reasoning content blocks and standard OpenAI summaries+encrypted_content is well-structured. The three branches handle:
- Skip messages with content blocks but no summaries for non-gpt-oss models
- Convert summaries to content blocks for gpt-oss models
- Preserve messages as-is for other cases
The model detection via
strings.Contains(bifrostReq.Model, "gpt-oss")is fragile. Consider verifying this matches the actual model naming convention and whether a more robust check (e.g., a helper function or constant) would be appropriate.ui/lib/types/logs.ts (1)
411-433: LGTM - Type rename aligns with backend schema changes.The rename from
ResponsesReasoningContenttoResponsesReasoningSummarycorrectly mirrors the backendResponsesReasoningSummarystruct incore/schemas/responses.go(lines 743-746), maintaining consistency across the codebase.framework/streaming/responses.go (3)
497-534: LGTM - Reasoning summary streaming accumulation.The new case for
ResponsesStreamResponseTypeReasoningSummaryTextDeltacorrectly:
- Searches for existing reasoning message by
ItemID- Creates a new reasoning message with proper type and role if not found
- Delegates to helper methods for delta and signature handling
626-679: Reasoning delta accumulation handles both storage modes correctly.The helper properly branches on
contentIndex:
- With index: Stores in content blocks as
reasoning_texttype- Without index: Accumulates into
ResponsesReasoning.SummaryThe comment on line 667-668 acknowledges the current limitation of accumulating into a single summary entry.
681-727: Signature accumulation logic is correct.Follows the same pattern as delta handling, storing signatures in either:
ContentBlock.SignaturewhencontentIndexis providedResponsesReasoning.EncryptedContentotherwiseThis aligns with the schema design where
EncryptedContentserves as the reasoning-level signature storage.core/schemas/responses.go (4)
68-68: LGTM -StopReasonfield addition.The
StopReasonfield appropriately handles non-OpenAI providers that return stop reasons in a different format, with a clear comment noting it's not part of OpenAI's spec.
398-402: LGTM -Signaturefield for content blocks.Adding the
Signaturefield toResponsesMessageContentBlockenables per-block signature storage for reasoning content, which aligns with the streaming accumulation logic inframework/streaming/responses.go.
729-747: LGTM - Rename toResponsesReasoningSummary.The rename from
ResponsesReasoningContenttoResponsesReasoningSummarybetter reflects the purpose of this struct and maintains consistency with the UI types inui/lib/types/logs.ts.
1439-1441: LGTM -Signaturefield for streaming responses.Adding
SignaturetoBifrostResponsesStreamResponseenables streaming signature deltas alongside text deltas, supporting the reasoning accumulation logic.core/providers/gemini/responses.go (2)
138-179: LGTM! Good handling of function calls and thought signatures.The implementation correctly:
- Avoids range loop variable capture by creating copies of
functionCallIDandfunctionCallName- Preserves Gemini's
ThoughtSignatureas encrypted content in a separate reasoning message- Initializes the
Summaryfield as an empty slice, consistent with the new schema structure
609-629: LGTM! Proper bidirectional conversion with safe look-ahead.The look-ahead mechanism correctly:
- Checks array bounds before accessing the next message
- Validates the next message is a reasoning type with encrypted content
- Preserves the thought signature from the Bifrost reasoning message back to Gemini's format
This maintains consistency with the reverse conversion in
convertGeminiCandidatesToResponsesOutput.core/providers/cohere/responses.go (7)
17-17: LGTM! Proper state tracking for reasoning content.The
ReasoningContentIndicesfield is correctly:
- Initialized in the pool's
Newfunction- Handled with defensive nil checks in
acquireCohereResponsesStreamState- Cleared in the
flushmethod, consistent with other map fieldsThis enables proper tracking of reasoning blocks during streaming conversion.
Also applies to: 34-34, 64-68, 106-110
318-368: LGTM! Proper reasoning content lifecycle handling.The thinking/reasoning block handling correctly:
- Creates a reasoning message with the appropriate type and empty
Summaryslice- Tracks the content index in
ReasoningContentIndicesfor downstream event emission- Emits OpenAI-style lifecycle events (
output_item.added,content_part.added)- Generates stable item IDs consistent with other content types
395-410: LGTM! Correct differentiation between text and reasoning deltas.The implementation properly emits
reasoning_summary_text.deltaevents for thinking content (line 400) instead ofoutput_text.delta, ensuring downstream consumers can distinguish between regular text and reasoning updates.
420-449: LGTM! Proper reasoning block cleanup and event emission.The content end handling correctly:
- Uses
ReasoningContentIndicesto differentiate reasoning from text blocks- Emits
reasoning_summary_text.donefor reasoning (line 425) vs.output_text.donefor text (line 454)- Cleans up the tracking map (line 449) to prevent memory leaks
977-1112: LGTM! Comprehensive message conversion with proper state management.The conversion function correctly handles:
- Accumulation of reasoning blocks via
pendingReasoningContentBlocks- Association of reasoning with assistant messages
- Proper flushing of pending blocks at function end (lines 1090-1100)
- System message collection and prepending (lines 1102-1109)
The state machine logic is complex but appears sound for managing the various message types and their relationships.
850-932: LGTM! Comprehensive request conversion with proper parameter mapping.The
ToCohereResponsesRequestfunction correctly:
- Maps standard parameters (temperature, top_p, max_tokens)
- Extracts Cohere-specific options from
ExtraParams(top_k, thinking, penalties)- Converts tools and tool choice using dedicated helper functions
- Delegates message conversion to
ConvertBifrostMessagesToCohereMessagesThe structure is clean and follows established patterns in the codebase.
1370-1383: The reasoning message structure is correct and not redundant. Cohere provides reasoning asreasoning_textcontent blocks (line 1327-1331), which are correctly placed inContent.ContentBlockswhileResponsesReasoning.Summaryremains empty. This dual-field pattern is intentional:ResponsesReasoning.Summaryis used by providers that send reasoning summaries (e.g., some OpenAI models), whileContent.ContentBlocksis used for reasoning_text blocks (Cohere, Bedrock, Anthropic). When converting back to provider format (line 1207-1209), the code checksSummaryfirst—which is empty for content that originated from blocks, and that's correct.Likely an incorrect or invalid review comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
framework/streaming/responses.go (1)
71-104: NewSignaturefield is not preserved in deep copies
deepCopyResponsesStreamResponsecopiesDeltaandLogProbsbut never copies the newSignaturefield onBifrostResponsesStreamResponse(Lines 71–104), so any signature arriving from providers is lost when we stash stream responses in the accumulator.deepCopyResponsesMessageContentBlocksimilarly never copies the newSignature(and still ignoresFileID) onResponsesMessageContentBlock(Lines 382–424), so block‑level signatures won’t survive accumulation either.Both issues mean the newly added reasoning signature plumbing in
buildCompleteMessageFromResponsesStreamChunks/appendReasoningSignatureToResponsesMessagecan never see those signatures.Consider updating both helpers along these lines:
func deepCopyResponsesStreamResponse(original *schemas.BifrostResponsesStreamResponse) *schemas.BifrostResponsesStreamResponse { ... - if original.Delta != nil { - copyDelta := *original.Delta - copy.Delta = ©Delta - } + if original.Delta != nil { + copyDelta := *original.Delta + copy.Delta = ©Delta + } + if original.Signature != nil { + copySig := *original.Signature + copy.Signature = ©Sig + } ... }func deepCopyResponsesMessageContentBlock(original schemas.ResponsesMessageContentBlock) schemas.ResponsesMessageContentBlock { - copy := schemas.ResponsesMessageContentBlock{ - Type: original.Type, - } + copy := schemas.ResponsesMessageContentBlock{ + Type: original.Type, + } + if original.FileID != nil { + id := *original.FileID + copy.FileID = &id + } + if original.Signature != nil { + sig := *original.Signature + copy.Signature = &sig + } if original.Text != nil { copyText := *original.Text copy.Text = ©Text } ... }Also applies to: 382-424
♻️ Duplicate comments (4)
core/providers/utils/utils.go (1)
267-267: Revert tosonic.Marshalfor production performance.This change from
sonic.Marshaltosonic.MarshalIndentwas flagged in a previous review but remains unaddressed. The indented JSON increases payload size and bandwidth for all provider API requests without any documented justification. Provider APIs do not require formatted JSON.Unless there is a specific requirement for indented JSON (which should be documented with a code comment), revert this change.
Apply this diff to revert the change:
- jsonBody, err := sonic.MarshalIndent(convertedBody, "", " ") + jsonBody, err := sonic.Marshal(convertedBody)core/providers/cohere/responses.go (3)
140-172: Empty content block for invalid image URL is fragileWhen
cohereBlock.Type == CohereContentBlockTypeImageandcohereBlock.ImageURL == nil, the function returns a zero‑valueResponsesMessageContentBlock{}(Type == ""), which can later end up inContentBlocksand surprise downstream logic that expects a validType.Consider either:
- Skipping such blocks entirely, or
- Returning a text block indicating an invalid/missing image instead of an empty block.
1127-1145: Cohere tool choice mapping treats"auto"and unknown values as"required"
convertBifrostToolChoiceToCohereToolChoice:switch *toolChoiceString { case "none": choice := ToolChoiceNone return &choice case "required", "auto", "function": choice := ToolChoiceRequired return &choice default: choice := ToolChoiceRequired return &choice }Maps both
"auto"and any unknown string toToolChoiceRequired, which forces a tool call instead of letting the model decide. That’s a semantic change from OpenAI‑style"auto"and may not match Cohere’s API either.Consider instead:
- Mapping
"none"→ToolChoiceNone,"required"/"function"→ToolChoiceRequired,"auto"(and other/unknown values) →nilto fall back to Cohere defaults, or a dedicated “auto” enum if available.
1193-1225: Encrypted reasoning content is exposed via a plain-text markerIn
convertBifrostReasoningToCohereThinking, whenEncryptedContentis present:encryptedText := fmt.Sprintf("[ENCRYPTED_REASONING: %s]", *msg.ResponsesReasoning.EncryptedContent) thinkingBlock := CohereContentBlock{ Type: CohereContentBlockTypeThinking, Thinking: &encryptedText, }This wraps the encrypted payload in a clear‑text marker and sends it to Cohere as “thinking” text, which may be contrary to the intent of keeping it opaque and could leak internal details.
Safer options:
- Skip
EncryptedContententirely for Cohere (don’t send it), or- Represent only high‑level metadata (e.g., “[ENCRYPTED_REASONING_PRESENT]”) without including the ciphertext.
🧹 Nitpick comments (4)
core/providers/utils/utils.go (1)
951-960: The duplicate-string concern is overstated; refactoring is optional, not essential.While the implementation has minor inefficiencies, the actual impact is negligible for this use case:
Duplicates are extremely unlikely: A 50-character random string from a 37-character alphabet offers ~10^80 possible combinations. The probability of duplicate outputs is vanishingly small, especially for API response message IDs generated during processing.
Inefficiency is minor: Creating a new
rand.Sourceper call has overhead, but these calls occur during response transformation—not in a tight loop. This is unlikely to be a performance bottleneck.Cryptographic security not required: These are internal message IDs, not authentication tokens or security-sensitive values.
If performance profiling shows this is a bottleneck, consider refactoring with
sync.Onceto initialize a package-level random source. However, this is not essential for the current usage pattern.transports/bifrost-http/handlers/inference.go (1)
224-254: ResponsesRequest.UnmarshalJSON logic looks solid; fix comment wording and consider deduping with ChatRequestThis implementation correctly mirrors the ChatRequest flow: it protects the embedded
BifrostParamsfrom being shadowed by the customResponsesParameters.UnmarshalJSON, and it cleanly decodes theinputunion and params. No functional issues stand out.Two small nits:
- Line 236: the comment says "Unmarshal messages" but this block unmarshals the
inputfield. Consider updating to avoid confusion.- The structure is now nearly identical to
ChatRequest.UnmarshalJSON; if this pattern spreads further, a shared helper for "unmarshal BifrostParams + specific input + specific params" could reduce duplication, though it's not urgent.core/providers/openai/responses.go (1)
42-84: Reasoning message skip / comment mismatch – please confirm intended behavior
- For non‑
gpt-ossmodels, reasoning messages withResponsesReasoningbut onlyContentBlocks(noSummary, noEncryptedContent) are silently skipped (Lines 47–54). That drops those messages entirely instead of degrading them (e.g., into summaries or plain text). If such inputs can occur cross‑provider, this may be surprising; worth confirming that they can’t, or that dropping them is acceptable.- The comment “convert them to summaries” (Line 43) doesn’t match the code, which instead converts summaries to reasoning content blocks for
gpt-osswhenContent == nil(Lines 56–77). Updating the comment would avoid confusion.core/providers/cohere/responses.go (1)
1304-1383: Reasoning summary content is only attached as blocks, not asSummary
convertSingleCohereMessageToBifrostMessagescollectsCohereContentBlockTypeThinkingblocks intoreasoningContentBlocksand then:
- Prepends a
ResponsesMessageTypeReasoningmessage withContent.ContentBlocks = reasoningContentBlocksand- Initializes
ResponsesReasoning.Summaryas an empty slice.Given the new schema encourages using
ResponsesReasoning.Summaryfor reasoning summaries, this is fine as long as downstream code expects reasoning_text content blocks and not populatedSummaryentries for Cohere outputs. If you intend to surface reasoning summaries uniformly across providers, you might later want to mirror those blocks intoResponsesReasoning.Summaryas well.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (13)
core/providers/anthropic/errors.go(2 hunks)core/providers/anthropic/types.go(3 hunks)core/providers/cohere/responses.go(9 hunks)core/providers/gemini/responses.go(2 hunks)core/providers/openai/openai.go(1 hunks)core/providers/openai/responses.go(2 hunks)core/providers/openai/types.go(2 hunks)core/providers/utils/utils.go(3 hunks)core/schemas/responses.go(5 hunks)framework/streaming/responses.go(2 hunks)transports/bifrost-http/handlers/inference.go(1 hunks)transports/bifrost-http/integrations/anthropic.go(3 hunks)ui/lib/types/logs.ts(2 hunks)
🚧 Files skipped from review as they are similar to previous changes (5)
- core/providers/anthropic/errors.go
- ui/lib/types/logs.ts
- core/providers/openai/openai.go
- core/providers/openai/types.go
- core/providers/anthropic/types.go
🧰 Additional context used
📓 Path-based instructions (1)
**
⚙️ CodeRabbit configuration file
always check the stack if there is one for the current PR. do not give localized reviews for the PR, always see all changes in the light of the whole stack of PRs (if there is a stack, if there is no stack you can continue to make localized suggestions/reviews)
Files:
core/providers/utils/utils.gocore/providers/openai/responses.gotransports/bifrost-http/handlers/inference.gotransports/bifrost-http/integrations/anthropic.gocore/providers/gemini/responses.gocore/providers/cohere/responses.gocore/schemas/responses.goframework/streaming/responses.go
🧬 Code graph analysis (6)
core/providers/openai/responses.go (2)
core/schemas/responses.go (5)
ResponsesMessage(313-326)ResponsesReasoning(730-733)ResponsesMessageContentBlock(398-410)ResponsesOutputMessageContentTypeReasoning(393-393)ResponsesMessageContent(338-343)core/providers/openai/types.go (2)
OpenAIResponsesRequest(143-152)OpenAIResponsesRequestInput(110-113)
transports/bifrost-http/handlers/inference.go (2)
core/schemas/bifrost.go (1)
ResponsesRequest(91-91)core/schemas/responses.go (1)
ResponsesParameters(86-113)
core/providers/gemini/responses.go (3)
ui/lib/types/logs.ts (6)
FunctionCall(157-160)ResponsesToolMessage(402-408)ResponsesMessage(422-437)ResponsesMessageContent(399-399)ResponsesReasoning(416-419)ResponsesReasoningSummary(411-414)core/providers/gemini/types.go (4)
FunctionCall(1091-1101)Role(13-13)Content(922-930)Type(778-778)core/schemas/responses.go (5)
ResponsesToolMessage(461-481)ResponsesMessage(313-326)ResponsesMessageContent(338-343)ResponsesReasoning(730-733)ResponsesReasoningSummary(744-747)
core/providers/cohere/responses.go (3)
core/providers/cohere/types.go (22)
CohereContentBlock(142-156)CohereContentBlockTypeText(134-134)CohereContentBlockTypeImage(135-135)CohereContentBlockTypeThinking(136-136)CohereStreamEvent(387-392)StreamEventMessageStart(372-372)StreamEventContentStart(373-373)StreamEventContentDelta(374-374)StreamEventContentEnd(375-375)StreamEventToolPlanDelta(376-376)StreamEventToolCallStart(377-377)StreamEventToolCallDelta(378-378)StreamEventToolCallEnd(379-379)StreamEventCitationStart(380-380)StreamEventCitationEnd(381-381)StreamEventMessageEnd(382-382)StreamEventDebug(383-383)CohereChatRequest(14-31)CohereMessage(50-56)NewBlocksContent(105-109)NewStringContent(98-102)CohereImageURL(159-161)core/schemas/responses.go (14)
BifrostResponsesResponse(45-84)ResponsesStreamResponseTypeCreated(1362-1362)ResponsesStreamResponseTypeInProgress(1363-1363)ResponsesStreamResponseTypeOutputTextDone(1375-1375)ResponsesStreamResponseTypeContentPartDone(1372-1372)ResponsesMessage(313-326)ResponsesStreamResponseTypeOutputItemDone(1369-1369)ResponsesStreamResponseTypeOutputItemAdded(1368-1368)ResponsesStreamResponseTypeContentPartAdded(1371-1371)ResponsesStreamResponseTypeOutputTextDelta(1374-1374)ResponsesStreamResponseTypeFunctionCallArgumentsDelta(1380-1380)ResponsesStreamResponseTypeFunctionCallArgumentsDone(1381-1381)ResponsesResponseUsage(253-260)ResponsesToolChoice(958-961)core/schemas/utils.go (2)
SafeExtractIntPointer(486-494)SafeExtractFromMap(519-525)
core/schemas/responses.go (2)
core/providers/gemini/types.go (1)
Type(778-778)ui/lib/types/logs.ts (2)
ResponsesMessageContentBlockType(351-358)ResponsesReasoningSummary(411-414)
framework/streaming/responses.go (4)
core/schemas/responses.go (9)
ResponsesMessage(313-326)ResponsesMessageTypeReasoning(306-306)ResponsesInputMessageRoleAssistant(331-331)ResponsesReasoning(730-733)ResponsesReasoningSummary(744-747)ResponsesMessageContent(338-343)ResponsesMessageContentBlock(398-410)ResponsesOutputMessageContentTypeReasoning(393-393)ResponsesReasoningContentBlockTypeSummaryText(740-740)core/providers/gemini/types.go (3)
Type(778-778)Role(13-13)Content(922-930)core/schemas/utils.go (1)
Ptr(16-18)core/utils.go (1)
Ptr(51-53)
⏰ Context from checks skipped due to timeout of 900000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)
- GitHub Check: Graphite / mergeability_check
- GitHub Check: Graphite / mergeability_check
- GitHub Check: Graphite / mergeability_check
- GitHub Check: Graphite / mergeability_check
- GitHub Check: Graphite / mergeability_check
- GitHub Check: Graphite / mergeability_check
🔇 Additional comments (3)
core/providers/gemini/responses.go (1)
138-179: Function-call ↔ thought-signature round-trip looks consistentThe new FunctionCall conversion:
- Emits a
ResponsesToolMessagewithCallID,Name, and stringifiedArgs.- Emits a separate reasoning message carrying
EncryptedContentfromThoughtSignature.And the reverse path:
- Rebuilds
FunctionCallfromResponsesToolMessageand, if the next message is areasoningmessage withEncryptedContent, attaches it asThoughtSignature.This is internally consistent and nil‑safe; just keep in mind the assumption that the reasoning message immediately follows the function-call message when constructing
messageselsewhere.Also applies to: 609-655
transports/bifrost-http/integrations/anthropic.go (1)
73-103: Clarify SSE contract for multi-event Anthropic streamingWhen
ToAnthropicResponsesStreamResponsereturns more than one event, the converter now:
- Marshals each event to JSON and concatenates them as a single SSE string (
"event: %s\ndata: %s\n\n"), and- Returns
("", combinedContent, nil).This assumes the upstream streaming writer treats a non-empty payload with an empty event name as “already formatted SSE” and writes it verbatim. If the writer instead always wraps
(eventName, data)into its own SSE envelope, this will double‑wrap or drop the event type.Please double‑check the
StreamConfigwriter path to ensure:
event == ""is indeed interpreted as “raw SSE payload”, and- It’s acceptable to skip individual Anthropic events that fail
sonic.Marshalrather than failing the whole chunk.core/schemas/responses.go (1)
45-84: Schema extensions for stop reason, reasoning summaries, and signatures look coherentThe additions:
StopReasononBifrostResponsesResponse,SignatureonResponsesMessageContentBlock,- the new
ResponsesReasoningSummarytype and updatedResponsesReasoning.Summary,- and
Delta/SignatureonBifrostResponsesStreamResponseare structurally consistent with how the rest of the file models union types and streaming events.
The main follow‑up risk is making sure all converters and helpers (deep copies, provider adapters, streaming accumulators) are updated to propagate
Signatureand the newSummaryshape; some of that is already wired up, but a few helpers still need updates (see streaming/cohere comments).Also applies to: 399-410, 729-747, 1439-1442
d4bfce4 to
bf9c361
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
♻️ Duplicate comments (6)
core/providers/utils/utils.go (1)
267-271: AvoidMarshalIndentin hot-path request body marshalling.
CheckContextAndGetRequestBodyis on the request path for all providers; usingsonic.MarshalIndenthere increases allocations and bloats every request payload with whitespace. Unless a specific upstream API strictly requires pretty-printed JSON, it’s better to keep the wire format compact and, if needed, pretty-print only for logging.Consider reverting to
sonic.Marshal:- jsonBody, err := sonic.MarshalIndent(convertedBody, "", " ") + jsonBody, err := sonic.Marshal(convertedBody)If pretty JSON is truly required for a given provider, please document that requirement and consider making indentation opt‑in rather than the default for all providers.
core/providers/cohere/responses.go (5)
140-172: Avoid returning zero-value content block for invalid image URLWhen
cohereBlock.ImageURL == nil, this returnsschemas.ResponsesMessageContentBlock{}with a zeroType, which can confuse downstream consumers that expect a valid type or no block at all. A small sentinel or text fallback is safer.case CohereContentBlockTypeImage: - // For images, create a text block describing the image - if cohereBlock.ImageURL == nil { - // Skip invalid image blocks without ImageURL - return schemas.ResponsesMessageContentBlock{} - } - return schemas.ResponsesMessageContentBlock{ - Type: schemas.ResponsesInputMessageContentBlockTypeImage, - ResponsesInputMessageContentBlockImage: &schemas.ResponsesInputMessageContentBlockImage{ - ImageURL: &cohereBlock.ImageURL.URL, - }, - } + if cohereBlock.ImageURL == nil || cohereBlock.ImageURL.URL == "" { + // Return a small text sentinel instead of a zero-value block + return schemas.ResponsesMessageContentBlock{ + Type: schemas.ResponsesInputMessageContentBlockTypeText, + Text: schemas.Ptr("[Image block with missing URL]"), + } + } + return schemas.ResponsesMessageContentBlock{ + Type: schemas.ResponsesInputMessageContentBlockTypeImage, + ResponsesInputMessageContentBlockImage: &schemas.ResponsesInputMessageContentBlockImage{ + ImageURL: &cohereBlock.ImageURL.URL, + }, + }
174-488: Fix nil-dereference when generating reasoning item IDs in streamingIn the
StreamEventContentStarthandler forCohereContentBlockTypeThinking,state.MessageIDis dereferenced before the nil check:// Generate stable ID for reasoning item itemID := fmt.Sprintf("msg_%s_reasoning_%d", *state.MessageID, outputIndex) if state.MessageID == nil { itemID = fmt.Sprintf("reasoning_%d", outputIndex) }If
state.MessageIDis nil (e.g., no message_start ID), this will panic.A nil-safe branch avoids the panic:
- // Generate stable ID for reasoning item - itemID := fmt.Sprintf("msg_%s_reasoning_%d", *state.MessageID, outputIndex) - if state.MessageID == nil { - itemID = fmt.Sprintf("reasoning_%d", outputIndex) - } + // Generate stable ID for reasoning item + var itemID string + if state.MessageID != nil { + itemID = fmt.Sprintf("msg_%s_reasoning_%d", *state.MessageID, outputIndex) + } else { + itemID = fmt.Sprintf("reasoning_%d", outputIndex) + }The rest of the reasoning/text streaming (reasoning_summary_text.{delta,done}, content_part.{added,done}, output_item.{added,done}) looks coherent and matches the intended OpenAI-style lifecycle.
1127-1146: Tool choice"auto"should not be forced toToolChoiceRequiredThe current mapping forces both
"auto"and unknown strings toToolChoiceRequired, which changes semantics and can unintentionally force tool calls:case "required", "auto", "function": choice := ToolChoiceRequired return &choice default: choice := ToolChoiceRequired return &choiceSafer behavior is to only map explicit
"required"/"function"and"none", letting"auto"(and unknown strings) fall back to Cohere’s default behavior:if toolChoiceString != nil { switch *toolChoiceString { case "none": choice := ToolChoiceNone return &choice - case "required", "auto", "function": + case "required", "function": choice := ToolChoiceRequired - return &choice - default: - choice := ToolChoiceRequired - return &choice + return &choice + case "auto": + // Let Cohere use its default "auto" behavior. + return nil + default: + // Unknown strings: fall back to provider defaults. + return nil } }What are the valid `tool_choice` values and semantics for Cohere's Chat API, and how should `"auto"`, `"none"`, and `"required"` be mapped to the Go SDK enums?
1193-1229: Do not expose encrypted reasoning content as plain text
convertBifrostReasoningToCohereThinkingcurrently converts encrypted reasoning into a readable string:} else if msg.ResponsesReasoning.EncryptedContent != nil { encryptedText := fmt.Sprintf("[ENCRYPTED_REASONING: %s]", *msg.ResponsesReasoning.EncryptedContent) thinkingBlock := CohereContentBlock{ Type: CohereContentBlockTypeThinking, Thinking: &encryptedText, } thinkingBlocks = append(thinkingBlocks, thinkingBlock) }This leaks the encrypted payload in clear form, which defeats the purpose of keeping it opaque when forwarding to another provider.
Better to skip encrypted reasoning entirely for Cohere:
- } else if msg.ResponsesReasoning.EncryptedContent != nil { - // Cohere doesn't have a direct equivalent to encrypted content, - // so we'll store it as a regular thinking block with a special marker - encryptedText := fmt.Sprintf("[ENCRYPTED_REASONING: %s]", *msg.ResponsesReasoning.EncryptedContent) - thinkingBlock := CohereContentBlock{ - Type: CohereContentBlockTypeThinking, - Thinking: &encryptedText, - } - thinkingBlocks = append(thinkingBlocks, thinkingBlock) + } else if msg.ResponsesReasoning.EncryptedContent != nil { + // Cohere doesn't support encrypted reasoning; skip forwarding it so it remains opaque. }The existing handling of
ContentBlocksandSummaryalready covers non-encrypted reasoning.
1231-1265: AccessCallIDvia embedded struct to avoid nil-pointer panic
convertBifrostFunctionCallToCohereMessagereadsmsg.CallIDdirectly:if msg.CallID != nil { toolCall.ID = msg.CallID }Because
CallIDis promoted from the embedded*ResponsesToolMessage, this will panic ifmsg.ResponsesToolMessageis nil.Guard the embedded pointer explicitly:
- if msg.CallID != nil { - toolCall.ID = msg.CallID - } + if msg.ResponsesToolMessage != nil && msg.ResponsesToolMessage.CallID != nil { + toolCall.ID = msg.ResponsesToolMessage.CallID + }The rest of the function already checks
msg.ResponsesToolMessage != nilforArgumentsandName.
🧹 Nitpick comments (4)
core/providers/openai/responses.go (1)
57-59: Redundant condition check.
len(message.ResponsesReasoning.Summary) > 0is checked twice on lines 57 and 59.- if len(message.ResponsesReasoning.Summary) > 0 && - strings.Contains(bifrostReq.Model, "gpt-oss") && - len(message.ResponsesReasoning.Summary) > 0 && + if len(message.ResponsesReasoning.Summary) > 0 && + strings.Contains(bifrostReq.Model, "gpt-oss") && message.Content == nil {transports/bifrost-http/integrations/anthropic.go (2)
94-98: Use injected logger instead of standard log package.Using
log.Printfdirectly bypasses the structured logger passed toNewAnthropicRouter. This can cause inconsistent logging behavior and lose context in production environments.Consider passing the logger to the converter function or using a closure to capture it. If that's not feasible, at minimum document why
logis used here.
74-78: Remove or clarify commented-out code blocks.Multiple commented-out code blocks are present. If this code is no longer needed, remove it to reduce confusion. If it's temporarily disabled, add a TODO comment explaining when it should be re-enabled.
Also applies to: 103-117
core/providers/cohere/responses.go (1)
1303-1429: Consider setting assistant role on synthesized reasoning messages
convertSingleCohereMessageToBifrostMessagesbuilds a separate reasoningResponsesMessagewith populatedResponsesReasoningandContentBlocks, but it doesn’t set aRole. For consistency with other providers and with how reasoning is emitted elsewhere, it’s useful to mark these as assistant-originated:if len(reasoningContentBlocks) > 0 { + role := schemas.ResponsesInputMessageRoleAssistant reasoningMessage := schemas.ResponsesMessage{ ID: schemas.Ptr("rs_" + fmt.Sprintf("%d", time.Now().UnixNano())), Type: schemas.Ptr(schemas.ResponsesMessageTypeReasoning), + Role: &role, ResponsesReasoning: &schemas.ResponsesReasoning{ Summary: []schemas.ResponsesReasoningSummary{}, }, Content: &schemas.ResponsesMessageContent{ ContentBlocks: reasoningContentBlocks, }, }This is a behavioral refinement rather than a correctness fix, but it will likely make downstream consumers’ role-based handling more predictable.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (14)
core/providers/anthropic/chat.go(1 hunks)core/providers/anthropic/errors.go(2 hunks)core/providers/anthropic/types.go(4 hunks)core/providers/cohere/responses.go(9 hunks)core/providers/gemini/responses.go(2 hunks)core/providers/openai/openai.go(1 hunks)core/providers/openai/responses.go(2 hunks)core/providers/openai/types.go(2 hunks)core/providers/utils/utils.go(3 hunks)core/schemas/responses.go(5 hunks)framework/streaming/responses.go(2 hunks)transports/bifrost-http/handlers/inference.go(1 hunks)transports/bifrost-http/integrations/anthropic.go(3 hunks)ui/lib/types/logs.ts(2 hunks)
🚧 Files skipped from review as they are similar to previous changes (4)
- core/providers/openai/openai.go
- ui/lib/types/logs.ts
- core/providers/anthropic/errors.go
- transports/bifrost-http/handlers/inference.go
🧰 Additional context used
📓 Path-based instructions (1)
**
⚙️ CodeRabbit configuration file
always check the stack if there is one for the current PR. do not give localized reviews for the PR, always see all changes in the light of the whole stack of PRs (if there is a stack, if there is no stack you can continue to make localized suggestions/reviews)
Files:
core/providers/openai/types.gocore/providers/utils/utils.gotransports/bifrost-http/integrations/anthropic.gocore/providers/gemini/responses.gocore/providers/anthropic/chat.gocore/providers/openai/responses.goframework/streaming/responses.gocore/providers/anthropic/types.gocore/schemas/responses.gocore/providers/cohere/responses.go
🧬 Code graph analysis (5)
core/providers/openai/types.go (2)
core/schemas/responses.go (1)
ResponsesParametersReasoning(233-238)ui/lib/types/logs.ts (1)
ResponsesParametersReasoning(512-519)
transports/bifrost-http/integrations/anthropic.go (1)
core/providers/gemini/types.go (1)
Type(778-778)
core/providers/gemini/responses.go (3)
ui/lib/types/logs.ts (6)
FunctionCall(157-160)ResponsesToolMessage(402-408)ResponsesMessage(422-437)ResponsesMessageContent(399-399)ResponsesReasoning(416-419)ResponsesReasoningSummary(411-414)core/providers/gemini/types.go (4)
FunctionCall(1091-1101)Role(13-13)Content(922-930)Type(778-778)core/schemas/responses.go (5)
ResponsesToolMessage(461-481)ResponsesMessage(313-326)ResponsesMessageContent(338-343)ResponsesReasoning(730-733)ResponsesReasoningSummary(744-747)
core/providers/openai/responses.go (2)
core/schemas/responses.go (4)
ResponsesMessage(313-326)ResponsesReasoning(730-733)ResponsesMessageContentBlock(398-410)ResponsesMessageContent(338-343)core/providers/openai/types.go (2)
OpenAIResponsesRequest(143-152)OpenAIResponsesRequestInput(110-113)
core/schemas/responses.go (2)
core/providers/gemini/types.go (1)
Type(778-778)ui/lib/types/logs.ts (2)
ResponsesMessageContentBlockType(351-358)ResponsesReasoningSummary(411-414)
⏰ Context from checks skipped due to timeout of 900000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
- GitHub Check: Graphite / mergeability_check
- GitHub Check: Graphite / mergeability_check
- GitHub Check: Graphite / mergeability_check
- GitHub Check: Graphite / mergeability_check
- GitHub Check: Graphite / mergeability_check
🔇 Additional comments (23)
core/schemas/responses.go (4)
68-68: LGTM: StopReason field addition.The comment clearly documents that this field is "Not in OpenAI's spec, but sent by other providers", which provides useful context for maintainers.
399-402: LGTM: Signature field addition to ResponsesMessageContentBlock.The
Signaturefield for reasoning content blocks aligns with the streaming updates inBifrostResponsesStreamResponseand the UI type definitions.
730-747: LGTM: ResponsesReasoning and ResponsesReasoningSummary refactoring.The transition from
[]ResponsesReasoningContentto[]ResponsesReasoningSummarywith explicitTypeandTextfields provides clearer semantics. This aligns with the UI typeResponsesReasoningSummaryinui/lib/types/logs.ts:410-413.
1439-1441: LGTM: Streaming response signature support.Adding
SignaturetoBifrostResponsesStreamResponseenables proper signature propagation during streaming, which is essential for reasoning content integrity.core/providers/anthropic/types.go (2)
135-145: LGTM: RedactedThinking content block type.Adding
AnthropicContentBlockTypeRedactedThinkingenables proper handling of redacted thinking content blocks from Anthropic's API.
153-153: LGTM: Data field for redacted thinking.The
Datafield with clear documentation for encrypted data in redacted thinking blocks is appropriate.core/providers/openai/types.go (1)
154-192: LGTM: Custom marshaling to strip MaxTokens for OpenAI.The implementation correctly strips
MaxTokensfrom reasoning parameters before sending to OpenAI, since OpenAI doesn't support this field (it's Anthropic-specific per the schema documentation). The approach using an alias struct andjson.RawMessagefor preserving customInputmarshaling is sound.core/providers/openai/responses.go (1)
42-85: LGTM: Reasoning content handling for OpenAI models.The logic correctly differentiates between:
- gpt-oss models: which use reasoning_text content blocks
- Other OpenAI models: which use summaries + encrypted_content
The transformation ensures proper format compatibility when sending requests to OpenAI.
transports/bifrost-http/integrations/anthropic.go (1)
86-119: LGTM: Multi-event SSE aggregation logic.The streaming response handling correctly aggregates multiple Anthropic events into proper SSE format and handles edge cases (empty responses, single events). The error logging without failing allows the stream to continue processing remaining events.
framework/streaming/responses.go (3)
497-534: LGTM: ReasoningSummaryTextDelta handling.The implementation correctly:
- Searches for existing reasoning messages by
ItemID(reverse iteration for efficiency)- Creates new reasoning messages when needed with proper initialization
- Handles both text deltas and signature deltas in a single pass
The guard condition on line 500 ensures we have at least one payload and a valid
ItemIDbefore processing.
626-679: LGTM: Reasoning delta accumulation with dual-path logic.The helper correctly handles two accumulation paths:
- With ContentIndex: Accumulates into content blocks as
reasoning_texttype- Without ContentIndex: Accumulates into
ResponsesReasoning.SummaryThe TODO comment on lines 667-668 appropriately notes future enhancement potential for multiple summary entries.
681-727: LGTM: Signature accumulation with proper field mapping.The signature helper correctly maps:
- With ContentIndex →
Signaturefield in content block- Without ContentIndex →
EncryptedContentfield inResponsesReasoningThis aligns with the schema design where
EncryptedContentserves as the signature/encrypted data at the reasoning level.core/providers/anthropic/chat.go (1)
608-634: PartialJSON guard condition now emits empty string deltas.The condition changed from
chunk.Delta.PartialJSON != nil && *chunk.Delta.PartialJSON != ""to justchunk.Delta.PartialJSON != nil. This allows empty string partial JSON to be emitted as deltas. Evidence shows this is intentional:responses.go:3069explicitly creates empty PartialJSON deltas, and the accumulation logic (responses.go:470, 478) safely concatenates even empty strings. Validation of non-empty Arguments is deferred to after accumulation completes (as seen in test utilities validating the final assembled result). This change is safe and maintains streaming consistency.core/providers/gemini/responses.go (2)
138-179: Function-call → tool message + reasoning signature path looks solidThe new
FunctionCallbranch builds a properResponsesToolMessage(with JSON-serialized args) and a separate reasoning message carryingSummary(initialized empty) andEncryptedContentfor the thought signature. This cleanly aligns Gemini function calls with the updatedResponsesReasoningschema and avoids range-variable capture issues.
596-629: Reconstruction of Gemini FunctionCall + ThoughtSignature is consistent with emit sideThe FunctionCall reconstruction from
ResponsesToolMessage(includingCallIDand decodedArguments) and the lookahead-basedThoughtSignatureattachment match howconvertGeminiCandidatesToResponsesOutputemits the function-call + reasoning pair. As long as the reasoning message immediately follows the function-call (which this file enforces), this round-trip is coherent.core/providers/cohere/responses.go (8)
13-25: ReasoningContentIndices tracking and reset look correctAdding
ReasoningContentIndicesintoCohereResponsesStreamState, initializing it in the pool, and clearing it in bothacquireCohereResponsesStreamStateandflushensures per-stream tracking of reasoning content indices without leaking state between streams. No issues here.Also applies to: 29-41, 45-77, 89-118
490-567: Streaming tool plan, tool calls, citations, and lifecycle wiring look consistentThe handling of
StreamEventToolPlanDelta, tool call start/delta/end, citation start/end, andStreamEventMessageEndappears internally consistent:
- Tool plan text is emitted as normal
output_text.deltaon a dedicated output index, with proper close-out events before tool calls.- Tool call arguments are buffered per-output-index and finalized with
function_call_arguments.donefollowed byoutput_item.done.- Citations become
OutputTextAnnotationAdded/OutputTextAnnotationDonewith indices wired viaContentIndexToOutputIndex.- Message end emits a single
response.completedwith aggregated usage and stableCreatedAt.No additional correctness issues stand out beyond the reasoning-ID nil-deref already called out.
Also applies to: 612-735, 735-803, 804-848
850-932: Bifrost → Cohere request conversion is aligned with Responses params
ToCohereResponsesRequestcleanly maps core parameters (MaxOutputTokens,Temperature,TopP,top_k, stop sequences, penalties) and thethinkingextra param into the Cohere request, and converts tools/tool choice/messages via the new helpers. The shape looks correct and side-effect free.
935-975: Cohere → Bifrost response conversion is straightforward
ToBifrostResponsesResponsecorrectly maps the CohereID, computesCreatedAtat receipt time, translates usage (including cached tokens), and usesConvertCohereMessagesToBifrostMessages(..., true)for the output message. This path looks correct; only note is thatModelisn’t propagated here, but that’s consistent with the current implementation.
977-1112: Bidirectional message mapping handles system, reasoning, and tools coherently
ConvertBifrostMessagesToCohereMessagesandConvertCohereMessagesToBifrostMessages:
- Separate system content and prepend it as a single system message to Cohere.
- Accumulate reasoning messages into
CohereContentBlockTypeThinkingand attach them to the next assistant message.- Convert function calls and function outputs into Cohere’s
tool_calls/role:"tool"structures and back.The control flow around
currentAssistantMessageandpendingReasoningContentBlockslooks sound and flushes consistently at boundaries.
1150-1191: System and regular message conversions are straightforward
convertBifrostMessageToCohereSystemContentandconvertBifrostMessageToCohereMessagesimply flatten text blocks and map roles, and only emit content when present. Both look correct and side-effect free.
1267-1301: Function call output → Cohererole:"tool"mapping looks correct
convertBifrostFunctionCallOutputToCohereMessageonly emits a tool message whenCallIDis present, reconstructs content from eitherContentor the OpenAI-styleOutputwrapper, and setsToolCallIDappropriately. This matches the Responses schema and Cohere expectations.
1431-1465: Content block → Cohere block conversion covers key cases
convertResponsesMessageContentBlocksToCoherecorrectly:
- Maps both input and output text types to
CohereContentBlockTypeText.- Converts image URL blocks into
CohereContentBlockTypeImage.- Maps reasoning blocks to
CohereContentBlockTypeThinking.This provides a clean, minimal surface for Cohere without surprising behavior.
4b4a584 to
a97e22b
Compare
bf9c361 to
bcef5b4
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
♻️ Duplicate comments (4)
core/providers/anthropic/types.go (1)
356-356: This StopSequence change was already flagged in a previous review.The change from
*stringwithomitemptytostringwithoutomitemptybreaks API compatibility, as previously noted. Empty strings will serialize as"stop_sequence": ""instead of being omitted or representing null.core/providers/cohere/responses.go (3)
1238-1256: Guard access to promoted CallID field to prevent panicsOn line 1244,
msg.CallIDis accessed directly, butCallIDis promoted from the embedded*ResponsesToolMessage. Ifmsg.ResponsesToolMessageisnil, accessingmsg.CallIDwill panic even inside theifcondition.Apply this fix:
- if msg.CallID != nil { - toolCall.ID = msg.CallID + if msg.ResponsesToolMessage != nil && msg.ResponsesToolMessage.CallID != nil { + toolCall.ID = msg.ResponsesToolMessage.CallID }The same pattern should be applied to any other accesses of promoted fields from
msg.ResponsesToolMessagethroughout the function.
319-327: Critical: Nil-pointer dereference on state.MessageIDThe code dereferences
*state.MessageIDon line 324 before checking if it's nil on line 325, which will cause a panic.Apply this fix:
- // Generate stable ID for reasoning item - itemID := fmt.Sprintf("msg_%s_reasoning_%d", *state.MessageID, outputIndex) - if state.MessageID == nil { - itemID = fmt.Sprintf("reasoning_%d", outputIndex) - } + // Generate stable ID for reasoning item + var itemID string + if state.MessageID != nil { + itemID = fmt.Sprintf("msg_%s_reasoning_%d", *state.MessageID, outputIndex) + } else { + itemID = fmt.Sprintf("reasoning_%d", outputIndex) + }Note: This issue was flagged in a previous review but appears to remain unaddressed in the current code.
1131-1142: "auto" tool choice should map to ToolChoiceAuto, not ToolChoiceRequiredThe
"auto"tool choice is incorrectly mapped toToolChoiceRequired. Cohere's API supports three tool choice modes:NONE,REQUIRED, andAUTO(defined as constants incore/providers/cohere/types.go). Mapping"auto"toToolChoiceRequiredchanges semantics—autoallows the model to decide whether to call a tool, whilerequiredforces a tool call.- case "required", "auto", "function": - choice := ToolChoiceRequired - return &choice + case "required", "function": + choice := ToolChoiceRequired + return &choice + case "auto": + choice := ToolChoiceAuto + return &choice
🧹 Nitpick comments (4)
core/providers/vertex/errors.go (1)
14-28: Centralized body decoding and error classification look correctUsing
providerUtils.CheckAndDecodeBodyand switching allsonic.Unmarshalcalls todecodedBodyis a solid improvement: it handles content‑encoding consistently and cleanly separates decode failures (ErrProviderResponseDecode) from JSON shape issues (ErrProviderResponseUnmarshal). The fallback chain for OpenAI/Vertex/VertexValidation error formats remains intact and behaviorally equivalent apart from the improved error typing. I don’t see new correctness or panic risks here; this aligns well with the shared decoding utils used in other providers.ui/app/workspace/logs/views/logResponsesMessageView.tsx (1)
202-204: Use strict equality (===) for type comparison.The guard logic is correct, but TypeScript/JavaScript best practice is to use strict equality
===instead of loose equality==for type comparisons.- if (message.type == "reasoning" && (!message.summary || message.summary.length === 0) && !message.encrypted_content) { + if (message.type === "reasoning" && (!message.summary || message.summary.length === 0) && !message.encrypted_content) { return null; }core/schemas/responses.go (1)
731-733: Consider using a pointer or omitempty behavior for Summary slice.The
Summaryfield is a non-pointer slice withoutomitempty. In Go, an empty slice[]ResponsesReasoningSummary{}will serialize as"summary": []rather than being omitted. If the intent is to omit the field when empty (consistent with the UI guard checkingmessage.summary.length === 0), consider addingomitempty.type ResponsesReasoning struct { - Summary []ResponsesReasoningSummary `json:"summary"` + Summary []ResponsesReasoningSummary `json:"summary,omitempty"` EncryptedContent *string `json:"encrypted_content,omitempty"` }transports/bifrost-http/integrations/anthropic.go (1)
111-113: Use the structured logger instead oflog.Printf.The router receives a
schemas.Loggerparameter (as seen inNewAnthropicRouter), but this error logging uses the standard library'slog.Printf. For consistency with the codebase's logging practices, use the structured logger.Consider passing the logger to the stream converter or using a context-aware logging approach. If the logger isn't accessible in this closure, the error could be returned or the design adjusted to provide logger access.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (18)
core/providers/anthropic/chat.go(1 hunks)core/providers/anthropic/errors.go(2 hunks)core/providers/anthropic/types.go(4 hunks)core/providers/cohere/responses.go(9 hunks)core/providers/gemini/responses.go(2 hunks)core/providers/openai/responses.go(2 hunks)core/providers/openai/types.go(2 hunks)core/providers/utils/utils.go(4 hunks)core/providers/vertex/errors.go(1 hunks)core/schemas/bifrost.go(1 hunks)core/schemas/responses.go(5 hunks)framework/streaming/responses.go(2 hunks)transports/bifrost-http/handlers/inference.go(1 hunks)transports/bifrost-http/integrations/anthropic.go(4 hunks)transports/bifrost-http/integrations/router.go(3 hunks)ui/app/workspace/logs/views/logResponsesMessageView.tsx(1 hunks)ui/app/workspace/logs/views/logResponsesOutputView.tsx(0 hunks)ui/lib/types/logs.ts(2 hunks)
💤 Files with no reviewable changes (1)
- ui/app/workspace/logs/views/logResponsesOutputView.tsx
🚧 Files skipped from review as they are similar to previous changes (5)
- transports/bifrost-http/handlers/inference.go
- core/providers/anthropic/chat.go
- core/providers/anthropic/errors.go
- core/providers/utils/utils.go
- ui/lib/types/logs.ts
🧰 Additional context used
📓 Path-based instructions (1)
**
⚙️ CodeRabbit configuration file
always check the stack if there is one for the current PR. do not give localized reviews for the PR, always see all changes in the light of the whole stack of PRs (if there is a stack, if there is no stack you can continue to make localized suggestions/reviews)
Files:
ui/app/workspace/logs/views/logResponsesMessageView.tsxcore/schemas/bifrost.gocore/providers/gemini/responses.gocore/providers/vertex/errors.gocore/schemas/responses.gotransports/bifrost-http/integrations/anthropic.gotransports/bifrost-http/integrations/router.gocore/providers/openai/types.gocore/providers/openai/responses.gocore/providers/anthropic/types.gocore/providers/cohere/responses.goframework/streaming/responses.go
🧬 Code graph analysis (7)
core/providers/gemini/responses.go (3)
ui/lib/types/logs.ts (6)
FunctionCall(157-160)ResponsesToolMessage(403-409)ResponsesMessage(423-438)ResponsesMessageContent(400-400)ResponsesReasoning(417-420)ResponsesReasoningSummary(412-415)core/providers/gemini/types.go (4)
FunctionCall(1091-1101)Role(13-13)Content(922-930)Type(778-778)core/schemas/responses.go (5)
ResponsesToolMessage(461-481)ResponsesMessage(313-326)ResponsesMessageContent(338-343)ResponsesReasoning(730-733)ResponsesReasoningSummary(744-747)
core/providers/vertex/errors.go (4)
core/providers/utils/utils.go (2)
CheckAndDecodeBody(467-475)NewBifrostOperationError(493-504)core/schemas/provider.go (1)
ErrProviderResponseDecode(29-29)core/providers/vertex/vertex.go (1)
VertexError(25-31)core/providers/vertex/types.go (1)
VertexValidationError(154-161)
core/schemas/responses.go (2)
core/providers/gemini/types.go (1)
Type(778-778)ui/lib/types/logs.ts (2)
ResponsesMessageContentBlockType(352-359)ResponsesReasoningSummary(412-415)
transports/bifrost-http/integrations/anthropic.go (3)
core/schemas/bifrost.go (6)
Anthropic(37-37)Vertex(40-40)BifrostContextKeyUseRawRequestBody(117-117)BifrostContextKeyExtraHeaders(115-115)BifrostContextKeyURLPath(116-116)BifrostContextKeySkipKeySelection(114-114)core/schemas/utils.go (1)
IsAnthropicModel(1043-1045)core/providers/anthropic/responses.go (1)
ToAnthropicResponsesStreamResponse(834-1232)
transports/bifrost-http/integrations/router.go (2)
core/schemas/bifrost.go (1)
BifrostContextKeyIntegrationType(120-120)core/providers/gemini/types.go (1)
Type(778-778)
core/providers/openai/types.go (2)
core/schemas/responses.go (1)
ResponsesParametersReasoning(233-238)ui/lib/types/logs.ts (1)
ResponsesParametersReasoning(513-520)
core/providers/openai/responses.go (2)
core/schemas/responses.go (4)
ResponsesMessage(313-326)ResponsesReasoning(730-733)ResponsesMessageContentBlock(398-410)ResponsesMessageContent(338-343)core/providers/openai/types.go (2)
OpenAIResponsesRequest(143-152)OpenAIResponsesRequestInput(110-113)
⏰ Context from checks skipped due to timeout of 900000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (9)
- GitHub Check: Graphite / mergeability_check
- GitHub Check: Graphite / mergeability_check
- GitHub Check: Graphite / mergeability_check
- GitHub Check: Graphite / mergeability_check
- GitHub Check: Graphite / mergeability_check
- GitHub Check: Graphite / mergeability_check
- GitHub Check: Graphite / mergeability_check
- GitHub Check: Graphite / mergeability_check
- GitHub Check: Graphite / mergeability_check
🔇 Additional comments (21)
core/schemas/responses.go (4)
743-747: Type rename fromResponsesReasoningContenttoResponsesReasoningSummarylooks correct.The rename aligns with the UI type definition and the semantic meaning of the struct. This is a breaking change for any external consumers using the old type name.
68-68: LGTM - StopReason field addition.The
StopReasonfield appropriately accommodates providers that include stop reason information outside OpenAI's spec.
399-402: LGTM - Signature field addition to content blocks.The
Signaturefield enables per-content signatures for reasoning blocks, consistent with the streaming response structure.
1439-1441: LGTM - Streaming response signature support.The
Signaturefield addition aligns with the content block changes and properly documents that it's "Not in OpenAI's spec, but sent by other providers."core/providers/anthropic/types.go (4)
135-145: LGTM - Addedredacted_thinkingcontent block type.The new constant appropriately extends the enum to support Anthropic's redacted/encrypted thinking content blocks.
153-153: LGTM - Data field for redacted thinking.The
Datafield properly supports encrypted data for redacted thinking blocks, with the comment clarifying that signature does not accompany this field.
350-350: LGTM - Type field made optional.Adding
omitemptyto theTypefield inAnthropicStreamDeltais appropriate as the type may not always be present in delta events.
300-305: The concern aboutomitemptyon cache token fields is technically invalid;omitemptyhas no effect oninttypes in Go.Per Go's encoding/json specification,
omitemptyonly affects pointers, slices, maps, interfaces, and strings—not integer types. Integer zero values are always serialized as0regardless of the tag. The actual change (using a value type instead of pointer forCacheCreation) aligns with Anthropic's API contract, which always includes thecache_creationobject in usage responses, even when empty. The code correctly deserializes and handles these fields per the documented Anthropic API behavior.Likely an incorrect or invalid review comment.
transports/bifrost-http/integrations/anthropic.go (3)
74-81: LGTM - Extended provider check for Vertex with Anthropic models.The condition correctly handles the case where Vertex is the provider but the model is an Anthropic/Claude model, using the
IsAnthropicModelutility function.
106-122: LGTM - SSE aggregation for multiple streaming events.The logic correctly handles the case where
ToAnthropicResponsesStreamResponsereturns multiple events by combining them into a properly formatted SSE string withevent:anddata:lines.
194-206: LGTM - Refined passthrough gating.The updated logic properly:
- Only sets raw request body for Anthropic or unspecified providers
- Conditionally attaches extra headers/URL path only when not using Anthropic API key auth
core/providers/openai/types.go (1)
154-192: Well-structured custom marshaling implementation.The approach correctly shadows the embedded fields to customize JSON output. The implementation properly:
- Marshals Input first using its custom MarshalJSON method
- Wraps it in
json.RawMessageto preserve the marshaled output- Copies Reasoning with
MaxTokensset to nilThis is correct for the OpenAI Responses API, which does not include a
max_tokensfield in the reasoning parameter. Token limiting is controlled at the request level viamax_output_tokens, not within the reasoning configuration. The implementation correctly omits this field by setting it to nil.core/schemas/bifrost.go (1)
120-120: LGTM!The addition of
BifrostContextKeyIntegrationTypefollows the existing pattern for context keys and is used appropriately in the router to store integration type information.transports/bifrost-http/integrations/router.go (3)
312-313: LGTM!Setting the integration type in the context is clean and follows the established pattern for storing request metadata.
709-712: LGTM!The updated
shouldSendDoneMarkerlogic correctly distinguishes between providers that expect[DONE]markers and those that don't (Anthropic and the responses API).
883-883: LGTM!Expanding the SSE string check to allow both
"data: "and"event: "prefixes properly supports providers like Anthropic that use custom event types in their SSE format.framework/streaming/responses.go (3)
498-534: LGTM!The new
ReasoningSummaryTextDeltahandling correctly creates or finds reasoning messages and delegates to the new helper functions for accumulation. The logic to find existing messages by ItemID is sound.
626-679: LGTM!The
appendReasoningDeltaToResponsesMessagehelper correctly handles both content-block-based reasoning (with ContentIndex) and summary-based reasoning (without ContentIndex). The array bounds checks and initialization logic are appropriate.
681-727: LGTM!The
appendReasoningSignatureToResponsesMessagehelper mirrors the delta logic and correctly handles signatures in both content blocks and encrypted content. The implementation is consistent with the delta handler.core/providers/gemini/responses.go (2)
138-179: LGTM!The function call handling improvements include:
- Proper JSON marshaling of function arguments
- Creating local copies to avoid range loop variable capture issues
- Correctly initializing the new
Summaryfield when emitting reasoning messages for ThoughtSignatureThese changes align with the broader schema updates for reasoning summaries.
609-629: LGTM!The conversion logic correctly:
- Sets the function call name and arguments
- Propagates the CallID when present
- Preserves ThoughtSignature by looking ahead for reasoning messages with encrypted content
This properly handles Gemini 3 Pro's requirement for ThoughtSignature on function calls.
bcef5b4 to
4f289b9
Compare
4ab2a0a to
10060d1
Compare
4f289b9 to
b3244b9
Compare
10060d1 to
d6466cb
Compare
b3244b9 to
e04023a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
♻️ Duplicate comments (9)
core/providers/utils/utils.go (2)
267-267: Past concern remains unaddressed: MarshalIndent increases payload size.A previous review comment already identified that switching to
sonic.MarshalIndentincreases the payload size for all provider API requests without documented justification. The code still uses the indented format, and the concern about debug prints and production performance remains valid.
1000-1009: Past suggestions remain unaddressed: Input validation and efficiency improvements.A previous review comment already provided detailed suggestions for improving this function, including:
- Adding length validation to prevent panics for
length <= 0- Using a
conststring for letters instead of recreating[]runeeach call- Building the result into a
[]bytebuffer instead of[]runeThe function works correctly for its current use case (generating cosmetic identifiers), but these improvements would harden it for broader use.
core/providers/anthropic/types.go (1)
356-356: StopSequence should use*stringwithomitemptyfor API compatibility.This concern was raised in a previous review. Changing
StopSequencefrom*stringwithomitemptytostringwithoutomitemptybreaks compatibility with Anthropic's API specification. The API returnsstop_sequenceas eithernull(in initial streaming events) or a string value. Using a non-pointerstringwill serialize empty strings as"stop_sequence": ""instead of properly representing the null state.Apply this diff to restore API compatibility:
- StopSequence string `json:"stop_sequence"` + StopSequence *string `json:"stop_sequence,omitempty"`transports/bifrost-http/integrations/anthropic.go (1)
94-105: Remove commented-out code.This dead code was flagged in a previous review. Remove it to improve maintainability.
} else { - // if resp.ExtraFields.Provider == schemas.Anthropic || - // (resp.ExtraFields.Provider == schemas.Vertex && - // (schemas.IsAnthropicModel(resp.ExtraFields.ModelRequested) || - // schemas.IsAnthropicModel(resp.ExtraFields.ModelDeployment))) { - // if resp.ExtraFields.RawResponse != nil { - // var rawResponseJSON anthropic.AnthropicStreamDelta - // err := sonic.Unmarshal([]byte(resp.ExtraFields.RawResponse.(string)), &rawResponseJSON) - // if err == nil { - // return string(rawResponseJSON.Type), resp.ExtraFields.RawResponse, nil - // } - // } - // } if len(anthropicResponse) > 1 {core/providers/cohere/responses.go (5)
140-172: Empty block returned for invalid image may cause downstream issues.When
ImageURLis nil (line 150-152), an emptyResponsesMessageContentBlock{}with zero-valueTypeis returned. This was flagged in a previous review but the current fix returns an empty block instead of a sentinel value.Consider returning a properly typed block or filtering at the call site.
319-327: Nil-pointer dereference risk in reasoning item ID generation.Line 324 dereferences
*state.MessageIDbefore the nil check on line 325. This was flagged in a previous review and remains unaddressed.- itemID := fmt.Sprintf("msg_%s_reasoning_%d", *state.MessageID, outputIndex) - if state.MessageID == nil { - itemID = fmt.Sprintf("reasoning_%d", outputIndex) - } + var itemID string + if state.MessageID != nil { + itemID = fmt.Sprintf("msg_%s_reasoning_%d", *state.MessageID, outputIndex) + } else { + itemID = fmt.Sprintf("reasoning_%d", outputIndex) + }
1131-1142: Tool choice "auto" incorrectly maps to "required".This was flagged in a previous review. The
"auto"tool choice has different semantics than"required"- auto lets the model decide, while required forces a tool call.Verify Cohere's tool choice options and map
"auto"appropriately (possibly tonilfor default behavior).
1216-1225: Encrypted reasoning content exposed in plain text marker.This was flagged in a previous review. Embedding encrypted content in a
[ENCRYPTED_REASONING: ...]marker exposes potentially sensitive data in plain text to Cohere.Consider skipping encrypted content entirely rather than exposing it.
1244-1246: Guard access to embedded CallID to avoid nil panic.Accessing
msg.CallIDwhenmsg.ResponsesToolMessageis nil will panic becauseCallIDis a field on the embedded pointer type.- if msg.CallID != nil { - toolCall.ID = msg.CallID - } + if msg.ResponsesToolMessage != nil && msg.ResponsesToolMessage.CallID != nil { + toolCall.ID = msg.ResponsesToolMessage.CallID + }
🧹 Nitpick comments (2)
transports/bifrost-http/integrations/router.go (1)
709-712: Consider tightening the/responsespath check.The
strings.Contains(config.Path, "/responses")check is somewhat broad and could match unintended paths (e.g., a hypothetical/api/user_responses). Consider using a more specific check:- if config.Type == RouteConfigTypeAnthropic || strings.Contains(config.Path, "/responses") { + if config.Type == RouteConfigTypeAnthropic || strings.HasSuffix(config.Path, "/responses") || strings.Contains(config.Path, "/responses/") { shouldSendDoneMarker = false }Alternatively, you could add a dedicated flag to
StreamConfigto explicitly control DONE marker behavior.core/providers/gemini/responses.go (1)
143-145: Inconsistent JSON library usage.The code uses
json.Marshalhere while the rest of the file usessonicfor JSON operations. This inconsistency could lead to subtle serialization differences.- if argsBytes, err := json.Marshal(part.FunctionCall.Args); err == nil { + if argsBytes, err := sonic.Marshal(part.FunctionCall.Args); err == nil { argumentsStr = string(argsBytes) }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (18)
core/providers/anthropic/chat.go(1 hunks)core/providers/anthropic/errors.go(2 hunks)core/providers/anthropic/types.go(4 hunks)core/providers/cohere/responses.go(9 hunks)core/providers/gemini/responses.go(2 hunks)core/providers/openai/responses.go(2 hunks)core/providers/openai/types.go(2 hunks)core/providers/utils/utils.go(4 hunks)core/providers/vertex/errors.go(1 hunks)core/schemas/bifrost.go(1 hunks)core/schemas/responses.go(5 hunks)framework/streaming/responses.go(2 hunks)transports/bifrost-http/handlers/inference.go(1 hunks)transports/bifrost-http/integrations/anthropic.go(4 hunks)transports/bifrost-http/integrations/router.go(3 hunks)ui/app/workspace/logs/views/logResponsesMessageView.tsx(1 hunks)ui/app/workspace/logs/views/logResponsesOutputView.tsx(0 hunks)ui/lib/types/logs.ts(2 hunks)
💤 Files with no reviewable changes (1)
- ui/app/workspace/logs/views/logResponsesOutputView.tsx
🚧 Files skipped from review as they are similar to previous changes (6)
- core/providers/vertex/errors.go
- core/providers/openai/responses.go
- ui/app/workspace/logs/views/logResponsesMessageView.tsx
- core/providers/anthropic/errors.go
- ui/lib/types/logs.ts
- transports/bifrost-http/handlers/inference.go
🧰 Additional context used
📓 Path-based instructions (1)
**
⚙️ CodeRabbit configuration file
always check the stack if there is one for the current PR. do not give localized reviews for the PR, always see all changes in the light of the whole stack of PRs (if there is a stack, if there is no stack you can continue to make localized suggestions/reviews)
Files:
core/schemas/bifrost.gocore/schemas/responses.gocore/providers/gemini/responses.gocore/providers/anthropic/chat.gotransports/bifrost-http/integrations/anthropic.gocore/providers/anthropic/types.gocore/providers/openai/types.gotransports/bifrost-http/integrations/router.goframework/streaming/responses.gocore/providers/cohere/responses.gocore/providers/utils/utils.go
🧬 Code graph analysis (7)
core/schemas/responses.go (2)
core/providers/gemini/types.go (1)
Type(778-778)ui/lib/types/logs.ts (2)
ResponsesMessageContentBlockType(352-359)ResponsesReasoningSummary(412-415)
core/providers/gemini/responses.go (2)
core/providers/gemini/types.go (5)
FunctionCall(1091-1101)Role(13-13)Content(922-930)Type(778-778)Part(936-960)core/schemas/responses.go (5)
ResponsesToolMessage(461-481)ResponsesMessage(313-326)ResponsesMessageContent(338-343)ResponsesReasoning(730-733)ResponsesReasoningSummary(744-747)
transports/bifrost-http/integrations/anthropic.go (3)
core/schemas/bifrost.go (5)
Anthropic(37-37)Vertex(40-40)BifrostContextKeyUseRawRequestBody(117-117)BifrostContextKeyExtraHeaders(115-115)BifrostContextKeyURLPath(116-116)core/schemas/utils.go (1)
IsAnthropicModel(1043-1045)core/providers/anthropic/responses.go (1)
ToAnthropicResponsesStreamResponse(794-1192)
core/providers/openai/types.go (2)
core/schemas/responses.go (1)
ResponsesParametersReasoning(233-238)ui/lib/types/logs.ts (1)
ResponsesParametersReasoning(513-520)
transports/bifrost-http/integrations/router.go (2)
core/schemas/bifrost.go (1)
BifrostContextKeyIntegrationType(120-120)core/providers/gemini/types.go (1)
Type(778-778)
framework/streaming/responses.go (3)
core/schemas/responses.go (6)
ResponsesStreamResponseTypeReasoningSummaryTextDelta(1392-1392)ResponsesMessage(313-326)ResponsesReasoning(730-733)ResponsesReasoningSummary(744-747)ResponsesMessageContent(338-343)ResponsesMessageContentBlock(398-410)core/schemas/utils.go (1)
Ptr(16-18)framework/streaming/accumulator.go (1)
Accumulator(14-30)
core/providers/utils/utils.go (1)
core/schemas/bifrost.go (2)
BifrostError(356-365)ErrorField(374-381)
⏰ Context from checks skipped due to timeout of 900000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (7)
- GitHub Check: Graphite / mergeability_check
- GitHub Check: Graphite / mergeability_check
- GitHub Check: Graphite / mergeability_check
- GitHub Check: Graphite / mergeability_check
- GitHub Check: Graphite / mergeability_check
- GitHub Check: Graphite / mergeability_check
- GitHub Check: Graphite / mergeability_check
🔇 Additional comments (26)
core/providers/utils/utils.go (2)
10-10: LGTM: Import supports the new utility function.The
math/randimport is required for theGetRandomStringfunction added below. Since the function generates cosmetic identifiers (not security-sensitive tokens), this import is appropriate.
322-334: LGTM: Good defensive addition for compressed error responses.Adding
CheckAndDecodeBodybefore unmarshaling ensures that compressed (e.g., gzip) error responses from provider APIs are properly decoded. The error handling correctly returns aBifrostErrorwith status code if decoding fails, and the decoded body is appropriately used for subsequent unmarshaling.core/providers/anthropic/types.go (1)
135-145: LGTM - New content block type for redacted thinking.The addition of
AnthropicContentBlockTypeRedactedThinkingaligns with Anthropic's extended thinking feature where thinking blocks may be redacted. TheDatafield addition on line 153 properly supports encrypted data for these redacted thinking blocks.core/schemas/bifrost.go (1)
120-120: LGTM - New context key for integration type.The new
BifrostContextKeyIntegrationTypeconstant enables routing logic to identify the integration type (OpenAI, Anthropic, etc.) in the request context. This supports the conditional DONE marker behavior inhandleStreaming.core/providers/openai/types.go (1)
154-192: LGTM - Custom marshaling excludesMaxTokensfor OpenAI.The
MarshalJSONimplementation correctly stripsMaxTokensfrom theReasoningfield before serialization, as OpenAI's Responses API doesn't support this parameter (it's Anthropic-specific per the schema comment). The approach of manually copying fields ensures the original request remains unchanged.Note: If
ResponsesParametersReasoninggains new fields in the future, this method will need to be updated to copy them as well.transports/bifrost-http/integrations/router.go (2)
312-313: LGTM - Integration type stored in context.Setting the integration type in the context enables downstream logic to conditionally handle provider-specific behaviors like DONE marker emission.
883-885: LGTM - Extended SSE prefix handling.The condition now correctly handles both
"data: "and"event: "prefixed strings, allowing providers that return complete SSE-formatted strings to pass through without double-wrapping.transports/bifrost-http/integrations/anthropic.go (3)
74-82: LGTM - Extended provider check for Vertex with Anthropic models.The condition now correctly handles both direct Anthropic requests and Vertex requests using Anthropic models (claude-*), returning raw responses when available.
106-122: LGTM - Multi-event aggregation for streaming responses.The logic correctly handles cases where
ToAnthropicResponsesStreamResponsereturns multiple events by aggregating them into a single SSE-formatted string with properevent:anddata:lines. Single events are returned directly for more efficient handling.
194-206: Empty provider assumption and OAuth key skipping are correct.The code's assumption that
provider == ""means Anthropic passthrough is reasonable given this is the/anthropic/v1/messagesendpoint. TheBifrostContextKeySkipKeySelectionflag is intentionally set for OAuth flows (detected by the Bearersk-ant-oat*token inisAnthropicAPIKeyAuth), not API key auth. Anthropic is in the allowed list for key skipping (unlike Azure, Bedrock, and Vertex), so passing an empty key to the provider for OAuth flows is the intended behavior and is properly guarded.core/providers/anthropic/chat.go (1)
608-634: No issues found with emptyPartialJSONhandling.The change to emit tool input deltas whenever
PartialJSONis non-nil is safe. Downstream code inframework/streaming/accumulator.goexplicitly handles empty string Arguments through string concatenation (line 267), which safely accumulates empty strings without issues. The accumulator also includes special handling for edge cases like empty braces (line 247-248), confirming the code is prepared for empty Arguments values during streaming aggregation.core/providers/gemini/responses.go (3)
148-164: LGTM - Good defensive copy pattern.The code correctly creates local copies of
functionCallIDandfunctionCallNameto avoid potential issues with range variable capture when these values are used in pointers.
166-179: ThoughtSignature preservation for Gemini 3 Pro looks correct.The logic to emit a separate
ResponsesReasoningmessage whenThoughtSignatureis present ensures the signature can be round-tripped. TheSummaryfield is correctly initialized as an empty slice.
619-627: The look-ahead logic is correct; the reasoning message is always emitted immediately after the function call.In
convertGeminiCandidatesToResponsesOutput, when a function call part with aThoughtSignatureis processed, the reasoning message is appended directly after the function call message within the same case block (lines 167–178). There is no opportunity for intervening messages between them, as the loop processes individual parts and appends complete function-call-plus-reasoning pairs sequentially to themessagesarray.framework/streaming/responses.go (3)
497-534: LGTM - Well-structured reasoning delta handling.The new
ReasoningSummaryTextDeltacase correctly:
- Guards against nil
Delta/SignaturewithItemIDcheck- Searches backwards for existing message by ID
- Creates new reasoning message if not found
- Handles both text delta and signature delta
626-679: Clear dual-path logic for reasoning delta accumulation.The helper correctly branches on
contentIndex:
- With index: accumulates into content blocks (reasoning_text type)
- Without index: accumulates into
ResponsesReasoning.SummaryThe comment on lines 667-668 acknowledges future extensibility for multiple summary entries.
681-727: Signature helper mirrors delta helper pattern.The
appendReasoningSignatureToResponsesMessagefollows the same dual-path logic as the delta helper, storing signatures either in content blocks (Signaturefield) or inResponsesReasoning.EncryptedContent. This is consistent with the schema design.core/schemas/responses.go (4)
68-68: LGTM - StopReason field addition.The
StopReasonfield is properly documented as not part of OpenAI's spec but needed for other providers. Theomitemptytag ensures it won't appear in responses when not set.
399-402: Signature field enables reasoning content signing.The new
Signaturefield onResponsesMessageContentBlocksupports the reasoning signature streaming feature added in the streaming layer. The field ordering and JSON tag are correct.
729-747: ResponsesReasoning schema update aligns with streaming changes.The
Summaryfield now uses[]ResponsesReasoningSummarywith the new struct definition. This aligns with:
- The streaming helper that appends to
Summary[0].Text- The UI type definition (
ResponsesReasoningSummarywithtype: "summary_text")- The Gemini conversion that initializes
Summary: []schemas.ResponsesReasoningSummary{}
1439-1441: Stream response Signature field added.The
Signaturefield onBifrostResponsesStreamResponseenables streaming reasoning signatures, used by theappendReasoningSignatureToResponsesMessagehelper. The comment correctly notes this is not in OpenAI's spec.core/providers/cohere/responses.go (5)
17-17: LGTM - ReasoningContentIndices state tracking.The new
ReasoningContentIndicesmap correctly tracks which content indices are reasoning blocks, enabling proper event emission (reasoning vs text) during streaming. Initialization and cleanup follow the established pattern for other state maps.Also applies to: 34-34, 64-68, 106-110
850-932: ToCohereResponsesRequest implementation looks correct.The conversion properly:
- Maps basic parameters (MaxOutputTokens, Temperature, TopP)
- Extracts extra params (top_k, stop, frequency_penalty, presence_penalty, thinking)
- Converts tools and tool choice
- Delegates message conversion to
ConvertBifrostMessagesToCohereMessages
977-1112: ConvertBifrostMessagesToCohereMessages handles complex message flows.The function correctly:
- Collects system messages separately
- Tracks pending reasoning blocks to attach to assistant messages
- Handles function calls and outputs
- Flushes pending state at message boundaries
The logic for accumulating reasoning blocks before assistant content is particularly well-structured.
1303-1429: convertSingleCohereMessageToBifrostMessages comprehensive conversion.The function properly:
- Separates reasoning blocks from regular content
- Prepends reasoning message to output
- Handles tool calls with nil safety checks (lines 1389-1392)
- Generates stable IDs using timestamps
505-511: Same nil-dereference pattern in tool plan ID generation.Lines 507-511 have the same issue - dereferencing
state.MessageIDbefore checking for nil.- var itemID string - if state.MessageID == nil { - itemID = fmt.Sprintf("item_%d", outputIndex) - } else { - itemID = fmt.Sprintf("msg_%s_item_%d", *state.MessageID, outputIndex) - } + var itemID string + if state.MessageID != nil { + itemID = fmt.Sprintf("msg_%s_item_%d", *state.MessageID, outputIndex) + } else { + itemID = fmt.Sprintf("item_%d", outputIndex) + }Likely an incorrect or invalid review comment.
e04023a to
bb6c8dc
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 15
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (5)
core/providers/openai/responses.go (1)
109-137: Add a comment linking to OpenAI's API documentation as the source of truth for supported tools.The
supportedTypesallowlist is accurate (all included tools are supported by the OpenAI Responses API), but lacks documentation. Add a comment with a link to OpenAI's official tools documentation to clarify the source of this list and make it easier to update when OpenAI adds or removes tool support. This also addresses the concern about centralizing the allowlist—if this list is part of a broader tool support matrix across providers, consider extracting it to a shared configuration.transports/bifrost-http/integrations/router.go (1)
879-886: SSE “already formatted” detection is too strict; acceptdata:/event:without requiring a space.
Right now (Line 883-885) a valid SSE line starting withdata:(no space) would be wrapped again, producing malformed output.- if !strings.HasPrefix(sseString, "data: ") && !strings.HasPrefix(sseString, "event: ") { + if !strings.HasPrefix(sseString, "data:") && !strings.HasPrefix(sseString, "event:") { sseString = fmt.Sprintf("data: %s\n\n", sseString) }transports/bifrost-http/handlers/inference.go (1)
91-118: Add missing known fields (notably"user") toresponsesParamsKnownFieldsto avoid pollutingExtraParams.
schemas.ResponsesParametersincludesUser, but"user"isn’t listed here—so it can end up duplicated intoExtraParamsand potentially forwarded as an unknown provider param.var responsesParamsKnownFields = map[string]bool{ @@ "tool_choice": true, "tools": true, "truncation": true, + "user": true, }core/providers/bedrock/bedrock.go (1)
1034-1091: Set stream-end indicator on provider “exception” frames too (not just Decode errors).In
ResponsesStream, the newctx = context.WithValue(ctx, schemas.BifrostContextKeyStreamEndIndicator, true)on decode errors (Line 1070) and the EOF-finalization last chunk (Line 1059) is good, but themsgType != "event"early-return path (Lines 1078-1090) still terminates without setting the end indicator.Suggested patch:
if msgType := msgTypeHeader.String(); msgType != "event" { excType := msgType if excHeader := message.Headers.Get(":exception-type"); excHeader != nil { if v := excHeader.String(); v != "" { excType = v } } errMsg := string(message.Payload) err := fmt.Errorf("%s stream %s: %s", providerName, excType, errMsg) + ctx = context.WithValue(ctx, schemas.BifrostContextKeyStreamEndIndicator, true) providerUtils.ProcessAndSendError(ctx, postHookRunner, err, responseChan, schemas.ResponsesStreamRequest, providerName, request.Model, provider.logger) return }core/schemas/responses.go (1)
45-85: Fix doc comment typos + clarify “non-OpenAI spec” fields.
Minor, but these comments are public-facing and currently read “Not is OpenAI’s spec”.
♻️ Duplicate comments (3)
core/internal/testutil/chat_completion_stream.go (1)
535-556: Cerebras prompt references tools but none are configured.The Cerebras-specific prompt instructs the model to "use your tools for this", but the request configuration at lines 544-556 doesn't include any tools. This mismatch will likely cause test failures or unexpected model behavior for Cerebras.
Consider using a reasoning-focused prompt without tool references, consistent with the default mathematical problem:
if testConfig.Provider == schemas.Cerebras { - problemPrompt = "Hello how are you, can you search hackernews news regarding maxim ai for me? use your tools for this" + problemPrompt = "Explain step by step: What is 15% of 200, then multiply that result by 3?" }core/providers/bedrock/utils.go (1)
115-121: Don’t overwriteAdditionalModelRequestFields(clobbersreasoning_config).
bedrockReq.AdditionalModelRequestFields = orderedFields(Line 119) can discardreasoning_configset earlier (and any other pre-populated fields). Merge instead.Suggested patch:
- if orderedFields, ok := schemas.SafeExtractOrderedMap(requestFields); ok { - bedrockReq.AdditionalModelRequestFields = orderedFields - } + if orderedFields, ok := schemas.SafeExtractOrderedMap(requestFields); ok { + if bedrockReq.AdditionalModelRequestFields == nil { + bedrockReq.AdditionalModelRequestFields = make(schemas.OrderedMap) + } + for k, v := range orderedFields { + // Preserve already-derived/provider-required fields (e.g., reasoning_config) + if _, exists := bedrockReq.AdditionalModelRequestFields[k]; exists { + continue + } + bedrockReq.AdditionalModelRequestFields[k] = v + } + }core/providers/cohere/responses.go (1)
1284-1302: Don’t embed encrypted reasoning into plain text (reintroduces prior security issue).
This leaksEncryptedContentto Cohere by wrapping it in a readable marker.@@ } else if msg.ResponsesReasoning.EncryptedContent != nil { - // Cohere doesn't have a direct equivalent to encrypted content, - // so we'll store it as a regular thinking block with a special marker - encryptedText := fmt.Sprintf("[ENCRYPTED_REASONING: %s]", *msg.ResponsesReasoning.EncryptedContent) - thinkingBlock := CohereContentBlock{ - Type: CohereContentBlockTypeThinking, - Thinking: &encryptedText, - } - thinkingBlocks = append(thinkingBlocks, thinkingBlock) + // Cohere doesn't support encrypted reasoning; keep it opaque. + // Intentionally dropping rather than exfiltrating into text. }
🧹 Nitpick comments (9)
core/internal/testutil/reasoning.go (1)
413-419: Consider removing unusedminfunction.This helper function appears unused in the file - length calculations are done inline throughout (e.g., lines 100-103, 146-149, 293-296, 340-343, 366-369). Additionally, Go 1.21+ provides a built-in
minfunction.#!/bin/bash # Verify if min function is used anywhere in this file or if it's imported elsewhere rg -n "min\(" core/internal/testutil/reasoning.go # Check if other files in testutil import or reference this min rg -n "testutil\.min" core/core/providers/openai/types.go (1)
11-13: Central constant is good—ensure it’s used consistently (and justified).
MinMaxCompletionTokensbeing a single source of truth helps; please ensure other request paths don’t reintroduce different minima (esp. in this Graphite stack).core/providers/openai/responses.go (1)
56-80: Avoid partial field-copy when converting summaries → content blocks (risk of silently dropping message fields).
Instead of manually copyingID/Type/Status/RoleintonewMessage, prefernewMessage := message(then overwriteContentand any fields you intentionally want cleared). This reduces the chance of losing fields ifschemas.ResponsesMessageevolves.core/providers/vertex/errors.go (1)
13-28: Good: decode body once; avoidvertexErrshadowing to reduce confusion.
The decoded-body approach is a solid robustness win. Minor readability:vertexErris both[]VertexErrorand laterVertexError(Line 12 vs Line 23), which is easy to misread during maintenance—rename one of them (e.g.,vertexErrList/vertexErrSingle).core/providers/cohere/chat.go (1)
104-128: Consider validating/clampingThinking.TokenBudgetto Cohere’s minimum when Reasoning.MaxTokens is provided.
IfbifrostReq.Params.Reasoning.MaxTokensis set but is<= 0, you’ll emit an invalid Cohere request (token_budgetis typically expected to be ≥ 1). A small guard/clamp toMinimumReasoningMaxTokenswould make this more robust.transports/bifrost-http/integrations/router.go (1)
312-314: Schema comment forBifrostContextKeyIntegrationTypecould be clearer.The schema annotation says
// RouteConfigType, but the actual stored value is astring(viastring(config.Type)at line 313). SinceRouteConfigTypeis a string type alias and all downstream readers (e.g.,anthropic/responses.go) use string comparisons rather than type assertions, consider updating the schema comment to// stringor// string (RouteConfigType)for accuracy.core/providers/anthropic/chat.go (1)
602-629: Avoid emitting empty tool-arguments deltas; useschemas.Ptrfor"function".
Right now you’ll emit chunks whenPartialJSONis"", and the pointer closure is inconsistent with the rest of the file.@@ case AnthropicStreamDeltaTypeInputJSON: // Handle tool use streaming - accumulate partial JSON - if chunk.Delta.PartialJSON != nil { + if chunk.Delta.PartialJSON != nil && *chunk.Delta.PartialJSON != "" { @@ { - Type: func() *string { s := "function"; return &s }(), + Type: schemas.Ptr("function"), Function: schemas.ChatAssistantMessageToolCallFunction{ Arguments: *chunk.Delta.PartialJSON, }, },transports/bifrost-http/integrations/anthropic.go (1)
230-238: Defensive guard:ctxpointer can be nil.
(*ctx).Value(...)will panic if a caller ever passesnil(or passes a nil*context.Context).func shouldUsePassthrough(ctx *context.Context, provider schemas.ModelProvider, model string, deployment string) bool { + if ctx == nil { + return false + } isClaudeCode := false if userAgent, ok := (*ctx).Value(schemas.BifrostContextKeyUserAgent).(string); ok {core/providers/cohere/responses.go (1)
1202-1223: Avoid defaulting unknown tool_choice strings torequired.
Forcing tool calls on unknown values is a surprising behavior change; returningnil(provider default) is usually safer.@@ case "auto": choice := ToolChoiceAuto return &choice default: - choice := ToolChoiceRequired - return &choice + return nil }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
ui/package-lock.jsonis excluded by!**/package-lock.json
📒 Files selected for processing (55)
core/internal/testutil/account.go(1 hunks)core/internal/testutil/chat_completion_stream.go(1 hunks)core/internal/testutil/reasoning.go(4 hunks)core/internal/testutil/responses_stream.go(1 hunks)core/internal/testutil/tests.go(2 hunks)core/providers/anthropic/anthropic.go(3 hunks)core/providers/anthropic/chat.go(5 hunks)core/providers/anthropic/errors.go(2 hunks)core/providers/anthropic/types.go(8 hunks)core/providers/anthropic/utils.go(2 hunks)core/providers/azure/azure.go(2 hunks)core/providers/azure/utils.go(1 hunks)core/providers/bedrock/bedrock.go(2 hunks)core/providers/bedrock/bedrock_test.go(13 hunks)core/providers/bedrock/types.go(2 hunks)core/providers/bedrock/utils.go(2 hunks)core/providers/cerebras/cerebras_test.go(2 hunks)core/providers/cohere/chat.go(3 hunks)core/providers/cohere/cohere.go(2 hunks)core/providers/cohere/cohere_test.go(1 hunks)core/providers/cohere/responses.go(10 hunks)core/providers/cohere/types.go(1 hunks)core/providers/gemini/responses.go(2 hunks)core/providers/groq/groq_test.go(2 hunks)core/providers/mistral/mistral_test.go(1 hunks)core/providers/openai/chat.go(1 hunks)core/providers/openai/responses.go(2 hunks)core/providers/openai/text.go(1 hunks)core/providers/openai/types.go(3 hunks)core/providers/openai/utils.go(1 hunks)core/providers/openrouter/openrouter_test.go(1 hunks)core/providers/utils/utils.go(4 hunks)core/providers/vertex/errors.go(1 hunks)core/providers/vertex/types.go(0 hunks)core/providers/vertex/utils.go(1 hunks)core/providers/vertex/vertex.go(3 hunks)core/providers/vertex/vertex_test.go(1 hunks)core/schemas/bifrost.go(1 hunks)core/schemas/mux.go(0 hunks)core/schemas/responses.go(5 hunks)core/schemas/utils.go(1 hunks)docs/docs.json(0 hunks)framework/configstore/rdb.go(0 hunks)framework/streaming/chat.go(0 hunks)framework/streaming/responses.go(2 hunks)transports/bifrost-http/handlers/inference.go(1 hunks)transports/bifrost-http/handlers/middlewares.go(1 hunks)transports/bifrost-http/integrations/anthropic.go(5 hunks)transports/bifrost-http/integrations/router.go(3 hunks)ui/app/workspace/logs/views/columns.tsx(1 hunks)ui/app/workspace/logs/views/logDetailsSheet.tsx(1 hunks)ui/app/workspace/logs/views/logResponsesMessageView.tsx(2 hunks)ui/app/workspace/logs/views/logResponsesOutputView.tsx(0 hunks)ui/lib/types/logs.ts(2 hunks)ui/package.json(1 hunks)
💤 Files with no reviewable changes (6)
- core/schemas/mux.go
- ui/app/workspace/logs/views/logResponsesOutputView.tsx
- framework/configstore/rdb.go
- framework/streaming/chat.go
- core/providers/vertex/types.go
- docs/docs.json
✅ Files skipped from review due to trivial changes (1)
- transports/bifrost-http/handlers/middlewares.go
🚧 Files skipped from review as they are similar to previous changes (22)
- ui/app/workspace/logs/views/columns.tsx
- ui/app/workspace/logs/views/logResponsesMessageView.tsx
- core/providers/groq/groq_test.go
- core/schemas/utils.go
- core/providers/anthropic/errors.go
- core/providers/vertex/vertex_test.go
- core/providers/openai/text.go
- ui/app/workspace/logs/views/logDetailsSheet.tsx
- core/providers/openai/utils.go
- core/providers/vertex/vertex.go
- core/providers/cohere/cohere_test.go
- core/providers/utils/utils.go
- core/providers/cerebras/cerebras_test.go
- core/providers/mistral/mistral_test.go
- framework/streaming/responses.go
- core/providers/anthropic/utils.go
- core/providers/bedrock/types.go
- core/providers/openrouter/openrouter_test.go
- core/providers/cohere/types.go
- core/internal/testutil/tests.go
- ui/package.json
- core/providers/bedrock/bedrock_test.go
🧰 Additional context used
📓 Path-based instructions (1)
**
⚙️ CodeRabbit configuration file
always check the stack if there is one for the current PR. do not give localized reviews for the PR, always see all changes in the light of the whole stack of PRs (if there is a stack, if there is no stack you can continue to make localized suggestions/reviews)
Files:
core/providers/openai/chat.goui/lib/types/logs.tstransports/bifrost-http/integrations/router.gocore/providers/azure/utils.gocore/providers/vertex/utils.gocore/providers/vertex/errors.gocore/providers/openai/responses.gocore/providers/azure/azure.gocore/providers/bedrock/utils.gocore/providers/cohere/chat.gocore/providers/gemini/responses.gocore/schemas/responses.gocore/providers/bedrock/bedrock.gocore/internal/testutil/responses_stream.gocore/internal/testutil/account.gocore/providers/cohere/cohere.gocore/internal/testutil/chat_completion_stream.gocore/providers/anthropic/chat.gotransports/bifrost-http/integrations/anthropic.gocore/internal/testutil/reasoning.gocore/providers/openai/types.gocore/schemas/bifrost.gocore/providers/anthropic/types.gocore/providers/anthropic/anthropic.gocore/providers/cohere/responses.gotransports/bifrost-http/handlers/inference.go
🧠 Learnings (2)
📚 Learning: 2025-12-09T17:07:42.007Z
Learnt from: qwerty-dvorak
Repo: maximhq/bifrost PR: 1006
File: core/schemas/account.go:9-18
Timestamp: 2025-12-09T17:07:42.007Z
Learning: In core/schemas/account.go, the HuggingFaceKeyConfig field within the Key struct is currently unused and reserved for future Hugging Face inference endpoint deployments. Do not flag this field as missing from OpenAPI documentation or require its presence in the API spec until the feature is actively implemented and used. When the feature is added, update the OpenAPI docs accordingly; otherwise, treat this field as non-breaking and not part of the current API surface.
Applied to files:
core/providers/openai/chat.gotransports/bifrost-http/integrations/router.gocore/providers/azure/utils.gocore/providers/vertex/utils.gocore/providers/vertex/errors.gocore/providers/openai/responses.gocore/providers/azure/azure.gocore/providers/bedrock/utils.gocore/providers/cohere/chat.gocore/providers/gemini/responses.gocore/schemas/responses.gocore/providers/bedrock/bedrock.gocore/internal/testutil/responses_stream.gocore/internal/testutil/account.gocore/providers/cohere/cohere.gocore/internal/testutil/chat_completion_stream.gocore/providers/anthropic/chat.gotransports/bifrost-http/integrations/anthropic.gocore/internal/testutil/reasoning.gocore/providers/openai/types.gocore/schemas/bifrost.gocore/providers/anthropic/types.gocore/providers/anthropic/anthropic.gocore/providers/cohere/responses.gotransports/bifrost-http/handlers/inference.go
📚 Learning: 2025-12-11T11:58:25.307Z
Learnt from: Pratham-Mishra04
Repo: maximhq/bifrost PR: 1000
File: core/providers/openai/responses.go:42-84
Timestamp: 2025-12-11T11:58:25.307Z
Learning: In core/providers/openai/responses.go (and related OpenAI response handling), document and enforce the API format constraint: if ResponsesReasoning != nil and the response contains content blocks, all content blocks should be treated as reasoning blocks by default. Implement type guards or parsing logic accordingly, and add unit tests to verify that when ResponsesReasoning is non-nil, content blocks are labeled as reasoning blocks. Include clear comments in the code explaining the rationale and ensure downstream consumers rely on this behavior.
Applied to files:
core/providers/openai/chat.gocore/providers/openai/responses.gocore/providers/openai/types.go
🧬 Code graph analysis (18)
core/providers/openai/chat.go (5)
core/schemas/chatcompletions.go (1)
ChatParameters(155-184)core/providers/openai/types.go (1)
MinMaxCompletionTokens(12-12)core/schemas/utils.go (1)
Ptr(16-18)core/utils.go (1)
Ptr(51-53)core/providers/openai/utils.go (1)
SanitizeUserField(51-56)
ui/lib/types/logs.ts (1)
core/schemas/responses.go (2)
ResponsesReasoningSummary(745-748)ResponsesReasoning(731-734)
transports/bifrost-http/integrations/router.go (2)
core/schemas/bifrost.go (1)
BifrostContextKeyIntegrationType(120-120)core/providers/gemini/types.go (1)
Type(778-778)
core/providers/azure/utils.go (6)
core/schemas/responses.go (1)
BifrostResponsesRequest(32-39)core/providers/utils/utils.go (1)
NewBifrostOperationError(516-527)core/schemas/provider.go (2)
ErrRequestBodyConversion(25-25)ErrProviderRequestMarshal(26-26)core/providers/anthropic/types.go (1)
AnthropicDefaultMaxTokens(14-14)core/providers/anthropic/responses.go (1)
ToAnthropicResponsesRequest(1419-1529)core/schemas/utils.go (1)
Ptr(16-18)
core/providers/vertex/utils.go (7)
core/schemas/responses.go (1)
BifrostResponsesRequest(32-39)core/providers/utils/utils.go (1)
NewBifrostOperationError(516-527)core/schemas/provider.go (2)
ErrRequestBodyConversion(25-25)ErrProviderRequestMarshal(26-26)core/providers/anthropic/types.go (1)
AnthropicDefaultMaxTokens(14-14)core/providers/vertex/types.go (1)
DefaultVertexAnthropicVersion(8-8)core/providers/anthropic/responses.go (1)
ToAnthropicResponsesRequest(1419-1529)core/schemas/utils.go (1)
Ptr(16-18)
core/providers/azure/azure.go (4)
core/schemas/bifrost.go (1)
BifrostError(358-367)ui/lib/types/logs.ts (1)
BifrostError(226-232)core/schemas/utils.go (1)
IsAnthropicModel(1043-1045)core/providers/utils/utils.go (1)
CheckContextAndGetRequestBody(257-275)
core/providers/cohere/chat.go (5)
core/providers/cohere/types.go (5)
CohereThinking(174-177)ThinkingTypeEnabled(183-183)DefaultCompletionMaxTokens(12-12)MinimumReasoningMaxTokens(11-11)ThinkingTypeDisabled(184-184)core/providers/bedrock/types.go (2)
DefaultCompletionMaxTokens(13-13)MinimumReasoningMaxTokens(12-12)core/providers/utils/utils.go (1)
GetBudgetTokensFromReasoningEffort(1084-1121)core/providers/anthropic/types.go (1)
MinimumReasoningMaxTokens(15-15)core/schemas/utils.go (1)
Ptr(16-18)
core/providers/gemini/responses.go (3)
ui/lib/types/logs.ts (6)
FunctionCall(157-160)ResponsesToolMessage(403-409)ResponsesMessage(423-438)ResponsesMessageContent(400-400)ResponsesReasoning(417-420)ResponsesReasoningSummary(412-415)core/providers/gemini/types.go (5)
FunctionCall(1091-1101)Role(13-13)Content(922-930)Type(778-778)Part(936-960)core/schemas/responses.go (5)
ResponsesToolMessage(462-482)ResponsesMessage(314-327)ResponsesMessageContent(339-344)ResponsesReasoning(731-734)ResponsesReasoningSummary(745-748)
core/schemas/responses.go (2)
core/providers/gemini/types.go (1)
Type(778-778)ui/lib/types/logs.ts (2)
ResponsesMessageContentBlockType(352-359)ResponsesReasoningSummary(412-415)
core/providers/bedrock/bedrock.go (1)
core/schemas/bifrost.go (1)
BifrostContextKeyStreamEndIndicator(113-113)
core/internal/testutil/responses_stream.go (1)
core/schemas/utils.go (1)
Ptr(16-18)
core/providers/cohere/cohere.go (1)
core/providers/cohere/responses.go (1)
ToCohereResponsesRequest(892-1006)
core/providers/anthropic/chat.go (6)
core/providers/anthropic/types.go (2)
AnthropicThinking(69-72)MinimumReasoningMaxTokens(15-15)core/providers/utils/utils.go (1)
GetBudgetTokensFromReasoningEffort(1084-1121)core/providers/bedrock/types.go (1)
MinimumReasoningMaxTokens(12-12)core/providers/cohere/types.go (1)
MinimumReasoningMaxTokens(11-11)core/schemas/utils.go (1)
Ptr(16-18)core/utils.go (1)
Ptr(51-53)
transports/bifrost-http/integrations/anthropic.go (2)
core/schemas/bifrost.go (2)
BifrostContextKeyUserAgent(123-123)Anthropic(37-37)core/schemas/utils.go (1)
IsAnthropicModel(1043-1045)
core/internal/testutil/reasoning.go (8)
core/internal/testutil/account.go (1)
ComprehensiveTestConfig(47-64)core/schemas/utils.go (1)
Ptr(16-18)core/schemas/bifrost.go (3)
OpenAI(35-35)BifrostError(358-367)ChatCompletionRequest(89-89)core/schemas/chatcompletions.go (5)
ChatMessage(469-478)BifrostChatRequest(12-19)ChatParameters(155-184)BifrostChatResponse(26-41)ChatAssistantMessage(626-632)core/internal/testutil/utils.go (1)
CreateBasicChatMessage(247-254)core/schemas/models.go (1)
Model(109-129)core/internal/testutil/test_retry_framework.go (4)
GetTestRetryConfigForScenario(1116-1148)TestRetryContext(168-173)ChatRetryConfig(186-193)WithChatTestRetry(274-424)core/internal/testutil/validation_presets.go (2)
GetExpectationsForScenario(208-293)ModifyExpectationsForProvider(300-347)
core/providers/openai/types.go (2)
core/schemas/chatcompletions.go (1)
ChatParameters(155-184)core/schemas/responses.go (1)
ResponsesParametersReasoning(234-239)
core/providers/cohere/responses.go (4)
core/schemas/responses.go (12)
ResponsesMessageContentBlock(399-411)ResponsesMessage(314-327)ResponsesMessageTypeMessage(290-290)ResponsesInputMessageRoleAssistant(332-332)ResponsesMessageContent(339-344)ResponsesMessageTypeReasoning(307-307)ResponsesMessageTypeFunctionCall(295-295)ResponsesResponseUsage(254-261)BifrostResponsesRequest(32-39)ResponsesMessageRoleType(329-329)ResponsesInputMessageRoleUser(333-333)ResponsesInputMessageRoleSystem(334-334)core/schemas/utils.go (3)
Ptr(16-18)SafeExtractStringSlice(437-465)SafeExtractString(280-297)core/providers/utils/utils.go (2)
GetRandomString(1027-1038)GetBudgetTokensFromReasoningEffort(1084-1121)core/providers/bedrock/types.go (2)
DefaultCompletionMaxTokens(13-13)MinimumReasoningMaxTokens(12-12)
transports/bifrost-http/handlers/inference.go (2)
core/schemas/bifrost.go (1)
ResponsesRequest(91-91)core/schemas/responses.go (1)
ResponsesParameters(87-114)
⏰ Context from checks skipped due to timeout of 900000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (7)
- GitHub Check: Graphite / mergeability_check
- GitHub Check: Graphite / mergeability_check
- GitHub Check: Graphite / mergeability_check
- GitHub Check: Graphite / mergeability_check
- GitHub Check: Graphite / mergeability_check
- GitHub Check: Graphite / mergeability_check
- GitHub Check: Graphite / mergeability_check
🔇 Additional comments (19)
core/internal/testutil/responses_stream.go (1)
440-446: LGTM - Token limit increase is appropriate for reasoning scenarios.The increase from 400 to 1800
MaxOutputTokensaligns with the other reasoning test configurations in this PR stack (chat_completion_stream.go and reasoning.go both use 1800). Reasoning models require more tokens to produce step-by-step explanations.core/internal/testutil/chat_completion_stream.go (3)
358-384: LGTM - Well-structured reasoning streaming test.The test configuration is consistent with other reasoning tests in the PR stack:
- Uses 1800
MaxCompletionTokensmatching other reasoning scenarios- Properly configures
ChatReasoningwith high effort and 1500 max tokens- Gates execution behind
Scenarios.ReasoningandReasoningModelchecks
504-518: Appropriate soft validation for provider variability.The warning-only approach when no reasoning indicators are found is correct. Different providers expose reasoning through different mechanisms (content, details, or tokens), and some may not expose explicit indicators while still performing reasoning internally.
664-679: Good handling of reasoning-only responses.The validation logic correctly handles the case where reasoning indicators are present but no content chunks were received. This accounts for reasoning models that may process internally without streaming traditional content, avoiding false test failures.
core/internal/testutil/reasoning.go (3)
12-25: LGTM - Clear naming distinction between API variants.The rename from
RunReasoningTesttoRunResponsesReasoningTestand the addition ofRunChatCompletionReasoningTestclearly distinguish between the two API paths. The test label change to "ResponsesReasoning" maintains consistency with the function name.
348-382: Field access issue is fixed.The previous review identified that
message.ReasoningDetailswas incorrectly used instead ofmessage.ChatAssistantMessage.ReasoningDetails. This has been correctly fixed - all accesses now properly go through theChatAssistantMessageembedded struct with appropriate nil checks.
219-223: Verify whether Groq's reasoning flakiness applies to non-streaming chat completions.The
ChatCompletionReasoningtest skips only OpenAI, whileChatCompletionStreamWithReasoningValidatedinchat_completion_stream.go(lines 529-531) skips both OpenAI and Groq. The skip comments reference different APIs: "chat completions" vs "stream". If Groq's reasoning is also unstable in non-streamingChatCompletioncalls, add it to the skip list inreasoning.gofor consistency. If the flakiness is specific to streaming, the current skip patterns are likely correct.core/providers/openai/responses.go (1)
95-104: Add focused tests for clamping and user field validation.User field sanitization is correctly implemented (64-character limit), but
MinMaxCompletionTokens = 16lacks documentation explaining why this value was chosen. Add unit tests covering:
- Token clamping when
MaxOutputTokensfalls below 16- User field sanitization when exceeding 64 characters
- User field being dropped (set to nil) when oversized
Verify that
MinMaxCompletionTokens = 16aligns with OpenAI's actual API requirements for the Responses endpoint.transports/bifrost-http/handlers/inference.go (1)
224-254: CustomResponsesRequest.UnmarshalJSONmatches the ChatRequest pattern; looks correct.
Good separation: BifrostParams → Input union → ResponsesParameters with its custom unmarshaller.core/providers/cohere/chat.go (1)
440-454: Streaming thinking → Reasoning/ReasoningDetails mapping looks good.
The nil checks are in place and usingschemas.Ptr(thinkingText)avoids referencing ephemeral data.core/internal/testutil/account.go (1)
138-143: Bedrock deployment alias is correctly used in tests.The
"claude-4.5-haiku"alias is actively referenced incore/providers/vertex/vertex_test.goline 31, and the model ID format"global.anthropic.claude-haiku-4-5-20251001-v1:0"follows the correct Bedrock convention, consistent with other claude-4.5 entries in the same Deployments map.transports/bifrost-http/integrations/router.go (1)
921-926: No issue found. TheWarnmethod signature supports variadic arguments.The Logger interface at
core/schemas/logger.godefinesWarn(msg string, args ...any), which accepts printf-style format arguments. Line 923's usageg.logger.Warn("Failed to write SSE done marker: %v", err)is correct and consistent with this signature and widespread patterns throughout the codebase (e.g.,server.go:1079,config.go:341,config.go:551).Likely an incorrect or invalid review comment.
core/providers/cohere/cohere.go (2)
510-518: Good: propagateToCohereResponsesRequestconversion errors.The updated
CheckContextAndGetRequestBodyclosure now returns(any, error)fromToCohereResponsesRequest(Line 513), which prevents silent bad request construction when conversions fail.
566-582: Good: stream request construction now fails fast + setsStream=trueonly when body exists.Handling the
(reqBody, err)return (Lines 570-573) and then conditionally settingreqBody.Stream(Line 575) avoids nil-deref risk while correctly enabling SSE mode.core/providers/anthropic/anthropic.go (2)
672-717: Good: centralized request-body construction for Responses.Using
getRequestBodyForResponses(Line 677) and passing the samejsonBodythroughcompleteRequest+HandleProviderResponse(Lines 683, 692) should reduce mismatches between “what we send” vs “what we record as raw request”.
719-756: Good: ResponsesStream now shares the same request-body builder (and fails fast).The streaming path now mirrors the non-streaming body build via
getRequestBodyForResponses(..., true)with immediate error return (Lines 726-729), which is cleaner and safer.core/providers/bedrock/utils.go (1)
35-73: Reasoning budget min-tokens selection looks fixed.Using
MinimumReasoningMaxTokensby default and switching toanthropic.MinimumReasoningMaxTokensonly whenschemas.IsAnthropicModel(bifrostReq.Model)(Lines 58-60) addresses the earlier cross-provider min-token mismatch.ui/lib/types/logs.ts (1)
411-435: Type rename + propagation looks consistent.
No issues with theResponsesReasoningSummaryshape or the updated field types.core/schemas/responses.go (1)
399-411: Schema migration toResponsesReasoningSummaryis complete and consistent across the stack.No references to the old
ResponsesReasoningContentidentifier remain. The newResponsesReasoningSummarytype andSignaturefield are properly integrated across all consumers: backend schemas, provider implementations (Anthropic, Bedrock, Cohere, Gemini), streaming layer, and TypeScript types.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 5
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (6)
core/providers/anthropic/types.go (2)
58-139: Fixextra_paramsbeing treated as “unknown” and re-captured intoExtraParams.var anthropicMessageRequestKnownFields = map[string]bool{ "model": true, "max_tokens": true, "messages": true, "metadata": true, "system": true, "temperature": true, "top_p": true, "top_k": true, "stop_sequences": true, "stream": true, "tools": true, "tool_choice": true, "mcp_servers": true, "thinking": true, "output_format": true, + "extra_params": true, "fallbacks": true, }
160-196: Reset the other representation inAnthropicContent.UnmarshalJSONto avoid stale state.func (mc *AnthropicContent) UnmarshalJSON(data []byte) error { + // defensive reset (in case mc is reused) + mc.ContentStr = nil + mc.ContentBlocks = nil + // First, try to unmarshal as a direct string var stringContent string if err := sonic.Unmarshal(data, &stringContent); err == nil { mc.ContentStr = &stringContent - return nil + return nil } // Try to unmarshal as a direct array of ContentBlock var arrayContent []AnthropicContentBlock if err := sonic.Unmarshal(data, &arrayContent); err == nil { mc.ContentBlocks = arrayContent return nil }core/providers/bedrock/utils.go (1)
116-121:AdditionalModelRequestFieldsassignment overwritesreasoning_config.The extraction of
additionalModelRequestFieldPathsat line 119 replaces the entireAdditionalModelRequestFieldsmap, discarding anyreasoning_configset earlier (lines 40-73). If both reasoning and extra params are provided, reasoning will be lost.Consider merging instead of overwriting:
if requestFields, exists := bifrostReq.Params.ExtraParams["additionalModelRequestFieldPaths"]; exists { if orderedFields, ok := schemas.SafeExtractOrderedMap(requestFields); ok { - bedrockReq.AdditionalModelRequestFields = orderedFields + if bedrockReq.AdditionalModelRequestFields == nil { + bedrockReq.AdditionalModelRequestFields = make(schemas.OrderedMap) + } + for k, v := range orderedFields { + bedrockReq.AdditionalModelRequestFields[k] = v + } } }core/schemas/responses.go (1)
45-85: Fix typos in spec comments (“Not is” → “Not in”).@@ - StopReason *string `json:"stop_reason,omitempty"` // Not is OpenAI's spec, but sent by other providers + StopReason *string `json:"stop_reason,omitempty"` // Not in OpenAI's spec, but sent by other providersframework/streaming/responses.go (2)
14-157: Deep-copy must includeSignature(streamed signatures will be dropped for non-OpenAI providers).@@ if original.Delta != nil { copyDelta := *original.Delta copy.Delta = ©Delta } + + if original.Signature != nil { + copySignature := *original.Signature + copy.Signature = ©Signature + }
380-424: Deep-copy content blocks should copyFileIDandSignatureat minimum (new schema fields).@@ func deepCopyResponsesMessageContentBlock(original schemas.ResponsesMessageContentBlock) schemas.ResponsesMessageContentBlock { copy := schemas.ResponsesMessageContentBlock{ Type: original.Type, } + + if original.FileID != nil { + v := *original.FileID + copy.FileID = &v + } + + if original.Signature != nil { + v := *original.Signature + copy.Signature = &v + } @@ if original.Text != nil { copyText := *original.Text copy.Text = ©Text }
♻️ Duplicate comments (4)
core/providers/mistral/mistral_test.go (1)
33-50:Reasoning: false+ comment is clear and matches the intent (implementation not wired yet).core/internal/testutil/chat_completion_stream.go (1)
529-538: Cerebras prompt still requests tool usage but no tools are configured in the test request.@@ - if testConfig.Provider == schemas.Cerebras { - problemPrompt = "Hello how are you, can you search hackernews news regarding maxim ai for me? use your tools for this" - } + if testConfig.Provider == schemas.Cerebras { + problemPrompt = "Explain step by step: What is 15% of 200, then multiply that result by 3?" + }core/providers/cohere/responses.go (2)
1270-1306: Do not forwardEncryptedContentas plaintext payload to Cohere.
Even if the bytes are “encrypted”, embedding them in a regularthinkingblock leaks the opaque blob cross-provider (and into logs/telemetry). Prefer skipping (or strictly gating behind an explicit “allowEncryptedReasoningPassthrough” flag).} else if msg.ResponsesReasoning != nil { if msg.ResponsesReasoning.Summary != nil { ... - } else if msg.ResponsesReasoning.EncryptedContent != nil { - // Cohere doesn't have a direct equivalent to encrypted content, - // so we'll store it as a regular thinking block with a special marker - encryptedText := fmt.Sprintf("[ENCRYPTED_REASONING: %s]", *msg.ResponsesReasoning.EncryptedContent) - thinkingBlock := CohereContentBlock{ - Type: CohereContentBlockTypeThinking, - Thinking: &encryptedText, - } - thinkingBlocks = append(thinkingBlocks, thinkingBlock) + } else if msg.ResponsesReasoning.EncryptedContent != nil { + // Cohere doesn't support encrypted reasoning; keep it opaque and do not forward. } }
1201-1223: Remove unsupported "auto" case and change default tonil.Cohere's Chat API officially supports only two
tool_choicevalues:REQUIREDandNONE. The "auto" case is not supported by Cohere and should be removed. Defaulting unknown values toToolChoiceRequiredsilently forces tool calls; returnnilto defer to the provider's default behavior.case "auto": - choice := ToolChoiceAuto - return &choice + return nil default: - choice := ToolChoiceRequired - return &choice + return nil
🧹 Nitpick comments (12)
ui/app/workspace/logs/views/columns.tsx (1)
40-45: Transcription logs lose distinguishing context (constant “Audio file”).
Iflog.transcription_input.promptwas a safe label, consider keeping a truncated version (or filename/metadata) so rows aren’t all identical; otherwise add a short comment clarifying this is intentional for privacy/standardization.} else if (log?.transcription_input) { - return "Audio file"; + const label = log.transcription_input.prompt?.trim(); + return label ? label.slice(0, 120) : "Audio file"; }core/providers/azure/azure.go (1)
650-653: Consistent with the Responses method refactoring.The use of
getRequestBodyForAnthropicResponseswith the streaming flag set totrueis appropriate for the streaming endpoint. Error handling with early return is correct.For consistency across the codebase, consider whether
ChatCompletionStream(lines 468-485) should also be refactored to use a similar helper pattern, since it currently uses inline logic for Anthropic request construction.core/providers/bedrock/types.go (1)
12-14: Nit:DefaultCompletionMaxTokenscomment suggests “not passed in body” — consider moving to the budgeting helper instead of provider types.transports/bifrost-http/integrations/anthropic.go (1)
90-126: Avoid O(n²) string concatenation when combining multiple stream events.- if len(anthropicResponse) > 1 { - combinedContent := "" + if len(anthropicResponse) > 1 { + var b strings.Builder for _, event := range anthropicResponse { responseJSON, err := sonic.Marshal(event) if err != nil { // Log JSON marshaling error but continue processing (should not happen) log.Printf("Failed to marshal streaming response: %v", err) continue } - combinedContent += fmt.Sprintf("event: %s\ndata: %s\n\n", event.Type, responseJSON) + fmt.Fprintf(&b, "event: %s\ndata: %s\n\n", event.Type, responseJSON) } - return "", combinedContent, nil + return "", b.String(), nil }(Will need
stringsalready imported; this file already imports it.)core/internal/testutil/reasoning.go (1)
12-116: Token budget bump to 1800 in reasoning tests is fine, but consider making it provider/config-driven to limit CI cost.transports/bifrost-http/handlers/inference.go (1)
236-244: Nit: comment says “Unmarshal messages” but the field isinput.core/internal/testutil/chat_completion_stream.go (1)
359-521: Consider making the non-validated reasoning stream test explicitly “smoke” (or assert indicators for non-flaky providers).
Right now it only logs when no reasoning indicators are found; if this is meant to catch regressions, consider failing whentestConfig.Provideris known-stable.framework/streaming/responses.go (1)
498-534: ReasoningSummaryTextDelta routing and message creation flow looks reasonable.
Minor: when creatingnewMessage, consider copyingresp.ItemIDvalue into a fresh pointer (to avoid retaining pointers from pooled/aliased stream objects).core/providers/openai/responses.go (1)
42-84: Add explicit contract documentation and tests for the reasoning block conversion logic.The code assumes that when
ResponsesReasoning != niland content blocks exist, those blocks are reasoning-only for non-oss models. While the OpenAI Responses API documentation describes reasoning output items and content parts with specific types, it does not explicitly guarantee that content blocks are reasoning-only when both reasoning and content coexist. Add inline comments explaining:
- What OpenAI format behavior this code depends on (e.g., that reasoning content must be converted/skipped based on model type)
- Which models support/reject
reasoning_textcontent blocks (oss vs non-oss distinction)Include unit tests covering both the skip path (no summary, has content blocks, non-oss model) and the convert path (has summary, no content blocks, oss model) to prevent silent breakage if the contract assumption changes.
core/providers/cohere/responses.go (3)
141-173: Fallback content-block mapping likely produces wrong “input_*” types (and odd text payload).
convertCohereContentBlockToBifrostis used for response conversion, but the fallback path returnsType: input_textand setsTextto the block type string (not content). This can violate the output schema / confuse consumers.Suggested tweak (keep behavior but make it clearly “output text” + a readable marker):
default: - // Fallback to text block + // Fallback to output text block return schemas.ResponsesMessageContentBlock{ - Type: schemas.ResponsesInputMessageContentBlockTypeText, - Text: schemas.Ptr(string(cohereBlock.Type)), + Type: schemas.ResponsesOutputMessageContentTypeText, + Text: schemas.Ptr("[unsupported_cohere_block_type:" + string(cohereBlock.Type) + "]"), }
499-506: Duplicate comment line.
Two consecutive// Generate stable ID for text itemcomments. Drop one for readability.
1314-1323: Use the guarded field for ID assignment (clarity + avoids future footguns).
You guard onmsg.ResponsesToolMessage.CallIDbut assigntoolCall.ID = msg.CallID(promoted field). It’s equivalent today, but coupling to promotion rules is brittle.if msg.ResponsesToolMessage != nil && msg.ResponsesToolMessage.CallID != nil { - toolCall.ID = msg.CallID + toolCall.ID = msg.ResponsesToolMessage.CallID }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
ui/package-lock.jsonis excluded by!**/package-lock.json
📒 Files selected for processing (55)
core/internal/testutil/account.go(1 hunks)core/internal/testutil/chat_completion_stream.go(1 hunks)core/internal/testutil/reasoning.go(4 hunks)core/internal/testutil/responses_stream.go(1 hunks)core/internal/testutil/tests.go(2 hunks)core/providers/anthropic/anthropic.go(3 hunks)core/providers/anthropic/chat.go(5 hunks)core/providers/anthropic/errors.go(2 hunks)core/providers/anthropic/types.go(8 hunks)core/providers/anthropic/utils.go(2 hunks)core/providers/azure/azure.go(2 hunks)core/providers/azure/utils.go(1 hunks)core/providers/bedrock/bedrock.go(2 hunks)core/providers/bedrock/bedrock_test.go(13 hunks)core/providers/bedrock/types.go(2 hunks)core/providers/bedrock/utils.go(2 hunks)core/providers/cerebras/cerebras_test.go(2 hunks)core/providers/cohere/chat.go(3 hunks)core/providers/cohere/cohere.go(2 hunks)core/providers/cohere/cohere_test.go(1 hunks)core/providers/cohere/responses.go(10 hunks)core/providers/cohere/types.go(1 hunks)core/providers/gemini/responses.go(2 hunks)core/providers/groq/groq_test.go(2 hunks)core/providers/mistral/mistral_test.go(1 hunks)core/providers/openai/chat.go(1 hunks)core/providers/openai/responses.go(2 hunks)core/providers/openai/text.go(1 hunks)core/providers/openai/types.go(3 hunks)core/providers/openai/utils.go(1 hunks)core/providers/openrouter/openrouter_test.go(1 hunks)core/providers/utils/utils.go(4 hunks)core/providers/vertex/errors.go(1 hunks)core/providers/vertex/types.go(0 hunks)core/providers/vertex/utils.go(1 hunks)core/providers/vertex/vertex.go(3 hunks)core/providers/vertex/vertex_test.go(1 hunks)core/schemas/bifrost.go(1 hunks)core/schemas/mux.go(0 hunks)core/schemas/responses.go(5 hunks)core/schemas/utils.go(1 hunks)docs/docs.json(0 hunks)framework/configstore/rdb.go(0 hunks)framework/streaming/chat.go(0 hunks)framework/streaming/responses.go(2 hunks)transports/bifrost-http/handlers/inference.go(1 hunks)transports/bifrost-http/handlers/middlewares.go(1 hunks)transports/bifrost-http/integrations/anthropic.go(5 hunks)transports/bifrost-http/integrations/router.go(3 hunks)ui/app/workspace/logs/views/columns.tsx(1 hunks)ui/app/workspace/logs/views/logDetailsSheet.tsx(1 hunks)ui/app/workspace/logs/views/logResponsesMessageView.tsx(2 hunks)ui/app/workspace/logs/views/logResponsesOutputView.tsx(0 hunks)ui/lib/types/logs.ts(2 hunks)ui/package.json(1 hunks)
💤 Files with no reviewable changes (6)
- framework/streaming/chat.go
- framework/configstore/rdb.go
- core/schemas/mux.go
- core/providers/vertex/types.go
- docs/docs.json
- ui/app/workspace/logs/views/logResponsesOutputView.tsx
✅ Files skipped from review due to trivial changes (1)
- transports/bifrost-http/handlers/middlewares.go
🚧 Files skipped from review as they are similar to previous changes (19)
- core/schemas/utils.go
- core/providers/openai/utils.go
- transports/bifrost-http/integrations/router.go
- core/providers/vertex/vertex_test.go
- ui/app/workspace/logs/views/logResponsesMessageView.tsx
- core/internal/testutil/tests.go
- core/providers/vertex/utils.go
- ui/app/workspace/logs/views/logDetailsSheet.tsx
- core/providers/vertex/vertex.go
- core/providers/openai/chat.go
- core/providers/cerebras/cerebras_test.go
- core/providers/cohere/cohere_test.go
- core/providers/bedrock/bedrock.go
- core/providers/anthropic/utils.go
- core/providers/anthropic/errors.go
- core/providers/azure/utils.go
- core/providers/utils/utils.go
- ui/package.json
- core/providers/openai/text.go
🧰 Additional context used
📓 Path-based instructions (1)
**
⚙️ CodeRabbit configuration file
always check the stack if there is one for the current PR. do not give localized reviews for the PR, always see all changes in the light of the whole stack of PRs (if there is a stack, if there is no stack you can continue to make localized suggestions/reviews)
Files:
core/internal/testutil/account.gocore/providers/groq/groq_test.gotransports/bifrost-http/handlers/inference.gocore/providers/vertex/errors.gocore/providers/anthropic/anthropic.gocore/providers/bedrock/bedrock_test.gocore/providers/azure/azure.gocore/providers/openai/responses.gotransports/bifrost-http/integrations/anthropic.gocore/providers/openai/types.gocore/providers/cohere/types.gocore/providers/cohere/chat.goframework/streaming/responses.gocore/internal/testutil/chat_completion_stream.gocore/schemas/responses.goui/lib/types/logs.tscore/providers/gemini/responses.gocore/providers/mistral/mistral_test.gocore/providers/anthropic/chat.gocore/internal/testutil/responses_stream.gocore/providers/openrouter/openrouter_test.gocore/providers/bedrock/utils.gocore/providers/anthropic/types.gocore/schemas/bifrost.goui/app/workspace/logs/views/columns.tsxcore/internal/testutil/reasoning.gocore/providers/cohere/responses.gocore/providers/cohere/cohere.gocore/providers/bedrock/types.go
🧠 Learnings (3)
📚 Learning: 2025-12-09T17:07:42.007Z
Learnt from: qwerty-dvorak
Repo: maximhq/bifrost PR: 1006
File: core/schemas/account.go:9-18
Timestamp: 2025-12-09T17:07:42.007Z
Learning: In core/schemas/account.go, the HuggingFaceKeyConfig field within the Key struct is currently unused and reserved for future Hugging Face inference endpoint deployments. Do not flag this field as missing from OpenAPI documentation or require its presence in the API spec until the feature is actively implemented and used. When the feature is added, update the OpenAPI docs accordingly; otherwise, treat this field as non-breaking and not part of the current API surface.
Applied to files:
core/internal/testutil/account.gocore/providers/groq/groq_test.gotransports/bifrost-http/handlers/inference.gocore/providers/vertex/errors.gocore/providers/anthropic/anthropic.gocore/providers/bedrock/bedrock_test.gocore/providers/azure/azure.gocore/providers/openai/responses.gotransports/bifrost-http/integrations/anthropic.gocore/providers/openai/types.gocore/providers/cohere/types.gocore/providers/cohere/chat.goframework/streaming/responses.gocore/internal/testutil/chat_completion_stream.gocore/schemas/responses.gocore/providers/gemini/responses.gocore/providers/mistral/mistral_test.gocore/providers/anthropic/chat.gocore/internal/testutil/responses_stream.gocore/providers/openrouter/openrouter_test.gocore/providers/bedrock/utils.gocore/providers/anthropic/types.gocore/schemas/bifrost.gocore/internal/testutil/reasoning.gocore/providers/cohere/responses.gocore/providers/cohere/cohere.gocore/providers/bedrock/types.go
📚 Learning: 2025-12-11T07:38:31.413Z
Learnt from: Pratham-Mishra04
Repo: maximhq/bifrost PR: 1000
File: core/providers/bedrock/bedrock_test.go:1374-1390
Timestamp: 2025-12-11T07:38:31.413Z
Learning: In core/providers/bedrock tests, follow a layered testing approach: - Unit tests (e.g., TestBifrostToBedrockResponseConversion) should perform structural comparisons and type/field checks to avoid brittleness from dynamic fields. - Separate scenario-based and integration tests should validate the full end-to-end conversion logic, including content block internals. Ensure unit tests avoid brittle string/field matching and that integration tests cover end-to-end behavior with realistic data.
Applied to files:
core/providers/bedrock/bedrock_test.go
📚 Learning: 2025-12-11T11:58:25.307Z
Learnt from: Pratham-Mishra04
Repo: maximhq/bifrost PR: 1000
File: core/providers/openai/responses.go:42-84
Timestamp: 2025-12-11T11:58:25.307Z
Learning: In core/providers/openai/responses.go (and related OpenAI response handling), document and enforce the API format constraint: if ResponsesReasoning != nil and the response contains content blocks, all content blocks should be treated as reasoning blocks by default. Implement type guards or parsing logic accordingly, and add unit tests to verify that when ResponsesReasoning is non-nil, content blocks are labeled as reasoning blocks. Include clear comments in the code explaining the rationale and ensure downstream consumers rely on this behavior.
Applied to files:
core/providers/openai/responses.gocore/providers/openai/types.go
🧬 Code graph analysis (15)
transports/bifrost-http/handlers/inference.go (2)
core/schemas/bifrost.go (1)
ResponsesRequest(91-91)core/schemas/responses.go (1)
ResponsesParameters(87-114)
core/providers/vertex/errors.go (4)
core/providers/utils/utils.go (2)
CheckAndDecodeBody(490-498)NewBifrostOperationError(516-527)core/schemas/provider.go (1)
ErrProviderResponseDecode(29-29)core/providers/vertex/vertex.go (1)
VertexError(25-31)core/providers/vertex/types.go (1)
VertexValidationError(153-160)
core/providers/anthropic/anthropic.go (3)
core/schemas/bifrost.go (1)
ResponsesRequest(91-91)transports/bifrost-http/handlers/inference.go (1)
ResponsesRequest(257-261)core/providers/utils/utils.go (3)
HandleProviderResponse(359-445)ShouldSendBackRawRequest(551-556)ShouldSendBackRawResponse(559-564)
core/providers/azure/azure.go (5)
core/schemas/bifrost.go (1)
BifrostError(358-367)ui/lib/types/logs.ts (1)
BifrostError(226-232)core/schemas/utils.go (1)
IsAnthropicModel(1043-1045)core/providers/utils/utils.go (1)
CheckContextAndGetRequestBody(257-275)core/providers/openai/responses.go (1)
ToOpenAIResponsesRequest(37-107)
core/providers/openai/responses.go (4)
core/schemas/responses.go (6)
ResponsesMessage(314-327)ResponsesReasoning(731-734)ResponsesMessageContentBlock(399-411)ResponsesOutputMessageContentTypeReasoning(394-394)ResponsesMessageContent(339-344)ResponsesParameters(87-114)core/providers/openai/types.go (3)
OpenAIResponsesRequest(180-189)OpenAIResponsesRequestInput(147-150)MinMaxCompletionTokens(12-12)core/schemas/utils.go (1)
Ptr(16-18)core/providers/openai/utils.go (1)
SanitizeUserField(51-56)
transports/bifrost-http/integrations/anthropic.go (2)
core/schemas/bifrost.go (6)
BifrostContextKeyUserAgent(123-123)BifrostContextKeyUseRawRequestBody(117-117)Anthropic(37-37)BifrostContextKeyExtraHeaders(115-115)BifrostContextKeyURLPath(116-116)BifrostContextKeySkipKeySelection(114-114)core/schemas/utils.go (1)
IsAnthropicModel(1043-1045)
core/providers/openai/types.go (2)
core/schemas/chatcompletions.go (1)
ChatParameters(155-184)core/schemas/responses.go (1)
ResponsesParametersReasoning(234-239)
core/schemas/responses.go (1)
ui/lib/types/logs.ts (2)
ResponsesMessageContentBlockType(352-359)ResponsesReasoningSummary(412-415)
ui/lib/types/logs.ts (1)
core/schemas/responses.go (2)
ResponsesReasoningSummary(745-748)ResponsesReasoning(731-734)
core/providers/gemini/responses.go (3)
ui/lib/types/logs.ts (6)
FunctionCall(157-160)ResponsesToolMessage(403-409)ResponsesMessage(423-438)ResponsesMessageContent(400-400)ResponsesReasoning(417-420)ResponsesReasoningSummary(412-415)core/providers/gemini/types.go (4)
FunctionCall(1091-1101)Role(13-13)Content(922-930)Type(778-778)core/schemas/responses.go (8)
ResponsesToolMessage(462-482)ResponsesMessage(314-327)ResponsesInputMessageRoleAssistant(332-332)ResponsesMessageContent(339-344)ResponsesMessageTypeFunctionCall(295-295)ResponsesMessageTypeReasoning(307-307)ResponsesReasoning(731-734)ResponsesReasoningSummary(745-748)
core/providers/bedrock/utils.go (6)
core/schemas/chatcompletions.go (1)
BifrostChatRequest(12-19)core/providers/bedrock/types.go (4)
BedrockConverseRequest(55-75)DefaultCompletionMaxTokens(13-13)BedrockInferenceConfig(253-258)MinimumReasoningMaxTokens(12-12)core/providers/cohere/types.go (2)
DefaultCompletionMaxTokens(12-12)MinimumReasoningMaxTokens(11-11)core/schemas/utils.go (2)
Ptr(16-18)IsAnthropicModel(1043-1045)core/providers/anthropic/types.go (1)
MinimumReasoningMaxTokens(15-15)core/providers/utils/utils.go (1)
GetBudgetTokensFromReasoningEffort(1084-1121)
core/providers/anthropic/types.go (5)
core/providers/bedrock/types.go (1)
Alias(102-102)core/providers/openai/types.go (2)
Alias(83-83)Alias(194-194)core/schemas/bifrost.go (1)
Alias(388-388)core/schemas/chatcompletions.go (4)
Alias(189-189)Alias(643-643)Alias(770-770)Alias(860-860)core/providers/gemini/types.go (8)
Alias(220-220)Alias(245-245)Alias(828-830)Alias(842-844)Alias(857-859)Alias(871-873)Alias(1306-1306)Alias(1326-1326)
core/internal/testutil/reasoning.go (5)
core/schemas/utils.go (1)
Ptr(16-18)core/schemas/bifrost.go (1)
BifrostError(358-367)core/schemas/chatcompletions.go (5)
ChatMessage(469-478)BifrostChatRequest(12-19)ChatParameters(155-184)BifrostChatResponse(26-41)ChatAssistantMessage(626-632)core/internal/testutil/utils.go (3)
CreateBasicChatMessage(247-254)GetErrorMessage(606-639)GetChatContent(329-356)core/internal/testutil/test_retry_framework.go (5)
GetTestRetryConfigForScenario(1116-1148)TestRetryContext(168-173)ChatRetryConfig(186-193)ChatRetryCondition(126-129)WithChatTestRetry(274-424)
core/providers/cohere/cohere.go (1)
core/providers/cohere/responses.go (1)
ToCohereResponsesRequest(892-1006)
core/providers/bedrock/types.go (5)
core/providers/anthropic/types.go (1)
Alias(103-103)core/providers/openai/types.go (2)
Alias(83-83)Alias(194-194)core/schemas/bifrost.go (2)
Alias(388-388)Alias(405-405)core/schemas/chatcompletions.go (4)
Alias(189-189)Alias(643-643)Alias(770-770)Alias(860-860)core/providers/gemini/types.go (7)
Alias(220-220)Alias(245-245)Alias(828-830)Alias(842-844)Alias(857-859)Alias(871-873)Alias(1306-1306)
⏰ Context from checks skipped due to timeout of 900000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (14)
- GitHub Check: Graphite / mergeability_check
- GitHub Check: Graphite / mergeability_check
- GitHub Check: Graphite / mergeability_check
- GitHub Check: Graphite / mergeability_check
- GitHub Check: Graphite / mergeability_check
- GitHub Check: Graphite / mergeability_check
- GitHub Check: Graphite / mergeability_check
- GitHub Check: Graphite / mergeability_check
- GitHub Check: Graphite / mergeability_check
- GitHub Check: Graphite / mergeability_check
- GitHub Check: Graphite / mergeability_check
- GitHub Check: Graphite / mergeability_check
- GitHub Check: Graphite / mergeability_check
- GitHub Check: Graphite / mergeability_check
🔇 Additional comments (39)
core/providers/openrouter/openrouter_test.go (1)
25-49: The review comment cannot be definitively verified due to repository access limitations. Manual verification by the developer is required to confirm: (1) whether"openai/gpt-oss-120b"is available on the CI's OpenRouter account, (2) whether env override patterns are used elsewhere in the codebase for model configuration, and (3) whether the suggested changes align with existing conventions.core/providers/azure/azure.go (1)
548-564: Good refactoring to centralize Anthropic request body construction.The delegation to
getRequestBodyForAnthropicResponsesreduces code duplication and improves maintainability. The conditional logic is clear, and error handling is correct.However, please verify that the helper function exists and is properly implemented in
core/providers/azure/utils.go, particularly confirming its signature matches the expected parameters:ctx,request,deployment,provider.GetProviderKey(), and the streaming boolean flag.core/providers/groq/groq_test.go (1)
55-55: LGTM: Reasoning test enabled for Groq.Enabling the
Reasoningtest scenario aligns with the PR's objectives to expand reasoning support across providers. This change integrates Groq into the broader reasoning-enabled test flow introduced in this stack.Note: The effectiveness of this test depends on the
ReasoningModel(line 37) being valid and supporting reasoning capabilities. Please verify the model name as flagged in the previous comment.transports/bifrost-http/integrations/anthropic.go (1)
230-245:shouldUsePassthrough+isClaudeModelsplit looks reasonable.core/internal/testutil/reasoning.go (1)
201-411: Chat-completions reasoning validation looks solid (nil-safe, good diagnostics).core/providers/anthropic/types.go (2)
200-227: Newredacted_thinkingsupport +datafield addition looks consistent with the existing content-block model.
414-423:stop_sequenceconfiguration is correct — the Anthropic streaming API expects the field to always be present asnullor a string value, never omitted.Anthropic's Messages API streaming contract specifies that
stop_sequenceis always emitted (string|null), with null values inmessage_startevents and potentially null or a string in subsequent events. The current pointer type with noomitemptytag correctly produces"stop_sequence": nullin JSON when nil, matching the contract. Clients should treat missing fields as null and rely on the finalmessage_stopevent for the definitive value.core/internal/testutil/account.go (1)
138-143: New Bedrock deployment mapping looks fine; please double-check the deployment ID is enabled in the test AWS account.core/internal/testutil/responses_stream.go (1)
435-446: MaxOutputTokens bump for reasoning streaming makes sense; keep an eye on theresponseCountcap (150) if providers emit many small deltas.core/providers/cohere/types.go (1)
11-13: Constants addition looks fine.transports/bifrost-http/handlers/inference.go (1)
224-254: ResponsesRequest custom unmarshal mirrors ChatRequest pattern well (avoids sonic + embedded/custom-unmarshal conflicts).core/schemas/bifrost.go (1)
120-123: LGTM! New context keys are well-documented.The new
BifrostContextKeyIntegrationTypeandBifrostContextKeyUserAgentconstants follow the established naming conventions and include appropriate type annotations in comments. These additions support the broader streaming/context propagation changes in this PR stack.core/providers/anthropic/anthropic.go (2)
677-692: Good refactor: Centralized request body construction.The
Responsesmethod now uses the centralizedgetRequestBodyForResponseshelper, improving consistency with the streaming path and reducing code duplication. The variable rename fromjsonDatatojsonBodyaligns with conventions used elsewhere in the codebase.
726-729: LGTM! Consistent streaming request body construction.The
ResponsesStreammethod properly uses the centralized helper withisStreaming=trueand correctly propagates any errors from request body construction.core/providers/vertex/errors.go (1)
13-27: Good improvement: Proper response body decoding before unmarshalling.Adding
CheckAndDecodeBodyensures gzip-encoded error responses are properly decoded before attempting to unmarshal. The early return on decode failure prevents confusing unmarshal errors when the underlying issue is compression handling. All subsequent unmarshal calls correctly usedecodedBody.core/providers/cohere/cohere.go (2)
513-513: LGTM! Updated to match new function signature.The
Responsesmethod correctly adapts to the updatedToCohereResponsesRequestsignature that now returns(*CohereChatRequest, error), enabling proper error propagation from the conversion layer.
570-577: LGTM! Proper error handling and nil-safety in streaming path.The
ResponsesStreammethod now correctly captures and propagates errors fromToCohereResponsesRequest. The nil check onreqBodybefore setting theStreamflag is a good defensive practice.core/providers/cohere/chat.go (3)
7-7: LGTM! Import alias added for provider utilities.The
providerUtilsalias is consistent with other provider implementations and provides access toGetBudgetTokensFromReasoningEffort.
104-129: LGTM! Reasoning budget calculation is now safe and robust.The refactored reasoning handling correctly addresses the previous nil pointer concern by using
DefaultCompletionMaxTokensas a fallback whenMaxCompletionTokensis not provided. The error propagation fromGetBudgetTokensFromReasoningEffortensures invalid effort values are properly surfaced to callers.
440-453: LGTM! Clean extraction and reuse of thinking text.Extracting
thinkingTextonce and reusing it for bothReasoningandReasoningDetails.Textis cleaner than repeated dereferences. The mapping correctly populates both fields for streaming reasoning content.core/providers/bedrock/utils.go (1)
39-73: Reasoning config logic addresses past review feedback.The implementation now correctly uses model-specific minimum budget tokens: Bedrock's
MinimumReasoningMaxTokensfor non-Anthropic models andanthropic.MinimumReasoningMaxTokensfor Anthropic models (lines 57-60). This resolves the previously flagged concern.core/providers/gemini/responses.go (3)
166-179: ThoughtSignature preservation creates separate reasoning message.The implementation correctly stores the thought signature in a separate
ResponsesReasoningmessage with an emptySummaryslice and the signature inEncryptedContent. This aligns with the schema definition incore/schemas/responses.go(lines 730-733).
609-629: Look-ahead logic for thought signature restoration is correct.The implementation properly looks ahead to find the next reasoning message and extracts the encrypted content to restore the
ThoughtSignature. The bounds check at line 621 prevents index out-of-range errors.
148-156: Range loop variable capture comment needs context on Go module version.The comment about avoiding "range loop variable capture" is relevant only if the project targets Go versions before 1.22. Starting with Go 1.22, loop variables are created per-iteration by default—however, this is opt-in and only applies to modules explicitly declaring
go 1.22(or later) ingo.mod. If the project declares an older Go version, the copies are necessary and correct. If it declares Go 1.22+, these copies are redundant.Verify the Go version in
go.mod(godirective) to determine if the copies should be removed.core/providers/openai/types.go (3)
108-139: CustomUnmarshalJSONcorrectly handles embedded struct with custom unmarshaller.The implementation properly separates base field unmarshalling from
ChatParametersunmarshalling, avoiding the issue whereChatParameters.UnmarshalJSONwould hijack the entire process. This is a clean solution for Go's embedded struct JSON handling limitation.
191-229:MarshalJSONcorrectly stripsMaxTokensfrom reasoning for OpenAI.The implementation creates a shallow copy of
ResponsesParametersReasoningwithMaxTokensexplicitly set tonilbefore marshalling. This ensures OpenAI doesn't receive themax_tokensfield which it doesn't support in reasoning parameters, while preservingEffort,GenerateSummary, andSummary.
11-13: New constantMinMaxCompletionTokensappears unused in this file.The constant is declared but not referenced within this file. Verify it's used elsewhere or if this is leftover from refactoring.
core/providers/anthropic/chat.go (3)
100-121: Reasoning conversion logic is well-structured with proper priority.The three-branch approach correctly prioritizes:
- Explicit
MaxTokens→ direct useEffort(non-"none") → computed budget viaGetBudgetTokensFromReasoningEffort- Fallback → disabled
Error handling for budget computation is properly propagated. The use of
anthropic.MinimumReasoningMaxTokensis correct for Anthropic provider.
604-604: Guard condition forPartialJSONis appropriate.The nil check before accessing
*chunk.Delta.PartialJSONprevents potential nil pointer dereference.
630-657: Thinking delta handling correctly extracts to local variable.Extracting
thinkingTextavoids repeated dereferences and improves readability. The use ofschemas.Ptr(thinkingText)for bothReasoningandTextfields is consistent.ui/lib/types/logs.ts (2)
411-419: Type rename aligns with backend schema changes.The rename from
ResponsesReasoningContenttoResponsesReasoningSummarymaintains consistency with the backend schema incore/schemas/responses.go(lines 744-747). The field structure (type: "summary_text",text: string) correctly mirrors the Go struct.
434-434:ResponsesMessage.summarytype updated consistently.The optional
summaryfield now usesResponsesReasoningSummary[], maintaining type consistency across the interface hierarchy.core/providers/openai/responses.go (2)
95-104: MaxOutputTokens clamp + User sanitization look good.
Nice guardrails before hitting the provider.
109-137: Verifytool_choiceconsistency when filtering unsupported tools.
The function filtersreq.Toolsbut doesn't validate or adjustreq.ToolChoiceif it references a removed tool type, which could make the API request invalid. Confirm whether ToolChoice is validated elsewhere or needs to be handled in this function. Also verify thesupportedTypeslist matches OpenAI Responses API's current specifications.core/internal/testutil/chat_completion_stream.go (1)
576-707: Validated reasoning streaming test structure looks solid (retry + indicator checks).core/schemas/responses.go (2)
397-411:SignatureonResponsesMessageContentBlockis a good extension point for reasoning provenance.
730-748: Reasoning summary type rename (ResponsesReasoningSummary) looks consistent and clearer than the old name.framework/streaming/responses.go (1)
626-727: Helper split for reasoning delta vs signature is a net readability win.core/providers/bedrock/bedrock_test.go (1)
482-934: Bedrock request-to-bifrost mapping expectations (Type field + tool output string) updates look consistent.
ac6bbd4 to
3a4d9a5
Compare
abdcda6 to
1a511ff
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 6
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (3)
transports/bifrost-http/handlers/inference.go (1)
91-118: AddusertoresponsesParamsKnownFieldsto avoid duplicating it intoExtraParams.
schemas.ResponsesParametersincludesUser, but since it’s not “known” here,extractExtraParamswill also capture it, risking double-send/override downstream.var responsesParamsKnownFields = map[string]bool{ @@ "truncation": true, + "user": true, }core/providers/cohere/responses.go (1)
187-742: Tool-call args: avoid initializingArgumentsto""unless you’re sure downstream can handle “empty JSON”.In
StreamEventToolCallStart, you set:Arguments: schemas.Ptr(""), // Arguments will be filled by deltasIf a tool call ends up emitting no arg deltas, this leaves
Arguments == "", which is the same class of replay/unmarshal hazard you’ve hit elsewhere.Safer pattern:
- keep
Argumentsnil until first delta arrives, or- ensure
ToolCallEndemits a done event that sets a valid JSON object ({}) for empty args, rather than empty string.core/providers/anthropic/types.go (1)
58-140: InitializeExtraParamsonly when capturing unknown fields to preserveomitemptybehavior.
UnmarshalJSONunconditionally initializesmr.ExtraParamsto a non-nil map, which defeats theomitemptytag. In Go, non-nil empty maps serialize as{}regardless ofomitempty. When no unknown fields are present,ExtraParamsshould remainnilso the field is omitted from JSON output.Apply the suggested fix: reset
ExtraParamstonil, collect unknowns in a local map, and only assign iflen(unknown) > 0. This aligns with the pattern already used inhandlers/inference.go:extractExtraParams().Additionally, consider returning an error instead of silently skipping unmarshal failures (
continue), as this could mask malformed payloads that should be rejected.
♻️ Duplicate comments (15)
ui/app/workspace/logs/views/logDetailsSheet.tsx (1)
187-237: Boolean/number rendering issues remain unaddressed.The previous review already flagged this:
reasoning.generate_summaryis a boolean and won't render as text inside the Badge (line 227). Also, truthiness checks likereasoning.max_tokens && ...(line 232) will hide the field when the value is0.core/internal/testutil/chat_completion_stream.go (1)
535-538: Cerebras prompt mismatch with test configuration.The Cerebras-specific prompt asks to "use your tools for this" but the test request doesn't configure any tools. This was flagged in a previous review.
transports/bifrost-http/integrations/anthropic.go (3)
78-85: Align/v1/messagesRawResponse passthrough gating withshouldUsePassthrough(avoid accidental raw payload exposure / inconsistency).
Right now/v1/completeand streaming requireshouldUsePassthrough(...), but/v1/messagesnon-stream returnsRawResponsefor any Claude-like model regardless of claude-cli detection.- ResponsesResponseConverter: func(ctx *context.Context, resp *schemas.BifrostResponsesResponse) (interface{}, error) { - if isClaudeModel(resp.ExtraFields.ModelRequested, resp.ExtraFields.ModelDeployment, string(resp.ExtraFields.Provider)) { + ResponsesResponseConverter: func(ctx *context.Context, resp *schemas.BifrostResponsesResponse) (interface{}, error) { + if shouldUsePassthrough(ctx, resp.ExtraFields.Provider, resp.ExtraFields.ModelRequested, resp.ExtraFields.ModelDeployment) { if resp.ExtraFields.RawResponse != nil { return resp.ExtraFields.RawResponse, nil } } return anthropic.ToAnthropicResponsesResponse(resp), nil },
240-245: Reconfirmprovider == ""behavior inisClaudeModel(and document intent).
As written, provider-less requests can be treated as Anthropic based onschemas.IsAnthropicModel(model). If this is intentional for backwards compatibility, add an explicit comment explaining why; otherwise, tighten the condition.
90-127: Don’t drop passthrough stream events on JSON parse failure; avoid stdliblog.Printf; avoidstring +=in loop.
IfRawResponseexists but can’t be parsed intoAnthropicStreamEvent, returning("", nil, nil)can stall/lose frames. Also,log.Printfbypasses your structured logger, andcombinedContent += ...is quadratic.@@ if shouldUsePassthrough(ctx, resp.ExtraFields.Provider, resp.ExtraFields.ModelRequested, resp.ExtraFields.ModelDeployment) { if resp.ExtraFields.RawResponse != nil { raw, ok := resp.ExtraFields.RawResponse.(string) if !ok { return "", nil, fmt.Errorf("expected RawResponse string, got %T", resp.ExtraFields.RawResponse) } var rawResponseJSON anthropic.AnthropicStreamEvent if err := sonic.Unmarshal([]byte(raw), &rawResponseJSON); err == nil { return string(rawResponseJSON.Type), raw, nil } + // Fallback: forward raw payload even if we can't classify the event type. + return "", raw, nil } return "", nil, nil } @@ - if len(anthropicResponse) > 1 { - combinedContent := "" + if len(anthropicResponse) > 1 { + var combinedContent strings.Builder for _, event := range anthropicResponse { responseJSON, err := sonic.Marshal(event) if err != nil { - // Log JSON marshaling error but continue processing (should not happen) - log.Printf("Failed to marshal streaming response: %v", err) + // Prefer router/provider logger (plumb it here) rather than stdlib log. continue } - combinedContent += fmt.Sprintf("event: %s\ndata: %s\n\n", event.Type, responseJSON) + combinedContent.WriteString(fmt.Sprintf("event: %s\ndata: %s\n\n", event.Type, responseJSON)) } - return "", combinedContent, nil + return "", combinedContent.String(), nil } else if len(anthropicResponse) == 1 { return string(anthropicResponse[0].Type), anthropicResponse[0], nil } else { return "", nil, nil }core/providers/anthropic/utils.go (1)
30-76: Past concern aboutstreamfield override remains unaddressed.The past review flagged that when
isStreamingis true, the code always setsrequestBody["stream"] = true(line 48), potentially overriding an explicitstream: falsein the raw body. This concern appears unaddressed in the current code.The suggested fix from the past review was:
// Add stream if not present if isStreaming { - requestBody["stream"] = true + if _, exists := requestBody["stream"]; !exists { + requestBody["stream"] = true + } }Also, as noted previously, consider adding function-level documentation explaining the dual-mode behavior (raw vs. conversion) and the security assumptions around raw request body handling.
core/providers/openai/types.go (2)
108-139: Regression tests still needed for the UnmarshalJSON fix.The implementation correctly addresses the ChatParameters unmarshalling hijack issue by using a two-phase approach. However, as noted in a past review, regression tests should be added to verify that
Model,Messages,Stream,MaxTokens, andFallbacksare properly preserved alongsideChatParametersfields.
191-229: Unit tests still needed for OpenAIResponsesRequest.MarshalJSON().The implementation correctly shadows
reasoning.max_tokens(setting it to nil) and preserves custom Input marshalling. As noted in a past review, add tests to verify thatmax_tokensis absent from the JSON output and thatinputhandles both string and array forms correctly.core/providers/bedrock/utils.go (1)
118-123: Don’t overwriteAdditionalModelRequestFields(it can erasereasoning_config).
bedrockReq.AdditionalModelRequestFields = orderedFieldscan drop the earlier"reasoning_config"entry (and any other previously-set fields). Merge instead.- if orderedFields, ok := schemas.SafeExtractOrderedMap(requestFields); ok { - bedrockReq.AdditionalModelRequestFields = orderedFields - } + if orderedFields, ok := schemas.SafeExtractOrderedMap(requestFields); ok { + if bedrockReq.AdditionalModelRequestFields == nil { + bedrockReq.AdditionalModelRequestFields = make(schemas.OrderedMap) + } + // Preserve required/computed fields (e.g. reasoning_config) by default. + for k, v := range orderedFields { + if _, exists := bedrockReq.AdditionalModelRequestFields[k]; !exists { + bedrockReq.AdditionalModelRequestFields[k] = v + } + } + }core/providers/azure/utils.go (1)
13-60: Normalize"stream"in raw-body mode and add a nilrequestguard.
- If raw JSON already has
"stream": trueandisStreaming=false, you’ll still send a streaming request.requestcan be nil → panic.func getRequestBodyForAnthropicResponses(ctx context.Context, request *schemas.BifrostResponsesRequest, deployment string, providerName schemas.ModelProvider, isStreaming bool) ([]byte, *schemas.BifrostError) { var jsonBody []byte var err error + + if request == nil { + return nil, providerUtils.NewBifrostOperationError(schemas.ErrRequestBodyConversion, fmt.Errorf("request is nil"), providerName) + } // Check if raw request body should be used if useRawBody, ok := ctx.Value(schemas.BifrostContextKeyUseRawRequestBody).(bool); ok && useRawBody { jsonBody = request.GetRawRequestBody() @@ // Add stream if not present if isStreaming { requestBody["stream"] = true + } else { + delete(requestBody, "stream") }core/providers/vertex/utils.go (1)
13-42: Normalize"stream"in raw-body mode and add a nilrequestguard.
Same issue as Azure: a raw body containing"stream": truewill keep streaming enabled even whenisStreaming=false. Also,request == nilcan panic.func getRequestBodyForAnthropicResponses(ctx context.Context, request *schemas.BifrostResponsesRequest, deployment string, providerName schemas.ModelProvider, isStreaming bool) ([]byte, *schemas.BifrostError) { var jsonBody []byte var err error + + if request == nil { + return nil, providerUtils.NewBifrostOperationError(schemas.ErrRequestBodyConversion, fmt.Errorf("request is nil"), providerName) + } @@ // Add stream if not present if isStreaming { requestBody["stream"] = true + } else { + delete(requestBody, "stream") }core/providers/gemini/responses.go (2)
138-179: Don’t emit emptyArguments, and base64-encodeThoughtSignaturebefore putting into JSON.This is the same issue previously flagged (and noted as “handled in different pr”), but the final code shown here still:
- always sets
ResponsesToolMessage.Arguments: &argumentsStreven when empty (can breaksonic.Unmarshalon replay), and- stores
ThoughtSignatureviastring([]byte)which can corrupt non-UTF8 bytes.Suggested patch:
@@ - argumentsStr := "" + var argumentsStr *string if part.FunctionCall.Args != nil { if argsBytes, err := json.Marshal(part.FunctionCall.Args); err == nil { - argumentsStr = string(argsBytes) + s := string(argsBytes) + if strings.TrimSpace(s) != "" { + argumentsStr = &s + } } } @@ toolMsg := &schemas.ResponsesToolMessage{ CallID: &functionCallID, Name: &functionCallName, - Arguments: &argumentsStr, + Arguments: argumentsStr, } @@ if len(part.ThoughtSignature) > 0 { - thoughtSig := string(part.ThoughtSignature) + thoughtSig := base64.RawStdEncoding.EncodeToString(part.ThoughtSignature) reasoningMsg := schemas.ResponsesMessage{ @@ ResponsesReasoning: &schemas.ResponsesReasoning{ Summary: []schemas.ResponsesReasoningSummary{}, EncryptedContent: &thoughtSig, }, } messages = append(messages, reasoningMsg) }Also, since this is a Graphite stack: please ensure the “other PR” with this fix is upstack of #1000, otherwise this PR reintroduces the bug.
609-630: Decode base64EncryptedContentback intoThoughtSignature(don’t treat it as raw bytes).If
EncryptedContentis base64 (as it should be), this line will currently produce the wrong bytes:
part.ThoughtSignature = []byte(*nextMsg.ResponsesReasoning.EncryptedContent).@@ if nextMsg.Type != nil && *nextMsg.Type == schemas.ResponsesMessageTypeReasoning && nextMsg.ResponsesReasoning != nil && nextMsg.ResponsesReasoning.EncryptedContent != nil { - part.ThoughtSignature = []byte(*nextMsg.ResponsesReasoning.EncryptedContent) + sigBytes, err := base64.RawStdEncoding.DecodeString(*nextMsg.ResponsesReasoning.EncryptedContent) + if err != nil { + return nil, nil, fmt.Errorf("failed to decode thought signature: %w", err) + } + part.ThoughtSignature = sigBytes }core/providers/cohere/responses.go (2)
1294-1330: Regression risk: encrypted reasoning is embedded into plain text marker.This is the same concern raised earlier: the fallback branch still constructs:
[ENCRYPTED_REASONING: ...].Even if most callers set
Summary: []...{}(thus skipping the branch), this remains a footgun for any message whereSummaryis nil butEncryptedContentis set.Recommendation: always drop
EncryptedContentfor Cohere (or store it in non-user-visible metadata if you have a safe place), never embed it in text.
742-823: DeleteAnnotationIndexToContentIndexentries on citation end to prevent unbounded growth.
StreamEventCitationEndlooks up the mapping but never removes it. Add:contentIndex, exists := state.AnnotationIndexToContentIndex[*chunk.Index] if !exists { contentIndex = *chunk.Index } + delete(state.AnnotationIndexToContentIndex, *chunk.Index)
🧹 Nitpick comments (16)
core/schemas/utils.go (2)
1042-1045: Comment tweak looks good; consider labeling this as heuristic detection. The implementation is a substring check (not authoritative provider parsing), so “checks if … is an Anthropic model” can be read as stronger than what it does.
1047-1050: Comment tweak looks good; same “heuristic” wording suggestion applies. This is also a substring-based check and may match unexpected strings.core/providers/cohere/types.go (2)
69-97: Inconsistent JSON library usage betweenCohereMessageContentand streaming structs.
CohereMessageContentusesencoding/jsonfor marshal/unmarshal (lines 72-75, 84-91), while the streaming structs (CohereStreamToolCallStruct,CohereStreamContentStruct,CohereStreamCitationStruct) usegithub.com/bytedance/sonic. Consider usingsonicconsistently throughout for performance alignment, or document why standardjsonis preferred here.func (c *CohereMessageContent) MarshalJSON() ([]byte, error) { if c.StringContent != nil { - return json.Marshal(*c.StringContent) + return sonic.Marshal(*c.StringContent) } if c.BlocksContent != nil { - return json.Marshal(c.BlocksContent) + return sonic.Marshal(c.BlocksContent) } return []byte("null"), nil } func (c *CohereMessageContent) UnmarshalJSON(data []byte) error { var str string - if err := json.Unmarshal(data, &str); err == nil { + if err := sonic.Unmarshal(data, &str); err == nil { c.StringContent = &str return nil } var blocks []CohereContentBlock - if err := json.Unmarshal(data, &blocks); err == nil { + if err := sonic.Unmarshal(data, &blocks); err == nil { c.BlocksContent = blocks return nil } return fmt.Errorf("content must be either string or array of content blocks") }
425-433: Inconsistent null marshaling between structs.
CohereMessageContent.MarshalJSONreturns[]byte("null"), nilwhen both fields are nil, but the streaming structs returnsonic.Marshal(nil). While functionally equivalent, consider using the same approach for consistency:func (c *CohereStreamToolCallStruct) MarshalJSON() ([]byte, error) { if c.CohereToolCallObject != nil { return sonic.Marshal(c.CohereToolCallObject) } if c.CohereToolCallArray != nil { return sonic.Marshal(c.CohereToolCallArray) } - return sonic.Marshal(nil) + return []byte("null"), nil }core/internal/testutil/chat_completion_stream.go (1)
583-693: Consider extracting shared streaming validation logic.The validation callback in
ChatCompletionStreamWithReasoningValidated(lines 583-693) shares significant logic withChatCompletionStreamWithReasoning(lines 422-497). Consider extracting the common streaming read loop and reasoning detection into a helper function to reduce duplication.core/providers/bedrock/types.go (1)
128-137: Silent skip of unmarshal errors may hide data issues.When an unknown field fails to unmarshal (line 133), it's silently skipped. Consider logging a warning to aid debugging when unexpected field types are encountered:
for key, value := range rawData { if !bedrockConverseRequestKnownFields[key] { var v interface{} if err := sonic.Unmarshal(value, &v); err != nil { - continue // Skip fields that can't be unmarshaled + // Log but don't fail - allows forward compatibility + // Consider: log.Printf("bedrock: failed to unmarshal extra param %q: %v", key, err) + continue } r.ExtraParams[key] = v } }transports/bifrost-http/integrations/router.go (2)
709-712: DONE marker gating is brittle when keyed offconfig.Pathsubstring.
strings.Contains(config.Path, "/responses")can accidentally affect non-OpenAI integrations if they ever have “/responses” in route paths; prefer gating by route/integration type + request kind, or pass a boolean fromhandleStreamingRequestwhen the request isResponsesStream.- shouldSendDoneMarker := true - if config.Type == RouteConfigTypeAnthropic || strings.Contains(config.Path, "/responses") { + shouldSendDoneMarker := true + if config.Type == RouteConfigTypeAnthropic || + (config.Type == RouteConfigTypeOpenAI && strings.Contains(config.Path, "/responses")) { shouldSendDoneMarker = false }
879-886: Raw SSE string detection should accept more valid SSE prefixes (and “event:” without requiring a space).
Current check can double-wrap valid SSE strings likeid: .../retry: ...orevent:(no space), producing invalid output.- } else if sseString, ok := convertedResponse.(string); ok { + } else if sseString, ok := convertedResponse.(string); ok { // CUSTOM SSE FORMAT: The converter returned a complete SSE string // This is used by providers like Anthropic that need custom event types // Example: "event: content_block_delta\ndata: {...}\n\n" - if !strings.HasPrefix(sseString, "data: ") && !strings.HasPrefix(sseString, "event: ") { + // Accept common SSE fields; allow optional space after ':' per SSE conventions. + if !(strings.HasPrefix(sseString, "data:") || + strings.HasPrefix(sseString, "event:") || + strings.HasPrefix(sseString, "id:") || + strings.HasPrefix(sseString, "retry:")) { sseString = fmt.Sprintf("data: %s\n\n", sseString) } if _, err := fmt.Fprint(w, sseString); err != nil { cancel() // Client disconnected (write error), cancel upstream stream return }core/providers/openai/types_test.go (1)
204-331: Make the error-case assertion more specific (improves debuggability).
Right now the “both reasoning and reasoning_effort should error” case only checkserr != nil; consider asserting the error mentionsreasoning_effort(or whichever invariant you enforce).if tt.expectError { if err == nil { t.Error("Expected error but got none") } + // Optional: tighten to a stable substring if available. + // if err != nil && !strings.Contains(err.Error(), "reasoning_effort") { ... } return }core/providers/openai/responses.go (1)
114-142: Consider documenting how to maintain the supported tools list.The
filterUnsupportedToolsmethod uses a hardcoded map of supported tool types. While the implementation is correct, this creates a maintenance burden: when new tool types are added toschemas.ResponsesToolType, this list must also be updated.Consider adding a comment noting this, or alternatively, use a deny-list approach (filter out specifically unsupported types) if that list is shorter and more stable.
core/providers/bedrock/utils.go (1)
35-76: Reasoning budget derivation looks good; consider making"disabled"consistent type-wise.
The split betweenMaxTokensvsEffort(and Bedrock-vs-Anthropic min budget selection) is a solid improvement. One small consistency tweak:"disabled"usesmap[string]stringwhile enabled usesmap[string]any; consider usingmap[string]anyfor both to keep downstream handling uniform.core/providers/vertex/utils.go (1)
53-78: Optional: remove the “set model then delete model” step (clarity).
reqBody.Model = deploymenthas no effect on the final payload since"model"is removed before sending to Vertex; consider dropping it to reduce confusion (unless some downstream logic depends on it prior to the delete).core/schemas/responses.go (1)
400-404: Signature field addition looks reasonable; ensure it’s populated/forwarded consistently across the stack.
This adds a new surface area (signature) that UI/stream aggregators will likely rely on; worth double-checking all provider adapters in this PR stack either set it or intentionally leave it nil.core/providers/bedrock/bedrock_test.go (1)
1497-1513: Structural comparison for Bedrock response conversion is fine; considerrequire.NotNilon nested pointers to avoid panic on regressions.If
actual.Output/actual.Output.Messageever regresses to nil, these assertions would panic. A couplerequire.NotNil(t, actual.Output)/require.NotNil(t, actual.Output.Message)would make failures cleaner.core/providers/cohere/responses.go (2)
1225-1247: ToolChoice default-to-required is risky for unknown strings.If
tool_choicecontains an unrecognized value, defaulting torequiredcan force tool calls unexpectedly. Consider returningnil(let Cohere default) for thedefault:case.
1345-1347: Minor: usemsg.ResponsesToolMessage.CallIDinstead of promotedmsg.CallIDfor clarity.Current code is safe due to the nil-guard, but accessing through
msg.ResponsesToolMessage.CallIDavoids confusion about promoted fields.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
ui/package-lock.jsonis excluded by!**/package-lock.json
📒 Files selected for processing (58)
core/internal/testutil/account.go(1 hunks)core/internal/testutil/chat_completion_stream.go(1 hunks)core/internal/testutil/reasoning.go(4 hunks)core/internal/testutil/responses_stream.go(1 hunks)core/internal/testutil/tests.go(2 hunks)core/providers/anthropic/anthropic.go(3 hunks)core/providers/anthropic/chat.go(5 hunks)core/providers/anthropic/errors.go(2 hunks)core/providers/anthropic/types.go(8 hunks)core/providers/anthropic/utils.go(2 hunks)core/providers/azure/azure.go(2 hunks)core/providers/azure/utils.go(1 hunks)core/providers/bedrock/bedrock.go(2 hunks)core/providers/bedrock/bedrock_test.go(13 hunks)core/providers/bedrock/types.go(2 hunks)core/providers/bedrock/utils.go(2 hunks)core/providers/cerebras/cerebras_test.go(2 hunks)core/providers/cohere/chat.go(3 hunks)core/providers/cohere/cohere.go(2 hunks)core/providers/cohere/cohere_test.go(1 hunks)core/providers/cohere/responses.go(7 hunks)core/providers/cohere/types.go(1 hunks)core/providers/gemini/responses.go(2 hunks)core/providers/groq/groq_test.go(2 hunks)core/providers/mistral/mistral_test.go(1 hunks)core/providers/openai/chat.go(1 hunks)core/providers/openai/responses.go(2 hunks)core/providers/openai/responses_marshal_test.go(1 hunks)core/providers/openai/responses_test.go(1 hunks)core/providers/openai/text.go(1 hunks)core/providers/openai/types.go(3 hunks)core/providers/openai/types_test.go(1 hunks)core/providers/openai/utils.go(1 hunks)core/providers/openrouter/openrouter_test.go(1 hunks)core/providers/utils/utils.go(4 hunks)core/providers/vertex/errors.go(1 hunks)core/providers/vertex/types.go(0 hunks)core/providers/vertex/utils.go(1 hunks)core/providers/vertex/vertex.go(3 hunks)core/providers/vertex/vertex_test.go(1 hunks)core/schemas/bifrost.go(1 hunks)core/schemas/mux.go(0 hunks)core/schemas/responses.go(5 hunks)core/schemas/utils.go(1 hunks)docs/docs.json(0 hunks)framework/configstore/rdb.go(0 hunks)framework/streaming/chat.go(0 hunks)framework/streaming/responses.go(2 hunks)transports/bifrost-http/handlers/inference.go(1 hunks)transports/bifrost-http/handlers/middlewares.go(1 hunks)transports/bifrost-http/integrations/anthropic.go(5 hunks)transports/bifrost-http/integrations/router.go(3 hunks)ui/app/workspace/logs/views/columns.tsx(1 hunks)ui/app/workspace/logs/views/logDetailsSheet.tsx(1 hunks)ui/app/workspace/logs/views/logResponsesMessageView.tsx(2 hunks)ui/app/workspace/logs/views/logResponsesOutputView.tsx(0 hunks)ui/lib/types/logs.ts(2 hunks)ui/package.json(1 hunks)
💤 Files with no reviewable changes (6)
- core/schemas/mux.go
- docs/docs.json
- ui/app/workspace/logs/views/logResponsesOutputView.tsx
- core/providers/vertex/types.go
- framework/configstore/rdb.go
- framework/streaming/chat.go
🚧 Files skipped from review as they are similar to previous changes (20)
- core/providers/cohere/cohere_test.go
- ui/app/workspace/logs/views/logResponsesMessageView.tsx
- ui/lib/types/logs.ts
- ui/app/workspace/logs/views/columns.tsx
- core/providers/anthropic/chat.go
- ui/package.json
- core/providers/openai/chat.go
- core/providers/openrouter/openrouter_test.go
- core/internal/testutil/responses_stream.go
- core/providers/azure/azure.go
- core/providers/groq/groq_test.go
- core/providers/cerebras/cerebras_test.go
- core/providers/cohere/cohere.go
- core/internal/testutil/account.go
- core/internal/testutil/tests.go
- core/providers/anthropic/anthropic.go
- core/providers/utils/utils.go
- core/providers/openai/text.go
- transports/bifrost-http/handlers/middlewares.go
- core/providers/bedrock/bedrock.go
🧰 Additional context used
📓 Path-based instructions (1)
**
⚙️ CodeRabbit configuration file
always check the stack if there is one for the current PR. do not give localized reviews for the PR, always see all changes in the light of the whole stack of PRs (if there is a stack, if there is no stack you can continue to make localized suggestions/reviews)
Files:
core/providers/openai/utils.gocore/providers/openai/responses.gocore/providers/vertex/utils.gocore/providers/openai/types_test.gocore/providers/openai/responses_test.gocore/providers/vertex/errors.gocore/providers/openai/responses_marshal_test.goui/app/workspace/logs/views/logDetailsSheet.tsxcore/internal/testutil/chat_completion_stream.gocore/providers/vertex/vertex.gocore/schemas/responses.gocore/providers/gemini/responses.gocore/providers/anthropic/utils.gocore/providers/cohere/chat.gocore/providers/bedrock/utils.gotransports/bifrost-http/integrations/anthropic.gotransports/bifrost-http/handlers/inference.gocore/schemas/utils.gocore/providers/anthropic/errors.gocore/providers/cohere/types.gocore/providers/azure/utils.gocore/providers/mistral/mistral_test.gocore/schemas/bifrost.gocore/providers/openai/types.goframework/streaming/responses.gocore/internal/testutil/reasoning.gotransports/bifrost-http/integrations/router.gocore/providers/bedrock/types.gocore/providers/cohere/responses.gocore/providers/anthropic/types.gocore/providers/bedrock/bedrock_test.gocore/providers/vertex/vertex_test.go
🧠 Learnings (4)
📚 Learning: 2025-12-09T17:07:42.007Z
Learnt from: qwerty-dvorak
Repo: maximhq/bifrost PR: 1006
File: core/schemas/account.go:9-18
Timestamp: 2025-12-09T17:07:42.007Z
Learning: In core/schemas/account.go, the HuggingFaceKeyConfig field within the Key struct is currently unused and reserved for future Hugging Face inference endpoint deployments. Do not flag this field as missing from OpenAPI documentation or require its presence in the API spec until the feature is actively implemented and used. When the feature is added, update the OpenAPI docs accordingly; otherwise, treat this field as non-breaking and not part of the current API surface.
Applied to files:
core/providers/openai/utils.gocore/providers/openai/responses.gocore/providers/vertex/utils.gocore/providers/openai/types_test.gocore/providers/openai/responses_test.gocore/providers/vertex/errors.gocore/providers/openai/responses_marshal_test.gocore/internal/testutil/chat_completion_stream.gocore/providers/vertex/vertex.gocore/schemas/responses.gocore/providers/gemini/responses.gocore/providers/anthropic/utils.gocore/providers/cohere/chat.gocore/providers/bedrock/utils.gotransports/bifrost-http/integrations/anthropic.gotransports/bifrost-http/handlers/inference.gocore/schemas/utils.gocore/providers/anthropic/errors.gocore/providers/cohere/types.gocore/providers/azure/utils.gocore/providers/mistral/mistral_test.gocore/schemas/bifrost.gocore/providers/openai/types.goframework/streaming/responses.gocore/internal/testutil/reasoning.gotransports/bifrost-http/integrations/router.gocore/providers/bedrock/types.gocore/providers/cohere/responses.gocore/providers/anthropic/types.gocore/providers/bedrock/bedrock_test.gocore/providers/vertex/vertex_test.go
📚 Learning: 2025-12-11T11:58:25.307Z
Learnt from: Pratham-Mishra04
Repo: maximhq/bifrost PR: 1000
File: core/providers/openai/responses.go:42-84
Timestamp: 2025-12-11T11:58:25.307Z
Learning: In core/providers/openai/responses.go (and related OpenAI response handling), document and enforce the API format constraint: if ResponsesReasoning != nil and the response contains content blocks, all content blocks should be treated as reasoning blocks by default. Implement type guards or parsing logic accordingly, and add unit tests to verify that when ResponsesReasoning is non-nil, content blocks are labeled as reasoning blocks. Include clear comments in the code explaining the rationale and ensure downstream consumers rely on this behavior.
Applied to files:
core/providers/openai/utils.gocore/providers/openai/responses.gocore/providers/openai/types_test.gocore/providers/openai/responses_test.gocore/providers/openai/responses_marshal_test.gocore/providers/openai/types.go
📚 Learning: 2025-12-12T08:25:02.629Z
Learnt from: Pratham-Mishra04
Repo: maximhq/bifrost PR: 1000
File: transports/bifrost-http/integrations/router.go:709-712
Timestamp: 2025-12-12T08:25:02.629Z
Learning: In transports/bifrost-http/**/*.go, update streaming response handling to align with OpenAI Responses API: use typed SSE events such as response.created, response.output_text.delta, response.done, etc., and do not rely on the legacy data: [DONE] termination marker. Note that data: [DONE] is only used by the older Chat Completions and Text Completions streaming APIs. Ensure parsers, writers, and tests distinguish SSE events from the [DONE] sentinel and handle each event type accordingly for correct stream termination and progress updates.
Applied to files:
transports/bifrost-http/integrations/anthropic.gotransports/bifrost-http/handlers/inference.gotransports/bifrost-http/integrations/router.go
📚 Learning: 2025-12-11T07:38:31.413Z
Learnt from: Pratham-Mishra04
Repo: maximhq/bifrost PR: 1000
File: core/providers/bedrock/bedrock_test.go:1374-1390
Timestamp: 2025-12-11T07:38:31.413Z
Learning: In core/providers/bedrock tests, follow a layered testing approach: - Unit tests (e.g., TestBifrostToBedrockResponseConversion) should perform structural comparisons and type/field checks to avoid brittleness from dynamic fields. - Separate scenario-based and integration tests should validate the full end-to-end conversion logic, including content block internals. Ensure unit tests avoid brittle string/field matching and that integration tests cover end-to-end behavior with realistic data.
Applied to files:
core/providers/bedrock/bedrock_test.go
🧬 Code graph analysis (16)
core/providers/openai/responses.go (4)
core/schemas/responses.go (2)
ResponsesMessage(314-327)ResponsesParameters(87-114)core/schemas/utils.go (1)
Ptr(16-18)core/providers/openai/types.go (3)
OpenAIResponsesRequest(180-189)OpenAIResponsesRequestInput(147-150)MinMaxCompletionTokens(12-12)core/providers/openai/utils.go (1)
SanitizeUserField(51-56)
core/providers/openai/types_test.go (2)
core/providers/openai/types.go (1)
OpenAIChatRequest(46-59)core/schemas/chatcompletions.go (2)
ChatMessageRoleUser(462-462)ChatMessageRoleSystem(463-463)
core/providers/vertex/errors.go (4)
core/providers/utils/utils.go (2)
CheckAndDecodeBody(490-498)NewBifrostOperationError(516-527)core/schemas/provider.go (1)
ErrProviderResponseDecode(29-29)core/providers/vertex/vertex.go (1)
VertexError(25-31)core/providers/vertex/types.go (1)
VertexValidationError(153-160)
core/providers/openai/responses_marshal_test.go (2)
core/providers/openai/types.go (2)
OpenAIResponsesRequest(180-189)OpenAIResponsesRequestInput(147-150)core/schemas/responses.go (3)
ResponsesParameters(87-114)ResponsesParametersReasoning(234-239)ResponsesMessage(314-327)
core/schemas/responses.go (2)
core/providers/gemini/types.go (1)
Type(778-778)ui/lib/types/logs.ts (2)
ResponsesMessageContentBlockType(352-359)ResponsesReasoningSummary(412-415)
core/providers/anthropic/utils.go (7)
core/schemas/responses.go (1)
BifrostResponsesRequest(32-39)core/schemas/bifrost.go (3)
ModelProvider(32-32)BifrostError(358-367)BifrostContextKeyUseRawRequestBody(117-117)core/providers/utils/utils.go (1)
NewBifrostOperationError(516-527)core/schemas/provider.go (2)
ErrRequestBodyConversion(25-25)ErrProviderRequestMarshal(26-26)core/providers/anthropic/types.go (1)
AnthropicDefaultMaxTokens(14-14)core/providers/anthropic/responses.go (1)
ToAnthropicResponsesRequest(1419-1532)core/schemas/utils.go (1)
Ptr(16-18)
core/providers/cohere/chat.go (5)
core/providers/cohere/types.go (5)
CohereThinking(174-177)ThinkingTypeEnabled(183-183)DefaultCompletionMaxTokens(12-12)MinimumReasoningMaxTokens(11-11)ThinkingTypeDisabled(184-184)core/providers/bedrock/types.go (2)
DefaultCompletionMaxTokens(13-13)MinimumReasoningMaxTokens(12-12)core/providers/utils/utils.go (1)
GetBudgetTokensFromReasoningEffort(1084-1121)core/schemas/utils.go (1)
Ptr(16-18)core/schemas/chatcompletions.go (4)
ChatStreamResponseChoice(751-753)ChatStreamResponseChoiceDelta(756-763)ChatReasoningDetails(723-730)BifrostReasoningDetailsTypeText(719-719)
transports/bifrost-http/handlers/inference.go (2)
core/schemas/bifrost.go (1)
ResponsesRequest(91-91)core/schemas/responses.go (1)
ResponsesParameters(87-114)
core/providers/anthropic/errors.go (2)
core/schemas/bifrost.go (1)
BifrostError(358-367)ui/lib/types/logs.ts (1)
BifrostError(226-232)
core/providers/openai/types.go (2)
core/schemas/chatcompletions.go (1)
ChatParameters(155-184)core/schemas/responses.go (1)
ResponsesParametersReasoning(234-239)
framework/streaming/responses.go (3)
core/schemas/responses.go (8)
ResponsesMessage(314-327)ResponsesMessageTypeReasoning(307-307)ResponsesInputMessageRoleAssistant(332-332)ResponsesReasoning(731-734)ResponsesReasoningSummary(745-748)ResponsesMessageContent(339-344)ResponsesMessageContentBlock(399-411)ResponsesOutputMessageContentTypeReasoning(394-394)core/schemas/utils.go (1)
Ptr(16-18)core/utils.go (1)
Ptr(51-53)
transports/bifrost-http/integrations/router.go (2)
core/schemas/bifrost.go (1)
BifrostContextKeyIntegrationType(120-120)core/providers/gemini/types.go (1)
Type(778-778)
core/providers/bedrock/types.go (5)
core/providers/anthropic/types.go (1)
Alias(104-104)core/providers/openai/types.go (2)
Alias(83-83)Alias(194-194)core/schemas/bifrost.go (2)
Alias(388-388)Alias(405-405)core/schemas/chatcompletions.go (4)
Alias(189-189)Alias(643-643)Alias(770-770)Alias(860-860)core/providers/gemini/types.go (7)
Alias(220-220)Alias(245-245)Alias(828-830)Alias(842-844)Alias(857-859)Alias(871-873)Alias(1306-1306)
core/providers/cohere/responses.go (4)
core/providers/cohere/types.go (2)
CohereContentBlock(146-160)CohereStreamEvent(405-410)core/schemas/responses.go (5)
ResponsesMessageContentBlock(399-411)BifrostResponsesStreamResponse(1427-1466)ResponsesMessage(314-327)ResponsesMessageContent(339-344)ResponsesOutputMessageContentTextAnnotation(438-449)core/schemas/utils.go (1)
Ptr(16-18)core/providers/utils/utils.go (2)
GetRandomString(1027-1038)GetBudgetTokensFromReasoningEffort(1084-1121)
core/providers/anthropic/types.go (3)
core/providers/bedrock/types.go (1)
Alias(104-104)core/providers/openai/types.go (2)
Alias(83-83)Alias(194-194)core/schemas/chatcompletions.go (4)
Alias(189-189)Alias(643-643)Alias(770-770)Alias(860-860)
core/providers/bedrock/bedrock_test.go (3)
core/schemas/utils.go (1)
Ptr(16-18)core/schemas/responses.go (3)
ResponsesMessageTypeMessage(290-290)ResponsesToolMessage(462-482)ResponsesToolMessageOutputStruct(531-535)core/schemas/chatcompletions.go (1)
OrderedMap(268-268)
⏰ Context from checks skipped due to timeout of 900000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
- GitHub Check: Graphite / mergeability_check
- GitHub Check: Graphite / mergeability_check
- GitHub Check: Graphite / mergeability_check
🔇 Additional comments (36)
core/providers/vertex/vertex_test.go (1)
31-31: No changes needed. The Vertex provider intentionally supports Anthropic Claude models through explicit model routing logic. TheIsAnthropicModel()function in schemas/utils.go recognizes any model string containing "claude" and the Vertex provider has dedicated handling to route these models via the Anthropic API format (addinganthropic_versionto the request). SettingReasoningModel: "claude-4.5-haiku"in the test configuration is correct and consistent with how other providers (Bedrock, Anthropic, Azure) configure reasoning models.core/providers/cohere/types.go (1)
11-12: LGTM! Constants align with other providers.These reasoning token budget constants are consistent with the same values introduced in
core/providers/bedrock/types.go, establishing uniform semantics across providers.core/providers/vertex/errors.go (2)
14-17: LGTM! Proper response decoding before unmarshalling.Good addition of
CheckAndDecodeBodyto handle content-encoded responses (e.g., gzip). The early return on decode failure withErrProviderResponseDecodeis appropriate.
19-43: Multi-format error parsing looks correct.The fallback chain (OpenAI → []VertexError → VertexError → VertexValidationError) correctly uses
decodedBodythroughout, and the error propagation paths are consistent with the relevant code snippets fromcore/providers/vertex/vertex.goandtypes.go.core/internal/testutil/chat_completion_stream.go (1)
361-521: LGTM! Well-structured reasoning streaming test.The test properly checks for multiple reasoning indicators (content, details, tokens) and handles the fact that different providers may expose reasoning differently. The 200-second timeout is appropriate for reasoning-heavy models, and the warning-only approach for missing indicators is reasonable given provider variability.
core/providers/bedrock/types.go (2)
12-13: LGTM! Constants consistent with Cohere provider.These reasoning token budget constants match
core/providers/cohere/types.go, maintaining consistency across providers.
82-98: Known fields map now correctly includes all struct fields.The previous review comments about missing
serviceTierandextra_paramshave been addressed. The map now correctly includes all JSON-tagged fields fromBedrockConverseRequest.core/internal/testutil/reasoning.go (3)
12-51: Responses reasoning test updates look coherent (token bump + include reasoning.encrypted_content).
No concerns with the updated request construction / validation flow.Also applies to: 87-116
201-310: Chat Completions reasoning test: sensible provider skip + retry wiring.
Looks fine; the OpenAI skip is a pragmatic flake-reduction lever.
314-411: Validator is nil-safe and non-fatal; logging is bounded.
No issues spotted in the new reasoning detection heuristics.core/providers/mistral/mistral_test.go (1)
49-50: Commented Reasoning=false is clear and implementation-accurate.
Good clarification that this is a bifrost integration gap, not a provider capability claim.core/providers/openai/utils.go (1)
47-56: User-field sanitization is a good defensive guard (64-char cap).
Returningnilto omit the field is reasonable and avoids upstream request failures. Based on learnings, this matches an observed OpenAI constraint.transports/bifrost-http/handlers/inference.go (1)
224-254: CustomResponsesRequest.UnmarshalJSONlooks correct (mirrors ChatRequest workaround).transports/bifrost-http/integrations/router.go (1)
312-314: Context: integration type propagation looks correct; consider using it (instead of path heuristics) for stream behavior.
Storingstring(config.Type)underschemas.BifrostContextKeyIntegrationTypeis a good stack-level hook; it would also let you avoidstrings.Contains(config.Path, ...)for stream quirks.core/schemas/bifrost.go (1)
120-123: New context keys look good and align with router propagation.
BifrostContextKeyIntegrationTypeandBifrostContextKeyUserAgentare reasonable additions to the shared context surface.core/providers/vertex/vertex.go (2)
841-893: Streaming path now shares the same helper; double-check it sets streaming-specific fields for Vertex Anthropic (streamRawPredict).
You’re calling the helper withisStreaming=trueand passing the returned bytes toanthropic.HandleAnthropicResponsesStream; please confirm the helper sets/keeps any required streaming flags/fields and doesn’t regress SSE framing expectations across the PR stack.
703-707: Helper correctly preserves raw-body passthrough and all Anthropic-on-Vertex semantics.The
getRequestBodyForAnthropicResponseshelper incore/providers/vertex/utils.goproperly implements both the raw-body passthrough path (via context flagBifrostContextKeyUseRawRequestBody) and the standard Anthropic conversion path. It correctly handles:
- Raw body retrieval and processing when flagged, with required transformations (anthropic_version injection, max_tokens, stream flag)
- Removal of model/region fields for Vertex compatibility
- Proper streaming flag assignment (false for responses, true for stream)
- Deployment-to-model mapping
- Consistent error handling
The refactored calls at lines 703 and ~841 are correct and preserve prior semantics.
core/providers/openai/types_test.go (2)
1-203: Good coverage for base-field + ChatParameters preservation.
These cases should prevent accidental regressions inOpenAIChatRequestcustom unmarshalling.
333-462: Presence + value assertion tests look clean and complementary.
Nice separation between “field present” and “exact value” validation.core/providers/anthropic/errors.go (1)
3-58: No redeclaration exists forToAnthropicResponsesStreamError—only one definition is present in the codebase.The primary concern about package-level redeclaration is unfounded; there is exactly one definition at
core/providers/anthropic/errors.go:43and one active usage attransports/bifrost-http/integrations/anthropic.go:129.The silent return of
""onjson.Marshalfailure is a minor consideration. In practice, marshaling the constructedAnthropicMessageErroris extremely unlikely to fail since it uses well-defined, serializable types. If error resilience is needed, a minimal fallback (e.g.,"event: error\ndata: {}\n\n") could be considered, but the current implementation is acceptable for this use case.Likely an incorrect or invalid review comment.
core/providers/openai/types.go (1)
11-13: LGTM! Constant defined for minimum token clamping.The
MinMaxCompletionTokens = 16constant provides a sensible minimum floor for output tokens, used inresponses.goto clampMaxOutputTokens.core/providers/openai/responses.go (3)
47-59: LGTM! Excellent documentation of the reasoning-only message skip constraint.The detailed comment (lines 47-52) clearly explains why reasoning-only messages are skipped for non-gpt-oss models. This addresses the past review request for documenting this API format constraint. The logic correctly checks all required conditions before skipping.
71-78: LGTM! Pointer-to-range-variable bug correctly fixed.Using
schemas.Ptr(summary.Text)(line 76) correctly creates a new pointer for each iteration, avoiding the aliasing issue where allTextpointers would reference the same memory location. This addresses the critical bug flagged in past reviews.
100-109: LGTM! Proper parameter sanitization and clamping.The code correctly:
- Clamps
MaxOutputTokensto a minimum of 16 tokens (lines 102-104)- Sanitizes the
Userfield to respect OpenAI's 64-character limit (line 106)- Filters out unsupported tool types (line 108)
core/providers/openai/responses_test.go (2)
10-193: LGTM! Comprehensive tests for reasoning-only message skip logic.These tests thoroughly cover the documented OpenAI responses format constraint. The test cases validate:
- Reasoning-only messages are skipped for non-gpt-oss models (when Summary is empty, ContentBlocks are non-empty, and EncryptedContent is nil)
- Messages with Summary, EncryptedContent, empty ContentBlocks, or nil Content are preserved
- gpt-oss models preserve reasoning-only messages
This directly addresses the past review request for unit tests to lock in this behavior.
195-346: LGTM! Thorough tests for Summary-to-ContentBlocks conversion.The tests properly validate that:
- gpt-oss models convert Summary to ContentBlocks when Content is nil
- Existing Content is preserved (not overwritten)
- Variant model names containing "gpt-oss" also trigger conversion
- Original message fields (ID, Type, Status, Role) are preserved during transformation
core/providers/cohere/chat.go (2)
104-129: LGTM! Reasoning budget calculation with proper error handling.The updated logic correctly:
- Prioritizes explicit
MaxTokenswhen provided (lines 106-110)- Falls back to effort-based budget calculation using
GetBudgetTokensFromReasoningEffort(lines 111-123)- Properly handles the error return from the budget calculation (lines 117-119)
- Uses
DefaultCompletionMaxTokensas fallback whenMaxCompletionTokensis nil (lines 112-115)- Disables thinking when neither MaxTokens nor valid Effort is provided (lines 124-128)
This addresses the past review concern about nil pointer dereference on
MaxCompletionTokens.
439-463: LGTM! Clean handling of thinking text in streaming response.The code correctly:
- Extracts the thinking text once (line 440) to avoid repeated dereferencing
- Sets both
ReasoningandReasoningDetails[0].Textto point to the thinking content (lines 448, 453)- Uses
schemas.Ptr(thinkingText)to create proper pointerscore/providers/openai/responses_marshal_test.go (1)
12-480: Good coverage for custom marshaling/round-trip edge cases.
These tests should help stabilize the “field shadowing + omit reasoning.max_tokens” behavior and input union encoding.core/schemas/responses.go (1)
730-748:ResponsesReasoning.Summaryis consistently initialized to empty slice across all providers; the null-marshaling concern doesn't apply in practice.The
Summaryfield lacksomitemptyand will marshal as an empty array[]rather thannull. However, this is the intended behavior—every provider (OpenAI, Anthropic, Bedrock, Cohere, Gemini) initializesSummaryto[]schemas.ResponsesReasoningSummary{}when creatingResponsesReasoning, ensuring it never remains nil. If you want to make this behavior more explicit or prevent future regressions where Summary might be left uninitialized, consider addingomitemptytag; otherwise, the current pattern is sound.core/providers/anthropic/types.go (3)
161-197: Sonic-basedAnthropicContent(un)marshal looks good; just ensure callers don’t set both fields (incl. empty slices).The “both ContentStr and ContentBlocks set” guard is correct, but note
ContentBlocks != nilwill be true for empty-but-non-nil slices—so any code that default-initializesContentBlocks: []AnthropicContentBlock{}and later setsContentStrwill now error.If that pattern exists, you may want to normalize empty slices to nil at construction sites (not necessarily here).
201-228:redacted_thinking+Datafield addition is consistent with Anthropic’s newer redaction shapes.No concerns with the enum expansion and
Data *string.
365-424: Streaming delta tags:StopSequence *string \json:"stop_sequence"`preservesnull` semantics.This matches the requirement to faithfully roundtrip
nullvs string forstop_sequence.core/providers/bedrock/bedrock_test.go (2)
485-935: Test expectations updated cleanly forResponsesMessage.Typeand new tool-output representation.The added
Typefields and migration toResponsesToolCallOutputStralign the tests with the updated schemas and should reduce brittleness.
1683-1717: Nice: nil-guards before pointer deref in assertions.This avoids test panics and gives clearer failures.
core/providers/cohere/responses.go (1)
14-151: State pooling/reset additions are solid (maps are cleared, non-map fields reset).This is the right direction for avoiding cross-request contamination in streaming conversions.
3a4d9a5 to
bd1d8c9
Compare
1a511ff to
6951a77
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (4)
transports/bifrost-http/integrations/router.go (1)
820-907: Bug risk: don’t emitevent:line before determining output mode (Bedrock eventstream / preformatted SSE string / JSON SSE).
Right noweventType(Line 820+) is written before the Bedrock branch and before the “preformatted SSE string” branch—this can corrupt Bedrock streams and can double-emitevent:for converters that already return fully formatted SSE strings.- if eventType != "" { - // OPENAI RESPONSES FORMAT: Use event: and data: lines for OpenAI responses API compatibility - if _, err := fmt.Fprintf(w, "event: %s\n", eventType); err != nil { - cancel() // Client disconnected (write error), cancel upstream stream - return - } - } - - // Handle Bedrock Event Stream format - if config.Type == RouteConfigTypeBedrock && eventStreamEncoder != nil { + // Handle Bedrock Event Stream format first (NOT SSE) + if config.Type == RouteConfigTypeBedrock && eventStreamEncoder != nil { // We need to cast to BedrockStreamEvent to determine event type and structure if bedrockEvent, ok := convertedResponse.(*bedrock.BedrockStreamEvent); ok { // Convert to sequence of specific Bedrock events events := bedrockEvent.ToEncodedEvents() @@ // Continue to next chunk (we handled flushing internally) continue - } else if sseString, ok := convertedResponse.(string); ok { + } + + // CUSTOM SSE FORMAT: converter returned a complete SSE string (should own event/data lines) + if sseString, ok := convertedResponse.(string); ok { // CUSTOM SSE FORMAT: The converter returned a complete SSE string // This is used by providers like Anthropic that need custom event types // Example: "event: content_block_delta\ndata: {...}\n\n" if !strings.HasPrefix(sseString, "data: ") && !strings.HasPrefix(sseString, "event: ") { sseString = fmt.Sprintf("data: %s\n\n", sseString) } if _, err := fmt.Fprint(w, sseString); err != nil { cancel() // Client disconnected (write error), cancel upstream stream return } } else { + // STANDARD SSE FORMAT: emit eventType only for object->JSON SSE + if eventType != "" { + if _, err := fmt.Fprintf(w, "event: %s\n", eventType); err != nil { + cancel() + return + } + } // STANDARD SSE FORMAT: The converter returned an object // This will be JSON marshaled and wrapped as "data: {json}\n\n" // Used by most providers (OpenAI chat/completions, Google, etc.) responseJSON, err := sonic.Marshal(convertedResponse) if err != nil { // Log JSON marshaling error but continue processing log.Printf("Failed to marshal streaming response: %v", err) continue }transports/bifrost-http/handlers/inference.go (2)
91-118: Add"user"toresponsesParamsKnownFieldsto prevent it from being duplicated into ExtraParams.Right now
"user"can be treated as “unknown”, get copied intoResponsesParameters.ExtraParams, and potentially conflict withResponsesParameters.Userhandling/sanitization downstream.var responsesParamsKnownFields = map[string]bool{ @@ "truncation": true, + "user": true, }Also applies to: 612-617
580-660: The responses handler should populateRawRequestBodywhen raw passthrough is enabled. Currently,ctx.PostBody()is parsed for JSON but never preserved viaSetRawRequestBody(). While integration routers handle this at lines 355-356 inrouter.go, the direct handler path ininference.golacks this mechanism entirely. Add logic to detect raw passthrough conditions (similar to integration handlers) and callbifrostResponsesReq.SetRawRequestBody(ctx.PostBody())when appropriate to align with the router-based implementation.core/providers/openai/responses.go (1)
114-142: RemoveResponsesToolTypeWebSearchPreviewfrom the allowlist; it is a legacy preview variant.The code currently allows both
web_searchandweb_search_preview, but OpenAI's Responses API has movedweb_searchto general availability with improved features (domain filtering, external web access control). The preview variant (web_search_preview) lacks these capabilities and should not be actively supported. Keep onlyweb_searchin the supported types map.The
customtool type andlocal_shell(mapped to shell) are both currently supported by OpenAI and correctly included. The rest of the allowlist aligns with documented OpenAI Responses API tool types.
♻️ Duplicate comments (12)
ui/app/workspace/logs/views/logDetailsSheet.tsx (1)
187-237: Previous review feedback not addressed: boolean/number rendering bugs remain.The issues flagged in the earlier review still exist:
Critical: Line 227 renders
reasoning.generate_summary(a boolean) directly in the Badge. In JSX, booleans are not rendered as text, so the Badge will appear visually empty even when the value istrue.Major: Line 232's truthiness check (
reasoning.max_tokens && ...) will hide the field whenmax_tokensis0, which is a valid value that should be displayed in a debugging/logging UI.Lines 199, 210, 221 use truthiness checks that could also hide falsy-but-valid values.
Apply the same fix as previously suggested:
- {reasoning.effort && ( + {typeof reasoning.effort === "string" && ( <LogEntryDetailsView className="w-full" label="Effort" value={ <Badge variant="secondary" className="uppercase"> {reasoning.effort} </Badge> } /> )} - {reasoning.summary && ( + {typeof reasoning.summary === "string" && ( <LogEntryDetailsView className="w-full" label="Summary" value={ <Badge variant="secondary" className="uppercase"> {reasoning.summary} </Badge> } /> )} - {reasoning.generate_summary && ( + {typeof reasoning.generate_summary === "boolean" && ( <LogEntryDetailsView className="w-full" label="Generate Summary" value={ <Badge variant="secondary" className="uppercase"> - {reasoning.generate_summary} + {String(reasoning.generate_summary)} </Badge> } /> )} - {reasoning.max_tokens && <LogEntryDetailsView className="w-full" label="Max Tokens" value={reasoning.max_tokens} />} + {typeof reasoning.max_tokens === "number" && ( + <LogEntryDetailsView className="w-full" label="Max Tokens" value={reasoning.max_tokens} /> + )}core/providers/azure/utils.go (2)
45-46: Use aschemas.Err...constant instead of string literal.The string-literal error code
"request body is not provided"won't be handled consistently with other error codes. Consider using an existing constant likeschemas.ErrRequestBodyConversionwith a more specific wrapped error, or add a new constant if needed.if reqBody == nil { - return nil, providerUtils.NewBifrostOperationError("request body is not provided", nil, providerName) + return nil, providerUtils.NewBifrostOperationError(schemas.ErrRequestBodyConversion, fmt.Errorf("request body is nil after conversion"), providerName) }
31-34: Normalizestreamfield whenisStreaming=falsein raw-body path.When
useRawRequestBody=trueand the raw JSON contains"stream": true, this code won't reset it whenisStreaming=false. This creates a protocol mismatch: the raw body forces streaming even when the caller expects non-streaming responses.// Add stream if not present - if isStreaming { - requestBody["stream"] = true - } + if isStreaming { + requestBody["stream"] = true + } else { + delete(requestBody, "stream") + }core/providers/utils/utils.go (2)
256-275: Avoidsonic.MarshalIndentfor provider request bodies (payload bloat + regression).
CheckContextAndGetRequestBodyreturns the JSON that gets sent to providers; indenting increases bytes/CPU for every call.- jsonBody, err := sonic.MarshalIndent(convertedBody, "", " ") + jsonBody, err := sonic.Marshal(convertedBody)
1026-1038:GetRandomString: revert to the already-reviewed, lower-allocation implementation.Current code rebuilds a
[]runealphabet and allocates[]runeoutput each call.func GetRandomString(length int) string { if length <= 0 { return "" } randomSource := rand.New(rand.NewSource(time.Now().UnixNano())) - letters := []rune("abcdefghijklmnopqrstuvwxyz0123456789") - b := make([]rune, length) + const letters = "abcdefghijklmnopqrstuvwxyz0123456789" + b := make([]byte, length) for i := range b { - b[i] = letters[randomSource.Intn(len(letters))] + b[i] = letters[randomSource.Intn(len(letters))] } return string(b) }core/providers/vertex/utils.go (1)
13-43: Raw-body path must normalize"stream"whenisStreaming=false(avoid accidental streaming).If caller provides raw JSON with
"stream": truebutisStreamingis false, the request still streams.- // Add stream if not present - if isStreaming { - requestBody["stream"] = true - } + if isStreaming { + requestBody["stream"] = true + } else { + delete(requestBody, "stream") + }core/schemas/responses.go (1)
69-70: Fix “Not is OpenAI’s spec” typo (appears twice).- StopReason *string `json:"stop_reason,omitempty"` // Not is OpenAI's spec, but sent by other providers + StopReason *string `json:"stop_reason,omitempty"` // Not in OpenAI's spec, but sent by other providers @@ - Signature *string `json:"signature,omitempty"` // Not is OpenAI's spec, but sent by other providers + Signature *string `json:"signature,omitempty"` // Not in OpenAI's spec, but sent by other providersAlso applies to: 399-411, 731-748, 1440-1442
core/providers/gemini/responses.go (1)
138-179: Fix unsafe tool-arguments pointer + ThoughtSignature encoding/decoding (still present in this PR branch).
This reintroduces the same failure modes:Arguments: &argumentsStreven when empty, andstring(part.ThoughtSignature)(non‑UTF8 risk). Also, the lookahead path should decode from base64.@@ - argumentsStr := "" + var argumentsStr *string if part.FunctionCall.Args != nil { if argsBytes, err := json.Marshal(part.FunctionCall.Args); err == nil { - argumentsStr = string(argsBytes) + s := string(argsBytes) + // Avoid empty/whitespace/"null" which breaks replay Unmarshal. + if ts := strings.TrimSpace(s); ts != "" && ts != "null" { + argumentsStr = &s + } } } @@ toolMsg := &schemas.ResponsesToolMessage{ CallID: &functionCallID, Name: &functionCallName, - Arguments: &argumentsStr, + Arguments: argumentsStr, } @@ if len(part.ThoughtSignature) > 0 { - thoughtSig := string(part.ThoughtSignature) + thoughtSig := base64.RawStdEncoding.EncodeToString(part.ThoughtSignature) reasoningMsg := schemas.ResponsesMessage{ @@ ResponsesReasoning: &schemas.ResponsesReasoning{ Summary: []schemas.ResponsesReasoningSummary{}, EncryptedContent: &thoughtSig, }, } messages = append(messages, reasoningMsg) } @@ if i+1 < len(messages) { nextMsg := messages[i+1] if nextMsg.Type != nil && *nextMsg.Type == schemas.ResponsesMessageTypeReasoning && nextMsg.ResponsesReasoning != nil && nextMsg.ResponsesReasoning.EncryptedContent != nil { - part.ThoughtSignature = []byte(*nextMsg.ResponsesReasoning.EncryptedContent) + sigStr := *nextMsg.ResponsesReasoning.EncryptedContent + if sigBytes, err := base64.RawStdEncoding.DecodeString(sigStr); err == nil { + part.ThoughtSignature = sigBytes + } else { + // Backward-compat fallback if older logs stored raw string bytes. + part.ThoughtSignature = []byte(sigStr) + } } }If this is “handled in a different PR in the stack”, please ensure this PR is rebased onto that fix before merging the stack (otherwise the bug remains here).
Also applies to: 609-626
core/internal/testutil/chat_completion_stream.go (1)
535-538: Cerebras prompt requires tools, but the request config provides none.
This is likely to fail/flap; either remove “use your tools” wording or configure tools for Cerebras.@@ - if testConfig.Provider == schemas.Cerebras { - problemPrompt = "Hello how are you, can you search hackernews news regarding maxim ai for me? use your tools for this" - } + if testConfig.Provider == schemas.Cerebras { + // Keep it reasoning-focused (no tools configured in this test). + problemPrompt = "Explain step by step: What is 15% of 200, then multiply that result by 3?" + }Also applies to: 544-556
core/providers/anthropic/utils.go (1)
30-53: Raw mode may override user's explicitstream: false.When
isStreamingis true, line 48 unconditionally setsrequestBody["stream"] = true, potentially overriding a user-providedstream: false. This differs from themax_tokenshandling which checks for existence first (line 43).Apply this diff to respect user-provided
streamvalues:// Add stream if not present if isStreaming { - requestBody["stream"] = true + if _, exists := requestBody["stream"]; !exists { + requestBody["stream"] = true + } }core/providers/bedrock/utils.go (1)
119-124:additionalModelRequestFieldPathscan overwritereasoning_configset earlier.This overwrites
AdditionalModelRequestFieldsentirely, potentially discarding thereasoning_configset in lines 43-75. Consider merging instead:if requestFields, exists := bifrostReq.Params.ExtraParams["additionalModelRequestFieldPaths"]; exists { if orderedFields, ok := schemas.SafeExtractOrderedMap(requestFields); ok { - bedrockReq.AdditionalModelRequestFields = orderedFields + if bedrockReq.AdditionalModelRequestFields == nil { + bedrockReq.AdditionalModelRequestFields = orderedFields + } else { + for key, value := range orderedFields { + bedrockReq.AdditionalModelRequestFields[key] = value + } + } } }transports/bifrost-http/integrations/anthropic.go (1)
208-208: Make claude-cli detection case-insensitive for header values.The user-agent header name comparison uses
EqualFold(line 201) for case-insensitivity, but the value comparison at lines 208 and 233 usesContains, which is case-sensitive. A user-agent like "Claude-CLI/1.0" or "CLAUDE-CLI" won't be detected.Apply this pattern for consistency:
- if strings.Contains(userAgent[0], "claude-cli") { + if strings.Contains(strings.ToLower(userAgent[0]), "claude-cli") { *bifrostCtx = context.WithValue(*bifrostCtx, schemas.BifrostContextKeyUserAgent, "claude-cli") }And similarly at line 233:
- if strings.Contains(userAgent, "claude-cli") { + if strings.Contains(strings.ToLower(userAgent), "claude-cli") { isClaudeCode = true }🤖 Prompt for AI Agents
In transports/bifrost-http/integrations/anthropic.go at lines 208 and 233, the code checks the User-Agent header value with strings.Contains which is case-sensitive; update both locations to use strings.Contains(strings.ToLower(userAgent), "claude-cli") or strings.Contains(strings.ToLower(userAgent[0]), "claude-cli") respectively, so the check matches "claude-cli", "Claude-CLI", "CLAUDE-CLI", etc.Also applies to: 233-233
🧹 Nitpick comments (11)
transports/bifrost-http/integrations/router.go (2)
709-712: Prefer capability-driven DONE-marker gating overstrings.Contains(config.Path, "/responses")(path check is brittle).
This matches the intended behavior (nodata: [DONE]for Responses/Anthropic), but tying semantics to a path substring can silently break if routes are renamed/aliased or other providers reuse/responses. Consider driving this off route type + request kind (e.g., ResponsesRequest) or aRouteConfigflag. Based on learnings, this is the right direction (typed SSE events; no legacy DONE).Also applies to: 916-927
883-885: Preformatted SSE detection is too narrow (might wrap valid SSE starting withid:/retry:/ comments).
If any converter emits those fields (or multi-line SSE where first line isn’tdata:/event:), this will incorrectly wrap and break the stream. Consider treating “already contains\n” as preformatted, or accept the standard SSE field prefixes.core/providers/openrouter/openrouter_test.go (1)
25-49: Model slug is valid, but consider env override for test resilience.
openai/gpt-oss-120bis a real OpenRouter model with reasoning support (released Aug 2025). The slug format is correct. However, since this is an external dependency, consider makingReasoningModeloverridable via environment variable (defaulting to this value) so CI/dev can quickly swap models without code changes if entitlements or availability changes.core/providers/utils/utils.go (2)
318-346: Keep the “copy body” invariant when decoding compressed provider errors.You start with a defensive copy (
body := append([]byte(nil), resp.Body()...)) but then overwrite it withdecodedBody(which can beresp.Body()for non-gzip). Consider copyingdecodedBodyback intobodybefore unmarshalling to avoid any future reuse/lifetime surprises if callers releaseresp.- body = decodedBody + body = append([]byte(nil), decodedBody...)Also applies to: 489-498
1040-1121: Reasoning-effort ↔ budget mapping: pin edge-case semantics with tests (esp.maxTokens <= minBudgetTokens).The “zero/negative range” paths return
"high"(effort) orminBudgetTokens(budget). That may be correct, but it’s easy to regress—worth locking with unit tests for:
budgetTokens <= 0maxTokens <= 0maxTokens == minBudgetTokens- unknown effort strings
core/internal/testutil/reasoning.go (1)
201-223: ChatCompletionReasoningTest: skip logic is fine; confirm provider capability split is intentional.
This runs reasoning via Chat Completions (and skips OpenAI). Just confirmtestConfig.Scenarios.Reasoningmeans “reasoning supported via both APIs”, or adjust to separate flags (Responses vs Chat) if you start seeing provider-specific failures.Also applies to: 232-245
core/internal/testutil/chat_completion_stream.go (1)
403-420: Use a timeout ctx for the request, not just for reading the channel.
Right nowstreamCtxonly affects the reader; the provider stream can keep running after timeout.@@ - responseChannel, err := WithStreamRetry(t, retryConfig, retryContext, func() (chan *schemas.BifrostStream, *schemas.BifrostError) { - return client.ChatCompletionStreamRequest(ctx, request) - }) + reqCtx, reqCancel := context.WithTimeout(ctx, 200*time.Second) + defer reqCancel() + responseChannel, err := WithStreamRetry(t, retryConfig, retryContext, func() (chan *schemas.BifrostStream, *schemas.BifrostError) { + return client.ChatCompletionStreamRequest(reqCtx, request) + }) @@ - streamCtx, cancel := context.WithTimeout(ctx, 200*time.Second) + streamCtx, cancel := context.WithTimeout(reqCtx, 200*time.Second) defer cancel()(Apply similarly in the “Validated” subtest.)
Also applies to: 591-599
core/providers/vertex/vertex.go (1)
777-785: DuplicateModelRequestedassignment.
response.ExtraFields.ModelRequestedis set twice - once within the struct initialization (line 780) and again at line 784. The second assignment is redundant.Apply this diff to remove the redundant assignment:
response.ExtraFields = schemas.BifrostResponseExtraFields{ RequestType: schemas.ResponsesRequest, Provider: providerName, ModelRequested: request.Model, Latency: latency.Milliseconds(), } - response.ExtraFields.ModelRequested = request.Model - // Set raw request if enabledcore/providers/bedrock/types.go (1)
100-140: Custom UnmarshalJSON implementation is correct but silently ignores malformed unknown fields.The implementation correctly:
- Uses the alias pattern to avoid infinite recursion
- Captures all unregistered fields into
ExtraParams- Includes all struct fields in
bedrockConverseRequestKnownFieldsHowever, line 133 silently continues when an unknown field fails to unmarshal. Consider logging this for debugging:
for key, value := range rawData { if !bedrockConverseRequestKnownFields[key] { var v interface{} if err := sonic.Unmarshal(value, &v); err != nil { - continue // Skip fields that can't be unmarshaled + // Log for debugging but continue processing other fields + continue } r.ExtraParams[key] = v } }transports/bifrost-http/integrations/anthropic.go (2)
7-7: Use structured logger instead of stdliblog.Printffor streaming errors.The
logpackage is imported (line 7) and used at line 115 for logging JSON marshaling errors. Based on learnings, you should use the existing structured logger instead of stdlib logging.Consider passing the router/handler logger through the converter context or using a logger from the request context, similar to how other handlers in this codebase log errors.
🤖 Prompt for AI Agents
In transports/bifrost-http/integrations/anthropic.go at line 115, replace the stdlib log.Printf call with a structured logger; either pass the logger through the converter function signature (e.g., add a logger parameter to ResponsesStreamResponseConverter) or extract it from the context (*ctx), then use logger.Error or logger.Warn to log the JSON marshaling error with structured fields (e.g., logger.Error("failed to marshal streaming response", "error", err)) instead of log.Printf.Also applies to: 115-115
110-120: Consider usingstrings.Builderfor SSE event concatenation.When concatenating multiple streaming events (lines 110-120), the code uses
combinedContent += fmt.Sprintf(...)which creates a new string on each iteration. For many events, this can be inefficient due to repeated allocations.Using
strings.Builderwould be more efficient:- combinedContent := "" + var combinedContent strings.Builder for _, event := range anthropicResponse { responseJSON, err := sonic.Marshal(event) if err != nil { log.Printf("Failed to marshal streaming response: %v", err) continue } - combinedContent += fmt.Sprintf("event: %s\ndata: %s\n\n", event.Type, responseJSON) + combinedContent.WriteString(fmt.Sprintf("event: %s\ndata: %s\n\n", event.Type, responseJSON)) } - return "", combinedContent, nil + return "", combinedContent.String(), nil🤖 Prompt for AI Agents
In transports/bifrost-http/integrations/anthropic.go around lines 110 to 120, replace the string concatenation (combinedContent += ...) with a strings.Builder for better performance when handling multiple events; declare var combinedContent strings.Builder, use combinedContent.WriteString instead of +=, and return combinedContent.String() at the end.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
ui/package-lock.jsonis excluded by!**/package-lock.json
📒 Files selected for processing (58)
core/internal/testutil/account.go(1 hunks)core/internal/testutil/chat_completion_stream.go(1 hunks)core/internal/testutil/reasoning.go(4 hunks)core/internal/testutil/responses_stream.go(1 hunks)core/internal/testutil/tests.go(2 hunks)core/providers/anthropic/anthropic.go(3 hunks)core/providers/anthropic/chat.go(5 hunks)core/providers/anthropic/errors.go(2 hunks)core/providers/anthropic/types.go(8 hunks)core/providers/anthropic/utils.go(2 hunks)core/providers/azure/azure.go(2 hunks)core/providers/azure/utils.go(1 hunks)core/providers/bedrock/bedrock.go(2 hunks)core/providers/bedrock/bedrock_test.go(13 hunks)core/providers/bedrock/types.go(2 hunks)core/providers/bedrock/utils.go(2 hunks)core/providers/cerebras/cerebras_test.go(2 hunks)core/providers/cohere/chat.go(3 hunks)core/providers/cohere/cohere.go(2 hunks)core/providers/cohere/cohere_test.go(1 hunks)core/providers/cohere/responses.go(7 hunks)core/providers/cohere/types.go(1 hunks)core/providers/gemini/responses.go(2 hunks)core/providers/groq/groq_test.go(2 hunks)core/providers/mistral/mistral_test.go(1 hunks)core/providers/openai/chat.go(1 hunks)core/providers/openai/responses.go(2 hunks)core/providers/openai/responses_marshal_test.go(1 hunks)core/providers/openai/responses_test.go(1 hunks)core/providers/openai/text.go(1 hunks)core/providers/openai/types.go(3 hunks)core/providers/openai/types_test.go(1 hunks)core/providers/openai/utils.go(1 hunks)core/providers/openrouter/openrouter_test.go(1 hunks)core/providers/utils/utils.go(4 hunks)core/providers/vertex/errors.go(1 hunks)core/providers/vertex/types.go(0 hunks)core/providers/vertex/utils.go(1 hunks)core/providers/vertex/vertex.go(3 hunks)core/providers/vertex/vertex_test.go(1 hunks)core/schemas/bifrost.go(1 hunks)core/schemas/mux.go(0 hunks)core/schemas/responses.go(5 hunks)core/schemas/utils.go(1 hunks)docs/docs.json(0 hunks)framework/configstore/rdb.go(0 hunks)framework/streaming/chat.go(0 hunks)framework/streaming/responses.go(2 hunks)transports/bifrost-http/handlers/inference.go(1 hunks)transports/bifrost-http/handlers/middlewares.go(1 hunks)transports/bifrost-http/integrations/anthropic.go(5 hunks)transports/bifrost-http/integrations/router.go(3 hunks)ui/app/workspace/logs/views/columns.tsx(1 hunks)ui/app/workspace/logs/views/logDetailsSheet.tsx(1 hunks)ui/app/workspace/logs/views/logResponsesMessageView.tsx(2 hunks)ui/app/workspace/logs/views/logResponsesOutputView.tsx(0 hunks)ui/lib/types/logs.ts(2 hunks)ui/package.json(1 hunks)
💤 Files with no reviewable changes (6)
- core/providers/vertex/types.go
- framework/streaming/chat.go
- ui/app/workspace/logs/views/logResponsesOutputView.tsx
- docs/docs.json
- framework/configstore/rdb.go
- core/schemas/mux.go
🚧 Files skipped from review as they are similar to previous changes (18)
- ui/lib/types/logs.ts
- core/providers/openai/utils.go
- core/providers/anthropic/errors.go
- core/schemas/utils.go
- core/schemas/bifrost.go
- ui/app/workspace/logs/views/columns.tsx
- core/internal/testutil/account.go
- ui/app/workspace/logs/views/logResponsesMessageView.tsx
- core/providers/anthropic/anthropic.go
- core/providers/vertex/vertex_test.go
- ui/package.json
- core/internal/testutil/tests.go
- transports/bifrost-http/handlers/middlewares.go
- core/providers/openai/chat.go
- core/providers/cerebras/cerebras_test.go
- core/internal/testutil/responses_stream.go
- core/providers/azure/azure.go
- framework/streaming/responses.go
🧰 Additional context used
📓 Path-based instructions (1)
**
⚙️ CodeRabbit configuration file
always check the stack if there is one for the current PR. do not give localized reviews for the PR, always see all changes in the light of the whole stack of PRs (if there is a stack, if there is no stack you can continue to make localized suggestions/reviews)
Files:
core/providers/openrouter/openrouter_test.gocore/providers/anthropic/chat.gocore/providers/vertex/errors.gocore/providers/anthropic/utils.gocore/providers/bedrock/types.gocore/providers/groq/groq_test.gocore/providers/openai/types.gocore/providers/openai/responses_test.gocore/providers/bedrock/utils.gocore/providers/vertex/vertex.gocore/providers/openai/text.gotransports/bifrost-http/integrations/anthropic.gotransports/bifrost-http/handlers/inference.goui/app/workspace/logs/views/logDetailsSheet.tsxcore/providers/cohere/chat.gocore/schemas/responses.gocore/providers/azure/utils.gocore/internal/testutil/chat_completion_stream.gocore/providers/bedrock/bedrock_test.gocore/providers/vertex/utils.gocore/providers/gemini/responses.gocore/providers/openai/types_test.gocore/providers/cohere/cohere_test.gocore/providers/cohere/cohere.gocore/internal/testutil/reasoning.gocore/providers/bedrock/bedrock.gocore/providers/openai/responses.gocore/providers/anthropic/types.gocore/providers/cohere/responses.gocore/providers/utils/utils.gocore/providers/openai/responses_marshal_test.gocore/providers/cohere/types.gocore/providers/mistral/mistral_test.gotransports/bifrost-http/integrations/router.go
🧠 Learnings (4)
📚 Learning: 2025-12-09T17:07:42.007Z
Learnt from: qwerty-dvorak
Repo: maximhq/bifrost PR: 1006
File: core/schemas/account.go:9-18
Timestamp: 2025-12-09T17:07:42.007Z
Learning: In core/schemas/account.go, the HuggingFaceKeyConfig field within the Key struct is currently unused and reserved for future Hugging Face inference endpoint deployments. Do not flag this field as missing from OpenAPI documentation or require its presence in the API spec until the feature is actively implemented and used. When the feature is added, update the OpenAPI docs accordingly; otherwise, treat this field as non-breaking and not part of the current API surface.
Applied to files:
core/providers/openrouter/openrouter_test.gocore/providers/anthropic/chat.gocore/providers/vertex/errors.gocore/providers/anthropic/utils.gocore/providers/bedrock/types.gocore/providers/groq/groq_test.gocore/providers/openai/types.gocore/providers/openai/responses_test.gocore/providers/bedrock/utils.gocore/providers/vertex/vertex.gocore/providers/openai/text.gotransports/bifrost-http/integrations/anthropic.gotransports/bifrost-http/handlers/inference.gocore/providers/cohere/chat.gocore/schemas/responses.gocore/providers/azure/utils.gocore/internal/testutil/chat_completion_stream.gocore/providers/bedrock/bedrock_test.gocore/providers/vertex/utils.gocore/providers/gemini/responses.gocore/providers/openai/types_test.gocore/providers/cohere/cohere_test.gocore/providers/cohere/cohere.gocore/internal/testutil/reasoning.gocore/providers/bedrock/bedrock.gocore/providers/openai/responses.gocore/providers/anthropic/types.gocore/providers/cohere/responses.gocore/providers/utils/utils.gocore/providers/openai/responses_marshal_test.gocore/providers/cohere/types.gocore/providers/mistral/mistral_test.gotransports/bifrost-http/integrations/router.go
📚 Learning: 2025-12-11T11:58:25.307Z
Learnt from: Pratham-Mishra04
Repo: maximhq/bifrost PR: 1000
File: core/providers/openai/responses.go:42-84
Timestamp: 2025-12-11T11:58:25.307Z
Learning: In core/providers/openai/responses.go (and related OpenAI response handling), document and enforce the API format constraint: if ResponsesReasoning != nil and the response contains content blocks, all content blocks should be treated as reasoning blocks by default. Implement type guards or parsing logic accordingly, and add unit tests to verify that when ResponsesReasoning is non-nil, content blocks are labeled as reasoning blocks. Include clear comments in the code explaining the rationale and ensure downstream consumers rely on this behavior.
Applied to files:
core/providers/openai/types.gocore/providers/openai/responses_test.gocore/providers/openai/text.gocore/providers/openai/types_test.gocore/providers/openai/responses.gocore/providers/openai/responses_marshal_test.go
📚 Learning: 2025-12-12T08:25:02.629Z
Learnt from: Pratham-Mishra04
Repo: maximhq/bifrost PR: 1000
File: transports/bifrost-http/integrations/router.go:709-712
Timestamp: 2025-12-12T08:25:02.629Z
Learning: In transports/bifrost-http/**/*.go, update streaming response handling to align with OpenAI Responses API: use typed SSE events such as response.created, response.output_text.delta, response.done, etc., and do not rely on the legacy data: [DONE] termination marker. Note that data: [DONE] is only used by the older Chat Completions and Text Completions streaming APIs. Ensure parsers, writers, and tests distinguish SSE events from the [DONE] sentinel and handle each event type accordingly for correct stream termination and progress updates.
Applied to files:
transports/bifrost-http/integrations/anthropic.gotransports/bifrost-http/handlers/inference.gotransports/bifrost-http/integrations/router.go
📚 Learning: 2025-12-11T07:38:31.413Z
Learnt from: Pratham-Mishra04
Repo: maximhq/bifrost PR: 1000
File: core/providers/bedrock/bedrock_test.go:1374-1390
Timestamp: 2025-12-11T07:38:31.413Z
Learning: In core/providers/bedrock tests, follow a layered testing approach: - Unit tests (e.g., TestBifrostToBedrockResponseConversion) should perform structural comparisons and type/field checks to avoid brittleness from dynamic fields. - Separate scenario-based and integration tests should validate the full end-to-end conversion logic, including content block internals. Ensure unit tests avoid brittle string/field matching and that integration tests cover end-to-end behavior with realistic data.
Applied to files:
core/providers/bedrock/bedrock_test.go
🧬 Code graph analysis (22)
core/providers/anthropic/chat.go (6)
core/providers/anthropic/types.go (2)
MinimumReasoningMaxTokens(15-15)AnthropicThinking(69-72)core/providers/bedrock/types.go (1)
MinimumReasoningMaxTokens(12-12)core/providers/cohere/types.go (1)
MinimumReasoningMaxTokens(11-11)core/providers/utils/utils.go (1)
GetBudgetTokensFromReasoningEffort(1084-1121)core/schemas/utils.go (1)
Ptr(16-18)core/utils.go (1)
Ptr(51-53)
core/providers/anthropic/utils.go (6)
core/schemas/responses.go (1)
BifrostResponsesRequest(32-39)core/providers/utils/utils.go (1)
NewBifrostOperationError(516-527)core/schemas/provider.go (2)
ErrRequestBodyConversion(25-25)ErrProviderRequestMarshal(26-26)core/providers/anthropic/types.go (1)
AnthropicDefaultMaxTokens(14-14)core/providers/anthropic/responses.go (1)
ToAnthropicResponsesRequest(1419-1532)core/schemas/utils.go (1)
Ptr(16-18)
core/providers/bedrock/types.go (4)
core/providers/anthropic/types.go (1)
Alias(104-104)core/providers/openai/types.go (2)
Alias(83-83)Alias(194-194)core/schemas/bifrost.go (2)
Alias(388-388)Alias(405-405)core/schemas/chatcompletions.go (4)
Alias(189-189)Alias(643-643)Alias(770-770)Alias(860-860)
core/providers/openai/types.go (4)
core/schemas/chatcompletions.go (5)
ChatParameters(155-184)Alias(189-189)Alias(643-643)Alias(770-770)Alias(860-860)core/providers/anthropic/types.go (1)
Alias(104-104)core/schemas/responses.go (1)
ResponsesParametersReasoning(234-239)ui/lib/types/logs.ts (1)
ResponsesParametersReasoning(513-520)
core/providers/openai/responses_test.go (3)
core/schemas/responses.go (7)
ResponsesMessage(314-327)ResponsesReasoning(731-734)ResponsesReasoningSummary(745-748)ResponsesMessageContent(339-344)ResponsesMessageContentBlock(399-411)ResponsesOutputMessageContentTypeReasoning(394-394)BifrostResponsesRequest(32-39)core/schemas/utils.go (1)
Ptr(16-18)core/providers/openai/responses.go (1)
ToOpenAIResponsesRequest(37-112)
core/providers/bedrock/utils.go (4)
core/schemas/utils.go (2)
IsAnthropicModel(1043-1045)Ptr(16-18)core/providers/anthropic/types.go (1)
MinimumReasoningMaxTokens(15-15)core/providers/cohere/types.go (2)
MinimumReasoningMaxTokens(11-11)DefaultCompletionMaxTokens(12-12)core/providers/utils/utils.go (1)
GetBudgetTokensFromReasoningEffort(1084-1121)
core/providers/openai/text.go (2)
core/schemas/textcompletions.go (1)
TextCompletionParameters(120-140)core/providers/openai/utils.go (1)
SanitizeUserField(51-56)
transports/bifrost-http/integrations/anthropic.go (3)
core/schemas/bifrost.go (5)
BifrostContextKeyUserAgent(123-123)Anthropic(37-37)BifrostContextKeyExtraHeaders(115-115)BifrostContextKeyURLPath(116-116)BifrostContextKeySkipKeySelection(114-114)core/providers/anthropic/types.go (1)
AnthropicStreamEvent(395-404)core/schemas/utils.go (1)
IsAnthropicModel(1043-1045)
transports/bifrost-http/handlers/inference.go (2)
core/schemas/bifrost.go (1)
ResponsesRequest(91-91)core/schemas/responses.go (1)
ResponsesParameters(87-114)
ui/app/workspace/logs/views/logDetailsSheet.tsx (3)
ui/components/ui/separator.tsx (1)
DottedSeparator(43-43)ui/app/workspace/logs/views/logEntryDetailsView.tsx (1)
LogEntryDetailsView(15-49)ui/components/ui/badge.tsx (1)
Badge(37-37)
core/providers/cohere/chat.go (5)
core/providers/cohere/types.go (5)
CohereThinking(174-177)ThinkingTypeEnabled(183-183)DefaultCompletionMaxTokens(12-12)MinimumReasoningMaxTokens(11-11)ThinkingTypeDisabled(184-184)core/providers/bedrock/types.go (2)
DefaultCompletionMaxTokens(13-13)MinimumReasoningMaxTokens(12-12)core/providers/utils/utils.go (1)
GetBudgetTokensFromReasoningEffort(1084-1121)core/providers/anthropic/types.go (1)
MinimumReasoningMaxTokens(15-15)core/schemas/utils.go (1)
Ptr(16-18)
core/schemas/responses.go (2)
core/providers/gemini/types.go (1)
Type(778-778)ui/lib/types/logs.ts (2)
ResponsesMessageContentBlockType(352-359)ResponsesReasoningSummary(412-415)
core/providers/azure/utils.go (6)
core/schemas/responses.go (1)
BifrostResponsesRequest(32-39)core/providers/utils/utils.go (1)
NewBifrostOperationError(516-527)core/schemas/provider.go (2)
ErrRequestBodyConversion(25-25)ErrProviderRequestMarshal(26-26)core/providers/anthropic/types.go (1)
AnthropicDefaultMaxTokens(14-14)core/providers/anthropic/responses.go (1)
ToAnthropicResponsesRequest(1419-1532)core/schemas/utils.go (1)
Ptr(16-18)
core/internal/testutil/chat_completion_stream.go (5)
core/schemas/chatcompletions.go (6)
ChatMessage(469-478)BifrostChatRequest(12-19)ChatParameters(155-184)ChatReasoning(223-226)BifrostChatResponse(26-41)BifrostReasoningDetailsTypeText(719-719)core/internal/testutil/utils.go (1)
CreateBasicChatMessage(247-254)core/schemas/utils.go (1)
Ptr(16-18)core/internal/testutil/test_retry_framework.go (5)
StreamingRetryConfig(819-839)TestRetryContext(168-173)WithStreamRetry(580-688)WithChatStreamValidationRetry(2339-2485)ChatStreamValidationResult(2328-2335)core/schemas/bifrost.go (2)
BifrostStream(323-330)BifrostError(358-367)
core/providers/bedrock/bedrock_test.go (3)
core/schemas/utils.go (1)
Ptr(16-18)core/schemas/responses.go (4)
ResponsesMessageTypeMessage(290-290)ResponsesMessageTypeFunctionCallOutput(296-296)ResponsesToolMessage(462-482)ResponsesToolMessageOutputStruct(531-535)core/schemas/chatcompletions.go (1)
OrderedMap(268-268)
core/providers/openai/types_test.go (2)
core/providers/openai/types.go (1)
OpenAIChatRequest(46-59)core/schemas/chatcompletions.go (2)
ChatMessageRoleUser(462-462)ChatMessageRoleSystem(463-463)
core/providers/cohere/cohere.go (1)
core/providers/cohere/responses.go (1)
ToCohereResponsesRequest(916-1030)
core/internal/testutil/reasoning.go (8)
core/internal/testutil/account.go (1)
ComprehensiveTestConfig(47-64)core/schemas/bifrost.go (2)
OpenAI(35-35)BifrostError(358-367)ui/lib/types/logs.ts (2)
ChatMessage(108-117)BifrostError(226-232)core/schemas/chatcompletions.go (5)
ChatMessage(469-478)BifrostChatRequest(12-19)ChatParameters(155-184)ChatReasoning(223-226)BifrostChatResponse(26-41)core/internal/testutil/utils.go (2)
CreateBasicChatMessage(247-254)GetChatContent(329-356)core/schemas/models.go (1)
Model(109-129)core/internal/testutil/test_retry_framework.go (1)
WithChatTestRetry(274-424)core/internal/testutil/validation_presets.go (1)
GetExpectationsForScenario(208-293)
core/providers/bedrock/bedrock.go (1)
core/schemas/bifrost.go (1)
BifrostContextKeyStreamEndIndicator(113-113)
core/providers/utils/utils.go (2)
core/schemas/bifrost.go (2)
BifrostError(358-367)ErrorField(376-383)ui/lib/types/logs.ts (2)
BifrostError(226-232)ErrorField(217-224)
core/providers/openai/responses_marshal_test.go (3)
core/providers/openai/types.go (2)
OpenAIResponsesRequest(180-189)OpenAIResponsesRequestInput(147-150)core/schemas/utils.go (1)
Ptr(16-18)core/schemas/responses.go (3)
ResponsesParameters(87-114)ResponsesParametersReasoning(234-239)ResponsesMessage(314-327)
transports/bifrost-http/integrations/router.go (1)
core/schemas/bifrost.go (1)
BifrostContextKeyIntegrationType(120-120)
⏰ Context from checks skipped due to timeout of 900000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Graphite / mergeability_check
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 5
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (3)
core/providers/vertex/vertex_test.go (1)
31-47: Clarify the reasoning model configuration.The test sets
ReasoningModel: "claude-4.5-haiku"but keepsReasoning: falsewith a comment stating "Not supported right now because we are not using native gemini converters." This is inconsistent:
- If reasoning isn't supported, why specify a Claude reasoning model?
- The comment mentions Gemini converters, but the model is Claude
Either:
- Remove or comment out the
ReasoningModelfield if reasoning is truly unsupported, or- Enable reasoning (
Reasoning: true) and update the comment to reflect Claude-based reasoning support via VertexThis will prevent confusion about the actual reasoning capabilities being tested.
core/providers/openai/responses.go (1)
114-142: Remove unsupported tool types that will cause OpenAI API failures.The
supportedTypesmap is over-inclusive and contradicts OpenAI's official May 2025 Responses API specification. OpenAI only supports:file_search,web_search,code_interpreter,image_generation, andmcp.The map currently marks these as supported (but OpenAI doesn't support them):
function— not a tool type in OpenAI's APIcomputer_use_preview— Anthropic-specific toollocal_shell— Anthropic-specific tool (bash)custom— not in OpenAI's APIweb_search_preview— not in official OpenAI specThese unsupported types will pass through the filter and be sent to OpenAI, causing API failures. Update the map to only allow the five tool types OpenAI actually supports.
framework/streaming/responses.go (1)
14-157: Deep-copy is missing newly-addedSignature(and other scalar pointers likeFileID).
This defeats the “prevent shared mutation” goal once Signature starts being appended during accumulation.func deepCopyResponsesStreamResponse(original *schemas.BifrostResponsesStreamResponse) *schemas.BifrostResponsesStreamResponse { @@ if original.Delta != nil { copyDelta := *original.Delta copy.Delta = ©Delta } + + if original.Signature != nil { + copySig := *original.Signature + copy.Signature = ©Sig + } @@ } func deepCopyResponsesMessageContentBlock(original schemas.ResponsesMessageContentBlock) schemas.ResponsesMessageContentBlock { copy := schemas.ResponsesMessageContentBlock{ Type: original.Type, } + + if original.FileID != nil { + fid := *original.FileID + copy.FileID = &fid + } if original.Text != nil { copyText := *original.Text copy.Text = ©Text } + + if original.Signature != nil { + sig := *original.Signature + copy.Signature = &sig + } @@ return copy }Also applies to: 381-424
♻️ Duplicate comments (17)
ui/app/workspace/logs/views/logDetailsSheet.tsx (1)
187-237: Fix boolean/0-value rendering in “Reasoning Parameters” (currently hides valid data / renders empty Badge).
This block still uses truthiness checks, and rendersreasoning.generate_summarydirectly inside a<Badge>(booleans don’t render as text), plus hidesmax_tokens: 0.Suggested patch:
{(() => { const params = log.params as any; const reasoning = params?.reasoning; - if (!reasoning || typeof reasoning !== "object" || Object.keys(reasoning).length === 0) { + if ( + !reasoning || + typeof reasoning !== "object" || + Array.isArray(reasoning) || + Object.keys(reasoning).length === 0 + ) { return null; } return ( <> <DottedSeparator /> <div className="space-y-4"> <BlockHeader title="Reasoning Parameters" icon={<FileText className="h-5 w-5 text-gray-600" />} /> <div className="grid w-full grid-cols-3 items-center justify-between gap-4"> - {reasoning.effort && ( + {typeof reasoning.effort === "string" && ( <LogEntryDetailsView className="w-full" label="Effort" value={ <Badge variant="secondary" className="uppercase"> {reasoning.effort} </Badge> } /> )} - {reasoning.summary && ( + {typeof reasoning.summary === "string" && ( <LogEntryDetailsView className="w-full" label="Summary" value={ <Badge variant="secondary" className="uppercase"> {reasoning.summary} </Badge> } /> )} - {reasoning.generate_summary && ( + {typeof reasoning.generate_summary === "boolean" && ( <LogEntryDetailsView className="w-full" label="Generate Summary" value={ <Badge variant="secondary" className="uppercase"> - {reasoning.generate_summary} + {String(reasoning.generate_summary)} </Badge> } /> )} - {reasoning.max_tokens && <LogEntryDetailsView className="w-full" label="Max Tokens" value={reasoning.max_tokens} />} + {typeof reasoning.max_tokens === "number" && ( + <LogEntryDetailsView className="w-full" label="Max Tokens" value={reasoning.max_tokens} /> + )} </div> </div> </> ); })()}Follow-up (stack-aware): consider typing
LogEntry["params"]to includereasoninginstead ofas any, since the stack is standardizing reasoning/summaries across providers.core/providers/anthropic/errors.go (1)
3-10: Re-check package-level duplication + considersonic.Marshalfor consistency.Given prior duplicate-definition issues for
ToAnthropicResponsesStreamError, please re-verify it exists in exactly one file in packageanthropic.transports/bifrost-http/integrations/router.go (1)
709-712: Avoid path-substring checks for stream termination behavior; gate by request type instead.Using
strings.Contains(config.Path, "/responses")is fragile (aliases/versioned paths can bypass it). Since this is about Responses vs non-Responses streaming semantics (and per learnings, Responses should not use[DONE]), consider decidingshouldSendDoneMarkerearlier inhandleStreamingRequestbased onbifrostReq.ResponsesRequest != nil, and pass it intohandleStreaming(...). Based on learnings, this reduces the chance of accidentally emitting[DONE]for Responses routes.Proposed direction (sketch):
-func (g *GenericRouter) handleStreaming(ctx *fasthttp.RequestCtx, bifrostCtx *context.Context, config RouteConfig, streamChan chan *schemas.BifrostStream, cancel context.CancelFunc) { +func (g *GenericRouter) handleStreaming(ctx *fasthttp.RequestCtx, bifrostCtx *context.Context, config RouteConfig, streamChan chan *schemas.BifrostStream, cancel context.CancelFunc, shouldSendDoneMarker bool) { ... - shouldSendDoneMarker := true - if config.Type == RouteConfigTypeAnthropic || strings.Contains(config.Path, "/responses") { - shouldSendDoneMarker = false - }core/providers/anthropic/utils.go (1)
30-76: Raw-body branch should normalize"stream"based onisStreaming(and handle empty raw body).
Right now, non-streaming calls can accidentally keep a user-provided"stream": true, and streaming calls always overwrite user intent. At minimum, make behavior explicit and consistent.if useRawBody, ok := ctx.Value(schemas.BifrostContextKeyUseRawRequestBody).(bool); ok && useRawBody { jsonBody = request.GetRawRequestBody() + if len(jsonBody) == 0 { + return nil, providerUtils.NewBifrostOperationError("request body is not provided", nil, providerName) + } // Unmarshal and check if model and region are present var requestBody map[string]interface{} if err := sonic.Unmarshal(jsonBody, &requestBody); err != nil { return nil, providerUtils.NewBifrostOperationError(schemas.ErrRequestBodyConversion, fmt.Errorf("failed to unmarshal request body: %w", err), providerName) } @@ - // Add stream if not present - if isStreaming { - requestBody["stream"] = true - } + // Normalize stream to match the actual transport + if isStreaming { + requestBody["stream"] = true + } else { + delete(requestBody, "stream") + }Also please document the security/contract assumptions of
bifrost-use-raw-request-body(who is allowed to set it, and what validations are intentionally bypassed).core/providers/vertex/utils.go (1)
13-86: Raw-body branch should delete"stream"whenisStreaming == falseto prevent accidental SSE.// Add stream if not present if isStreaming { requestBody["stream"] = true + } else { + delete(requestBody, "stream") }(Also consider sharing the same normalization behavior across
core/providers/vertex/utils.goandcore/providers/anthropic/utils.goso the stack has one consistent raw-body contract.)core/providers/bedrock/utils.go (1)
119-124:additionalModelRequestFieldPathsoverwritesreasoning_configset earlier.When both
Reasoningconfig (lines 35-77) andadditionalModelRequestFieldPathsinExtraParamsare provided, the assignment at line 122 overwritesAdditionalModelRequestFieldsentirely, losing thereasoning_configthat was set earlier.Consider merging instead of overwriting:
if requestFields, exists := bifrostReq.Params.ExtraParams["additionalModelRequestFieldPaths"]; exists { if orderedFields, ok := schemas.SafeExtractOrderedMap(requestFields); ok { - bedrockReq.AdditionalModelRequestFields = orderedFields + if bedrockReq.AdditionalModelRequestFields == nil { + bedrockReq.AdditionalModelRequestFields = orderedFields + } else { + // Merge orderedFields into existing, preserving reasoning_config + for key, value := range orderedFields { + bedrockReq.AdditionalModelRequestFields[key] = value + } + } } }core/providers/azure/utils.go (2)
31-34: Raw-body path should normalizestreamfield whenisStreaming=false.When
useRawRequestBody=trueand the raw JSON already contains"stream": true, this code won't reset it whenisStreaming=false. This creates a protocol mismatch risk where the raw body forces streaming even when the caller expects non-streaming responses.// Add stream if not present if isStreaming { requestBody["stream"] = true + } else { + delete(requestBody, "stream") }
45-47: Avoid string-literal error codes inNewBifrostOperationError.Using
"request body is not provided"as a string literal won't be handled consistently with other error codes. Prefer aschemas.Err...constant.if reqBody == nil { - return nil, providerUtils.NewBifrostOperationError("request body is not provided", nil, providerName) + return nil, providerUtils.NewBifrostOperationError(schemas.ErrRequestBodyMissing, nil, providerName) }Note: You may need to define
ErrRequestBodyMissingin the schemas package if it doesn't exist.core/internal/testutil/chat_completion_stream.go (1)
523-708: Cerebras prompt requires tools but the test doesn’t configure any tools.
This is the same issue previously raised; it will likely cause failures/non-determinism for Cerebras. Consider using the same reasoning/math prompt as other providers, or wire tool configuration for Cerebras in this scenario.- if testConfig.Provider == schemas.Cerebras { - problemPrompt = "Hello how are you, can you search hackernews news regarding maxim ai for me? use your tools for this" - } + if testConfig.Provider == schemas.Cerebras { + problemPrompt = "Explain step by step: What is 15% of 200, then multiply that result by 3?" + }core/schemas/responses.go (2)
69-70: Fix comment typo (“Not in OpenAI’s spec”).- StopReason *string `json:"stop_reason,omitempty"` // Not is OpenAI's spec, but sent by other providers + StopReason *string `json:"stop_reason,omitempty"` // Not in OpenAI's spec, but sent by other providers
1440-1443: Fix comment typo (“Not in OpenAI’s spec”).- Signature *string `json:"signature,omitempty"` // Not is OpenAI's spec, but sent by other providers + Signature *string `json:"signature,omitempty"` // Not in OpenAI's spec, but sent by other providersframework/streaming/responses.go (2)
498-534: Don’t mutate a non-reasoning message when matching byItemIDonly.
This is the same issue previously raised: strengthen the lookup to ensureType == reasoningbefore appending deltas/signatures.- for i := len(messages) - 1; i >= 0; i-- { - if messages[i].ID != nil && *messages[i].ID == *resp.ItemID { + for i := len(messages) - 1; i >= 0; i-- { + if messages[i].ID != nil && *messages[i].ID == *resp.ItemID && + messages[i].Type != nil && *messages[i].Type == schemas.ResponsesMessageTypeReasoning { targetMessage = &messages[i] break } }
626-727: Reasoning summary delta/signature should consistently populateResponsesReasoning(notContentBlocks) + guard negativeContentIndex.
Current logic writes into content blocks whencontentIndex != nil, but this event type is “reasoning_summary_text.delta”; it should map toResponsesReasoning.Summary(+ signature intoEncryptedContent) regardless of ContentIndex. Also,make([]T, *contentIndex+1)will panic for negative values.func (a *Accumulator) appendReasoningDeltaToResponsesMessage(message *schemas.ResponsesMessage, delta string, contentIndex *int) { - // If we have a content index, this is reasoning content in content blocks - if contentIndex != nil { - ... - } else { - // No content index - this is reasoning summary accumulation - if message.ResponsesReasoning == nil { - message.ResponsesReasoning = &schemas.ResponsesReasoning{ - Summary: []schemas.ResponsesReasoningSummary{}, - } - } - ... - } + if message.ResponsesReasoning == nil { + message.ResponsesReasoning = &schemas.ResponsesReasoning{Summary: []schemas.ResponsesReasoningSummary{}} + } + idx := 0 + if contentIndex != nil && *contentIndex >= 0 { + idx = *contentIndex + } + for len(message.ResponsesReasoning.Summary) <= idx { + message.ResponsesReasoning.Summary = append(message.ResponsesReasoning.Summary, schemas.ResponsesReasoningSummary{ + Type: schemas.ResponsesReasoningContentBlockTypeSummaryText, + Text: "", + }) + } + message.ResponsesReasoning.Summary[idx].Text += delta } func (a *Accumulator) appendReasoningSignatureToResponsesMessage(message *schemas.ResponsesMessage, signature string, contentIndex *int) { - if contentIndex != nil { - ... - } else { - ... - } + if message.ResponsesReasoning == nil { + message.ResponsesReasoning = &schemas.ResponsesReasoning{Summary: []schemas.ResponsesReasoningSummary{}} + } + if message.ResponsesReasoning.EncryptedContent == nil { + message.ResponsesReasoning.EncryptedContent = &signature + } else { + *message.ResponsesReasoning.EncryptedContent += signature + } }transports/bifrost-http/integrations/anthropic.go (4)
109-120: Avoid stdliblog.Printfin request path; use structured logger (and considerstrings.Builder).
196-238: Make claude-cli detection case-insensitive for header values.- if strings.Contains(userAgent[0], "claude-cli") { + if strings.Contains(strings.ToLower(userAgent[0]), "claude-cli") { *bifrostCtx = context.WithValue(*bifrostCtx, schemas.BifrostContextKeyUserAgent, "claude-cli") } @@ - if strings.Contains(userAgent, "claude-cli") { + if strings.Contains(strings.ToLower(userAgent), "claude-cli") { isClaudeCode = true }
240-245: Re-checkprovider == ""fallback inisClaudeModelto avoid misclassification when Provider is omitted.
90-126: Don’t drop passthrough stream events when type parse fails (can stall the stream).
Same as previously noted: fall back to forwarding the raw payload even if you can’t classify the event type.- var rawResponseJSON anthropic.AnthropicStreamEvent - if err := sonic.Unmarshal([]byte(raw), &rawResponseJSON); err == nil { - return string(rawResponseJSON.Type), raw, nil - } + var rawResponseJSON anthropic.AnthropicStreamEvent + if err := sonic.Unmarshal([]byte(raw), &rawResponseJSON); err == nil { + return string(rawResponseJSON.Type), raw, nil + } + // Fallback: forward raw event payload even if we can't classify its type. + return "", raw, nil
🧹 Nitpick comments (6)
core/providers/bedrock/types.go (1)
129-137: Consider logging or tracking skipped fields during unmarshaling.The
continuestatement on line 133 silently drops fields that cannot be unmarshaled intoExtraParams. This could mask malformed data or unexpected field types without providing visibility into what's being skipped.Consider adding logging or metrics to track when fields are skipped:
// Extract unknown fields for key, value := range rawData { if !bedrockConverseRequestKnownFields[key] { var v interface{} if err := sonic.Unmarshal(value, &v); err != nil { - continue // Skip fields that can't be unmarshaled + // Log or track skipped fields for observability + // For now, continue to maintain backward compatibility + continue } r.ExtraParams[key] = v } }core/providers/utils/utils.go (1)
318-335: Consider optimizing body handling to avoid unnecessary allocation.The code copies
resp.Body()intobodyat line 320, then immediately callsCheckAndDecodeBodywhich may decompress the response. If decoding succeeds, the originalbodycopy is replaced (line 334), making the initial copy wasteful.Consider this optimization:
func HandleProviderAPIError(resp *fasthttp.Response, errorResp any) *schemas.BifrostError { statusCode := resp.StatusCode() - body := append([]byte(nil), resp.Body()...) // decode body decodedBody, err := CheckAndDecodeBody(resp) if err != nil { return &schemas.BifrostError{ IsBifrostError: false, StatusCode: &statusCode, Error: &schemas.ErrorField{ Message: err.Error(), }, } } - body = decodedBody + body := decodedBodyThis avoids the unnecessary initial copy when decoding succeeds (the common case).
transports/bifrost-http/integrations/router.go (1)
312-314: Prefer storingRouteConfigType(or a dedicated enum) in context instead ofstring(config.Type).If downstream code can accept it, store
config.Typedirectly to avoid stringly-typed comparisons (and future typo bugs).core/providers/openai/responses_test.go (1)
10-193: Nice coverage for skip/preserve rules; consider asserting exact counts, not just presence.
Right now the test reduces to “included vs not”; if future logic accidentally adds two messages, this won’t catch it.core/internal/testutil/chat_completion_stream.go (1)
359-521: Reasoning stream test is permissive; consider asserting at least one “reasoning indicator” in non-flaky providers.
Right now the test logs indicators but doesn’t fail if none are present (which makes the test less useful as a regression detector). If this is intentional due to provider variability, OK—otherwise consider failing for providers/models where reasoning is expected to surface.core/schemas/responses.go (1)
730-748: LGTM:ResponsesReasoningSummaryaligns with UI shape (type: "summary_text", text).
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
ui/package-lock.jsonis excluded by!**/package-lock.json
📒 Files selected for processing (58)
core/internal/testutil/account.go(1 hunks)core/internal/testutil/chat_completion_stream.go(1 hunks)core/internal/testutil/reasoning.go(4 hunks)core/internal/testutil/responses_stream.go(1 hunks)core/internal/testutil/tests.go(2 hunks)core/providers/anthropic/anthropic.go(3 hunks)core/providers/anthropic/chat.go(5 hunks)core/providers/anthropic/errors.go(2 hunks)core/providers/anthropic/types.go(8 hunks)core/providers/anthropic/utils.go(2 hunks)core/providers/azure/azure.go(2 hunks)core/providers/azure/utils.go(1 hunks)core/providers/bedrock/bedrock.go(2 hunks)core/providers/bedrock/bedrock_test.go(13 hunks)core/providers/bedrock/types.go(2 hunks)core/providers/bedrock/utils.go(2 hunks)core/providers/cerebras/cerebras_test.go(2 hunks)core/providers/cohere/chat.go(3 hunks)core/providers/cohere/cohere.go(2 hunks)core/providers/cohere/cohere_test.go(1 hunks)core/providers/cohere/responses.go(7 hunks)core/providers/cohere/types.go(1 hunks)core/providers/gemini/responses.go(2 hunks)core/providers/groq/groq_test.go(2 hunks)core/providers/mistral/mistral_test.go(1 hunks)core/providers/openai/chat.go(1 hunks)core/providers/openai/responses.go(2 hunks)core/providers/openai/responses_marshal_test.go(1 hunks)core/providers/openai/responses_test.go(1 hunks)core/providers/openai/text.go(1 hunks)core/providers/openai/types.go(3 hunks)core/providers/openai/types_test.go(1 hunks)core/providers/openai/utils.go(1 hunks)core/providers/openrouter/openrouter_test.go(1 hunks)core/providers/utils/utils.go(4 hunks)core/providers/vertex/errors.go(1 hunks)core/providers/vertex/types.go(0 hunks)core/providers/vertex/utils.go(1 hunks)core/providers/vertex/vertex.go(3 hunks)core/providers/vertex/vertex_test.go(1 hunks)core/schemas/bifrost.go(1 hunks)core/schemas/mux.go(0 hunks)core/schemas/responses.go(5 hunks)core/schemas/utils.go(1 hunks)docs/docs.json(0 hunks)framework/configstore/rdb.go(0 hunks)framework/streaming/chat.go(0 hunks)framework/streaming/responses.go(2 hunks)transports/bifrost-http/handlers/inference.go(1 hunks)transports/bifrost-http/handlers/middlewares.go(1 hunks)transports/bifrost-http/integrations/anthropic.go(5 hunks)transports/bifrost-http/integrations/router.go(3 hunks)ui/app/workspace/logs/views/columns.tsx(1 hunks)ui/app/workspace/logs/views/logDetailsSheet.tsx(1 hunks)ui/app/workspace/logs/views/logResponsesMessageView.tsx(2 hunks)ui/app/workspace/logs/views/logResponsesOutputView.tsx(0 hunks)ui/lib/types/logs.ts(2 hunks)ui/package.json(1 hunks)
💤 Files with no reviewable changes (6)
- ui/app/workspace/logs/views/logResponsesOutputView.tsx
- core/providers/vertex/types.go
- core/schemas/mux.go
- framework/streaming/chat.go
- framework/configstore/rdb.go
- docs/docs.json
🚧 Files skipped from review as they are similar to previous changes (16)
- ui/app/workspace/logs/views/logResponsesMessageView.tsx
- core/schemas/bifrost.go
- ui/lib/types/logs.ts
- core/providers/cohere/types.go
- ui/package.json
- core/providers/azure/azure.go
- core/schemas/utils.go
- core/providers/vertex/vertex.go
- core/providers/bedrock/bedrock.go
- transports/bifrost-http/handlers/middlewares.go
- core/providers/openai/text.go
- core/internal/testutil/tests.go
- core/providers/cerebras/cerebras_test.go
- ui/app/workspace/logs/views/columns.tsx
- core/internal/testutil/account.go
- core/providers/openai/types_test.go
🧰 Additional context used
📓 Path-based instructions (1)
**
⚙️ CodeRabbit configuration file
always check the stack if there is one for the current PR. do not give localized reviews for the PR, always see all changes in the light of the whole stack of PRs (if there is a stack, if there is no stack you can continue to make localized suggestions/reviews)
Files:
transports/bifrost-http/handlers/inference.gocore/providers/vertex/vertex_test.gocore/providers/anthropic/errors.gocore/providers/openai/utils.gocore/providers/openrouter/openrouter_test.gocore/providers/vertex/errors.gocore/providers/openai/responses.gocore/providers/anthropic/chat.gocore/providers/anthropic/utils.gocore/providers/openai/types.gocore/internal/testutil/chat_completion_stream.gocore/providers/vertex/utils.gocore/providers/azure/utils.gocore/providers/utils/utils.gocore/providers/cohere/cohere_test.gocore/providers/openai/responses_marshal_test.goframework/streaming/responses.gocore/providers/anthropic/types.gocore/providers/bedrock/types.gocore/providers/bedrock/utils.gocore/schemas/responses.gocore/providers/gemini/responses.gocore/internal/testutil/responses_stream.gocore/internal/testutil/reasoning.gocore/providers/mistral/mistral_test.gocore/providers/groq/groq_test.gocore/providers/cohere/cohere.gocore/providers/openai/responses_test.gocore/providers/openai/chat.gotransports/bifrost-http/integrations/anthropic.gocore/providers/bedrock/bedrock_test.gocore/providers/anthropic/anthropic.gotransports/bifrost-http/integrations/router.gocore/providers/cohere/chat.goui/app/workspace/logs/views/logDetailsSheet.tsxcore/providers/cohere/responses.go
🧠 Learnings (4)
📚 Learning: 2025-12-09T17:07:42.007Z
Learnt from: qwerty-dvorak
Repo: maximhq/bifrost PR: 1006
File: core/schemas/account.go:9-18
Timestamp: 2025-12-09T17:07:42.007Z
Learning: In core/schemas/account.go, the HuggingFaceKeyConfig field within the Key struct is currently unused and reserved for future Hugging Face inference endpoint deployments. Do not flag this field as missing from OpenAPI documentation or require its presence in the API spec until the feature is actively implemented and used. When the feature is added, update the OpenAPI docs accordingly; otherwise, treat this field as non-breaking and not part of the current API surface.
Applied to files:
transports/bifrost-http/handlers/inference.gocore/providers/vertex/vertex_test.gocore/providers/anthropic/errors.gocore/providers/openai/utils.gocore/providers/openrouter/openrouter_test.gocore/providers/vertex/errors.gocore/providers/openai/responses.gocore/providers/anthropic/chat.gocore/providers/anthropic/utils.gocore/providers/openai/types.gocore/internal/testutil/chat_completion_stream.gocore/providers/vertex/utils.gocore/providers/azure/utils.gocore/providers/utils/utils.gocore/providers/cohere/cohere_test.gocore/providers/openai/responses_marshal_test.goframework/streaming/responses.gocore/providers/anthropic/types.gocore/providers/bedrock/types.gocore/providers/bedrock/utils.gocore/schemas/responses.gocore/providers/gemini/responses.gocore/internal/testutil/responses_stream.gocore/internal/testutil/reasoning.gocore/providers/mistral/mistral_test.gocore/providers/groq/groq_test.gocore/providers/cohere/cohere.gocore/providers/openai/responses_test.gocore/providers/openai/chat.gotransports/bifrost-http/integrations/anthropic.gocore/providers/bedrock/bedrock_test.gocore/providers/anthropic/anthropic.gotransports/bifrost-http/integrations/router.gocore/providers/cohere/chat.gocore/providers/cohere/responses.go
📚 Learning: 2025-12-12T08:25:02.629Z
Learnt from: Pratham-Mishra04
Repo: maximhq/bifrost PR: 1000
File: transports/bifrost-http/integrations/router.go:709-712
Timestamp: 2025-12-12T08:25:02.629Z
Learning: In transports/bifrost-http/**/*.go, update streaming response handling to align with OpenAI Responses API: use typed SSE events such as response.created, response.output_text.delta, response.done, etc., and do not rely on the legacy data: [DONE] termination marker. Note that data: [DONE] is only used by the older Chat Completions and Text Completions streaming APIs. Ensure parsers, writers, and tests distinguish SSE events from the [DONE] sentinel and handle each event type accordingly for correct stream termination and progress updates.
Applied to files:
transports/bifrost-http/handlers/inference.gotransports/bifrost-http/integrations/anthropic.gotransports/bifrost-http/integrations/router.go
📚 Learning: 2025-12-11T11:58:25.307Z
Learnt from: Pratham-Mishra04
Repo: maximhq/bifrost PR: 1000
File: core/providers/openai/responses.go:42-84
Timestamp: 2025-12-11T11:58:25.307Z
Learning: In core/providers/openai/responses.go (and related OpenAI response handling), document and enforce the API format constraint: if ResponsesReasoning != nil and the response contains content blocks, all content blocks should be treated as reasoning blocks by default. Implement type guards or parsing logic accordingly, and add unit tests to verify that when ResponsesReasoning is non-nil, content blocks are labeled as reasoning blocks. Include clear comments in the code explaining the rationale and ensure downstream consumers rely on this behavior.
Applied to files:
core/providers/openai/utils.gocore/providers/openai/responses.gocore/providers/openai/types.gocore/providers/openai/responses_marshal_test.gocore/providers/openai/responses_test.gocore/providers/openai/chat.go
📚 Learning: 2025-12-11T07:38:31.413Z
Learnt from: Pratham-Mishra04
Repo: maximhq/bifrost PR: 1000
File: core/providers/bedrock/bedrock_test.go:1374-1390
Timestamp: 2025-12-11T07:38:31.413Z
Learning: In core/providers/bedrock tests, follow a layered testing approach: - Unit tests (e.g., TestBifrostToBedrockResponseConversion) should perform structural comparisons and type/field checks to avoid brittleness from dynamic fields. - Separate scenario-based and integration tests should validate the full end-to-end conversion logic, including content block internals. Ensure unit tests avoid brittle string/field matching and that integration tests cover end-to-end behavior with realistic data.
Applied to files:
core/providers/bedrock/bedrock_test.go
🧬 Code graph analysis (25)
transports/bifrost-http/handlers/inference.go (1)
core/schemas/responses.go (1)
ResponsesParameters(87-114)
core/providers/anthropic/errors.go (2)
core/schemas/bifrost.go (1)
BifrostError(358-367)ui/lib/types/logs.ts (1)
BifrostError(226-232)
core/providers/vertex/errors.go (4)
core/providers/utils/utils.go (2)
CheckAndDecodeBody(490-498)NewBifrostOperationError(516-527)core/schemas/provider.go (1)
ErrProviderResponseDecode(29-29)core/providers/vertex/vertex.go (1)
VertexError(25-31)core/providers/vertex/types.go (1)
VertexValidationError(153-160)
core/providers/openai/responses.go (4)
core/schemas/responses.go (6)
ResponsesMessage(314-327)ResponsesReasoning(731-734)ResponsesMessageContentBlock(399-411)ResponsesOutputMessageContentTypeReasoning(394-394)ResponsesMessageContent(339-344)ResponsesParameters(87-114)core/schemas/utils.go (1)
Ptr(16-18)core/providers/openai/types.go (3)
OpenAIResponsesRequest(180-189)OpenAIResponsesRequestInput(147-150)MinMaxCompletionTokens(12-12)core/providers/openai/utils.go (1)
SanitizeUserField(51-56)
core/providers/anthropic/chat.go (6)
core/providers/anthropic/types.go (2)
MinimumReasoningMaxTokens(15-15)AnthropicThinking(69-72)core/providers/bedrock/types.go (1)
MinimumReasoningMaxTokens(12-12)core/providers/cohere/types.go (1)
MinimumReasoningMaxTokens(11-11)core/providers/utils/utils.go (1)
GetBudgetTokensFromReasoningEffort(1084-1121)core/schemas/utils.go (1)
Ptr(16-18)core/utils.go (1)
Ptr(51-53)
core/providers/anthropic/utils.go (6)
core/schemas/responses.go (1)
BifrostResponsesRequest(32-39)core/providers/utils/utils.go (1)
NewBifrostOperationError(516-527)core/schemas/provider.go (1)
ErrRequestBodyConversion(25-25)core/providers/anthropic/types.go (1)
AnthropicDefaultMaxTokens(14-14)core/providers/anthropic/responses.go (1)
ToAnthropicResponsesRequest(1419-1532)core/schemas/utils.go (1)
Ptr(16-18)
core/providers/openai/types.go (3)
core/schemas/chatcompletions.go (1)
ChatParameters(155-184)core/schemas/responses.go (1)
ResponsesParametersReasoning(234-239)ui/lib/types/logs.ts (1)
ResponsesParametersReasoning(513-520)
core/internal/testutil/chat_completion_stream.go (5)
core/schemas/chatcompletions.go (5)
ChatMessage(469-478)BifrostChatRequest(12-19)ChatParameters(155-184)ChatReasoning(223-226)BifrostChatResponse(26-41)core/internal/testutil/utils.go (1)
CreateBasicChatMessage(247-254)core/schemas/models.go (1)
Model(109-129)core/internal/testutil/test_retry_framework.go (4)
StreamingRetryConfig(819-839)TestRetryContext(168-173)WithStreamRetry(580-688)ChatStreamValidationResult(2328-2335)core/schemas/bifrost.go (2)
BifrostStream(323-330)BifrostError(358-367)
core/providers/vertex/utils.go (7)
core/schemas/responses.go (1)
BifrostResponsesRequest(32-39)core/providers/utils/utils.go (1)
NewBifrostOperationError(516-527)core/schemas/provider.go (2)
ErrRequestBodyConversion(25-25)ErrProviderRequestMarshal(26-26)core/providers/anthropic/types.go (1)
AnthropicDefaultMaxTokens(14-14)core/providers/vertex/types.go (1)
DefaultVertexAnthropicVersion(8-8)core/providers/anthropic/responses.go (1)
ToAnthropicResponsesRequest(1419-1532)core/schemas/utils.go (1)
Ptr(16-18)
core/providers/utils/utils.go (2)
core/schemas/bifrost.go (2)
BifrostError(358-367)ErrorField(376-383)ui/lib/types/logs.ts (2)
BifrostError(226-232)ErrorField(217-224)
core/providers/openai/responses_marshal_test.go (3)
core/providers/openai/types.go (2)
OpenAIResponsesRequest(180-189)OpenAIResponsesRequestInput(147-150)core/schemas/utils.go (1)
Ptr(16-18)core/schemas/responses.go (5)
ResponsesParameters(87-114)ResponsesParametersReasoning(234-239)ResponsesMessage(314-327)ResponsesInputMessageRoleUser(333-333)ResponsesMessageContent(339-344)
framework/streaming/responses.go (3)
core/schemas/responses.go (8)
ResponsesStreamResponseTypeReasoningSummaryTextDelta(1393-1393)ResponsesMessage(314-327)ResponsesReasoning(731-734)ResponsesReasoningSummary(745-748)ResponsesMessageContent(339-344)ResponsesMessageContentBlock(399-411)ResponsesOutputMessageContentTypeReasoning(394-394)ResponsesReasoningContentBlockTypeSummaryText(741-741)core/schemas/utils.go (1)
Ptr(16-18)core/utils.go (1)
Ptr(51-53)
core/providers/bedrock/types.go (5)
core/providers/anthropic/types.go (1)
Alias(104-104)core/providers/openai/types.go (2)
Alias(83-83)Alias(194-194)core/schemas/bifrost.go (2)
Alias(388-388)Alias(405-405)core/schemas/chatcompletions.go (4)
Alias(189-189)Alias(643-643)Alias(770-770)Alias(860-860)core/providers/gemini/types.go (7)
Alias(220-220)Alias(245-245)Alias(828-830)Alias(842-844)Alias(857-859)Alias(871-873)Alias(1306-1306)
core/providers/bedrock/utils.go (6)
core/schemas/chatcompletions.go (1)
BifrostChatRequest(12-19)core/providers/bedrock/types.go (4)
BedrockConverseRequest(55-75)MinimumReasoningMaxTokens(12-12)DefaultCompletionMaxTokens(13-13)BedrockInferenceConfig(255-260)core/schemas/utils.go (2)
IsAnthropicModel(1043-1045)Ptr(16-18)core/providers/anthropic/types.go (1)
MinimumReasoningMaxTokens(15-15)core/providers/cohere/types.go (2)
MinimumReasoningMaxTokens(11-11)DefaultCompletionMaxTokens(12-12)core/providers/utils/utils.go (1)
GetBudgetTokensFromReasoningEffort(1084-1121)
core/schemas/responses.go (2)
core/providers/gemini/types.go (1)
Type(778-778)ui/lib/types/logs.ts (2)
ResponsesMessageContentBlockType(352-359)ResponsesReasoningSummary(412-415)
core/internal/testutil/responses_stream.go (1)
core/schemas/utils.go (1)
Ptr(16-18)
core/internal/testutil/reasoning.go (5)
core/schemas/utils.go (1)
Ptr(16-18)core/schemas/bifrost.go (1)
OpenAI(35-35)core/internal/testutil/utils.go (3)
CreateBasicChatMessage(247-254)GetErrorMessage(606-639)GetChatContent(329-356)core/internal/testutil/test_retry_framework.go (1)
WithChatTestRetry(274-424)core/internal/testutil/validation_presets.go (2)
GetExpectationsForScenario(208-293)ModifyExpectationsForProvider(300-347)
core/providers/cohere/cohere.go (1)
core/providers/cohere/responses.go (1)
ToCohereResponsesRequest(916-1030)
core/providers/openai/responses_test.go (1)
core/providers/openai/responses.go (1)
ToOpenAIResponsesRequest(37-112)
core/providers/openai/chat.go (5)
core/schemas/chatcompletions.go (1)
ChatParameters(155-184)core/providers/openai/types.go (1)
MinMaxCompletionTokens(12-12)core/schemas/utils.go (1)
Ptr(16-18)core/utils.go (1)
Ptr(51-53)core/providers/openai/utils.go (1)
SanitizeUserField(51-56)
transports/bifrost-http/integrations/anthropic.go (3)
core/schemas/bifrost.go (5)
BifrostContextKeyUserAgent(123-123)Anthropic(37-37)BifrostContextKeyExtraHeaders(115-115)BifrostContextKeyURLPath(116-116)BifrostContextKeySkipKeySelection(114-114)core/providers/anthropic/types.go (1)
AnthropicStreamEvent(395-404)core/schemas/utils.go (1)
IsAnthropicModel(1043-1045)
core/providers/bedrock/bedrock_test.go (3)
core/schemas/utils.go (1)
Ptr(16-18)core/schemas/responses.go (4)
ResponsesMessageTypeMessage(290-290)ResponsesMessageTypeFunctionCallOutput(296-296)ResponsesToolMessage(462-482)ResponsesToolMessageOutputStruct(531-535)core/schemas/chatcompletions.go (1)
OrderedMap(268-268)
core/providers/anthropic/anthropic.go (3)
core/schemas/bifrost.go (1)
ResponsesRequest(91-91)transports/bifrost-http/handlers/inference.go (1)
ResponsesRequest(257-261)core/providers/utils/utils.go (3)
HandleProviderResponse(359-445)ShouldSendBackRawRequest(551-556)ShouldSendBackRawResponse(559-564)
transports/bifrost-http/integrations/router.go (2)
core/schemas/bifrost.go (1)
BifrostContextKeyIntegrationType(120-120)core/providers/gemini/types.go (1)
Type(778-778)
ui/app/workspace/logs/views/logDetailsSheet.tsx (3)
ui/components/ui/separator.tsx (1)
DottedSeparator(43-43)ui/app/workspace/logs/views/logEntryDetailsView.tsx (1)
LogEntryDetailsView(15-49)ui/components/ui/badge.tsx (1)
Badge(37-37)
⏰ Context from checks skipped due to timeout of 900000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Graphite / mergeability_check
🔇 Additional comments (51)
core/providers/mistral/mistral_test.go (1)
49-49: LGTM: Clear explanation of implementation limitation.The addition of
Reasoning: falsewith the explanatory comment correctly reflects bifrost's current state. The comment effectively clarifies that this is an implementation limitation (lack of native Mistral converters) rather than a provider capability gap, addressing the concern raised in the previous review.core/providers/groq/groq_test.go (1)
55-55: LGTM, assuming the reasoning model is valid.Enabling reasoning tests for Groq aligns with the PR's objective to expand reasoning support across providers. This change is consistent with similar updates to other provider tests (Cerebras, Cohere, etc.).
The approval is conditional on verifying that the
ReasoningModelconfigured at line 37 is valid and supports reasoning capabilities.core/providers/bedrock/types.go (3)
3-8: LGTM!The added imports support the custom JSON unmarshaling implementation.
encoding/jsonprovidesjson.RawMessage,sonicprovides the unmarshaling capability, andschemasprovides internal types for the struct fields.
70-74: LGTM!The added fields appropriately extend
BedrockConverseRequest:
ExtraParamscaptures unknown fields during JSON unmarshalingFallbacksprovides Bifrost-specific configurationThis pattern aligns with similar implementations in other providers (Anthropic, OpenAI).
82-98: LGTM!The known-fields map correctly includes all struct fields, including
serviceTier,extra_params, andfallbackswhich were previously flagged in past reviews. This ensures proper field recognition during custom unmarshaling.core/providers/vertex/errors.go (1)
10-44: LGTM!The decoding logic is well-implemented:
- Properly handles gzip-encoded responses via
CheckAndDecodeBody- Returns appropriate errors on decode failure
- Consistently uses
decodedBodyfor all subsequent unmarshal attempts- Maintains all existing error fallback paths
core/internal/testutil/reasoning.go (3)
12-117: LGTM!The rename to
RunResponsesReasoningTestclearly distinguishes this as a Responses API-specific test, and the token increase to 1800 provides adequate capacity for complex reasoning scenarios.
201-310: LGTM!The new Chat Completions reasoning test is well-structured:
- Mirrors the Responses API test pattern consistently
- Includes a documented OpenAI skip due to known flakiness
- Properly configures reasoning parameters (
effort: "high",MaxTokens: 1800)- Integrates with the retry framework and validation expectations
312-411: LGTM!The validation logic is comprehensive and correct:
- Properly accesses reasoning fields via
message.ChatAssistantMessage- Checks multiple reasoning indicators (content, details array, token usage)
- Provides detailed logging for each indicator type
- Correctly iterates over
ReasoningDetailswith proper type checkingcore/internal/testutil/responses_stream.go (1)
440-440: LGTM!The token increase to 1800 is consistent with reasoning test adjustments across the codebase and provides adequate capacity for reasoning-heavy streaming scenarios.
core/providers/openai/chat.go (1)
31-38: LGTM!The token clamping and user field sanitization logic is well-implemented:
- Properly clamps
MaxCompletionTokensto the minimum (16)- Sanitizes the
Userfield to respect the 64-character limit- Clean integration with the utility functions
Note: A past review comment requested test coverage for these guards and was marked as addressed.
core/providers/openai/utils.go (1)
47-56: LGTM!The user field sanitization is well-implemented:
- The 64-character limit is properly documented as an OpenAI enforcement
SanitizeUserFieldcorrectly returnsnilwhen the limit is exceeded- Simple, clear implementation
Note: A past review confirmed this limit through actual OpenAI errors, even though it's not in their public documentation.
transports/bifrost-http/handlers/inference.go (1)
224-254: LGTM!The custom unmarshaller for
ResponsesRequestis well-implemented:
- Follows the same pattern as
ChatRequest.UnmarshalJSON(lines 169-199)- Properly handles the embedded
BifrostParamsstruct using an alias- Separately unmarshals
InputandResponsesParametersto avoid conflicts- Includes clear documentation explaining the need for custom unmarshalling
The implementation is consistent and correct.
core/providers/cohere/cohere_test.go (1)
25-50: Ensure the configured CohereReasoningModelis actually available in CI/accounts (model-name drift risk).Hardcoding
"command-a-reasoning-08-2025"can make provider tests flaky if that model isn’t enabled for the key used in CI. Consider reading this from env (similar to API key gating) or falling back toChatModelwhen unavailable.core/providers/anthropic/anthropic.go (2)
677-705: Centralized Responses body construction looks like a net win (less drift between modes).Please sanity-check that
getRequestBodyForResponses(..., stream=false)does not accidentally leavestream=true(or equivalent) set from reused structs, and that the marshaledjsonBodymatches what Anthropic expects for/v1/messages.
725-729: Good: streaming path now reuses the same body builder and properly propagates conversion errors.core/providers/cohere/cohere.go (2)
510-517: Good: propagateToCohereResponsesRequestconversion errors instead of forcing nil error.
569-577: Good: early-return on conversion error before mutating/stream-enabling request.Just confirm
ToCohereResponsesRequestreturns a concrete type with.Stream(notany), otherwise this won’t compile.core/providers/openai/types.go (3)
11-13: ConfirmMinMaxCompletionTokensvalue is correct for all OpenAI(-compatible) backends.
Hard-coding16is fine, but please double-check this minimum is actually required/valid across the providers routed through this OpenAI path.
108-139: Unmarshal split (base fields +schemas.ChatParameters) is the right fix for the “hijack” problem.
This avoids losingModel/Messages/Stream/MaxTokens/Fallbackswhile still allowingschemas.ChatParameters.UnmarshalJSONto run.
191-229: Good shadowing approach: preservesInput’s custom JSON and reliably omitsreasoning.max_tokens.
The clone prevents accidental mutation ofr.Reasoningwhile ensuring the outgoing payload matches the intended OpenAI schema.core/providers/anthropic/chat.go (2)
605-632: Good: don’t require non-emptyPartialJSONto stream tool args.
AllowingPartialJSON != nilmatches real-world deltas where empty-string chunks occur.
633-660: Good:thinkingTextlocal avoids accidental aliasing and keeps pointer fields consistent.
This is a safer pattern for constructingReasoningandReasoningDetails[].Text.core/providers/openai/responses_test.go (1)
195-346: Add a regression test for the “ResponsesReasoning != nil ⇒ content blocks treated as reasoning” constraint.
This behavior was called out as a contract; locking it with a focused test will prevent future drift. Based on learnings, this should be enforced and verified for downstream consumers.core/providers/bedrock/utils.go (2)
11-11: LGTM!The
providerUtilsimport alias is consistent with the pattern used in other providers in this PR.
39-76: Reasoning config logic correctly addresses model-aware minimum tokens.The implementation now properly uses Bedrock's
MinimumReasoningMaxTokens(1) by default, and only uses Anthropic's higher minimum (1024) when the model is detected as an Anthropic model viaIsAnthropicModel(). This resolves the concern from the previous review about spurious errors for Bedrock configs.core/providers/openai/responses.go (3)
3-7: LGTM!Import changes are appropriate for the added string-based model checks.
42-89: Message normalization logic correctly implemented with proper documentation.The code properly:
- Documents the OpenAI responses format constraint (lines 47-52)
- Uses
schemas.Ptr(summary.Text)to avoid pointer-to-range-variable aliasing (line 76)- Handles both gpt-oss and non-gpt-oss models appropriately
Based on learnings, this aligns with the confirmed OpenAI responses format behavior.
100-109: LGTM!Good defensive parameter handling:
- Clamping
MaxOutputTokensto minimum ensures valid requests- Sanitizing
Userfield respects OpenAI's 64-character limit- Filtering unsupported tools prevents API errors
core/providers/gemini/responses.go (1)
619-627: Ensure consistent encoding/decoding of ThoughtSignature.Line 625 converts
EncryptedContentdirectly to bytes:part.ThoughtSignature = []byte(*nextMsg.ResponsesReasoning.EncryptedContent)If the outbound path (line 169) is updated to base64-encode the ThoughtSignature, this inbound path will need corresponding base64 decoding. Ensure both directions are updated consistently.
core/providers/cohere/chat.go (3)
7-7: LGTM!The
providerUtilsimport alias is consistent with other providers in this PR.
104-129: Reasoning budget logic correctly handles nil checks and uses local constants.The implementation properly:
- Checks
MaxCompletionTokens != nilbefore dereferencing (lines 113-115)- Uses
DefaultCompletionMaxTokensas fallback whenMaxCompletionTokensis not provided- Propagates errors from
GetBudgetTokensFromReasoningEffort(lines 117-119)This addresses the nil pointer dereference concern from the previous review.
439-463: LGTM!The streaming thinking content handling properly extracts the thinking text into a local variable and uses
schemas.Ptr()to create safe pointers for bothReasoningandReasoningDetailsfields.core/schemas/responses.go (1)
399-411:Signatureaddition is fine; ensure downstream streaming deep-copy/accumulators preserve it.
Schema change looks consistent, but it increases the importance of updating stream deep-copy and accumulation paths (see framework/streaming/responses.go comments).core/providers/openai/responses_marshal_test.go (3)
12-129: Good coverage for “omit reasoning.max_tokens” while preserving other reasoning fields.
131-313: Input union tests look solid (string vs array forms).
315-480: Round-trip test is good; verify it matches the current OpenAIResponsesRequest (un)marshal contract.
This is tightly coupled to OpenAIResponsesRequest.MarshalJSON/UnmarshalJSON behavior—worth confirming it’s stable across the stack PRs.core/providers/anthropic/types.go (4)
58-140: Unknown-field capture intoExtraParamslooks correct (and avoids duplicatingextra_params).
161-197: Sonic-based union (string vs blocks) marshaling forAnthropicContentlooks good.
199-228: Redacted thinking support (redacted_thinking+data) is a clean extension.
365-424: Usage/cache fields +stop_sequencenullability look consistent with the stated API behavior.core/providers/bedrock/bedrock_test.go (4)
487-487: LGTM: Type field additions align with schema evolution.The test cases correctly populate the
Typefield withResponsesMessageTypeMessagefor standard message inputs, consistent with the new schema requirement. The pattern is applied uniformly across all test scenarios.Also applies to: 527-527, 539-539, 579-579, 624-624, 688-688, 758-758
823-823: LGTM: Tool call representations updated correctly.The test expectations now use
Status: "completed"for function calls andResponsesToolCallOutputStrfor tool outputs, aligning with the updated schema definitions. These changes are applied consistently across all relevant test scenarios.Also applies to: 865-865, 915-915, 927-927
1498-1513: LGTM: Structural comparison follows documented testing strategy.The field-by-field comparison approach correctly handles runtime-generated data (IDs, timestamps) while validating the essential conversion logic. This aligns with the documented layered testing strategy where unit tests focus on structure and separate integration tests validate detailed content.
Based on learnings, this pattern is intentional and appropriate.
1684-1717: LGTM: Nil guards added for safe test assertions.The
require.NotNilchecks before dereferencing pointer fields ensure tests fail cleanly with clear diagnostics rather than panicking. The comparison logic forOutput,Usage, and nested structures is now properly guarded and handles runtime-generated data appropriately.core/providers/cohere/responses.go (6)
14-151: LGTM: State tracking comprehensively manages streaming conversions.The expanded
CohereResponsesStreamStatewith reasoning indices, annotation mappings, tool plan tracking, and lifecycle flags provides robust state management for the streaming conversion pipeline. The pool management correctly initializes and clears all new fields, preventing state leakage between requests. ThegetOrCreateOutputIndexmethod handles nil content indices defensively.
153-185: LGTM: Content block conversion handles all Cohere block types.The helper correctly converts text, image, and thinking blocks to Bifrost format. The nil check for
ImageURL(line 163) prevents invalid image blocks, and based on previous review feedback, call sites filter out any zero-value blocks with emptyTypefields.
187-867: LGTM: Streaming event handlers comprehensively map Cohere to Bifrost format.The
ToBifrostResponsesStreammethod correctly handles all Cohere streaming event types, including:
- Lifecycle events (created, in_progress, completed) with proper state tracking
- Content deltas with distinction between regular text and reasoning summaries
- Tool call lifecycle with argument buffering and proper status transitions
- Citation pairing using annotation-to-content-index mapping (addresses previous feedback)
- Tool plan handling with automatic cleanup before subsequent events
The state management and event sequencing align with the Responses API streaming specification.
869-1030: LGTM: Request conversion correctly maps Bifrost to Cohere format.The
ToCohereResponsesRequestfunction comprehensively converts Bifrost Responses requests to Cohere's chat format, including:
- Response format mapping with proper JSON schema handling
- Reasoning budget calculation with both explicit
MaxTokensand effort-based paths- Safe nil handling for
MaxOutputTokens(lines 946-949) using a default value- Proper error propagation from
GetBudgetTokensFromReasoningEffort- Tools and tool choice conversion via dedicated helpers
Based on previous review feedback, the nil pointer issues have been addressed.
1032-1247: LGTM: Bidirectional message conversion handles all Responses types.The conversion functions correctly map between Bifrost Responses and Cohere formats:
ToBifrostResponsesResponseproperly extracts usage and output messagesConvertBifrostMessagesToCohereMessageshandles system messages, reasoning blocks, function calls, and regular content with proper accumulation and flushing logicconvertBifrostToolChoiceToCohereToolChoicenow correctly maps "auto" toToolChoiceAuto(addresses previous feedback)The reasoning block attachment to assistant messages and system message aggregation align with Cohere's API requirements.
1249-1569: LGTM: Message-level conversion helpers handle all content and tool types.The helper functions provide comprehensive bidirectional conversion:
- System, regular, reasoning, and tool messages all handled appropriately
- Nil guards prevent panics when accessing nested fields (e.g., line 1345 checks
ResponsesToolMessage != nil)- Invalid content blocks filtered out (line 1436 checks for non-empty
Type)- Function call outputs correctly extract from either
OutputorContentfields- Content block conversion handles text, image, and reasoning types
Based on previous review feedback, all identified issues have been addressed.
bd1d8c9 to
76853f9
Compare
6951a77 to
2d213a4
Compare
76853f9 to
b6f0c52
Compare
2d213a4 to
a22dc43
Compare
b6f0c52 to
ad08844
Compare
a22dc43 to
37e01b9
Compare
37e01b9 to
9551b34
Compare
ad08844 to
e1ce12c
Compare
9551b34 to
26864dc
Compare
e1ce12c to
1806088
Compare
Merge activity
|
1806088 to
cccf60c
Compare

Summary
Briefly explain the purpose of this PR and the problem it solves.
Changes
Type of change
Affected areas
How to test
Describe the steps to validate this change. Include commands and expected outcomes.
If adding new configs or environment variables, document them here.
Screenshots/Recordings
If UI changes, add before/after screenshots or short clips.
Breaking changes
If yes, describe impact and migration instructions.
Related issues
Link related issues and discussions. Example: Closes #123
Security considerations
Note any security implications (auth, secrets, PII, sandboxing, etc.).
Checklist
docs/contributing/README.mdand followed the guidelines