Skip to content

Conversation

@Pratham-Mishra04
Copy link
Collaborator

Summary

Briefly explain the purpose of this PR and the problem it solves.

Changes

  • What was changed and why
  • Any notable design decisions or trade-offs

Type of change

  • Bug fix
  • Feature
  • Refactor
  • Documentation
  • Chore/CI

Affected areas

  • Core (Go)
  • Transports (HTTP)
  • Providers/Integrations
  • Plugins
  • UI (Next.js)
  • Docs

How to test

Describe the steps to validate this change. Include commands and expected outcomes.

# Core/Transports
go version
go test ./...

# UI
cd ui
pnpm i || npm i
pnpm test || npm test
pnpm build || npm run build

If adding new configs or environment variables, document them here.

Screenshots/Recordings

If UI changes, add before/after screenshots or short clips.

Breaking changes

  • Yes
  • No

If yes, describe impact and migration instructions.

Related issues

Link related issues and discussions. Example: Closes #123

Security considerations

Note any security implications (auth, secrets, PII, sandboxing, etc.).

Checklist

  • I read docs/contributing/README.md and followed the guidelines
  • I added/updated tests where appropriate
  • I updated documentation where needed
  • I verified builds succeed (Go and UI)
  • I verified the CI pipeline passes locally if applicable

Copy link
Collaborator Author

Pratham-Mishra04 commented Dec 4, 2025

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Dec 4, 2025

Caution

Review failed

The pull request is closed.

📝 Walkthrough

Summary by CodeRabbit

Release Notes

  • New Features

    • Added reasoning capabilities across multiple AI providers with configurable effort and token budgets.
    • Introduced support for redacted thinking and reasoning summaries in responses.
  • Improvements

    • Enhanced streaming responses with reasoning deltas and signatures.
    • Improved parameter validation and request handling across providers.
    • Added UI controls to display reasoning parameters in log details.
  • Bug Fixes

    • Fixed passthrough response handling for improved provider integration.
    • Improved error decoding and logging for better debugging.
  • Tests

    • Expanded test coverage for reasoning features and response marshaling scenarios.

✏️ Tip: You can customize this high-level summary in your review settings.

Walkthrough

Centralizes Anthropic/Vertex/Azure request-body construction and passthrough logic, expands reasoning/thinking and summary handling across providers and streaming layers, adds custom JSON (un)marshallers and conversions, extends schemas with signatures/summaries, modifies router SSE/DONE behavior, and updates tests and UI surfaces.

Changes

Cohort / File(s) Summary
Anthropic provider
core/providers/anthropic/errors.go, core/providers/anthropic/types.go, core/providers/anthropic/chat.go, core/providers/anthropic/utils.go, core/providers/anthropic/anthropic.go
Added SSE error formatter ToAnthropicResponsesStreamError; introduced ExtraParams/Data/redacted-thinking support and sonic-based (un)marshallers; added getRequestBodyForResponses central helper; adjusted thinking/reasoning mapping and stream-delta guards; passthrough/raw-response handling and changes to request-body usage.
Vertex / Azure Anthropic bridge
core/providers/vertex/utils.go, core/providers/vertex/vertex.go, core/providers/vertex/errors.go, core/providers/azure/utils.go, core/providers/azure/azure.go
Added getRequestBodyForAnthropicResponses / getRequestBodyForAnthropicResponses-style helper (vertex/azure variants); replaced inline Anthropic request construction with centralized helper; response-body decoding now uses decoded payload for error handling.
Cohere provider
core/providers/cohere/responses.go, core/providers/cohere/chat.go, core/providers/cohere/types.go, core/providers/cohere/cohere.go
Large bidirectional conversion and streaming helpers between Cohere and Bifrost; unified thinking/reasoning budget handling; ToCohereResponsesRequest now returns (any, error); streaming state/lifecycle and mapping enhancements.
OpenAI provider
core/providers/openai/responses.go, core/providers/openai/types.go, core/providers/openai/chat.go, core/providers/openai/text.go, core/providers/openai/utils.go, core/providers/openai/responses_marshal_test.go, core/providers/openai/responses_test.go, core/providers/openai/types_test.go
Added custom marshal/unmarshal for Responses/Chat requests, MinMaxCompletionTokens constant and token clamping, SanitizeUserField (64-char max) and filtering unsupported tools; normalization logic for messages; comprehensive unit tests for marshal/unmarshal and conversion.
Bedrock provider & types/tests
core/providers/bedrock/types.go, core/providers/bedrock/utils.go, core/providers/bedrock/bedrock.go, core/providers/bedrock/bedrock_test.go, core/internal/testutil/account.go
Unified reasoning budget logic (MaxTokens vs Effort → budget_tokens); added ExtraParams/Fallbacks and UnmarshalJSON to Bedrock request; stream-end signaling adjustments; test updates (deployments and tool-output representation).
Framework streaming
framework/streaming/responses.go, framework/streaming/chat.go
Added handling for ReasoningSummaryTextDelta and helpers to append reasoning deltas/signatures into content blocks or reasoning summaries; removed legacy TextCompletionResponseChoice handling; added nil-safety guards.
Core schemas
core/schemas/responses.go, core/schemas/bifrost.go, core/schemas/mux.go, core/schemas/utils.go
Added StopReason and Signature fields; renamed ResponsesReasoningContentResponsesReasoningSummary and introduced the summary type; added BifrostContextKeyIntegrationType; minor comment edits.
HTTP transport & Anthropic router
transports/bifrost-http/handlers/inference.go, transports/bifrost-http/integrations/anthropic.go, transports/bifrost-http/integrations/router.go, transports/bifrost-http/handlers/middlewares.go
Added ResponsesRequest.UnmarshalJSON; centralized passthrough/raw-response handling for Anthropic (multi-event SSE aggregation), shouldUsePassthrough/isClaudeModel; propagate integration type into context; adjust DONE marker emission and SSE preformatted handling; minor whitespace fix.
Utilities
core/providers/utils/utils.go
Added GetRandomString and reasoning-effort ↔ token mapping helpers; switched some JSON uses to sonic; improved provider error-body decoding (duplicate insertions observed).
Frontend UI & types
ui/lib/types/logs.ts, ui/app/workspace/logs/views/logResponsesMessageView.tsx, ui/app/workspace/logs/views/logResponsesOutputView.tsx, ui/app/workspace/logs/views/logDetailsSheet.tsx, ui/app/workspace/logs/views/columns.tsx, ui/package.json
Renamed reasoning content type to ResponsesReasoningSummary; updated typings and UI guards (skip empty reasoning), added Reasoning Parameters UI block, break-word fix for long content, removed a large message-view component, and relaxed lucide-react version.
Transports / router misc
transports/...
Custom UnmarshalJSON for Responses requests, SSE preformatted handling changes, DONE marker gating changes, minor middleware whitespace removal.
Tests & testutil
core/internal/testutil/*, core/providers/*_test.go, provider test files
Added separate Responses and ChatCompletion reasoning tests, added ReasoningModel/Reasoning fields in test configs, increased streaming token budgets, and many test updates to reflect schema and behavior changes.
Misc / Config / small edits
framework/configstore/rdb.go, core/schemas/bifrost.go, core/providers/vertex/types.go, core/schemas/mux.go
Removed a debug log; added BifrostContextKeyIntegrationType; formatting/comment-only edits.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  participant Client
  participant Router as "Bifrost Router"
  participant Provider
  participant Acc as "Framework Accumulator"

  Client->>Router: POST /responses (may include raw body or reasoning params)
  Router->>Router: evaluate shouldUsePassthrough / isClaudeModel
  Router->>Router: build jsonBody via getRequestBodyForResponses / provider-specific helper
  Router->>Provider: send request (streaming or non-streaming)
  Provider-->>Router: stream SSE/JSON chunks (events, deltas, errors, or multi-event raw)
  Router->>Acc: forward chunks/deltas
  Acc->>Acc: assemble deltas (content, reasoning summaries, signatures)
  Acc-->>Router: emit assembled SSE/data frames (errors formatted via ToAnthropicResponsesStreamError)
  Router-->>Client: stream SSE (DONE marker suppressed for Anthropic/Responses routes)
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45–75 minutes

  • Areas needing extra attention:
    • Cohere conversions & streaming state (core/providers/cohere/responses.go)
    • Streaming assembly and reasoning signature handling (framework/streaming/responses.go, framework/streaming/chat.go)
    • Anthropic passthrough, multi-event SSE aggregation, and router DONE-marker behavior (transports/bifrost-http/integrations/anthropic.go, transports/bifrost-http/integrations/router.go)
    • Centralized Anthropic/Vertex request helpers and cross-provider reuse (core/providers/anthropic/utils.go, core/providers/vertex/utils.go, core/providers/azure/utils.go)
    • OpenAI custom JSON marshal/unmarshal and SanitizeUserField implications plus new tests (core/providers/openai/types.go, core/providers/openai/utils.go, tests)
    • Bedrock types and tests (ExtraParams, tool-output representation) (core/providers/bedrock/*, bedrock tests)
    • Duplicate utility function insertions in core/providers/utils/utils.go (possible dedup)
    • UI removal of LogResponsesMessageView — check for dangling imports/usages

Poem

🐇
I hop through deltas, signatures in tow,
Summaries nest where streaming breezes blow.
Passthrough whispers, converters hum along,
Tokens counted, thinking stitched to song.
A cheerful nibble — changes snug and strong!

Pre-merge checks and finishing touches

❌ Failed checks (3 warnings, 1 inconclusive)
Check name Status Explanation Resolution
Description check ⚠️ Warning The PR description is entirely templated with no concrete information filled in. All sections contain only placeholder text and unchecked checkboxes, providing no details about what was changed or why. Complete the description with: specific changes made to reasoning handling, affected providers/modules, testing approach, any breaking changes, and related issue links. Fill in the 'Summary' and 'Changes' sections with actual implementation details.
Linked Issues check ⚠️ Warning The linked issue #123 is about Files API support for OpenAI/Anthropic, which is unrelated to the changeset focused on responses reasoning features. The PR does not address file upload/management requirements. Link appropriate issues that describe the responses reasoning features being implemented, or clarify why issue #123 is relevant. Ensure linked issues match the actual code changes for reasoning support across providers.
Docstring Coverage ⚠️ Warning Docstring coverage is 66.25% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
Title check ❓ Inconclusive The title 'feat: responses reasoning fixes' is vague and generic, using non-descriptive terms like 'fixes' that don't convey the specific changes being made. Revise the title to be more specific about which reasoning fixes are being implemented (e.g., 'feat: add reasoning support to responses API' or 'feat: implement reasoning parameters handling across providers').
✅ Passed checks (1 passed)
Check name Status Explanation
Out of Scope Changes check ✅ Passed The changeset implements extensive reasoning support features (reasoning budgets, effort mappings, streaming reasoning deltas, reasoning signatures) across multiple providers (OpenAI, Anthropic, Cohere, Bedrock, Vertex). These align with responses reasoning objectives but extend significantly beyond typical bug fixes.

📜 Recent review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between bd1d8c9 and cccf60c.

⛔ Files ignored due to path filters (1)
  • ui/package-lock.json is excluded by !**/package-lock.json
📒 Files selected for processing (58)
  • core/internal/testutil/account.go (1 hunks)
  • core/internal/testutil/chat_completion_stream.go (1 hunks)
  • core/internal/testutil/reasoning.go (4 hunks)
  • core/internal/testutil/responses_stream.go (1 hunks)
  • core/internal/testutil/tests.go (2 hunks)
  • core/providers/anthropic/anthropic.go (3 hunks)
  • core/providers/anthropic/chat.go (5 hunks)
  • core/providers/anthropic/errors.go (2 hunks)
  • core/providers/anthropic/types.go (8 hunks)
  • core/providers/anthropic/utils.go (2 hunks)
  • core/providers/azure/azure.go (2 hunks)
  • core/providers/azure/utils.go (1 hunks)
  • core/providers/bedrock/bedrock.go (2 hunks)
  • core/providers/bedrock/bedrock_test.go (13 hunks)
  • core/providers/bedrock/types.go (2 hunks)
  • core/providers/bedrock/utils.go (2 hunks)
  • core/providers/cerebras/cerebras_test.go (2 hunks)
  • core/providers/cohere/chat.go (3 hunks)
  • core/providers/cohere/cohere.go (2 hunks)
  • core/providers/cohere/cohere_test.go (1 hunks)
  • core/providers/cohere/responses.go (7 hunks)
  • core/providers/cohere/types.go (1 hunks)
  • core/providers/gemini/responses.go (2 hunks)
  • core/providers/groq/groq_test.go (2 hunks)
  • core/providers/mistral/mistral_test.go (1 hunks)
  • core/providers/openai/chat.go (1 hunks)
  • core/providers/openai/responses.go (2 hunks)
  • core/providers/openai/responses_marshal_test.go (1 hunks)
  • core/providers/openai/responses_test.go (1 hunks)
  • core/providers/openai/text.go (1 hunks)
  • core/providers/openai/types.go (3 hunks)
  • core/providers/openai/types_test.go (1 hunks)
  • core/providers/openai/utils.go (1 hunks)
  • core/providers/openrouter/openrouter_test.go (1 hunks)
  • core/providers/utils/utils.go (4 hunks)
  • core/providers/vertex/errors.go (1 hunks)
  • core/providers/vertex/types.go (0 hunks)
  • core/providers/vertex/utils.go (1 hunks)
  • core/providers/vertex/vertex.go (3 hunks)
  • core/providers/vertex/vertex_test.go (1 hunks)
  • core/schemas/bifrost.go (1 hunks)
  • core/schemas/mux.go (0 hunks)
  • core/schemas/responses.go (5 hunks)
  • core/schemas/utils.go (1 hunks)
  • docs/docs.json (0 hunks)
  • framework/configstore/rdb.go (0 hunks)
  • framework/streaming/chat.go (0 hunks)
  • framework/streaming/responses.go (2 hunks)
  • transports/bifrost-http/handlers/inference.go (1 hunks)
  • transports/bifrost-http/handlers/middlewares.go (1 hunks)
  • transports/bifrost-http/integrations/anthropic.go (5 hunks)
  • transports/bifrost-http/integrations/router.go (3 hunks)
  • ui/app/workspace/logs/views/columns.tsx (1 hunks)
  • ui/app/workspace/logs/views/logDetailsSheet.tsx (1 hunks)
  • ui/app/workspace/logs/views/logResponsesMessageView.tsx (2 hunks)
  • ui/app/workspace/logs/views/logResponsesOutputView.tsx (0 hunks)
  • ui/lib/types/logs.ts (2 hunks)
  • ui/package.json (1 hunks)

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

🧹 Nitpick comments (9)
transports/bifrost-http/integrations/anthropic.go (1)

72-77: Non‑stream Anthropic /v1/messages no longer supports raw‑response passthrough; consider confirming intent and cleaning up commented code

The previous behavior short‑circuited to resp.ExtraFields.RawResponse for Anthropic if present; now that path is commented out and we always go through anthropic.ToAnthropicResponsesResponse(resp). This is a real behavior change for non‑stream responses:

  • If any callers relied on getting the provider’s raw body for /v1/messages (non‑stream), they will now receive the normalized Anthropic struct instead.
  • In streaming, we still support raw passthrough gated by BifrostContextKeySendBackRawResponse, so behavior is now asymmetric between streaming vs non‑stream.

If the intent is to fully normalize non‑stream responses (e.g., to ensure reasoning metadata is always passed through via our schemas), this looks fine functionally, but I’d suggest:

  1. Remove the commented block to avoid dead code, and
  2. Optionally add a brief comment above the converter clarifying that non‑stream Anthropic responses are intentionally always normalized and that raw passthrough is streaming‑only.

If, instead, non‑stream raw passthrough is still desired in some cases, we probably want to reintroduce this logic but gate it similarly to streaming using BifrostContextKeySendBackRawResponse for consistency.

core/providers/utils/utils.go (1)

951-960: No urgent security fix needed for this use case, but consider documenting the function's non-security purpose.

GetRandomString is used only for generating internal message IDs in Anthropic response parsing (with prefixes like msg_ and rs_), not for authentication tokens or security-sensitive identifiers. While math/rand is indeed not cryptographically secure, the current implementation is appropriate for internal message tracking.

If you want to prevent future misuse, add a doc comment clarifying this is not suitable for security-sensitive purposes. Alternatively, if you're concerned about consistency with Go best practices for all random generation, using crypto/rand is defensible but not critical for this use case.

core/providers/openai/responses.go (2)

57-59: Duplicate condition check.

Line 57 and line 59 both check len(message.ResponsesReasoning.Summary) > 0. This is redundant.

 			// If the message has summaries but no content blocks and the model is gpt-oss, then convert the summaries to content blocks
 			if len(message.ResponsesReasoning.Summary) > 0 &&
 				strings.Contains(bifrostReq.Model, "gpt-oss") &&
-				len(message.ResponsesReasoning.Summary) > 0 &&
 				message.Content == nil {

45-84: Consider extracting model-specific reasoning logic to improve readability.

The nested conditionals handling reasoning content transformation are complex. The logic correctly handles:

  1. Skipping reasoning blocks without summaries for non-gpt-oss models
  2. Converting summaries to content blocks for gpt-oss models
  3. Passing through other messages unchanged

However, using strings.Contains(bifrostReq.Model, "gpt-oss") for model detection may be fragile if model naming conventions change.

Consider extracting a helper function like isGptOssModel(model string) bool for clearer intent and easier maintenance:

func isGptOssModel(model string) bool {
    return strings.Contains(model, "gpt-oss")
}

This would make the conditional checks more readable and centralize the model detection logic.

core/providers/gemini/responses.go (2)

143-146: Consider using sonic.Marshal for consistency.

This uses encoding/json.Marshal while the rest of the codebase uses github.com/bytedance/sonic for JSON operations. For consistency and potential performance benefits, consider using sonic.Marshal here.

-				if argsBytes, err := json.Marshal(part.FunctionCall.Args); err == nil {
+				if argsBytes, err := sonic.Marshal(part.FunctionCall.Args); err == nil {
 					argumentsStr = string(argsBytes)
 				}

You would also need to add the sonic import if not already present via another code path.


263-272: Duplicate ID generation logic.

Lines 264-267 and 269-271 contain duplicate logic for generating itemID. The second block (269-271) appears to be redundant as it only handles the MessageID == nil case which is already covered.

 			// Generate stable ID for text item
 			var itemID string
 			if state.MessageID == nil {
 				itemID = fmt.Sprintf("item_%d", outputIndex)
 			} else {
 				itemID = fmt.Sprintf("msg_%s_item_%d", *state.MessageID, outputIndex)
 			}
-			if state.MessageID == nil {
-				itemID = fmt.Sprintf("item_%d", outputIndex)
-			}
 			state.ItemIDs[outputIndex] = itemID
core/providers/cohere/responses.go (1)

263-272: Duplicate ID generation pattern repeated multiple times.

The same ID generation logic with the redundant second if block appears in multiple places (lines 263-272, 306-316, and 421-429). This appears to be copy-paste duplication.

Consider extracting a helper function and removing the duplicate conditional:

func (state *CohereResponsesStreamState) generateItemID(outputIndex int, prefix string) string {
    if state.MessageID == nil {
        return fmt.Sprintf("%s_%d", prefix, outputIndex)
    }
    return fmt.Sprintf("msg_%s_%s_%d", *state.MessageID, prefix, outputIndex)
}

Then use it consistently:

itemID := state.generateItemID(outputIndex, "item")
state.ItemIDs[outputIndex] = itemID

Also applies to: 306-316, 421-429

transports/bifrost-http/handlers/inference.go (2)

224-254: Custom ResponsesRequest unmarshal aligns with chat pattern; consider guarding against reuse-side effects

The split unmarshal (BifrostParams → Input union → ResponsesParameters) looks correct and mirrors the ChatRequest.UnmarshalJSON pattern, so it should resolve the embedded‐struct issues with ResponsesParameters’ custom unmarshaller.

If you ever end up reusing a ResponsesRequest instance for multiple decodes, this implementation can leave stale values in fields that are omitted in subsequent payloads (standard encoding/json behavior, but now under your control). It’s not a problem for the current usage (fresh var req ResponsesRequest per request), but if you want stricter reset semantics you could zero the struct at the start of the method before re-populating it.


91-118: responsesParamsKnownFields omits "user"; likely ends up duplicated in ExtraParams

ResponsesParameters has a User *string \json:"user,omitempty"`, but "user"is not listed inresponsesParamsKnownFields. That means /v1/responsesrequests with auserfield will both populateResponsesParameters.User(viasonic.Unmarshal) and also be treated as an unknown extra param and forwarded in ExtraParams. This is inconsistent with the chat path (where "user"is marked as known) and could cause confusing duplication for provider adapters that look atExtraParams`.

If user is intended to be a first-class, schema-level field for responses (same as chat), consider adding it here so it is not treated as a provider-specific extra:

 var responsesParamsKnownFields = map[string]bool{
   "model":                true,
   "input":                true,
   "fallbacks":            true,
   "stream":               true,
@@
   "top_p":                true,
   "tool_choice":          true,
   "tools":                true,
-  "truncation":           true,
+  "truncation":           true,
+  "user":                 true,
 }
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 6cf3108 and a15c48b.

📒 Files selected for processing (12)
  • core/providers/anthropic/errors.go (2 hunks)
  • core/providers/anthropic/types.go (2 hunks)
  • core/providers/cohere/responses.go (4 hunks)
  • core/providers/gemini/responses.go (2 hunks)
  • core/providers/openai/openai.go (1 hunks)
  • core/providers/openai/responses.go (2 hunks)
  • core/providers/openai/types.go (2 hunks)
  • core/providers/utils/utils.go (3 hunks)
  • core/schemas/responses.go (3 hunks)
  • transports/bifrost-http/handlers/inference.go (1 hunks)
  • transports/bifrost-http/integrations/anthropic.go (1 hunks)
  • ui/lib/types/logs.ts (2 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**

⚙️ CodeRabbit configuration file

always check the stack if there is one for the current PR. do not give localized reviews for the PR, always see all changes in the light of the whole stack of PRs (if there is a stack, if there is no stack you can continue to make localized suggestions/reviews)

Files:

  • core/providers/anthropic/errors.go
  • transports/bifrost-http/handlers/inference.go
  • core/providers/openai/types.go
  • ui/lib/types/logs.ts
  • core/providers/anthropic/types.go
  • core/providers/openai/responses.go
  • transports/bifrost-http/integrations/anthropic.go
  • core/providers/utils/utils.go
  • core/providers/gemini/responses.go
  • core/providers/openai/openai.go
  • core/schemas/responses.go
  • core/providers/cohere/responses.go
🧬 Code graph analysis (6)
core/providers/anthropic/errors.go (2)
core/providers/anthropic/types.go (3)
  • AnthropicStreamEvent (328-337)
  • AnthropicStreamEventTypeError (324-324)
  • AnthropicStreamError (399-402)
core/providers/anthropic/responses.go (1)
  • ToAnthropicResponsesStreamError (1392-1413)
transports/bifrost-http/handlers/inference.go (2)
core/schemas/bifrost.go (1)
  • ResponsesRequest (91-91)
core/schemas/responses.go (1)
  • ResponsesParameters (85-112)
core/providers/openai/types.go (1)
core/schemas/responses.go (1)
  • ResponsesParametersReasoning (232-237)
ui/lib/types/logs.ts (1)
core/schemas/responses.go (2)
  • ResponsesReasoningSummary (743-746)
  • ResponsesReasoning (729-732)
core/providers/gemini/responses.go (3)
ui/lib/types/logs.ts (6)
  • FunctionCall (157-160)
  • ResponsesToolMessage (402-408)
  • ResponsesMessage (422-437)
  • ResponsesMessageContent (399-399)
  • ResponsesReasoning (416-419)
  • ResponsesReasoningSummary (411-414)
core/providers/gemini/types.go (4)
  • FunctionCall (1091-1101)
  • Role (13-13)
  • Content (922-930)
  • Type (778-778)
core/schemas/responses.go (5)
  • ResponsesToolMessage (460-480)
  • ResponsesMessage (312-325)
  • ResponsesMessageContent (337-342)
  • ResponsesReasoning (729-732)
  • ResponsesReasoningSummary (743-746)
core/schemas/responses.go (2)
core/providers/gemini/types.go (1)
  • Type (778-778)
ui/lib/types/logs.ts (2)
  • ResponsesMessageContentBlockType (351-358)
  • ResponsesReasoningSummary (411-414)
⏰ Context from checks skipped due to timeout of 900000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (15)
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
🔇 Additional comments (13)
core/providers/anthropic/types.go (2)

143-144: LGTM! Proper addition of redacted thinking support.

The new redacted_thinking content block type aligns with Anthropic's extended thinking API, which returns encrypted thinking content when the model's reasoning needs to be redacted. This is consistent with the Data field added below for storing the encrypted content.


153-153: Clear documentation for the Data field.

The comment accurately describes the field's purpose for encrypted redacted thinking data.

core/providers/openai/types.go (1)

154-192: Well-structured custom marshaling with clear intent.

The implementation correctly:

  1. Preserves custom Input marshaling via json.RawMessage
  2. Clones Reasoning to avoid mutating the original struct
  3. Always sets MaxTokens to nil as OpenAI's API doesn't support this parameter

One minor note: the code mixes encoding/json (for RawMessage) with sonic (for final marshal), which is intentional but could be documented.

core/providers/openai/responses.go (1)

41-94: Logic correctly handles reasoning content transformation.

The overall transformation logic for handling reasoning content blocks across different OpenAI model variants is sound. The approach of building a new messages slice while selectively transforming or skipping messages based on model capabilities is appropriate.

core/providers/anthropic/errors.go (1)

36-58: The function ToAnthropicResponsesStreamError exists only in core/providers/anthropic/errors.go and is not duplicated elsewhere in the codebase. There is no duplicate function definition in responses.go or any other file. This code can be merged without compilation errors related to duplication.

Likely an incorrect or invalid review comment.

ui/lib/types/logs.ts (1)

411-419: Type rename looks good and aligns with Go schema.

The renaming from ResponsesReasoningContent to ResponsesReasoningSummary is consistent with the corresponding changes in core/schemas/responses.go (lines 742-745). The field structure matches the Go definition.

core/schemas/responses.go (2)

398-401: New Signature field addition looks correct.

The Signature field is appropriately added as an optional pointer field for carrying content signatures (used for reasoning in Gemini 3 Pro). The JSON tag with omitempty is correct for optional fields.


728-746: Type rename and structure updates are consistent.

The ResponsesReasoning struct now uses []ResponsesReasoningSummary for the Summary field, and the new ResponsesReasoningSummary type is properly defined with Type and Text fields. This aligns with the corresponding TypeScript types in ui/lib/types/logs.ts.

core/providers/gemini/responses.go (3)

148-164: Good fix for range loop variable capture issue.

Creating copies of functionCallID and functionCallName before using them in pointer assignments correctly avoids the Go range loop variable capture issue. This is a proper fix for Go versions prior to 1.22.


166-179: Thought signature preservation looks correct.

The logic correctly creates a separate ResponsesReasoning message when a thought signature is present, using an empty Summary slice and the encrypted content. This aligns with the updated schema and supports Gemini 3 Pro requirements.


619-627: Look-ahead logic assumes reasoning message immediately follows function call.

The look-ahead for thought signature assumes the reasoning message is at index i+1. This may not handle cases where messages are reordered or there are intervening messages. Consider documenting this assumption or adding validation.

Verify that the message ordering convention (reasoning message immediately after function call) is consistently maintained across all code paths that produce these messages.

core/providers/cohere/responses.go (2)

162-765: Streaming conversion implementation is comprehensive.

The ToBifrostResponsesStream method handles the full OpenAI-style streaming lifecycle (created, in_progress, output_item.added, deltas, output_item.done, completed) with proper state management. The tool call argument accumulation and tool plan lifecycle handling appear correct.


894-1029: Message conversion logic handles reasoning blocks correctly.

The ConvertBifrostMessagesToCohereMessages function properly accumulates pending reasoning blocks and attaches them to assistant messages. The system message extraction and prepending logic is also correct.

@Pratham-Mishra04 Pratham-Mishra04 force-pushed the 12-04-feat_responses_reasoning_fixes branch from a15c48b to d4bfce4 Compare December 4, 2025 15:55
@Pratham-Mishra04 Pratham-Mishra04 force-pushed the 12-04-feat_raw_response_accumulation_for_streaming branch from 6cf3108 to 4b4a584 Compare December 4, 2025 15:55
@coderabbitai coderabbitai bot requested a review from TejasGhatte December 4, 2025 16:12
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (4)
core/providers/utils/utils.go (1)

267-267: Previous review comment still applies.

The past review already flagged this change from sonic.Marshal to sonic.MarshalIndent. The concern about increased payload size for production API requests and the associated debug prints in openai.go remains valid. Please address the feedback from the previous review.

core/providers/cohere/responses.go (3)

148-153: Returning empty struct for invalid image block remains unaddressed.

When ImageURL is nil, an empty ResponsesMessageContentBlock{} is returned with a zero-value Type field, which could cause unexpected behavior downstream when processing content blocks.

Consider one of the previously suggested approaches:

  • Return a text block indicating the missing image
  • Return (schemas.ResponsesMessageContentBlock, bool) to indicate validity
  • Skip invalid blocks at the call site

Based on learnings, this issue was previously flagged but not yet addressed.


1131-1142: Tool choice "auto" mapping to "required" remains semantically incorrect.

Line 1136 maps "auto" to ToolChoiceRequired, which changes the semantic meaning. In the Responses API, "auto" means the model decides whether to call a tool, while "required" forces a tool call.

Please verify Cohere's tool choice options and update the mapping:

#!/bin/bash
# Search for Cohere tool choice type definitions and usage
ast-grep --pattern 'type CohereToolChoice $$$'
ast-grep --pattern 'ToolChoice$_ CohereToolChoice = $_'

Based on learnings, this issue was previously flagged but not yet addressed.


1216-1225: Encrypted reasoning content exposure in plain text marker remains unaddressed.

Lines 1219-1224 wrap encrypted content in a plain text marker [ENCRYPTED_REASONING: ...], exposing the encrypted content in an unprotected format. This defeats the purpose of encryption if the content is meant to remain opaque.

Consider skipping encrypted content entirely since Cohere doesn't support it:

 		} else if msg.ResponsesReasoning.EncryptedContent != nil {
-			// Cohere doesn't have a direct equivalent to encrypted content,
-			// so we'll store it as a regular thinking block with a special marker
-			encryptedText := fmt.Sprintf("[ENCRYPTED_REASONING: %s]", *msg.ResponsesReasoning.EncryptedContent)
-			thinkingBlock := CohereContentBlock{
-				Type:     CohereContentBlockTypeThinking,
-				Thinking: &encryptedText,
-			}
-			thinkingBlocks = append(thinkingBlocks, thinkingBlock)
+			// Skip encrypted content as Cohere doesn't support it
+			// The encrypted content should remain opaque and not be sent to other providers
 		}

Based on learnings, this issue was previously flagged but not yet addressed.

🧹 Nitpick comments (5)
transports/bifrost-http/integrations/anthropic.go (3)

74-78: Remove commented-out code.

This commented-out code block should either be removed or documented with a TODO/reason for keeping it. Leaving dead code in comments reduces readability and maintainability.

 			ResponsesResponseConverter: func(ctx *context.Context, resp *schemas.BifrostResponsesResponse) (interface{}, error) {
-				// if resp.ExtraFields.Provider == schemas.Anthropic {
-				// 	if resp.ExtraFields.RawResponse != nil {
-				// 		return resp.ExtraFields.RawResponse, nil
-				// 	}
-				// }
 				return anthropic.ToAnthropicResponsesResponse(resp), nil
 			},

94-97: Use the injected logger instead of stdlib log.

The AnthropicRouter is initialized with a schemas.Logger (line 246), but this closure uses the stdlib log.Printf. This inconsistency means errors logged here won't go through the configured logging infrastructure.

Consider passing the logger through the route config or using a package-level logger that can be configured.


103-117: Remove large commented-out code block.

This 15-line commented block should be removed. If this logic might be needed in the future, document the intent in a TODO or track it in an issue rather than leaving dead code.

 					} else {
-						// if resp.ExtraFields.Provider == schemas.Anthropic ||
-						// 	(resp.ExtraFields.Provider == schemas.Vertex &&
-						// 		(schemas.IsAnthropicModel(resp.ExtraFields.ModelRequested) ||
-						// 			schemas.IsAnthropicModel(resp.ExtraFields.ModelDeployment))) {
-						// 	// This is always true in integrations
-						// 	isRawResponseEnabled, ok := (*ctx).Value(schemas.BifrostContextKeySendBackRawResponse).(bool)
-						// 	if ok && isRawResponseEnabled {
-						// 		if resp.ExtraFields.RawResponse != nil {
-						// 			return string(anthropicResponse[len(anthropicResponse)-1].Type), resp.ExtraFields.RawResponse, nil
-						// 		} else {
-						// 			// Explicitly return nil to indicate that no raw response is available (because 1 chunk of anthropic gets converted to multiple bifrost responses chunks)
-						// 			return "", nil, nil
-						// 		}
-						// 	}
-						// }
 						return string(anthropicResponse[0].Type), anthropicResponse[0], nil
 					}
core/providers/utils/utils.go (1)

950-960: Consider using a package-level random source for better performance.

The current implementation creates a new rand.Source on every call, which is inefficient. However, the collision risk from time.Now().UnixNano() seeding is minimal in practice since GetRandomString is used for generating message IDs in response processing (not in tight loops where nanosecond collisions would occur).

For non-security use cases like message identification, consider a simpler optimization using a package-level source with synchronization:

var (
	randMu  sync.Mutex
	randSrc = rand.New(rand.NewSource(time.Now().UnixNano()))
)

func GetRandomString(length int) string {
	letters := []rune("abcdefghijklmnopqrstuvwxyz0123456789")
	b := make([]rune, length)
	randMu.Lock()
	for i := range b {
		b[i] = letters[randSrc.Intn(len(letters))]
	}
	randMu.Unlock()
	return string(b)
}

This avoids repeated allocations without the complexity of crypto/rand.

core/providers/openai/responses.go (1)

56-81: Duplicate condition on line 59.

The condition len(message.ResponsesReasoning.Summary) > 0 is checked twice in the same if statement at lines 57 and 59.

Apply this diff to remove the redundant check:

 			// If the message has summaries but no content blocks and the model is gpt-oss, then convert the summaries to content blocks
 			if len(message.ResponsesReasoning.Summary) > 0 &&
 				strings.Contains(bifrostReq.Model, "gpt-oss") &&
-				len(message.ResponsesReasoning.Summary) > 0 &&
 				message.Content == nil {
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between a15c48b and d4bfce4.

📒 Files selected for processing (13)
  • core/providers/anthropic/errors.go (2 hunks)
  • core/providers/anthropic/types.go (3 hunks)
  • core/providers/cohere/responses.go (9 hunks)
  • core/providers/gemini/responses.go (2 hunks)
  • core/providers/openai/openai.go (1 hunks)
  • core/providers/openai/responses.go (2 hunks)
  • core/providers/openai/types.go (2 hunks)
  • core/providers/utils/utils.go (3 hunks)
  • core/schemas/responses.go (5 hunks)
  • framework/streaming/responses.go (2 hunks)
  • transports/bifrost-http/handlers/inference.go (1 hunks)
  • transports/bifrost-http/integrations/anthropic.go (3 hunks)
  • ui/lib/types/logs.ts (2 hunks)
🚧 Files skipped from review as they are similar to previous changes (3)
  • core/providers/openai/openai.go
  • transports/bifrost-http/handlers/inference.go
  • core/providers/anthropic/types.go
🧰 Additional context used
📓 Path-based instructions (1)
**

⚙️ CodeRabbit configuration file

always check the stack if there is one for the current PR. do not give localized reviews for the PR, always see all changes in the light of the whole stack of PRs (if there is a stack, if there is no stack you can continue to make localized suggestions/reviews)

Files:

  • ui/lib/types/logs.ts
  • core/providers/anthropic/errors.go
  • core/providers/openai/types.go
  • core/providers/utils/utils.go
  • transports/bifrost-http/integrations/anthropic.go
  • core/schemas/responses.go
  • core/providers/gemini/responses.go
  • core/providers/openai/responses.go
  • core/providers/cohere/responses.go
  • framework/streaming/responses.go
🧬 Code graph analysis (7)
ui/lib/types/logs.ts (1)
core/schemas/responses.go (2)
  • ResponsesReasoningSummary (744-747)
  • ResponsesReasoning (730-733)
core/providers/anthropic/errors.go (2)
core/providers/anthropic/types.go (3)
  • AnthropicStreamEvent (328-337)
  • AnthropicStreamEventTypeError (324-324)
  • AnthropicStreamError (399-402)
core/providers/anthropic/responses.go (1)
  • ToAnthropicResponsesStreamError (1392-1413)
core/providers/openai/types.go (1)
core/schemas/responses.go (1)
  • ResponsesParametersReasoning (233-238)
transports/bifrost-http/integrations/anthropic.go (1)
core/providers/gemini/types.go (1)
  • Type (778-778)
core/providers/gemini/responses.go (2)
core/providers/gemini/types.go (5)
  • FunctionCall (1091-1101)
  • Role (13-13)
  • Content (922-930)
  • Type (778-778)
  • Part (936-960)
core/schemas/responses.go (8)
  • ResponsesToolMessage (461-481)
  • ResponsesMessage (313-326)
  • ResponsesInputMessageRoleAssistant (331-331)
  • ResponsesMessageContent (338-343)
  • ResponsesMessageTypeFunctionCall (294-294)
  • ResponsesMessageTypeReasoning (306-306)
  • ResponsesReasoning (730-733)
  • ResponsesReasoningSummary (744-747)
core/providers/openai/responses.go (2)
core/schemas/responses.go (5)
  • ResponsesMessage (313-326)
  • ResponsesReasoning (730-733)
  • ResponsesMessageContentBlock (398-410)
  • ResponsesOutputMessageContentTypeReasoning (393-393)
  • ResponsesMessageContent (338-343)
core/providers/openai/types.go (2)
  • OpenAIResponsesRequest (143-152)
  • OpenAIResponsesRequestInput (110-113)
framework/streaming/responses.go (2)
core/schemas/responses.go (8)
  • ResponsesStreamResponseTypeReasoningSummaryTextDelta (1392-1392)
  • ResponsesMessage (313-326)
  • ResponsesInputMessageRoleAssistant (331-331)
  • ResponsesReasoning (730-733)
  • ResponsesReasoningSummary (744-747)
  • ResponsesMessageContent (338-343)
  • ResponsesMessageContentBlock (398-410)
  • ResponsesOutputMessageContentTypeReasoning (393-393)
core/schemas/utils.go (1)
  • Ptr (16-18)
⏰ Context from checks skipped due to timeout of 900000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
🔇 Additional comments (20)
transports/bifrost-http/integrations/anthropic.go (1)

85-119: Streaming response handling logic looks correct.

The three-way branching based on response count handles:

  • Empty responses (OpenAI lifecycle events) → returns empty
  • Multiple events → concatenates as SSE format
  • Single event → returns directly

The SSE format construction at line 99 correctly follows event: <type>\ndata: <json>\n\n.

core/providers/openai/types.go (1)

154-192: LGTM - Custom marshaling correctly omits max_tokens for OpenAI.

The implementation correctly:

  1. Preserves the custom Input marshaling via json.RawMessage
  2. Copies reasoning fields while explicitly setting MaxTokens to nil
  3. Follows the same pattern as OpenAIChatRequest.MarshalJSON above

Please verify that OpenAI's Responses API indeed does not support reasoning.max_tokens and should have it omitted from requests.

core/providers/openai/responses.go (1)

41-85: Reasoning transformation logic looks correct.

The bidirectional conversion between gpt-oss reasoning content blocks and standard OpenAI summaries+encrypted_content is well-structured. The three branches handle:

  1. Skip messages with content blocks but no summaries for non-gpt-oss models
  2. Convert summaries to content blocks for gpt-oss models
  3. Preserve messages as-is for other cases

The model detection via strings.Contains(bifrostReq.Model, "gpt-oss") is fragile. Consider verifying this matches the actual model naming convention and whether a more robust check (e.g., a helper function or constant) would be appropriate.

ui/lib/types/logs.ts (1)

411-433: LGTM - Type rename aligns with backend schema changes.

The rename from ResponsesReasoningContent to ResponsesReasoningSummary correctly mirrors the backend ResponsesReasoningSummary struct in core/schemas/responses.go (lines 743-746), maintaining consistency across the codebase.

framework/streaming/responses.go (3)

497-534: LGTM - Reasoning summary streaming accumulation.

The new case for ResponsesStreamResponseTypeReasoningSummaryTextDelta correctly:

  1. Searches for existing reasoning message by ItemID
  2. Creates a new reasoning message with proper type and role if not found
  3. Delegates to helper methods for delta and signature handling

626-679: Reasoning delta accumulation handles both storage modes correctly.

The helper properly branches on contentIndex:

  • With index: Stores in content blocks as reasoning_text type
  • Without index: Accumulates into ResponsesReasoning.Summary

The comment on line 667-668 acknowledges the current limitation of accumulating into a single summary entry.


681-727: Signature accumulation logic is correct.

Follows the same pattern as delta handling, storing signatures in either:

  • ContentBlock.Signature when contentIndex is provided
  • ResponsesReasoning.EncryptedContent otherwise

This aligns with the schema design where EncryptedContent serves as the reasoning-level signature storage.

core/schemas/responses.go (4)

68-68: LGTM - StopReason field addition.

The StopReason field appropriately handles non-OpenAI providers that return stop reasons in a different format, with a clear comment noting it's not part of OpenAI's spec.


398-402: LGTM - Signature field for content blocks.

Adding the Signature field to ResponsesMessageContentBlock enables per-block signature storage for reasoning content, which aligns with the streaming accumulation logic in framework/streaming/responses.go.


729-747: LGTM - Rename to ResponsesReasoningSummary.

The rename from ResponsesReasoningContent to ResponsesReasoningSummary better reflects the purpose of this struct and maintains consistency with the UI types in ui/lib/types/logs.ts.


1439-1441: LGTM - Signature field for streaming responses.

Adding Signature to BifrostResponsesStreamResponse enables streaming signature deltas alongside text deltas, supporting the reasoning accumulation logic.

core/providers/gemini/responses.go (2)

138-179: LGTM! Good handling of function calls and thought signatures.

The implementation correctly:

  • Avoids range loop variable capture by creating copies of functionCallID and functionCallName
  • Preserves Gemini's ThoughtSignature as encrypted content in a separate reasoning message
  • Initializes the Summary field as an empty slice, consistent with the new schema structure

609-629: LGTM! Proper bidirectional conversion with safe look-ahead.

The look-ahead mechanism correctly:

  • Checks array bounds before accessing the next message
  • Validates the next message is a reasoning type with encrypted content
  • Preserves the thought signature from the Bifrost reasoning message back to Gemini's format

This maintains consistency with the reverse conversion in convertGeminiCandidatesToResponsesOutput.

core/providers/cohere/responses.go (7)

17-17: LGTM! Proper state tracking for reasoning content.

The ReasoningContentIndices field is correctly:

  • Initialized in the pool's New function
  • Handled with defensive nil checks in acquireCohereResponsesStreamState
  • Cleared in the flush method, consistent with other map fields

This enables proper tracking of reasoning blocks during streaming conversion.

Also applies to: 34-34, 64-68, 106-110


318-368: LGTM! Proper reasoning content lifecycle handling.

The thinking/reasoning block handling correctly:

  • Creates a reasoning message with the appropriate type and empty Summary slice
  • Tracks the content index in ReasoningContentIndices for downstream event emission
  • Emits OpenAI-style lifecycle events (output_item.added, content_part.added)
  • Generates stable item IDs consistent with other content types

395-410: LGTM! Correct differentiation between text and reasoning deltas.

The implementation properly emits reasoning_summary_text.delta events for thinking content (line 400) instead of output_text.delta, ensuring downstream consumers can distinguish between regular text and reasoning updates.


420-449: LGTM! Proper reasoning block cleanup and event emission.

The content end handling correctly:

  • Uses ReasoningContentIndices to differentiate reasoning from text blocks
  • Emits reasoning_summary_text.done for reasoning (line 425) vs. output_text.done for text (line 454)
  • Cleans up the tracking map (line 449) to prevent memory leaks

977-1112: LGTM! Comprehensive message conversion with proper state management.

The conversion function correctly handles:

  • Accumulation of reasoning blocks via pendingReasoningContentBlocks
  • Association of reasoning with assistant messages
  • Proper flushing of pending blocks at function end (lines 1090-1100)
  • System message collection and prepending (lines 1102-1109)

The state machine logic is complex but appears sound for managing the various message types and their relationships.


850-932: LGTM! Comprehensive request conversion with proper parameter mapping.

The ToCohereResponsesRequest function correctly:

  • Maps standard parameters (temperature, top_p, max_tokens)
  • Extracts Cohere-specific options from ExtraParams (top_k, thinking, penalties)
  • Converts tools and tool choice using dedicated helper functions
  • Delegates message conversion to ConvertBifrostMessagesToCohereMessages

The structure is clean and follows established patterns in the codebase.


1370-1383: The reasoning message structure is correct and not redundant. Cohere provides reasoning as reasoning_text content blocks (line 1327-1331), which are correctly placed in Content.ContentBlocks while ResponsesReasoning.Summary remains empty. This dual-field pattern is intentional: ResponsesReasoning.Summary is used by providers that send reasoning summaries (e.g., some OpenAI models), while Content.ContentBlocks is used for reasoning_text blocks (Cohere, Bedrock, Anthropic). When converting back to provider format (line 1207-1209), the code checks Summary first—which is empty for content that originated from blocks, and that's correct.

Likely an incorrect or invalid review comment.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
framework/streaming/responses.go (1)

71-104: New Signature field is not preserved in deep copies

  • deepCopyResponsesStreamResponse copies Delta and LogProbs but never copies the new Signature field on BifrostResponsesStreamResponse (Lines 71–104), so any signature arriving from providers is lost when we stash stream responses in the accumulator.
  • deepCopyResponsesMessageContentBlock similarly never copies the new Signature (and still ignores FileID) on ResponsesMessageContentBlock (Lines 382–424), so block‑level signatures won’t survive accumulation either.

Both issues mean the newly added reasoning signature plumbing in buildCompleteMessageFromResponsesStreamChunks / appendReasoningSignatureToResponsesMessage can never see those signatures.

Consider updating both helpers along these lines:

func deepCopyResponsesStreamResponse(original *schemas.BifrostResponsesStreamResponse) *schemas.BifrostResponsesStreamResponse {
    ...
-   if original.Delta != nil {
-       copyDelta := *original.Delta
-       copy.Delta = &copyDelta
-   }
+   if original.Delta != nil {
+       copyDelta := *original.Delta
+       copy.Delta = &copyDelta
+   }
+   if original.Signature != nil {
+       copySig := *original.Signature
+       copy.Signature = &copySig
+   }
    ...
}
func deepCopyResponsesMessageContentBlock(original schemas.ResponsesMessageContentBlock) schemas.ResponsesMessageContentBlock {
-   copy := schemas.ResponsesMessageContentBlock{
-       Type: original.Type,
-   }
+   copy := schemas.ResponsesMessageContentBlock{
+       Type: original.Type,
+   }
+   if original.FileID != nil {
+       id := *original.FileID
+       copy.FileID = &id
+   }
+   if original.Signature != nil {
+       sig := *original.Signature
+       copy.Signature = &sig
+   }
    if original.Text != nil {
        copyText := *original.Text
        copy.Text = &copyText
    }
    ...
}

Also applies to: 382-424

♻️ Duplicate comments (4)
core/providers/utils/utils.go (1)

267-267: Revert to sonic.Marshal for production performance.

This change from sonic.Marshal to sonic.MarshalIndent was flagged in a previous review but remains unaddressed. The indented JSON increases payload size and bandwidth for all provider API requests without any documented justification. Provider APIs do not require formatted JSON.

Unless there is a specific requirement for indented JSON (which should be documented with a code comment), revert this change.

Apply this diff to revert the change:

-		jsonBody, err := sonic.MarshalIndent(convertedBody, "", "  ")
+		jsonBody, err := sonic.Marshal(convertedBody)
core/providers/cohere/responses.go (3)

140-172: Empty content block for invalid image URL is fragile

When cohereBlock.Type == CohereContentBlockTypeImage and cohereBlock.ImageURL == nil, the function returns a zero‑value ResponsesMessageContentBlock{} (Type == ""), which can later end up in ContentBlocks and surprise downstream logic that expects a valid Type.

Consider either:

  • Skipping such blocks entirely, or
  • Returning a text block indicating an invalid/missing image instead of an empty block.

1127-1145: Cohere tool choice mapping treats "auto" and unknown values as "required"

convertBifrostToolChoiceToCohereToolChoice:

switch *toolChoiceString {
case "none":
    choice := ToolChoiceNone
    return &choice
case "required", "auto", "function":
    choice := ToolChoiceRequired
    return &choice
default:
    choice := ToolChoiceRequired
    return &choice
}

Maps both "auto" and any unknown string to ToolChoiceRequired, which forces a tool call instead of letting the model decide. That’s a semantic change from OpenAI‑style "auto" and may not match Cohere’s API either.

Consider instead:

  • Mapping "none"ToolChoiceNone,
  • "required" / "function"ToolChoiceRequired,
  • "auto" (and other/unknown values) → nil to fall back to Cohere defaults, or a dedicated “auto” enum if available.

1193-1225: Encrypted reasoning content is exposed via a plain-text marker

In convertBifrostReasoningToCohereThinking, when EncryptedContent is present:

encryptedText := fmt.Sprintf("[ENCRYPTED_REASONING: %s]", *msg.ResponsesReasoning.EncryptedContent)
thinkingBlock := CohereContentBlock{
    Type:     CohereContentBlockTypeThinking,
    Thinking: &encryptedText,
}

This wraps the encrypted payload in a clear‑text marker and sends it to Cohere as “thinking” text, which may be contrary to the intent of keeping it opaque and could leak internal details.

Safer options:

  • Skip EncryptedContent entirely for Cohere (don’t send it), or
  • Represent only high‑level metadata (e.g., “[ENCRYPTED_REASONING_PRESENT]”) without including the ciphertext.
🧹 Nitpick comments (4)
core/providers/utils/utils.go (1)

951-960: The duplicate-string concern is overstated; refactoring is optional, not essential.

While the implementation has minor inefficiencies, the actual impact is negligible for this use case:

  1. Duplicates are extremely unlikely: A 50-character random string from a 37-character alphabet offers ~10^80 possible combinations. The probability of duplicate outputs is vanishingly small, especially for API response message IDs generated during processing.

  2. Inefficiency is minor: Creating a new rand.Source per call has overhead, but these calls occur during response transformation—not in a tight loop. This is unlikely to be a performance bottleneck.

  3. Cryptographic security not required: These are internal message IDs, not authentication tokens or security-sensitive values.

If performance profiling shows this is a bottleneck, consider refactoring with sync.Once to initialize a package-level random source. However, this is not essential for the current usage pattern.

transports/bifrost-http/handlers/inference.go (1)

224-254: ResponsesRequest.UnmarshalJSON logic looks solid; fix comment wording and consider deduping with ChatRequest

This implementation correctly mirrors the ChatRequest flow: it protects the embedded BifrostParams from being shadowed by the custom ResponsesParameters.UnmarshalJSON, and it cleanly decodes the input union and params. No functional issues stand out.

Two small nits:

  • Line 236: the comment says "Unmarshal messages" but this block unmarshals the input field. Consider updating to avoid confusion.
  • The structure is now nearly identical to ChatRequest.UnmarshalJSON; if this pattern spreads further, a shared helper for "unmarshal BifrostParams + specific input + specific params" could reduce duplication, though it's not urgent.
core/providers/openai/responses.go (1)

42-84: Reasoning message skip / comment mismatch – please confirm intended behavior

  • For non‑gpt-oss models, reasoning messages with ResponsesReasoning but only ContentBlocks (no Summary, no EncryptedContent) are silently skipped (Lines 47–54). That drops those messages entirely instead of degrading them (e.g., into summaries or plain text). If such inputs can occur cross‑provider, this may be surprising; worth confirming that they can’t, or that dropping them is acceptable.
  • The comment “convert them to summaries” (Line 43) doesn’t match the code, which instead converts summaries to reasoning content blocks for gpt-oss when Content == nil (Lines 56–77). Updating the comment would avoid confusion.
core/providers/cohere/responses.go (1)

1304-1383: Reasoning summary content is only attached as blocks, not as Summary

convertSingleCohereMessageToBifrostMessages collects CohereContentBlockTypeThinking blocks into reasoningContentBlocks and then:

  • Prepends a ResponsesMessageTypeReasoning message with Content.ContentBlocks = reasoningContentBlocks and
  • Initializes ResponsesReasoning.Summary as an empty slice.

Given the new schema encourages using ResponsesReasoning.Summary for reasoning summaries, this is fine as long as downstream code expects reasoning_text content blocks and not populated Summary entries for Cohere outputs. If you intend to surface reasoning summaries uniformly across providers, you might later want to mirror those blocks into ResponsesReasoning.Summary as well.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between a15c48b and d4bfce4.

📒 Files selected for processing (13)
  • core/providers/anthropic/errors.go (2 hunks)
  • core/providers/anthropic/types.go (3 hunks)
  • core/providers/cohere/responses.go (9 hunks)
  • core/providers/gemini/responses.go (2 hunks)
  • core/providers/openai/openai.go (1 hunks)
  • core/providers/openai/responses.go (2 hunks)
  • core/providers/openai/types.go (2 hunks)
  • core/providers/utils/utils.go (3 hunks)
  • core/schemas/responses.go (5 hunks)
  • framework/streaming/responses.go (2 hunks)
  • transports/bifrost-http/handlers/inference.go (1 hunks)
  • transports/bifrost-http/integrations/anthropic.go (3 hunks)
  • ui/lib/types/logs.ts (2 hunks)
🚧 Files skipped from review as they are similar to previous changes (5)
  • core/providers/anthropic/errors.go
  • ui/lib/types/logs.ts
  • core/providers/openai/openai.go
  • core/providers/openai/types.go
  • core/providers/anthropic/types.go
🧰 Additional context used
📓 Path-based instructions (1)
**

⚙️ CodeRabbit configuration file

always check the stack if there is one for the current PR. do not give localized reviews for the PR, always see all changes in the light of the whole stack of PRs (if there is a stack, if there is no stack you can continue to make localized suggestions/reviews)

Files:

  • core/providers/utils/utils.go
  • core/providers/openai/responses.go
  • transports/bifrost-http/handlers/inference.go
  • transports/bifrost-http/integrations/anthropic.go
  • core/providers/gemini/responses.go
  • core/providers/cohere/responses.go
  • core/schemas/responses.go
  • framework/streaming/responses.go
🧬 Code graph analysis (6)
core/providers/openai/responses.go (2)
core/schemas/responses.go (5)
  • ResponsesMessage (313-326)
  • ResponsesReasoning (730-733)
  • ResponsesMessageContentBlock (398-410)
  • ResponsesOutputMessageContentTypeReasoning (393-393)
  • ResponsesMessageContent (338-343)
core/providers/openai/types.go (2)
  • OpenAIResponsesRequest (143-152)
  • OpenAIResponsesRequestInput (110-113)
transports/bifrost-http/handlers/inference.go (2)
core/schemas/bifrost.go (1)
  • ResponsesRequest (91-91)
core/schemas/responses.go (1)
  • ResponsesParameters (86-113)
core/providers/gemini/responses.go (3)
ui/lib/types/logs.ts (6)
  • FunctionCall (157-160)
  • ResponsesToolMessage (402-408)
  • ResponsesMessage (422-437)
  • ResponsesMessageContent (399-399)
  • ResponsesReasoning (416-419)
  • ResponsesReasoningSummary (411-414)
core/providers/gemini/types.go (4)
  • FunctionCall (1091-1101)
  • Role (13-13)
  • Content (922-930)
  • Type (778-778)
core/schemas/responses.go (5)
  • ResponsesToolMessage (461-481)
  • ResponsesMessage (313-326)
  • ResponsesMessageContent (338-343)
  • ResponsesReasoning (730-733)
  • ResponsesReasoningSummary (744-747)
core/providers/cohere/responses.go (3)
core/providers/cohere/types.go (22)
  • CohereContentBlock (142-156)
  • CohereContentBlockTypeText (134-134)
  • CohereContentBlockTypeImage (135-135)
  • CohereContentBlockTypeThinking (136-136)
  • CohereStreamEvent (387-392)
  • StreamEventMessageStart (372-372)
  • StreamEventContentStart (373-373)
  • StreamEventContentDelta (374-374)
  • StreamEventContentEnd (375-375)
  • StreamEventToolPlanDelta (376-376)
  • StreamEventToolCallStart (377-377)
  • StreamEventToolCallDelta (378-378)
  • StreamEventToolCallEnd (379-379)
  • StreamEventCitationStart (380-380)
  • StreamEventCitationEnd (381-381)
  • StreamEventMessageEnd (382-382)
  • StreamEventDebug (383-383)
  • CohereChatRequest (14-31)
  • CohereMessage (50-56)
  • NewBlocksContent (105-109)
  • NewStringContent (98-102)
  • CohereImageURL (159-161)
core/schemas/responses.go (14)
  • BifrostResponsesResponse (45-84)
  • ResponsesStreamResponseTypeCreated (1362-1362)
  • ResponsesStreamResponseTypeInProgress (1363-1363)
  • ResponsesStreamResponseTypeOutputTextDone (1375-1375)
  • ResponsesStreamResponseTypeContentPartDone (1372-1372)
  • ResponsesMessage (313-326)
  • ResponsesStreamResponseTypeOutputItemDone (1369-1369)
  • ResponsesStreamResponseTypeOutputItemAdded (1368-1368)
  • ResponsesStreamResponseTypeContentPartAdded (1371-1371)
  • ResponsesStreamResponseTypeOutputTextDelta (1374-1374)
  • ResponsesStreamResponseTypeFunctionCallArgumentsDelta (1380-1380)
  • ResponsesStreamResponseTypeFunctionCallArgumentsDone (1381-1381)
  • ResponsesResponseUsage (253-260)
  • ResponsesToolChoice (958-961)
core/schemas/utils.go (2)
  • SafeExtractIntPointer (486-494)
  • SafeExtractFromMap (519-525)
core/schemas/responses.go (2)
core/providers/gemini/types.go (1)
  • Type (778-778)
ui/lib/types/logs.ts (2)
  • ResponsesMessageContentBlockType (351-358)
  • ResponsesReasoningSummary (411-414)
framework/streaming/responses.go (4)
core/schemas/responses.go (9)
  • ResponsesMessage (313-326)
  • ResponsesMessageTypeReasoning (306-306)
  • ResponsesInputMessageRoleAssistant (331-331)
  • ResponsesReasoning (730-733)
  • ResponsesReasoningSummary (744-747)
  • ResponsesMessageContent (338-343)
  • ResponsesMessageContentBlock (398-410)
  • ResponsesOutputMessageContentTypeReasoning (393-393)
  • ResponsesReasoningContentBlockTypeSummaryText (740-740)
core/providers/gemini/types.go (3)
  • Type (778-778)
  • Role (13-13)
  • Content (922-930)
core/schemas/utils.go (1)
  • Ptr (16-18)
core/utils.go (1)
  • Ptr (51-53)
⏰ Context from checks skipped due to timeout of 900000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
🔇 Additional comments (3)
core/providers/gemini/responses.go (1)

138-179: Function-call ↔ thought-signature round-trip looks consistent

The new FunctionCall conversion:

  • Emits a ResponsesToolMessage with CallID, Name, and stringified Args.
  • Emits a separate reasoning message carrying EncryptedContent from ThoughtSignature.

And the reverse path:

  • Rebuilds FunctionCall from ResponsesToolMessage and, if the next message is a reasoning message with EncryptedContent, attaches it as ThoughtSignature.

This is internally consistent and nil‑safe; just keep in mind the assumption that the reasoning message immediately follows the function-call message when constructing messages elsewhere.

Also applies to: 609-655

transports/bifrost-http/integrations/anthropic.go (1)

73-103: Clarify SSE contract for multi-event Anthropic streaming

When ToAnthropicResponsesStreamResponse returns more than one event, the converter now:

  • Marshals each event to JSON and concatenates them as a single SSE string ("event: %s\ndata: %s\n\n"), and
  • Returns ("", combinedContent, nil).

This assumes the upstream streaming writer treats a non-empty payload with an empty event name as “already formatted SSE” and writes it verbatim. If the writer instead always wraps (eventName, data) into its own SSE envelope, this will double‑wrap or drop the event type.

Please double‑check the StreamConfig writer path to ensure:

  • event == "" is indeed interpreted as “raw SSE payload”, and
  • It’s acceptable to skip individual Anthropic events that fail sonic.Marshal rather than failing the whole chunk.
core/schemas/responses.go (1)

45-84: Schema extensions for stop reason, reasoning summaries, and signatures look coherent

The additions:

  • StopReason on BifrostResponsesResponse,
  • Signature on ResponsesMessageContentBlock,
  • the new ResponsesReasoningSummary type and updated ResponsesReasoning.Summary,
  • and Delta/Signature on BifrostResponsesStreamResponse

are structurally consistent with how the rest of the file models union types and streaming events.

The main follow‑up risk is making sure all converters and helpers (deep copies, provider adapters, streaming accumulators) are updated to propagate Signature and the new Summary shape; some of that is already wired up, but a few helpers still need updates (see streaming/cohere comments).

Also applies to: 399-410, 729-747, 1439-1442

@Pratham-Mishra04 Pratham-Mishra04 force-pushed the 12-04-feat_responses_reasoning_fixes branch from d4bfce4 to bf9c361 Compare December 4, 2025 17:58
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (6)
core/providers/utils/utils.go (1)

267-271: Avoid MarshalIndent in hot-path request body marshalling.

CheckContextAndGetRequestBody is on the request path for all providers; using sonic.MarshalIndent here increases allocations and bloats every request payload with whitespace. Unless a specific upstream API strictly requires pretty-printed JSON, it’s better to keep the wire format compact and, if needed, pretty-print only for logging.

Consider reverting to sonic.Marshal:

-		jsonBody, err := sonic.MarshalIndent(convertedBody, "", "  ")
+		jsonBody, err := sonic.Marshal(convertedBody)

If pretty JSON is truly required for a given provider, please document that requirement and consider making indentation opt‑in rather than the default for all providers.

core/providers/cohere/responses.go (5)

140-172: Avoid returning zero-value content block for invalid image URL

When cohereBlock.ImageURL == nil, this returns schemas.ResponsesMessageContentBlock{} with a zero Type, which can confuse downstream consumers that expect a valid type or no block at all. A small sentinel or text fallback is safer.

 case CohereContentBlockTypeImage:
-	// For images, create a text block describing the image
-	if cohereBlock.ImageURL == nil {
-		// Skip invalid image blocks without ImageURL
-		return schemas.ResponsesMessageContentBlock{}
-	}
-	return schemas.ResponsesMessageContentBlock{
-		Type: schemas.ResponsesInputMessageContentBlockTypeImage,
-		ResponsesInputMessageContentBlockImage: &schemas.ResponsesInputMessageContentBlockImage{
-			ImageURL: &cohereBlock.ImageURL.URL,
-		},
-	}
+	if cohereBlock.ImageURL == nil || cohereBlock.ImageURL.URL == "" {
+		// Return a small text sentinel instead of a zero-value block
+		return schemas.ResponsesMessageContentBlock{
+			Type: schemas.ResponsesInputMessageContentBlockTypeText,
+			Text: schemas.Ptr("[Image block with missing URL]"),
+		}
+	}
+	return schemas.ResponsesMessageContentBlock{
+		Type: schemas.ResponsesInputMessageContentBlockTypeImage,
+		ResponsesInputMessageContentBlockImage: &schemas.ResponsesInputMessageContentBlockImage{
+			ImageURL: &cohereBlock.ImageURL.URL,
+		},
+	}

174-488: Fix nil-dereference when generating reasoning item IDs in streaming

In the StreamEventContentStart handler for CohereContentBlockTypeThinking, state.MessageID is dereferenced before the nil check:

// Generate stable ID for reasoning item
itemID := fmt.Sprintf("msg_%s_reasoning_%d", *state.MessageID, outputIndex)
if state.MessageID == nil {
    itemID = fmt.Sprintf("reasoning_%d", outputIndex)
}

If state.MessageID is nil (e.g., no message_start ID), this will panic.

A nil-safe branch avoids the panic:

-				// Generate stable ID for reasoning item
-				itemID := fmt.Sprintf("msg_%s_reasoning_%d", *state.MessageID, outputIndex)
-				if state.MessageID == nil {
-					itemID = fmt.Sprintf("reasoning_%d", outputIndex)
-				}
+				// Generate stable ID for reasoning item
+				var itemID string
+				if state.MessageID != nil {
+					itemID = fmt.Sprintf("msg_%s_reasoning_%d", *state.MessageID, outputIndex)
+				} else {
+					itemID = fmt.Sprintf("reasoning_%d", outputIndex)
+				}

The rest of the reasoning/text streaming (reasoning_summary_text.{delta,done}, content_part.{added,done}, output_item.{added,done}) looks coherent and matches the intended OpenAI-style lifecycle.


1127-1146: Tool choice "auto" should not be forced to ToolChoiceRequired

The current mapping forces both "auto" and unknown strings to ToolChoiceRequired, which changes semantics and can unintentionally force tool calls:

case "required", "auto", "function":
    choice := ToolChoiceRequired
    return &choice
default:
    choice := ToolChoiceRequired
    return &choice

Safer behavior is to only map explicit "required"/"function" and "none", letting "auto" (and unknown strings) fall back to Cohere’s default behavior:

	if toolChoiceString != nil {
		switch *toolChoiceString {
		case "none":
			choice := ToolChoiceNone
			return &choice
-		case "required", "auto", "function":
+		case "required", "function":
			choice := ToolChoiceRequired
-			return &choice
-		default:
-			choice := ToolChoiceRequired
-			return &choice
+			return &choice
+		case "auto":
+			// Let Cohere use its default "auto" behavior.
+			return nil
+		default:
+			// Unknown strings: fall back to provider defaults.
+			return nil
		}
	}
What are the valid `tool_choice` values and semantics for Cohere's Chat API, and how should `"auto"`, `"none"`, and `"required"` be mapped to the Go SDK enums?

1193-1229: Do not expose encrypted reasoning content as plain text

convertBifrostReasoningToCohereThinking currently converts encrypted reasoning into a readable string:

} else if msg.ResponsesReasoning.EncryptedContent != nil {
    encryptedText := fmt.Sprintf("[ENCRYPTED_REASONING: %s]", *msg.ResponsesReasoning.EncryptedContent)
    thinkingBlock := CohereContentBlock{
        Type:     CohereContentBlockTypeThinking,
        Thinking: &encryptedText,
    }
    thinkingBlocks = append(thinkingBlocks, thinkingBlock)
}

This leaks the encrypted payload in clear form, which defeats the purpose of keeping it opaque when forwarding to another provider.

Better to skip encrypted reasoning entirely for Cohere:

-	} else if msg.ResponsesReasoning.EncryptedContent != nil {
-		// Cohere doesn't have a direct equivalent to encrypted content,
-		// so we'll store it as a regular thinking block with a special marker
-		encryptedText := fmt.Sprintf("[ENCRYPTED_REASONING: %s]", *msg.ResponsesReasoning.EncryptedContent)
-		thinkingBlock := CohereContentBlock{
-			Type:     CohereContentBlockTypeThinking,
-			Thinking: &encryptedText,
-		}
-		thinkingBlocks = append(thinkingBlocks, thinkingBlock)
+	} else if msg.ResponsesReasoning.EncryptedContent != nil {
+		// Cohere doesn't support encrypted reasoning; skip forwarding it so it remains opaque.
 	}

The existing handling of ContentBlocks and Summary already covers non-encrypted reasoning.


1231-1265: Access CallID via embedded struct to avoid nil-pointer panic

convertBifrostFunctionCallToCohereMessage reads msg.CallID directly:

if msg.CallID != nil {
    toolCall.ID = msg.CallID
}

Because CallID is promoted from the embedded *ResponsesToolMessage, this will panic if msg.ResponsesToolMessage is nil.

Guard the embedded pointer explicitly:

-	if msg.CallID != nil {
-		toolCall.ID = msg.CallID
-	}
+	if msg.ResponsesToolMessage != nil && msg.ResponsesToolMessage.CallID != nil {
+		toolCall.ID = msg.ResponsesToolMessage.CallID
+	}

The rest of the function already checks msg.ResponsesToolMessage != nil for Arguments and Name.

🧹 Nitpick comments (4)
core/providers/openai/responses.go (1)

57-59: Redundant condition check.

len(message.ResponsesReasoning.Summary) > 0 is checked twice on lines 57 and 59.

-			if len(message.ResponsesReasoning.Summary) > 0 &&
-				strings.Contains(bifrostReq.Model, "gpt-oss") &&
-				len(message.ResponsesReasoning.Summary) > 0 &&
+			if len(message.ResponsesReasoning.Summary) > 0 &&
+				strings.Contains(bifrostReq.Model, "gpt-oss") &&
 				message.Content == nil {
transports/bifrost-http/integrations/anthropic.go (2)

94-98: Use injected logger instead of standard log package.

Using log.Printf directly bypasses the structured logger passed to NewAnthropicRouter. This can cause inconsistent logging behavior and lose context in production environments.

Consider passing the logger to the converter function or using a closure to capture it. If that's not feasible, at minimum document why log is used here.


74-78: Remove or clarify commented-out code blocks.

Multiple commented-out code blocks are present. If this code is no longer needed, remove it to reduce confusion. If it's temporarily disabled, add a TODO comment explaining when it should be re-enabled.

Also applies to: 103-117

core/providers/cohere/responses.go (1)

1303-1429: Consider setting assistant role on synthesized reasoning messages

convertSingleCohereMessageToBifrostMessages builds a separate reasoning ResponsesMessage with populated ResponsesReasoning and ContentBlocks, but it doesn’t set a Role. For consistency with other providers and with how reasoning is emitted elsewhere, it’s useful to mark these as assistant-originated:

	if len(reasoningContentBlocks) > 0 {
+		role := schemas.ResponsesInputMessageRoleAssistant
 		reasoningMessage := schemas.ResponsesMessage{
 			ID:   schemas.Ptr("rs_" + fmt.Sprintf("%d", time.Now().UnixNano())),
 			Type: schemas.Ptr(schemas.ResponsesMessageTypeReasoning),
+			Role: &role,
 			ResponsesReasoning: &schemas.ResponsesReasoning{
 				Summary: []schemas.ResponsesReasoningSummary{},
 			},
 			Content: &schemas.ResponsesMessageContent{
 				ContentBlocks: reasoningContentBlocks,
 			},
 		}

This is a behavioral refinement rather than a correctness fix, but it will likely make downstream consumers’ role-based handling more predictable.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between d4bfce4 and bf9c361.

📒 Files selected for processing (14)
  • core/providers/anthropic/chat.go (1 hunks)
  • core/providers/anthropic/errors.go (2 hunks)
  • core/providers/anthropic/types.go (4 hunks)
  • core/providers/cohere/responses.go (9 hunks)
  • core/providers/gemini/responses.go (2 hunks)
  • core/providers/openai/openai.go (1 hunks)
  • core/providers/openai/responses.go (2 hunks)
  • core/providers/openai/types.go (2 hunks)
  • core/providers/utils/utils.go (3 hunks)
  • core/schemas/responses.go (5 hunks)
  • framework/streaming/responses.go (2 hunks)
  • transports/bifrost-http/handlers/inference.go (1 hunks)
  • transports/bifrost-http/integrations/anthropic.go (3 hunks)
  • ui/lib/types/logs.ts (2 hunks)
🚧 Files skipped from review as they are similar to previous changes (4)
  • core/providers/openai/openai.go
  • ui/lib/types/logs.ts
  • core/providers/anthropic/errors.go
  • transports/bifrost-http/handlers/inference.go
🧰 Additional context used
📓 Path-based instructions (1)
**

⚙️ CodeRabbit configuration file

always check the stack if there is one for the current PR. do not give localized reviews for the PR, always see all changes in the light of the whole stack of PRs (if there is a stack, if there is no stack you can continue to make localized suggestions/reviews)

Files:

  • core/providers/openai/types.go
  • core/providers/utils/utils.go
  • transports/bifrost-http/integrations/anthropic.go
  • core/providers/gemini/responses.go
  • core/providers/anthropic/chat.go
  • core/providers/openai/responses.go
  • framework/streaming/responses.go
  • core/providers/anthropic/types.go
  • core/schemas/responses.go
  • core/providers/cohere/responses.go
🧬 Code graph analysis (5)
core/providers/openai/types.go (2)
core/schemas/responses.go (1)
  • ResponsesParametersReasoning (233-238)
ui/lib/types/logs.ts (1)
  • ResponsesParametersReasoning (512-519)
transports/bifrost-http/integrations/anthropic.go (1)
core/providers/gemini/types.go (1)
  • Type (778-778)
core/providers/gemini/responses.go (3)
ui/lib/types/logs.ts (6)
  • FunctionCall (157-160)
  • ResponsesToolMessage (402-408)
  • ResponsesMessage (422-437)
  • ResponsesMessageContent (399-399)
  • ResponsesReasoning (416-419)
  • ResponsesReasoningSummary (411-414)
core/providers/gemini/types.go (4)
  • FunctionCall (1091-1101)
  • Role (13-13)
  • Content (922-930)
  • Type (778-778)
core/schemas/responses.go (5)
  • ResponsesToolMessage (461-481)
  • ResponsesMessage (313-326)
  • ResponsesMessageContent (338-343)
  • ResponsesReasoning (730-733)
  • ResponsesReasoningSummary (744-747)
core/providers/openai/responses.go (2)
core/schemas/responses.go (4)
  • ResponsesMessage (313-326)
  • ResponsesReasoning (730-733)
  • ResponsesMessageContentBlock (398-410)
  • ResponsesMessageContent (338-343)
core/providers/openai/types.go (2)
  • OpenAIResponsesRequest (143-152)
  • OpenAIResponsesRequestInput (110-113)
core/schemas/responses.go (2)
core/providers/gemini/types.go (1)
  • Type (778-778)
ui/lib/types/logs.ts (2)
  • ResponsesMessageContentBlockType (351-358)
  • ResponsesReasoningSummary (411-414)
⏰ Context from checks skipped due to timeout of 900000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
🔇 Additional comments (23)
core/schemas/responses.go (4)

68-68: LGTM: StopReason field addition.

The comment clearly documents that this field is "Not in OpenAI's spec, but sent by other providers", which provides useful context for maintainers.


399-402: LGTM: Signature field addition to ResponsesMessageContentBlock.

The Signature field for reasoning content blocks aligns with the streaming updates in BifrostResponsesStreamResponse and the UI type definitions.


730-747: LGTM: ResponsesReasoning and ResponsesReasoningSummary refactoring.

The transition from []ResponsesReasoningContent to []ResponsesReasoningSummary with explicit Type and Text fields provides clearer semantics. This aligns with the UI type ResponsesReasoningSummary in ui/lib/types/logs.ts:410-413.


1439-1441: LGTM: Streaming response signature support.

Adding Signature to BifrostResponsesStreamResponse enables proper signature propagation during streaming, which is essential for reasoning content integrity.

core/providers/anthropic/types.go (2)

135-145: LGTM: RedactedThinking content block type.

Adding AnthropicContentBlockTypeRedactedThinking enables proper handling of redacted thinking content blocks from Anthropic's API.


153-153: LGTM: Data field for redacted thinking.

The Data field with clear documentation for encrypted data in redacted thinking blocks is appropriate.

core/providers/openai/types.go (1)

154-192: LGTM: Custom marshaling to strip MaxTokens for OpenAI.

The implementation correctly strips MaxTokens from reasoning parameters before sending to OpenAI, since OpenAI doesn't support this field (it's Anthropic-specific per the schema documentation). The approach using an alias struct and json.RawMessage for preserving custom Input marshaling is sound.

core/providers/openai/responses.go (1)

42-85: LGTM: Reasoning content handling for OpenAI models.

The logic correctly differentiates between:

  1. gpt-oss models: which use reasoning_text content blocks
  2. Other OpenAI models: which use summaries + encrypted_content

The transformation ensures proper format compatibility when sending requests to OpenAI.

transports/bifrost-http/integrations/anthropic.go (1)

86-119: LGTM: Multi-event SSE aggregation logic.

The streaming response handling correctly aggregates multiple Anthropic events into proper SSE format and handles edge cases (empty responses, single events). The error logging without failing allows the stream to continue processing remaining events.

framework/streaming/responses.go (3)

497-534: LGTM: ReasoningSummaryTextDelta handling.

The implementation correctly:

  1. Searches for existing reasoning messages by ItemID (reverse iteration for efficiency)
  2. Creates new reasoning messages when needed with proper initialization
  3. Handles both text deltas and signature deltas in a single pass

The guard condition on line 500 ensures we have at least one payload and a valid ItemID before processing.


626-679: LGTM: Reasoning delta accumulation with dual-path logic.

The helper correctly handles two accumulation paths:

  1. With ContentIndex: Accumulates into content blocks as reasoning_text type
  2. Without ContentIndex: Accumulates into ResponsesReasoning.Summary

The TODO comment on lines 667-668 appropriately notes future enhancement potential for multiple summary entries.


681-727: LGTM: Signature accumulation with proper field mapping.

The signature helper correctly maps:

  • With ContentIndex → Signature field in content block
  • Without ContentIndex → EncryptedContent field in ResponsesReasoning

This aligns with the schema design where EncryptedContent serves as the signature/encrypted data at the reasoning level.

core/providers/anthropic/chat.go (1)

608-634: PartialJSON guard condition now emits empty string deltas.

The condition changed from chunk.Delta.PartialJSON != nil && *chunk.Delta.PartialJSON != "" to just chunk.Delta.PartialJSON != nil. This allows empty string partial JSON to be emitted as deltas. Evidence shows this is intentional: responses.go:3069 explicitly creates empty PartialJSON deltas, and the accumulation logic (responses.go:470, 478) safely concatenates even empty strings. Validation of non-empty Arguments is deferred to after accumulation completes (as seen in test utilities validating the final assembled result). This change is safe and maintains streaming consistency.

core/providers/gemini/responses.go (2)

138-179: Function-call → tool message + reasoning signature path looks solid

The new FunctionCall branch builds a proper ResponsesToolMessage (with JSON-serialized args) and a separate reasoning message carrying Summary (initialized empty) and EncryptedContent for the thought signature. This cleanly aligns Gemini function calls with the updated ResponsesReasoning schema and avoids range-variable capture issues.


596-629: Reconstruction of Gemini FunctionCall + ThoughtSignature is consistent with emit side

The FunctionCall reconstruction from ResponsesToolMessage (including CallID and decoded Arguments) and the lookahead-based ThoughtSignature attachment match how convertGeminiCandidatesToResponsesOutput emits the function-call + reasoning pair. As long as the reasoning message immediately follows the function-call (which this file enforces), this round-trip is coherent.

core/providers/cohere/responses.go (8)

13-25: ReasoningContentIndices tracking and reset look correct

Adding ReasoningContentIndices into CohereResponsesStreamState, initializing it in the pool, and clearing it in both acquireCohereResponsesStreamState and flush ensures per-stream tracking of reasoning content indices without leaking state between streams. No issues here.

Also applies to: 29-41, 45-77, 89-118


490-567: Streaming tool plan, tool calls, citations, and lifecycle wiring look consistent

The handling of StreamEventToolPlanDelta, tool call start/delta/end, citation start/end, and StreamEventMessageEnd appears internally consistent:

  • Tool plan text is emitted as normal output_text.delta on a dedicated output index, with proper close-out events before tool calls.
  • Tool call arguments are buffered per-output-index and finalized with function_call_arguments.done followed by output_item.done.
  • Citations become OutputTextAnnotationAdded/OutputTextAnnotationDone with indices wired via ContentIndexToOutputIndex.
  • Message end emits a single response.completed with aggregated usage and stable CreatedAt.

No additional correctness issues stand out beyond the reasoning-ID nil-deref already called out.

Also applies to: 612-735, 735-803, 804-848


850-932: Bifrost → Cohere request conversion is aligned with Responses params

ToCohereResponsesRequest cleanly maps core parameters (MaxOutputTokens, Temperature, TopP, top_k, stop sequences, penalties) and the thinking extra param into the Cohere request, and converts tools/tool choice/messages via the new helpers. The shape looks correct and side-effect free.


935-975: Cohere → Bifrost response conversion is straightforward

ToBifrostResponsesResponse correctly maps the Cohere ID, computes CreatedAt at receipt time, translates usage (including cached tokens), and uses ConvertCohereMessagesToBifrostMessages(..., true) for the output message. This path looks correct; only note is that Model isn’t propagated here, but that’s consistent with the current implementation.


977-1112: Bidirectional message mapping handles system, reasoning, and tools coherently

ConvertBifrostMessagesToCohereMessages and ConvertCohereMessagesToBifrostMessages:

  • Separate system content and prepend it as a single system message to Cohere.
  • Accumulate reasoning messages into CohereContentBlockTypeThinking and attach them to the next assistant message.
  • Convert function calls and function outputs into Cohere’s tool_calls / role:"tool" structures and back.

The control flow around currentAssistantMessage and pendingReasoningContentBlocks looks sound and flushes consistently at boundaries.


1150-1191: System and regular message conversions are straightforward

convertBifrostMessageToCohereSystemContent and convertBifrostMessageToCohereMessage simply flatten text blocks and map roles, and only emit content when present. Both look correct and side-effect free.


1267-1301: Function call output → Cohere role:"tool" mapping looks correct

convertBifrostFunctionCallOutputToCohereMessage only emits a tool message when CallID is present, reconstructs content from either Content or the OpenAI-style Output wrapper, and sets ToolCallID appropriately. This matches the Responses schema and Cohere expectations.


1431-1465: Content block → Cohere block conversion covers key cases

convertResponsesMessageContentBlocksToCohere correctly:

  • Maps both input and output text types to CohereContentBlockTypeText.
  • Converts image URL blocks into CohereContentBlockTypeImage.
  • Maps reasoning blocks to CohereContentBlockTypeThinking.

This provides a clean, minimal surface for Cohere without surprising behavior.

@Pratham-Mishra04 Pratham-Mishra04 changed the base branch from 12-04-feat_raw_response_accumulation_for_streaming to graphite-base/1000 December 5, 2025 14:01
@Pratham-Mishra04 Pratham-Mishra04 force-pushed the 12-04-feat_responses_reasoning_fixes branch from bf9c361 to bcef5b4 Compare December 5, 2025 14:01
@Pratham-Mishra04 Pratham-Mishra04 changed the base branch from graphite-base/1000 to 12-05-feat_send_back_raw_request_support December 5, 2025 14:02
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (4)
core/providers/anthropic/types.go (1)

356-356: This StopSequence change was already flagged in a previous review.

The change from *string with omitempty to string without omitempty breaks API compatibility, as previously noted. Empty strings will serialize as "stop_sequence": "" instead of being omitted or representing null.

core/providers/cohere/responses.go (3)

1238-1256: Guard access to promoted CallID field to prevent panics

On line 1244, msg.CallID is accessed directly, but CallID is promoted from the embedded *ResponsesToolMessage. If msg.ResponsesToolMessage is nil, accessing msg.CallID will panic even inside the if condition.

Apply this fix:

-	if msg.CallID != nil {
-		toolCall.ID = msg.CallID
+	if msg.ResponsesToolMessage != nil && msg.ResponsesToolMessage.CallID != nil {
+		toolCall.ID = msg.ResponsesToolMessage.CallID
 	}

The same pattern should be applied to any other accesses of promoted fields from msg.ResponsesToolMessage throughout the function.


319-327: Critical: Nil-pointer dereference on state.MessageID

The code dereferences *state.MessageID on line 324 before checking if it's nil on line 325, which will cause a panic.

Apply this fix:

-			// Generate stable ID for reasoning item
-			itemID := fmt.Sprintf("msg_%s_reasoning_%d", *state.MessageID, outputIndex)
-			if state.MessageID == nil {
-				itemID = fmt.Sprintf("reasoning_%d", outputIndex)
-			}
+			// Generate stable ID for reasoning item
+			var itemID string
+			if state.MessageID != nil {
+				itemID = fmt.Sprintf("msg_%s_reasoning_%d", *state.MessageID, outputIndex)
+			} else {
+				itemID = fmt.Sprintf("reasoning_%d", outputIndex)
+			}

Note: This issue was flagged in a previous review but appears to remain unaddressed in the current code.


1131-1142: "auto" tool choice should map to ToolChoiceAuto, not ToolChoiceRequired

The "auto" tool choice is incorrectly mapped to ToolChoiceRequired. Cohere's API supports three tool choice modes: NONE, REQUIRED, and AUTO (defined as constants in core/providers/cohere/types.go). Mapping "auto" to ToolChoiceRequired changes semantics—auto allows the model to decide whether to call a tool, while required forces a tool call.

-		case "required", "auto", "function":
-			choice := ToolChoiceRequired
-			return &choice
+		case "required", "function":
+			choice := ToolChoiceRequired
+			return &choice
+		case "auto":
+			choice := ToolChoiceAuto
+			return &choice
🧹 Nitpick comments (4)
core/providers/vertex/errors.go (1)

14-28: Centralized body decoding and error classification look correct

Using providerUtils.CheckAndDecodeBody and switching all sonic.Unmarshal calls to decodedBody is a solid improvement: it handles content‑encoding consistently and cleanly separates decode failures (ErrProviderResponseDecode) from JSON shape issues (ErrProviderResponseUnmarshal). The fallback chain for OpenAI/Vertex/VertexValidation error formats remains intact and behaviorally equivalent apart from the improved error typing. I don’t see new correctness or panic risks here; this aligns well with the shared decoding utils used in other providers.

ui/app/workspace/logs/views/logResponsesMessageView.tsx (1)

202-204: Use strict equality (===) for type comparison.

The guard logic is correct, but TypeScript/JavaScript best practice is to use strict equality === instead of loose equality == for type comparisons.

-	if (message.type == "reasoning" && (!message.summary || message.summary.length === 0) && !message.encrypted_content) {
+	if (message.type === "reasoning" && (!message.summary || message.summary.length === 0) && !message.encrypted_content) {
 		return null;
 	}
core/schemas/responses.go (1)

731-733: Consider using a pointer or omitempty behavior for Summary slice.

The Summary field is a non-pointer slice without omitempty. In Go, an empty slice []ResponsesReasoningSummary{} will serialize as "summary": [] rather than being omitted. If the intent is to omit the field when empty (consistent with the UI guard checking message.summary.length === 0), consider adding omitempty.

 type ResponsesReasoning struct {
-	Summary          []ResponsesReasoningSummary `json:"summary"`
+	Summary          []ResponsesReasoningSummary `json:"summary,omitempty"`
 	EncryptedContent *string                     `json:"encrypted_content,omitempty"`
 }
transports/bifrost-http/integrations/anthropic.go (1)

111-113: Use the structured logger instead of log.Printf.

The router receives a schemas.Logger parameter (as seen in NewAnthropicRouter), but this error logging uses the standard library's log.Printf. For consistency with the codebase's logging practices, use the structured logger.

Consider passing the logger to the stream converter or using a context-aware logging approach. If the logger isn't accessible in this closure, the error could be returned or the design adjusted to provide logger access.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between bf9c361 and bcef5b4.

📒 Files selected for processing (18)
  • core/providers/anthropic/chat.go (1 hunks)
  • core/providers/anthropic/errors.go (2 hunks)
  • core/providers/anthropic/types.go (4 hunks)
  • core/providers/cohere/responses.go (9 hunks)
  • core/providers/gemini/responses.go (2 hunks)
  • core/providers/openai/responses.go (2 hunks)
  • core/providers/openai/types.go (2 hunks)
  • core/providers/utils/utils.go (4 hunks)
  • core/providers/vertex/errors.go (1 hunks)
  • core/schemas/bifrost.go (1 hunks)
  • core/schemas/responses.go (5 hunks)
  • framework/streaming/responses.go (2 hunks)
  • transports/bifrost-http/handlers/inference.go (1 hunks)
  • transports/bifrost-http/integrations/anthropic.go (4 hunks)
  • transports/bifrost-http/integrations/router.go (3 hunks)
  • ui/app/workspace/logs/views/logResponsesMessageView.tsx (1 hunks)
  • ui/app/workspace/logs/views/logResponsesOutputView.tsx (0 hunks)
  • ui/lib/types/logs.ts (2 hunks)
💤 Files with no reviewable changes (1)
  • ui/app/workspace/logs/views/logResponsesOutputView.tsx
🚧 Files skipped from review as they are similar to previous changes (5)
  • transports/bifrost-http/handlers/inference.go
  • core/providers/anthropic/chat.go
  • core/providers/anthropic/errors.go
  • core/providers/utils/utils.go
  • ui/lib/types/logs.ts
🧰 Additional context used
📓 Path-based instructions (1)
**

⚙️ CodeRabbit configuration file

always check the stack if there is one for the current PR. do not give localized reviews for the PR, always see all changes in the light of the whole stack of PRs (if there is a stack, if there is no stack you can continue to make localized suggestions/reviews)

Files:

  • ui/app/workspace/logs/views/logResponsesMessageView.tsx
  • core/schemas/bifrost.go
  • core/providers/gemini/responses.go
  • core/providers/vertex/errors.go
  • core/schemas/responses.go
  • transports/bifrost-http/integrations/anthropic.go
  • transports/bifrost-http/integrations/router.go
  • core/providers/openai/types.go
  • core/providers/openai/responses.go
  • core/providers/anthropic/types.go
  • core/providers/cohere/responses.go
  • framework/streaming/responses.go
🧬 Code graph analysis (7)
core/providers/gemini/responses.go (3)
ui/lib/types/logs.ts (6)
  • FunctionCall (157-160)
  • ResponsesToolMessage (403-409)
  • ResponsesMessage (423-438)
  • ResponsesMessageContent (400-400)
  • ResponsesReasoning (417-420)
  • ResponsesReasoningSummary (412-415)
core/providers/gemini/types.go (4)
  • FunctionCall (1091-1101)
  • Role (13-13)
  • Content (922-930)
  • Type (778-778)
core/schemas/responses.go (5)
  • ResponsesToolMessage (461-481)
  • ResponsesMessage (313-326)
  • ResponsesMessageContent (338-343)
  • ResponsesReasoning (730-733)
  • ResponsesReasoningSummary (744-747)
core/providers/vertex/errors.go (4)
core/providers/utils/utils.go (2)
  • CheckAndDecodeBody (467-475)
  • NewBifrostOperationError (493-504)
core/schemas/provider.go (1)
  • ErrProviderResponseDecode (29-29)
core/providers/vertex/vertex.go (1)
  • VertexError (25-31)
core/providers/vertex/types.go (1)
  • VertexValidationError (154-161)
core/schemas/responses.go (2)
core/providers/gemini/types.go (1)
  • Type (778-778)
ui/lib/types/logs.ts (2)
  • ResponsesMessageContentBlockType (352-359)
  • ResponsesReasoningSummary (412-415)
transports/bifrost-http/integrations/anthropic.go (3)
core/schemas/bifrost.go (6)
  • Anthropic (37-37)
  • Vertex (40-40)
  • BifrostContextKeyUseRawRequestBody (117-117)
  • BifrostContextKeyExtraHeaders (115-115)
  • BifrostContextKeyURLPath (116-116)
  • BifrostContextKeySkipKeySelection (114-114)
core/schemas/utils.go (1)
  • IsAnthropicModel (1043-1045)
core/providers/anthropic/responses.go (1)
  • ToAnthropicResponsesStreamResponse (834-1232)
transports/bifrost-http/integrations/router.go (2)
core/schemas/bifrost.go (1)
  • BifrostContextKeyIntegrationType (120-120)
core/providers/gemini/types.go (1)
  • Type (778-778)
core/providers/openai/types.go (2)
core/schemas/responses.go (1)
  • ResponsesParametersReasoning (233-238)
ui/lib/types/logs.ts (1)
  • ResponsesParametersReasoning (513-520)
core/providers/openai/responses.go (2)
core/schemas/responses.go (4)
  • ResponsesMessage (313-326)
  • ResponsesReasoning (730-733)
  • ResponsesMessageContentBlock (398-410)
  • ResponsesMessageContent (338-343)
core/providers/openai/types.go (2)
  • OpenAIResponsesRequest (143-152)
  • OpenAIResponsesRequestInput (110-113)
⏰ Context from checks skipped due to timeout of 900000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (9)
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
🔇 Additional comments (21)
core/schemas/responses.go (4)

743-747: Type rename from ResponsesReasoningContent to ResponsesReasoningSummary looks correct.

The rename aligns with the UI type definition and the semantic meaning of the struct. This is a breaking change for any external consumers using the old type name.


68-68: LGTM - StopReason field addition.

The StopReason field appropriately accommodates providers that include stop reason information outside OpenAI's spec.


399-402: LGTM - Signature field addition to content blocks.

The Signature field enables per-content signatures for reasoning blocks, consistent with the streaming response structure.


1439-1441: LGTM - Streaming response signature support.

The Signature field addition aligns with the content block changes and properly documents that it's "Not in OpenAI's spec, but sent by other providers."

core/providers/anthropic/types.go (4)

135-145: LGTM - Added redacted_thinking content block type.

The new constant appropriately extends the enum to support Anthropic's redacted/encrypted thinking content blocks.


153-153: LGTM - Data field for redacted thinking.

The Data field properly supports encrypted data for redacted thinking blocks, with the comment clarifying that signature does not accompany this field.


350-350: LGTM - Type field made optional.

Adding omitempty to the Type field in AnthropicStreamDelta is appropriate as the type may not always be present in delta events.


300-305: The concern about omitempty on cache token fields is technically invalid; omitempty has no effect on int types in Go.

Per Go's encoding/json specification, omitempty only affects pointers, slices, maps, interfaces, and strings—not integer types. Integer zero values are always serialized as 0 regardless of the tag. The actual change (using a value type instead of pointer for CacheCreation) aligns with Anthropic's API contract, which always includes the cache_creation object in usage responses, even when empty. The code correctly deserializes and handles these fields per the documented Anthropic API behavior.

Likely an incorrect or invalid review comment.

transports/bifrost-http/integrations/anthropic.go (3)

74-81: LGTM - Extended provider check for Vertex with Anthropic models.

The condition correctly handles the case where Vertex is the provider but the model is an Anthropic/Claude model, using the IsAnthropicModel utility function.


106-122: LGTM - SSE aggregation for multiple streaming events.

The logic correctly handles the case where ToAnthropicResponsesStreamResponse returns multiple events by combining them into a properly formatted SSE string with event: and data: lines.


194-206: LGTM - Refined passthrough gating.

The updated logic properly:

  1. Only sets raw request body for Anthropic or unspecified providers
  2. Conditionally attaches extra headers/URL path only when not using Anthropic API key auth
core/providers/openai/types.go (1)

154-192: Well-structured custom marshaling implementation.

The approach correctly shadows the embedded fields to customize JSON output. The implementation properly:

  1. Marshals Input first using its custom MarshalJSON method
  2. Wraps it in json.RawMessage to preserve the marshaled output
  3. Copies Reasoning with MaxTokens set to nil

This is correct for the OpenAI Responses API, which does not include a max_tokens field in the reasoning parameter. Token limiting is controlled at the request level via max_output_tokens, not within the reasoning configuration. The implementation correctly omits this field by setting it to nil.

core/schemas/bifrost.go (1)

120-120: LGTM!

The addition of BifrostContextKeyIntegrationType follows the existing pattern for context keys and is used appropriately in the router to store integration type information.

transports/bifrost-http/integrations/router.go (3)

312-313: LGTM!

Setting the integration type in the context is clean and follows the established pattern for storing request metadata.


709-712: LGTM!

The updated shouldSendDoneMarker logic correctly distinguishes between providers that expect [DONE] markers and those that don't (Anthropic and the responses API).


883-883: LGTM!

Expanding the SSE string check to allow both "data: " and "event: " prefixes properly supports providers like Anthropic that use custom event types in their SSE format.

framework/streaming/responses.go (3)

498-534: LGTM!

The new ReasoningSummaryTextDelta handling correctly creates or finds reasoning messages and delegates to the new helper functions for accumulation. The logic to find existing messages by ItemID is sound.


626-679: LGTM!

The appendReasoningDeltaToResponsesMessage helper correctly handles both content-block-based reasoning (with ContentIndex) and summary-based reasoning (without ContentIndex). The array bounds checks and initialization logic are appropriate.


681-727: LGTM!

The appendReasoningSignatureToResponsesMessage helper mirrors the delta logic and correctly handles signatures in both content blocks and encrypted content. The implementation is consistent with the delta handler.

core/providers/gemini/responses.go (2)

138-179: LGTM!

The function call handling improvements include:

  1. Proper JSON marshaling of function arguments
  2. Creating local copies to avoid range loop variable capture issues
  3. Correctly initializing the new Summary field when emitting reasoning messages for ThoughtSignature

These changes align with the broader schema updates for reasoning summaries.


609-629: LGTM!

The conversion logic correctly:

  1. Sets the function call name and arguments
  2. Propagates the CallID when present
  3. Preserves ThoughtSignature by looking ahead for reasoning messages with encrypted content

This properly handles Gemini 3 Pro's requirement for ThoughtSignature on function calls.

@Pratham-Mishra04 Pratham-Mishra04 force-pushed the 12-04-feat_responses_reasoning_fixes branch from bcef5b4 to 4f289b9 Compare December 5, 2025 14:25
@Pratham-Mishra04 Pratham-Mishra04 force-pushed the 12-05-feat_send_back_raw_request_support branch 2 times, most recently from 4ab2a0a to 10060d1 Compare December 5, 2025 14:29
@Pratham-Mishra04 Pratham-Mishra04 force-pushed the 12-04-feat_responses_reasoning_fixes branch from 4f289b9 to b3244b9 Compare December 5, 2025 14:29
@Pratham-Mishra04 Pratham-Mishra04 force-pushed the 12-05-feat_send_back_raw_request_support branch from 10060d1 to d6466cb Compare December 6, 2025 10:05
@Pratham-Mishra04 Pratham-Mishra04 force-pushed the 12-04-feat_responses_reasoning_fixes branch from b3244b9 to e04023a Compare December 6, 2025 10:05
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (9)
core/providers/utils/utils.go (2)

267-267: Past concern remains unaddressed: MarshalIndent increases payload size.

A previous review comment already identified that switching to sonic.MarshalIndent increases the payload size for all provider API requests without documented justification. The code still uses the indented format, and the concern about debug prints and production performance remains valid.


1000-1009: Past suggestions remain unaddressed: Input validation and efficiency improvements.

A previous review comment already provided detailed suggestions for improving this function, including:

  • Adding length validation to prevent panics for length <= 0
  • Using a const string for letters instead of recreating []rune each call
  • Building the result into a []byte buffer instead of []rune

The function works correctly for its current use case (generating cosmetic identifiers), but these improvements would harden it for broader use.

core/providers/anthropic/types.go (1)

356-356: StopSequence should use *string with omitempty for API compatibility.

This concern was raised in a previous review. Changing StopSequence from *string with omitempty to string without omitempty breaks compatibility with Anthropic's API specification. The API returns stop_sequence as either null (in initial streaming events) or a string value. Using a non-pointer string will serialize empty strings as "stop_sequence": "" instead of properly representing the null state.

Apply this diff to restore API compatibility:

-	StopSequence string                   `json:"stop_sequence"`
+	StopSequence *string                  `json:"stop_sequence,omitempty"`
transports/bifrost-http/integrations/anthropic.go (1)

94-105: Remove commented-out code.

This dead code was flagged in a previous review. Remove it to improve maintainability.

 				} else {
-					// if resp.ExtraFields.Provider == schemas.Anthropic ||
-					// 	(resp.ExtraFields.Provider == schemas.Vertex &&
-					// 		(schemas.IsAnthropicModel(resp.ExtraFields.ModelRequested) ||
-					// 			schemas.IsAnthropicModel(resp.ExtraFields.ModelDeployment))) {
-					// 	if resp.ExtraFields.RawResponse != nil {
-					// 		var rawResponseJSON anthropic.AnthropicStreamDelta
-					// 		err := sonic.Unmarshal([]byte(resp.ExtraFields.RawResponse.(string)), &rawResponseJSON)
-					// 		if err == nil {
-					// 			return string(rawResponseJSON.Type), resp.ExtraFields.RawResponse, nil
-					// 		}
-					// 	}
-					// }
 					if len(anthropicResponse) > 1 {
core/providers/cohere/responses.go (5)

140-172: Empty block returned for invalid image may cause downstream issues.

When ImageURL is nil (line 150-152), an empty ResponsesMessageContentBlock{} with zero-value Type is returned. This was flagged in a previous review but the current fix returns an empty block instead of a sentinel value.

Consider returning a properly typed block or filtering at the call site.


319-327: Nil-pointer dereference risk in reasoning item ID generation.

Line 324 dereferences *state.MessageID before the nil check on line 325. This was flagged in a previous review and remains unaddressed.

-				itemID := fmt.Sprintf("msg_%s_reasoning_%d", *state.MessageID, outputIndex)
-				if state.MessageID == nil {
-					itemID = fmt.Sprintf("reasoning_%d", outputIndex)
-				}
+				var itemID string
+				if state.MessageID != nil {
+					itemID = fmt.Sprintf("msg_%s_reasoning_%d", *state.MessageID, outputIndex)
+				} else {
+					itemID = fmt.Sprintf("reasoning_%d", outputIndex)
+				}

1131-1142: Tool choice "auto" incorrectly maps to "required".

This was flagged in a previous review. The "auto" tool choice has different semantics than "required" - auto lets the model decide, while required forces a tool call.

Verify Cohere's tool choice options and map "auto" appropriately (possibly to nil for default behavior).


1216-1225: Encrypted reasoning content exposed in plain text marker.

This was flagged in a previous review. Embedding encrypted content in a [ENCRYPTED_REASONING: ...] marker exposes potentially sensitive data in plain text to Cohere.

Consider skipping encrypted content entirely rather than exposing it.


1244-1246: Guard access to embedded CallID to avoid nil panic.

Accessing msg.CallID when msg.ResponsesToolMessage is nil will panic because CallID is a field on the embedded pointer type.

-	if msg.CallID != nil {
-		toolCall.ID = msg.CallID
-	}
+	if msg.ResponsesToolMessage != nil && msg.ResponsesToolMessage.CallID != nil {
+		toolCall.ID = msg.ResponsesToolMessage.CallID
+	}
🧹 Nitpick comments (2)
transports/bifrost-http/integrations/router.go (1)

709-712: Consider tightening the /responses path check.

The strings.Contains(config.Path, "/responses") check is somewhat broad and could match unintended paths (e.g., a hypothetical /api/user_responses). Consider using a more specific check:

-		if config.Type == RouteConfigTypeAnthropic || strings.Contains(config.Path, "/responses") {
+		if config.Type == RouteConfigTypeAnthropic || strings.HasSuffix(config.Path, "/responses") || strings.Contains(config.Path, "/responses/") {
			shouldSendDoneMarker = false
		}

Alternatively, you could add a dedicated flag to StreamConfig to explicitly control DONE marker behavior.

core/providers/gemini/responses.go (1)

143-145: Inconsistent JSON library usage.

The code uses json.Marshal here while the rest of the file uses sonic for JSON operations. This inconsistency could lead to subtle serialization differences.

-				if argsBytes, err := json.Marshal(part.FunctionCall.Args); err == nil {
+				if argsBytes, err := sonic.Marshal(part.FunctionCall.Args); err == nil {
 					argumentsStr = string(argsBytes)
 				}
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between bcef5b4 and e04023a.

📒 Files selected for processing (18)
  • core/providers/anthropic/chat.go (1 hunks)
  • core/providers/anthropic/errors.go (2 hunks)
  • core/providers/anthropic/types.go (4 hunks)
  • core/providers/cohere/responses.go (9 hunks)
  • core/providers/gemini/responses.go (2 hunks)
  • core/providers/openai/responses.go (2 hunks)
  • core/providers/openai/types.go (2 hunks)
  • core/providers/utils/utils.go (4 hunks)
  • core/providers/vertex/errors.go (1 hunks)
  • core/schemas/bifrost.go (1 hunks)
  • core/schemas/responses.go (5 hunks)
  • framework/streaming/responses.go (2 hunks)
  • transports/bifrost-http/handlers/inference.go (1 hunks)
  • transports/bifrost-http/integrations/anthropic.go (4 hunks)
  • transports/bifrost-http/integrations/router.go (3 hunks)
  • ui/app/workspace/logs/views/logResponsesMessageView.tsx (1 hunks)
  • ui/app/workspace/logs/views/logResponsesOutputView.tsx (0 hunks)
  • ui/lib/types/logs.ts (2 hunks)
💤 Files with no reviewable changes (1)
  • ui/app/workspace/logs/views/logResponsesOutputView.tsx
🚧 Files skipped from review as they are similar to previous changes (6)
  • core/providers/vertex/errors.go
  • core/providers/openai/responses.go
  • ui/app/workspace/logs/views/logResponsesMessageView.tsx
  • core/providers/anthropic/errors.go
  • ui/lib/types/logs.ts
  • transports/bifrost-http/handlers/inference.go
🧰 Additional context used
📓 Path-based instructions (1)
**

⚙️ CodeRabbit configuration file

always check the stack if there is one for the current PR. do not give localized reviews for the PR, always see all changes in the light of the whole stack of PRs (if there is a stack, if there is no stack you can continue to make localized suggestions/reviews)

Files:

  • core/schemas/bifrost.go
  • core/schemas/responses.go
  • core/providers/gemini/responses.go
  • core/providers/anthropic/chat.go
  • transports/bifrost-http/integrations/anthropic.go
  • core/providers/anthropic/types.go
  • core/providers/openai/types.go
  • transports/bifrost-http/integrations/router.go
  • framework/streaming/responses.go
  • core/providers/cohere/responses.go
  • core/providers/utils/utils.go
🧬 Code graph analysis (7)
core/schemas/responses.go (2)
core/providers/gemini/types.go (1)
  • Type (778-778)
ui/lib/types/logs.ts (2)
  • ResponsesMessageContentBlockType (352-359)
  • ResponsesReasoningSummary (412-415)
core/providers/gemini/responses.go (2)
core/providers/gemini/types.go (5)
  • FunctionCall (1091-1101)
  • Role (13-13)
  • Content (922-930)
  • Type (778-778)
  • Part (936-960)
core/schemas/responses.go (5)
  • ResponsesToolMessage (461-481)
  • ResponsesMessage (313-326)
  • ResponsesMessageContent (338-343)
  • ResponsesReasoning (730-733)
  • ResponsesReasoningSummary (744-747)
transports/bifrost-http/integrations/anthropic.go (3)
core/schemas/bifrost.go (5)
  • Anthropic (37-37)
  • Vertex (40-40)
  • BifrostContextKeyUseRawRequestBody (117-117)
  • BifrostContextKeyExtraHeaders (115-115)
  • BifrostContextKeyURLPath (116-116)
core/schemas/utils.go (1)
  • IsAnthropicModel (1043-1045)
core/providers/anthropic/responses.go (1)
  • ToAnthropicResponsesStreamResponse (794-1192)
core/providers/openai/types.go (2)
core/schemas/responses.go (1)
  • ResponsesParametersReasoning (233-238)
ui/lib/types/logs.ts (1)
  • ResponsesParametersReasoning (513-520)
transports/bifrost-http/integrations/router.go (2)
core/schemas/bifrost.go (1)
  • BifrostContextKeyIntegrationType (120-120)
core/providers/gemini/types.go (1)
  • Type (778-778)
framework/streaming/responses.go (3)
core/schemas/responses.go (6)
  • ResponsesStreamResponseTypeReasoningSummaryTextDelta (1392-1392)
  • ResponsesMessage (313-326)
  • ResponsesReasoning (730-733)
  • ResponsesReasoningSummary (744-747)
  • ResponsesMessageContent (338-343)
  • ResponsesMessageContentBlock (398-410)
core/schemas/utils.go (1)
  • Ptr (16-18)
framework/streaming/accumulator.go (1)
  • Accumulator (14-30)
core/providers/utils/utils.go (1)
core/schemas/bifrost.go (2)
  • BifrostError (356-365)
  • ErrorField (374-381)
⏰ Context from checks skipped due to timeout of 900000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (7)
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
🔇 Additional comments (26)
core/providers/utils/utils.go (2)

10-10: LGTM: Import supports the new utility function.

The math/rand import is required for the GetRandomString function added below. Since the function generates cosmetic identifiers (not security-sensitive tokens), this import is appropriate.


322-334: LGTM: Good defensive addition for compressed error responses.

Adding CheckAndDecodeBody before unmarshaling ensures that compressed (e.g., gzip) error responses from provider APIs are properly decoded. The error handling correctly returns a BifrostError with status code if decoding fails, and the decoded body is appropriately used for subsequent unmarshaling.

core/providers/anthropic/types.go (1)

135-145: LGTM - New content block type for redacted thinking.

The addition of AnthropicContentBlockTypeRedactedThinking aligns with Anthropic's extended thinking feature where thinking blocks may be redacted. The Data field addition on line 153 properly supports encrypted data for these redacted thinking blocks.

core/schemas/bifrost.go (1)

120-120: LGTM - New context key for integration type.

The new BifrostContextKeyIntegrationType constant enables routing logic to identify the integration type (OpenAI, Anthropic, etc.) in the request context. This supports the conditional DONE marker behavior in handleStreaming.

core/providers/openai/types.go (1)

154-192: LGTM - Custom marshaling excludes MaxTokens for OpenAI.

The MarshalJSON implementation correctly strips MaxTokens from the Reasoning field before serialization, as OpenAI's Responses API doesn't support this parameter (it's Anthropic-specific per the schema comment). The approach of manually copying fields ensures the original request remains unchanged.

Note: If ResponsesParametersReasoning gains new fields in the future, this method will need to be updated to copy them as well.

transports/bifrost-http/integrations/router.go (2)

312-313: LGTM - Integration type stored in context.

Setting the integration type in the context enables downstream logic to conditionally handle provider-specific behaviors like DONE marker emission.


883-885: LGTM - Extended SSE prefix handling.

The condition now correctly handles both "data: " and "event: " prefixed strings, allowing providers that return complete SSE-formatted strings to pass through without double-wrapping.

transports/bifrost-http/integrations/anthropic.go (3)

74-82: LGTM - Extended provider check for Vertex with Anthropic models.

The condition now correctly handles both direct Anthropic requests and Vertex requests using Anthropic models (claude-*), returning raw responses when available.


106-122: LGTM - Multi-event aggregation for streaming responses.

The logic correctly handles cases where ToAnthropicResponsesStreamResponse returns multiple events by aggregating them into a single SSE-formatted string with proper event: and data: lines. Single events are returned directly for more efficient handling.


194-206: Empty provider assumption and OAuth key skipping are correct.

The code's assumption that provider == "" means Anthropic passthrough is reasonable given this is the /anthropic/v1/messages endpoint. The BifrostContextKeySkipKeySelection flag is intentionally set for OAuth flows (detected by the Bearer sk-ant-oat* token in isAnthropicAPIKeyAuth), not API key auth. Anthropic is in the allowed list for key skipping (unlike Azure, Bedrock, and Vertex), so passing an empty key to the provider for OAuth flows is the intended behavior and is properly guarded.

core/providers/anthropic/chat.go (1)

608-634: No issues found with empty PartialJSON handling.

The change to emit tool input deltas whenever PartialJSON is non-nil is safe. Downstream code in framework/streaming/accumulator.go explicitly handles empty string Arguments through string concatenation (line 267), which safely accumulates empty strings without issues. The accumulator also includes special handling for edge cases like empty braces (line 247-248), confirming the code is prepared for empty Arguments values during streaming aggregation.

core/providers/gemini/responses.go (3)

148-164: LGTM - Good defensive copy pattern.

The code correctly creates local copies of functionCallID and functionCallName to avoid potential issues with range variable capture when these values are used in pointers.


166-179: ThoughtSignature preservation for Gemini 3 Pro looks correct.

The logic to emit a separate ResponsesReasoning message when ThoughtSignature is present ensures the signature can be round-tripped. The Summary field is correctly initialized as an empty slice.


619-627: The look-ahead logic is correct; the reasoning message is always emitted immediately after the function call.

In convertGeminiCandidatesToResponsesOutput, when a function call part with a ThoughtSignature is processed, the reasoning message is appended directly after the function call message within the same case block (lines 167–178). There is no opportunity for intervening messages between them, as the loop processes individual parts and appends complete function-call-plus-reasoning pairs sequentially to the messages array.

framework/streaming/responses.go (3)

497-534: LGTM - Well-structured reasoning delta handling.

The new ReasoningSummaryTextDelta case correctly:

  1. Guards against nil Delta/Signature with ItemID check
  2. Searches backwards for existing message by ID
  3. Creates new reasoning message if not found
  4. Handles both text delta and signature delta

626-679: Clear dual-path logic for reasoning delta accumulation.

The helper correctly branches on contentIndex:

  • With index: accumulates into content blocks (reasoning_text type)
  • Without index: accumulates into ResponsesReasoning.Summary

The comment on lines 667-668 acknowledges future extensibility for multiple summary entries.


681-727: Signature helper mirrors delta helper pattern.

The appendReasoningSignatureToResponsesMessage follows the same dual-path logic as the delta helper, storing signatures either in content blocks (Signature field) or in ResponsesReasoning.EncryptedContent. This is consistent with the schema design.

core/schemas/responses.go (4)

68-68: LGTM - StopReason field addition.

The StopReason field is properly documented as not part of OpenAI's spec but needed for other providers. The omitempty tag ensures it won't appear in responses when not set.


399-402: Signature field enables reasoning content signing.

The new Signature field on ResponsesMessageContentBlock supports the reasoning signature streaming feature added in the streaming layer. The field ordering and JSON tag are correct.


729-747: ResponsesReasoning schema update aligns with streaming changes.

The Summary field now uses []ResponsesReasoningSummary with the new struct definition. This aligns with:

  • The streaming helper that appends to Summary[0].Text
  • The UI type definition (ResponsesReasoningSummary with type: "summary_text")
  • The Gemini conversion that initializes Summary: []schemas.ResponsesReasoningSummary{}

1439-1441: Stream response Signature field added.

The Signature field on BifrostResponsesStreamResponse enables streaming reasoning signatures, used by the appendReasoningSignatureToResponsesMessage helper. The comment correctly notes this is not in OpenAI's spec.

core/providers/cohere/responses.go (5)

17-17: LGTM - ReasoningContentIndices state tracking.

The new ReasoningContentIndices map correctly tracks which content indices are reasoning blocks, enabling proper event emission (reasoning vs text) during streaming. Initialization and cleanup follow the established pattern for other state maps.

Also applies to: 34-34, 64-68, 106-110


850-932: ToCohereResponsesRequest implementation looks correct.

The conversion properly:

  • Maps basic parameters (MaxOutputTokens, Temperature, TopP)
  • Extracts extra params (top_k, stop, frequency_penalty, presence_penalty, thinking)
  • Converts tools and tool choice
  • Delegates message conversion to ConvertBifrostMessagesToCohereMessages

977-1112: ConvertBifrostMessagesToCohereMessages handles complex message flows.

The function correctly:

  • Collects system messages separately
  • Tracks pending reasoning blocks to attach to assistant messages
  • Handles function calls and outputs
  • Flushes pending state at message boundaries

The logic for accumulating reasoning blocks before assistant content is particularly well-structured.


1303-1429: convertSingleCohereMessageToBifrostMessages comprehensive conversion.

The function properly:

  • Separates reasoning blocks from regular content
  • Prepends reasoning message to output
  • Handles tool calls with nil safety checks (lines 1389-1392)
  • Generates stable IDs using timestamps

505-511: Same nil-dereference pattern in tool plan ID generation.

Lines 507-511 have the same issue - dereferencing state.MessageID before checking for nil.

-				var itemID string
-				if state.MessageID == nil {
-					itemID = fmt.Sprintf("item_%d", outputIndex)
-				} else {
-					itemID = fmt.Sprintf("msg_%s_item_%d", *state.MessageID, outputIndex)
-				}
+				var itemID string
+				if state.MessageID != nil {
+					itemID = fmt.Sprintf("msg_%s_item_%d", *state.MessageID, outputIndex)
+				} else {
+					itemID = fmt.Sprintf("item_%d", outputIndex)
+				}

Likely an incorrect or invalid review comment.

@Pratham-Mishra04 Pratham-Mishra04 force-pushed the 12-04-feat_responses_reasoning_fixes branch from e04023a to bb6c8dc Compare December 8, 2025 08:24
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 15

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (5)
core/providers/openai/responses.go (1)

109-137: Add a comment linking to OpenAI's API documentation as the source of truth for supported tools.

The supportedTypes allowlist is accurate (all included tools are supported by the OpenAI Responses API), but lacks documentation. Add a comment with a link to OpenAI's official tools documentation to clarify the source of this list and make it easier to update when OpenAI adds or removes tool support. This also addresses the concern about centralizing the allowlist—if this list is part of a broader tool support matrix across providers, consider extracting it to a shared configuration.

transports/bifrost-http/integrations/router.go (1)

879-886: SSE “already formatted” detection is too strict; accept data:/event: without requiring a space.
Right now (Line 883-885) a valid SSE line starting with data: (no space) would be wrapped again, producing malformed output.

-					if !strings.HasPrefix(sseString, "data: ") && !strings.HasPrefix(sseString, "event: ") {
+					if !strings.HasPrefix(sseString, "data:") && !strings.HasPrefix(sseString, "event:") {
 						sseString = fmt.Sprintf("data: %s\n\n", sseString)
 					}
transports/bifrost-http/handlers/inference.go (1)

91-118: Add missing known fields (notably "user") to responsesParamsKnownFields to avoid polluting ExtraParams.
schemas.ResponsesParameters includes User, but "user" isn’t listed here—so it can end up duplicated into ExtraParams and potentially forwarded as an unknown provider param.

 var responsesParamsKnownFields = map[string]bool{
@@
 	"tool_choice":          true,
 	"tools":                true,
 	"truncation":           true,
+	"user":                 true,
 }
core/providers/bedrock/bedrock.go (1)

1034-1091: Set stream-end indicator on provider “exception” frames too (not just Decode errors).

In ResponsesStream, the new ctx = context.WithValue(ctx, schemas.BifrostContextKeyStreamEndIndicator, true) on decode errors (Line 1070) and the EOF-finalization last chunk (Line 1059) is good, but the msgType != "event" early-return path (Lines 1078-1090) still terminates without setting the end indicator.

Suggested patch:

 					if msgType := msgTypeHeader.String(); msgType != "event" {
 						excType := msgType
 						if excHeader := message.Headers.Get(":exception-type"); excHeader != nil {
 							if v := excHeader.String(); v != "" {
 								excType = v
 							}
 						}
 						errMsg := string(message.Payload)
 						err := fmt.Errorf("%s stream %s: %s", providerName, excType, errMsg)
+						ctx = context.WithValue(ctx, schemas.BifrostContextKeyStreamEndIndicator, true)
 						providerUtils.ProcessAndSendError(ctx, postHookRunner, err, responseChan, schemas.ResponsesStreamRequest, providerName, request.Model, provider.logger)
 						return
 					}
core/schemas/responses.go (1)

45-85: Fix doc comment typos + clarify “non-OpenAI spec” fields.
Minor, but these comments are public-facing and currently read “Not is OpenAI’s spec”.

♻️ Duplicate comments (3)
core/internal/testutil/chat_completion_stream.go (1)

535-556: Cerebras prompt references tools but none are configured.

The Cerebras-specific prompt instructs the model to "use your tools for this", but the request configuration at lines 544-556 doesn't include any tools. This mismatch will likely cause test failures or unexpected model behavior for Cerebras.

Consider using a reasoning-focused prompt without tool references, consistent with the default mathematical problem:

 			if testConfig.Provider == schemas.Cerebras {
-				problemPrompt = "Hello how are you, can you search hackernews news regarding maxim ai for me? use your tools for this"
+				problemPrompt = "Explain step by step: What is 15% of 200, then multiply that result by 3?"
 			}
core/providers/bedrock/utils.go (1)

115-121: Don’t overwrite AdditionalModelRequestFields (clobbers reasoning_config).

bedrockReq.AdditionalModelRequestFields = orderedFields (Line 119) can discard reasoning_config set earlier (and any other pre-populated fields). Merge instead.

Suggested patch:

-				if orderedFields, ok := schemas.SafeExtractOrderedMap(requestFields); ok {
-					bedrockReq.AdditionalModelRequestFields = orderedFields
-				}
+				if orderedFields, ok := schemas.SafeExtractOrderedMap(requestFields); ok {
+					if bedrockReq.AdditionalModelRequestFields == nil {
+						bedrockReq.AdditionalModelRequestFields = make(schemas.OrderedMap)
+					}
+					for k, v := range orderedFields {
+						// Preserve already-derived/provider-required fields (e.g., reasoning_config)
+						if _, exists := bedrockReq.AdditionalModelRequestFields[k]; exists {
+							continue
+						}
+						bedrockReq.AdditionalModelRequestFields[k] = v
+					}
+				}
core/providers/cohere/responses.go (1)

1284-1302: Don’t embed encrypted reasoning into plain text (reintroduces prior security issue).
This leaks EncryptedContent to Cohere by wrapping it in a readable marker.

@@
 		} else if msg.ResponsesReasoning.EncryptedContent != nil {
-			// Cohere doesn't have a direct equivalent to encrypted content,
-			// so we'll store it as a regular thinking block with a special marker
-			encryptedText := fmt.Sprintf("[ENCRYPTED_REASONING: %s]", *msg.ResponsesReasoning.EncryptedContent)
-			thinkingBlock := CohereContentBlock{
-				Type:     CohereContentBlockTypeThinking,
-				Thinking: &encryptedText,
-			}
-			thinkingBlocks = append(thinkingBlocks, thinkingBlock)
+			// Cohere doesn't support encrypted reasoning; keep it opaque.
+			// Intentionally dropping rather than exfiltrating into text.
 		}
🧹 Nitpick comments (9)
core/internal/testutil/reasoning.go (1)

413-419: Consider removing unused min function.

This helper function appears unused in the file - length calculations are done inline throughout (e.g., lines 100-103, 146-149, 293-296, 340-343, 366-369). Additionally, Go 1.21+ provides a built-in min function.

#!/bin/bash
# Verify if min function is used anywhere in this file or if it's imported elsewhere
rg -n "min\(" core/internal/testutil/reasoning.go
# Check if other files in testutil import or reference this min
rg -n "testutil\.min" core/
core/providers/openai/types.go (1)

11-13: Central constant is good—ensure it’s used consistently (and justified).
MinMaxCompletionTokens being a single source of truth helps; please ensure other request paths don’t reintroduce different minima (esp. in this Graphite stack).

core/providers/openai/responses.go (1)

56-80: Avoid partial field-copy when converting summaries → content blocks (risk of silently dropping message fields).
Instead of manually copying ID/Type/Status/Role into newMessage, prefer newMessage := message (then overwrite Content and any fields you intentionally want cleared). This reduces the chance of losing fields if schemas.ResponsesMessage evolves.

core/providers/vertex/errors.go (1)

13-28: Good: decode body once; avoid vertexErr shadowing to reduce confusion.
The decoded-body approach is a solid robustness win. Minor readability: vertexErr is both []VertexError and later VertexError (Line 12 vs Line 23), which is easy to misread during maintenance—rename one of them (e.g., vertexErrList / vertexErrSingle).

core/providers/cohere/chat.go (1)

104-128: Consider validating/clamping Thinking.TokenBudget to Cohere’s minimum when Reasoning.MaxTokens is provided.
If bifrostReq.Params.Reasoning.MaxTokens is set but is <= 0, you’ll emit an invalid Cohere request (token_budget is typically expected to be ≥ 1). A small guard/clamp to MinimumReasoningMaxTokens would make this more robust.

transports/bifrost-http/integrations/router.go (1)

312-314: Schema comment for BifrostContextKeyIntegrationType could be clearer.

The schema annotation says // RouteConfigType, but the actual stored value is a string (via string(config.Type) at line 313). Since RouteConfigType is a string type alias and all downstream readers (e.g., anthropic/responses.go) use string comparisons rather than type assertions, consider updating the schema comment to // string or // string (RouteConfigType) for accuracy.

core/providers/anthropic/chat.go (1)

602-629: Avoid emitting empty tool-arguments deltas; use schemas.Ptr for "function".
Right now you’ll emit chunks when PartialJSON is "", and the pointer closure is inconsistent with the rest of the file.

@@
 			case AnthropicStreamDeltaTypeInputJSON:
 				// Handle tool use streaming - accumulate partial JSON
-				if chunk.Delta.PartialJSON != nil {
+				if chunk.Delta.PartialJSON != nil && *chunk.Delta.PartialJSON != "" {
@@
 											{
-												Type: func() *string { s := "function"; return &s }(),
+												Type: schemas.Ptr("function"),
 												Function: schemas.ChatAssistantMessageToolCallFunction{
 													Arguments: *chunk.Delta.PartialJSON,
 												},
 											},
transports/bifrost-http/integrations/anthropic.go (1)

230-238: Defensive guard: ctx pointer can be nil.
(*ctx).Value(...) will panic if a caller ever passes nil (or passes a nil *context.Context).

 func shouldUsePassthrough(ctx *context.Context, provider schemas.ModelProvider, model string, deployment string) bool {
+	if ctx == nil {
+		return false
+	}
 	isClaudeCode := false
 	if userAgent, ok := (*ctx).Value(schemas.BifrostContextKeyUserAgent).(string); ok {
core/providers/cohere/responses.go (1)

1202-1223: Avoid defaulting unknown tool_choice strings to required.
Forcing tool calls on unknown values is a surprising behavior change; returning nil (provider default) is usually safer.

@@
 		case "auto":
 			choice := ToolChoiceAuto
 			return &choice
 		default:
-			choice := ToolChoiceRequired
-			return &choice
+			return nil
 		}
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 63a0c31 and ac6bbd4.

⛔ Files ignored due to path filters (1)
  • ui/package-lock.json is excluded by !**/package-lock.json
📒 Files selected for processing (55)
  • core/internal/testutil/account.go (1 hunks)
  • core/internal/testutil/chat_completion_stream.go (1 hunks)
  • core/internal/testutil/reasoning.go (4 hunks)
  • core/internal/testutil/responses_stream.go (1 hunks)
  • core/internal/testutil/tests.go (2 hunks)
  • core/providers/anthropic/anthropic.go (3 hunks)
  • core/providers/anthropic/chat.go (5 hunks)
  • core/providers/anthropic/errors.go (2 hunks)
  • core/providers/anthropic/types.go (8 hunks)
  • core/providers/anthropic/utils.go (2 hunks)
  • core/providers/azure/azure.go (2 hunks)
  • core/providers/azure/utils.go (1 hunks)
  • core/providers/bedrock/bedrock.go (2 hunks)
  • core/providers/bedrock/bedrock_test.go (13 hunks)
  • core/providers/bedrock/types.go (2 hunks)
  • core/providers/bedrock/utils.go (2 hunks)
  • core/providers/cerebras/cerebras_test.go (2 hunks)
  • core/providers/cohere/chat.go (3 hunks)
  • core/providers/cohere/cohere.go (2 hunks)
  • core/providers/cohere/cohere_test.go (1 hunks)
  • core/providers/cohere/responses.go (10 hunks)
  • core/providers/cohere/types.go (1 hunks)
  • core/providers/gemini/responses.go (2 hunks)
  • core/providers/groq/groq_test.go (2 hunks)
  • core/providers/mistral/mistral_test.go (1 hunks)
  • core/providers/openai/chat.go (1 hunks)
  • core/providers/openai/responses.go (2 hunks)
  • core/providers/openai/text.go (1 hunks)
  • core/providers/openai/types.go (3 hunks)
  • core/providers/openai/utils.go (1 hunks)
  • core/providers/openrouter/openrouter_test.go (1 hunks)
  • core/providers/utils/utils.go (4 hunks)
  • core/providers/vertex/errors.go (1 hunks)
  • core/providers/vertex/types.go (0 hunks)
  • core/providers/vertex/utils.go (1 hunks)
  • core/providers/vertex/vertex.go (3 hunks)
  • core/providers/vertex/vertex_test.go (1 hunks)
  • core/schemas/bifrost.go (1 hunks)
  • core/schemas/mux.go (0 hunks)
  • core/schemas/responses.go (5 hunks)
  • core/schemas/utils.go (1 hunks)
  • docs/docs.json (0 hunks)
  • framework/configstore/rdb.go (0 hunks)
  • framework/streaming/chat.go (0 hunks)
  • framework/streaming/responses.go (2 hunks)
  • transports/bifrost-http/handlers/inference.go (1 hunks)
  • transports/bifrost-http/handlers/middlewares.go (1 hunks)
  • transports/bifrost-http/integrations/anthropic.go (5 hunks)
  • transports/bifrost-http/integrations/router.go (3 hunks)
  • ui/app/workspace/logs/views/columns.tsx (1 hunks)
  • ui/app/workspace/logs/views/logDetailsSheet.tsx (1 hunks)
  • ui/app/workspace/logs/views/logResponsesMessageView.tsx (2 hunks)
  • ui/app/workspace/logs/views/logResponsesOutputView.tsx (0 hunks)
  • ui/lib/types/logs.ts (2 hunks)
  • ui/package.json (1 hunks)
💤 Files with no reviewable changes (6)
  • core/schemas/mux.go
  • ui/app/workspace/logs/views/logResponsesOutputView.tsx
  • framework/configstore/rdb.go
  • framework/streaming/chat.go
  • core/providers/vertex/types.go
  • docs/docs.json
✅ Files skipped from review due to trivial changes (1)
  • transports/bifrost-http/handlers/middlewares.go
🚧 Files skipped from review as they are similar to previous changes (22)
  • ui/app/workspace/logs/views/columns.tsx
  • ui/app/workspace/logs/views/logResponsesMessageView.tsx
  • core/providers/groq/groq_test.go
  • core/schemas/utils.go
  • core/providers/anthropic/errors.go
  • core/providers/vertex/vertex_test.go
  • core/providers/openai/text.go
  • ui/app/workspace/logs/views/logDetailsSheet.tsx
  • core/providers/openai/utils.go
  • core/providers/vertex/vertex.go
  • core/providers/cohere/cohere_test.go
  • core/providers/utils/utils.go
  • core/providers/cerebras/cerebras_test.go
  • core/providers/mistral/mistral_test.go
  • framework/streaming/responses.go
  • core/providers/anthropic/utils.go
  • core/providers/bedrock/types.go
  • core/providers/openrouter/openrouter_test.go
  • core/providers/cohere/types.go
  • core/internal/testutil/tests.go
  • ui/package.json
  • core/providers/bedrock/bedrock_test.go
🧰 Additional context used
📓 Path-based instructions (1)
**

⚙️ CodeRabbit configuration file

always check the stack if there is one for the current PR. do not give localized reviews for the PR, always see all changes in the light of the whole stack of PRs (if there is a stack, if there is no stack you can continue to make localized suggestions/reviews)

Files:

  • core/providers/openai/chat.go
  • ui/lib/types/logs.ts
  • transports/bifrost-http/integrations/router.go
  • core/providers/azure/utils.go
  • core/providers/vertex/utils.go
  • core/providers/vertex/errors.go
  • core/providers/openai/responses.go
  • core/providers/azure/azure.go
  • core/providers/bedrock/utils.go
  • core/providers/cohere/chat.go
  • core/providers/gemini/responses.go
  • core/schemas/responses.go
  • core/providers/bedrock/bedrock.go
  • core/internal/testutil/responses_stream.go
  • core/internal/testutil/account.go
  • core/providers/cohere/cohere.go
  • core/internal/testutil/chat_completion_stream.go
  • core/providers/anthropic/chat.go
  • transports/bifrost-http/integrations/anthropic.go
  • core/internal/testutil/reasoning.go
  • core/providers/openai/types.go
  • core/schemas/bifrost.go
  • core/providers/anthropic/types.go
  • core/providers/anthropic/anthropic.go
  • core/providers/cohere/responses.go
  • transports/bifrost-http/handlers/inference.go
🧠 Learnings (2)
📚 Learning: 2025-12-09T17:07:42.007Z
Learnt from: qwerty-dvorak
Repo: maximhq/bifrost PR: 1006
File: core/schemas/account.go:9-18
Timestamp: 2025-12-09T17:07:42.007Z
Learning: In core/schemas/account.go, the HuggingFaceKeyConfig field within the Key struct is currently unused and reserved for future Hugging Face inference endpoint deployments. Do not flag this field as missing from OpenAPI documentation or require its presence in the API spec until the feature is actively implemented and used. When the feature is added, update the OpenAPI docs accordingly; otherwise, treat this field as non-breaking and not part of the current API surface.

Applied to files:

  • core/providers/openai/chat.go
  • transports/bifrost-http/integrations/router.go
  • core/providers/azure/utils.go
  • core/providers/vertex/utils.go
  • core/providers/vertex/errors.go
  • core/providers/openai/responses.go
  • core/providers/azure/azure.go
  • core/providers/bedrock/utils.go
  • core/providers/cohere/chat.go
  • core/providers/gemini/responses.go
  • core/schemas/responses.go
  • core/providers/bedrock/bedrock.go
  • core/internal/testutil/responses_stream.go
  • core/internal/testutil/account.go
  • core/providers/cohere/cohere.go
  • core/internal/testutil/chat_completion_stream.go
  • core/providers/anthropic/chat.go
  • transports/bifrost-http/integrations/anthropic.go
  • core/internal/testutil/reasoning.go
  • core/providers/openai/types.go
  • core/schemas/bifrost.go
  • core/providers/anthropic/types.go
  • core/providers/anthropic/anthropic.go
  • core/providers/cohere/responses.go
  • transports/bifrost-http/handlers/inference.go
📚 Learning: 2025-12-11T11:58:25.307Z
Learnt from: Pratham-Mishra04
Repo: maximhq/bifrost PR: 1000
File: core/providers/openai/responses.go:42-84
Timestamp: 2025-12-11T11:58:25.307Z
Learning: In core/providers/openai/responses.go (and related OpenAI response handling), document and enforce the API format constraint: if ResponsesReasoning != nil and the response contains content blocks, all content blocks should be treated as reasoning blocks by default. Implement type guards or parsing logic accordingly, and add unit tests to verify that when ResponsesReasoning is non-nil, content blocks are labeled as reasoning blocks. Include clear comments in the code explaining the rationale and ensure downstream consumers rely on this behavior.

Applied to files:

  • core/providers/openai/chat.go
  • core/providers/openai/responses.go
  • core/providers/openai/types.go
🧬 Code graph analysis (18)
core/providers/openai/chat.go (5)
core/schemas/chatcompletions.go (1)
  • ChatParameters (155-184)
core/providers/openai/types.go (1)
  • MinMaxCompletionTokens (12-12)
core/schemas/utils.go (1)
  • Ptr (16-18)
core/utils.go (1)
  • Ptr (51-53)
core/providers/openai/utils.go (1)
  • SanitizeUserField (51-56)
ui/lib/types/logs.ts (1)
core/schemas/responses.go (2)
  • ResponsesReasoningSummary (745-748)
  • ResponsesReasoning (731-734)
transports/bifrost-http/integrations/router.go (2)
core/schemas/bifrost.go (1)
  • BifrostContextKeyIntegrationType (120-120)
core/providers/gemini/types.go (1)
  • Type (778-778)
core/providers/azure/utils.go (6)
core/schemas/responses.go (1)
  • BifrostResponsesRequest (32-39)
core/providers/utils/utils.go (1)
  • NewBifrostOperationError (516-527)
core/schemas/provider.go (2)
  • ErrRequestBodyConversion (25-25)
  • ErrProviderRequestMarshal (26-26)
core/providers/anthropic/types.go (1)
  • AnthropicDefaultMaxTokens (14-14)
core/providers/anthropic/responses.go (1)
  • ToAnthropicResponsesRequest (1419-1529)
core/schemas/utils.go (1)
  • Ptr (16-18)
core/providers/vertex/utils.go (7)
core/schemas/responses.go (1)
  • BifrostResponsesRequest (32-39)
core/providers/utils/utils.go (1)
  • NewBifrostOperationError (516-527)
core/schemas/provider.go (2)
  • ErrRequestBodyConversion (25-25)
  • ErrProviderRequestMarshal (26-26)
core/providers/anthropic/types.go (1)
  • AnthropicDefaultMaxTokens (14-14)
core/providers/vertex/types.go (1)
  • DefaultVertexAnthropicVersion (8-8)
core/providers/anthropic/responses.go (1)
  • ToAnthropicResponsesRequest (1419-1529)
core/schemas/utils.go (1)
  • Ptr (16-18)
core/providers/azure/azure.go (4)
core/schemas/bifrost.go (1)
  • BifrostError (358-367)
ui/lib/types/logs.ts (1)
  • BifrostError (226-232)
core/schemas/utils.go (1)
  • IsAnthropicModel (1043-1045)
core/providers/utils/utils.go (1)
  • CheckContextAndGetRequestBody (257-275)
core/providers/cohere/chat.go (5)
core/providers/cohere/types.go (5)
  • CohereThinking (174-177)
  • ThinkingTypeEnabled (183-183)
  • DefaultCompletionMaxTokens (12-12)
  • MinimumReasoningMaxTokens (11-11)
  • ThinkingTypeDisabled (184-184)
core/providers/bedrock/types.go (2)
  • DefaultCompletionMaxTokens (13-13)
  • MinimumReasoningMaxTokens (12-12)
core/providers/utils/utils.go (1)
  • GetBudgetTokensFromReasoningEffort (1084-1121)
core/providers/anthropic/types.go (1)
  • MinimumReasoningMaxTokens (15-15)
core/schemas/utils.go (1)
  • Ptr (16-18)
core/providers/gemini/responses.go (3)
ui/lib/types/logs.ts (6)
  • FunctionCall (157-160)
  • ResponsesToolMessage (403-409)
  • ResponsesMessage (423-438)
  • ResponsesMessageContent (400-400)
  • ResponsesReasoning (417-420)
  • ResponsesReasoningSummary (412-415)
core/providers/gemini/types.go (5)
  • FunctionCall (1091-1101)
  • Role (13-13)
  • Content (922-930)
  • Type (778-778)
  • Part (936-960)
core/schemas/responses.go (5)
  • ResponsesToolMessage (462-482)
  • ResponsesMessage (314-327)
  • ResponsesMessageContent (339-344)
  • ResponsesReasoning (731-734)
  • ResponsesReasoningSummary (745-748)
core/schemas/responses.go (2)
core/providers/gemini/types.go (1)
  • Type (778-778)
ui/lib/types/logs.ts (2)
  • ResponsesMessageContentBlockType (352-359)
  • ResponsesReasoningSummary (412-415)
core/providers/bedrock/bedrock.go (1)
core/schemas/bifrost.go (1)
  • BifrostContextKeyStreamEndIndicator (113-113)
core/internal/testutil/responses_stream.go (1)
core/schemas/utils.go (1)
  • Ptr (16-18)
core/providers/cohere/cohere.go (1)
core/providers/cohere/responses.go (1)
  • ToCohereResponsesRequest (892-1006)
core/providers/anthropic/chat.go (6)
core/providers/anthropic/types.go (2)
  • AnthropicThinking (69-72)
  • MinimumReasoningMaxTokens (15-15)
core/providers/utils/utils.go (1)
  • GetBudgetTokensFromReasoningEffort (1084-1121)
core/providers/bedrock/types.go (1)
  • MinimumReasoningMaxTokens (12-12)
core/providers/cohere/types.go (1)
  • MinimumReasoningMaxTokens (11-11)
core/schemas/utils.go (1)
  • Ptr (16-18)
core/utils.go (1)
  • Ptr (51-53)
transports/bifrost-http/integrations/anthropic.go (2)
core/schemas/bifrost.go (2)
  • BifrostContextKeyUserAgent (123-123)
  • Anthropic (37-37)
core/schemas/utils.go (1)
  • IsAnthropicModel (1043-1045)
core/internal/testutil/reasoning.go (8)
core/internal/testutil/account.go (1)
  • ComprehensiveTestConfig (47-64)
core/schemas/utils.go (1)
  • Ptr (16-18)
core/schemas/bifrost.go (3)
  • OpenAI (35-35)
  • BifrostError (358-367)
  • ChatCompletionRequest (89-89)
core/schemas/chatcompletions.go (5)
  • ChatMessage (469-478)
  • BifrostChatRequest (12-19)
  • ChatParameters (155-184)
  • BifrostChatResponse (26-41)
  • ChatAssistantMessage (626-632)
core/internal/testutil/utils.go (1)
  • CreateBasicChatMessage (247-254)
core/schemas/models.go (1)
  • Model (109-129)
core/internal/testutil/test_retry_framework.go (4)
  • GetTestRetryConfigForScenario (1116-1148)
  • TestRetryContext (168-173)
  • ChatRetryConfig (186-193)
  • WithChatTestRetry (274-424)
core/internal/testutil/validation_presets.go (2)
  • GetExpectationsForScenario (208-293)
  • ModifyExpectationsForProvider (300-347)
core/providers/openai/types.go (2)
core/schemas/chatcompletions.go (1)
  • ChatParameters (155-184)
core/schemas/responses.go (1)
  • ResponsesParametersReasoning (234-239)
core/providers/cohere/responses.go (4)
core/schemas/responses.go (12)
  • ResponsesMessageContentBlock (399-411)
  • ResponsesMessage (314-327)
  • ResponsesMessageTypeMessage (290-290)
  • ResponsesInputMessageRoleAssistant (332-332)
  • ResponsesMessageContent (339-344)
  • ResponsesMessageTypeReasoning (307-307)
  • ResponsesMessageTypeFunctionCall (295-295)
  • ResponsesResponseUsage (254-261)
  • BifrostResponsesRequest (32-39)
  • ResponsesMessageRoleType (329-329)
  • ResponsesInputMessageRoleUser (333-333)
  • ResponsesInputMessageRoleSystem (334-334)
core/schemas/utils.go (3)
  • Ptr (16-18)
  • SafeExtractStringSlice (437-465)
  • SafeExtractString (280-297)
core/providers/utils/utils.go (2)
  • GetRandomString (1027-1038)
  • GetBudgetTokensFromReasoningEffort (1084-1121)
core/providers/bedrock/types.go (2)
  • DefaultCompletionMaxTokens (13-13)
  • MinimumReasoningMaxTokens (12-12)
transports/bifrost-http/handlers/inference.go (2)
core/schemas/bifrost.go (1)
  • ResponsesRequest (91-91)
core/schemas/responses.go (1)
  • ResponsesParameters (87-114)
⏰ Context from checks skipped due to timeout of 900000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (7)
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
🔇 Additional comments (19)
core/internal/testutil/responses_stream.go (1)

440-446: LGTM - Token limit increase is appropriate for reasoning scenarios.

The increase from 400 to 1800 MaxOutputTokens aligns with the other reasoning test configurations in this PR stack (chat_completion_stream.go and reasoning.go both use 1800). Reasoning models require more tokens to produce step-by-step explanations.

core/internal/testutil/chat_completion_stream.go (3)

358-384: LGTM - Well-structured reasoning streaming test.

The test configuration is consistent with other reasoning tests in the PR stack:

  • Uses 1800 MaxCompletionTokens matching other reasoning scenarios
  • Properly configures ChatReasoning with high effort and 1500 max tokens
  • Gates execution behind Scenarios.Reasoning and ReasoningModel checks

504-518: Appropriate soft validation for provider variability.

The warning-only approach when no reasoning indicators are found is correct. Different providers expose reasoning through different mechanisms (content, details, or tokens), and some may not expose explicit indicators while still performing reasoning internally.


664-679: Good handling of reasoning-only responses.

The validation logic correctly handles the case where reasoning indicators are present but no content chunks were received. This accounts for reasoning models that may process internally without streaming traditional content, avoiding false test failures.

core/internal/testutil/reasoning.go (3)

12-25: LGTM - Clear naming distinction between API variants.

The rename from RunReasoningTest to RunResponsesReasoningTest and the addition of RunChatCompletionReasoningTest clearly distinguish between the two API paths. The test label change to "ResponsesReasoning" maintains consistency with the function name.


348-382: Field access issue is fixed.

The previous review identified that message.ReasoningDetails was incorrectly used instead of message.ChatAssistantMessage.ReasoningDetails. This has been correctly fixed - all accesses now properly go through the ChatAssistantMessage embedded struct with appropriate nil checks.


219-223: Verify whether Groq's reasoning flakiness applies to non-streaming chat completions.

The ChatCompletionReasoning test skips only OpenAI, while ChatCompletionStreamWithReasoningValidated in chat_completion_stream.go (lines 529-531) skips both OpenAI and Groq. The skip comments reference different APIs: "chat completions" vs "stream". If Groq's reasoning is also unstable in non-streaming ChatCompletion calls, add it to the skip list in reasoning.go for consistency. If the flakiness is specific to streaming, the current skip patterns are likely correct.

core/providers/openai/responses.go (1)

95-104: Add focused tests for clamping and user field validation.

User field sanitization is correctly implemented (64-character limit), but MinMaxCompletionTokens = 16 lacks documentation explaining why this value was chosen. Add unit tests covering:

  • Token clamping when MaxOutputTokens falls below 16
  • User field sanitization when exceeding 64 characters
  • User field being dropped (set to nil) when oversized

Verify that MinMaxCompletionTokens = 16 aligns with OpenAI's actual API requirements for the Responses endpoint.

transports/bifrost-http/handlers/inference.go (1)

224-254: Custom ResponsesRequest.UnmarshalJSON matches the ChatRequest pattern; looks correct.
Good separation: BifrostParams → Input union → ResponsesParameters with its custom unmarshaller.

core/providers/cohere/chat.go (1)

440-454: Streaming thinking → Reasoning/ReasoningDetails mapping looks good.
The nil checks are in place and using schemas.Ptr(thinkingText) avoids referencing ephemeral data.

core/internal/testutil/account.go (1)

138-143: Bedrock deployment alias is correctly used in tests.

The "claude-4.5-haiku" alias is actively referenced in core/providers/vertex/vertex_test.go line 31, and the model ID format "global.anthropic.claude-haiku-4-5-20251001-v1:0" follows the correct Bedrock convention, consistent with other claude-4.5 entries in the same Deployments map.

transports/bifrost-http/integrations/router.go (1)

921-926: No issue found. The Warn method signature supports variadic arguments.

The Logger interface at core/schemas/logger.go defines Warn(msg string, args ...any), which accepts printf-style format arguments. Line 923's usage g.logger.Warn("Failed to write SSE done marker: %v", err) is correct and consistent with this signature and widespread patterns throughout the codebase (e.g., server.go:1079, config.go:341, config.go:551).

Likely an incorrect or invalid review comment.

core/providers/cohere/cohere.go (2)

510-518: Good: propagate ToCohereResponsesRequest conversion errors.

The updated CheckContextAndGetRequestBody closure now returns (any, error) from ToCohereResponsesRequest (Line 513), which prevents silent bad request construction when conversions fail.


566-582: Good: stream request construction now fails fast + sets Stream=true only when body exists.

Handling the (reqBody, err) return (Lines 570-573) and then conditionally setting reqBody.Stream (Line 575) avoids nil-deref risk while correctly enabling SSE mode.

core/providers/anthropic/anthropic.go (2)

672-717: Good: centralized request-body construction for Responses.

Using getRequestBodyForResponses (Line 677) and passing the same jsonBody through completeRequest + HandleProviderResponse (Lines 683, 692) should reduce mismatches between “what we send” vs “what we record as raw request”.


719-756: Good: ResponsesStream now shares the same request-body builder (and fails fast).

The streaming path now mirrors the non-streaming body build via getRequestBodyForResponses(..., true) with immediate error return (Lines 726-729), which is cleaner and safer.

core/providers/bedrock/utils.go (1)

35-73: Reasoning budget min-tokens selection looks fixed.

Using MinimumReasoningMaxTokens by default and switching to anthropic.MinimumReasoningMaxTokens only when schemas.IsAnthropicModel(bifrostReq.Model) (Lines 58-60) addresses the earlier cross-provider min-token mismatch.

ui/lib/types/logs.ts (1)

411-435: Type rename + propagation looks consistent.
No issues with the ResponsesReasoningSummary shape or the updated field types.

core/schemas/responses.go (1)

399-411: Schema migration to ResponsesReasoningSummary is complete and consistent across the stack.

No references to the old ResponsesReasoningContent identifier remain. The new ResponsesReasoningSummary type and Signature field are properly integrated across all consumers: backend schemas, provider implementations (Anthropic, Bedrock, Cohere, Gemini), streaming layer, and TypeScript types.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (6)
core/providers/anthropic/types.go (2)

58-139: Fix extra_params being treated as “unknown” and re-captured into ExtraParams.

 var anthropicMessageRequestKnownFields = map[string]bool{
   "model":          true,
   "max_tokens":     true,
   "messages":       true,
   "metadata":       true,
   "system":         true,
   "temperature":    true,
   "top_p":          true,
   "top_k":          true,
   "stop_sequences": true,
   "stream":         true,
   "tools":          true,
   "tool_choice":    true,
   "mcp_servers":    true,
   "thinking":       true,
   "output_format":  true,
+  "extra_params":   true,
   "fallbacks":      true,
 }

160-196: Reset the other representation in AnthropicContent.UnmarshalJSON to avoid stale state.

 func (mc *AnthropicContent) UnmarshalJSON(data []byte) error {
+    // defensive reset (in case mc is reused)
+    mc.ContentStr = nil
+    mc.ContentBlocks = nil
+
     // First, try to unmarshal as a direct string
     var stringContent string
     if err := sonic.Unmarshal(data, &stringContent); err == nil {
         mc.ContentStr = &stringContent
-        return nil
+        return nil
     }

     // Try to unmarshal as a direct array of ContentBlock
     var arrayContent []AnthropicContentBlock
     if err := sonic.Unmarshal(data, &arrayContent); err == nil {
         mc.ContentBlocks = arrayContent
         return nil
     }
core/providers/bedrock/utils.go (1)

116-121: AdditionalModelRequestFields assignment overwrites reasoning_config.

The extraction of additionalModelRequestFieldPaths at line 119 replaces the entire AdditionalModelRequestFields map, discarding any reasoning_config set earlier (lines 40-73). If both reasoning and extra params are provided, reasoning will be lost.

Consider merging instead of overwriting:

 			if requestFields, exists := bifrostReq.Params.ExtraParams["additionalModelRequestFieldPaths"]; exists {
 				if orderedFields, ok := schemas.SafeExtractOrderedMap(requestFields); ok {
-					bedrockReq.AdditionalModelRequestFields = orderedFields
+					if bedrockReq.AdditionalModelRequestFields == nil {
+						bedrockReq.AdditionalModelRequestFields = make(schemas.OrderedMap)
+					}
+					for k, v := range orderedFields {
+						bedrockReq.AdditionalModelRequestFields[k] = v
+					}
 				}
 			}
core/schemas/responses.go (1)

45-85: Fix typos in spec comments (“Not is” → “Not in”).

@@
-	StopReason         *string                             `json:"stop_reason,omitempty"` // Not is OpenAI's spec, but sent by other providers
+	StopReason         *string                             `json:"stop_reason,omitempty"` // Not in OpenAI's spec, but sent by other providers
framework/streaming/responses.go (2)

14-157: Deep-copy must include Signature (streamed signatures will be dropped for non-OpenAI providers).

@@
 	if original.Delta != nil {
 		copyDelta := *original.Delta
 		copy.Delta = &copyDelta
 	}
+
+	if original.Signature != nil {
+		copySignature := *original.Signature
+		copy.Signature = &copySignature
+	}

380-424: Deep-copy content blocks should copy FileID and Signature at minimum (new schema fields).

@@
 func deepCopyResponsesMessageContentBlock(original schemas.ResponsesMessageContentBlock) schemas.ResponsesMessageContentBlock {
 	copy := schemas.ResponsesMessageContentBlock{
 		Type: original.Type,
 	}
+
+	if original.FileID != nil {
+		v := *original.FileID
+		copy.FileID = &v
+	}
+
+	if original.Signature != nil {
+		v := *original.Signature
+		copy.Signature = &v
+	}
@@
 	if original.Text != nil {
 		copyText := *original.Text
 		copy.Text = &copyText
 	}
♻️ Duplicate comments (4)
core/providers/mistral/mistral_test.go (1)

33-50: Reasoning: false + comment is clear and matches the intent (implementation not wired yet).

core/internal/testutil/chat_completion_stream.go (1)

529-538: Cerebras prompt still requests tool usage but no tools are configured in the test request.

@@
-			if testConfig.Provider == schemas.Cerebras {
-				problemPrompt = "Hello how are you, can you search hackernews news regarding maxim ai for me? use your tools for this"
-			}
+			if testConfig.Provider == schemas.Cerebras {
+				problemPrompt = "Explain step by step: What is 15% of 200, then multiply that result by 3?"
+			}
core/providers/cohere/responses.go (2)

1270-1306: Do not forward EncryptedContent as plaintext payload to Cohere.
Even if the bytes are “encrypted”, embedding them in a regular thinking block leaks the opaque blob cross-provider (and into logs/telemetry). Prefer skipping (or strictly gating behind an explicit “allowEncryptedReasoningPassthrough” flag).

 } else if msg.ResponsesReasoning != nil {
   if msg.ResponsesReasoning.Summary != nil {
     ...
-  } else if msg.ResponsesReasoning.EncryptedContent != nil {
-    // Cohere doesn't have a direct equivalent to encrypted content,
-    // so we'll store it as a regular thinking block with a special marker
-    encryptedText := fmt.Sprintf("[ENCRYPTED_REASONING: %s]", *msg.ResponsesReasoning.EncryptedContent)
-    thinkingBlock := CohereContentBlock{
-      Type:     CohereContentBlockTypeThinking,
-      Thinking: &encryptedText,
-    }
-    thinkingBlocks = append(thinkingBlocks, thinkingBlock)
+  } else if msg.ResponsesReasoning.EncryptedContent != nil {
+    // Cohere doesn't support encrypted reasoning; keep it opaque and do not forward.
   }
 }

1201-1223: Remove unsupported "auto" case and change default to nil.

Cohere's Chat API officially supports only two tool_choice values: REQUIRED and NONE. The "auto" case is not supported by Cohere and should be removed. Defaulting unknown values to ToolChoiceRequired silently forces tool calls; return nil to defer to the provider's default behavior.

   case "auto":
-    choice := ToolChoiceAuto
-    return &choice
+    return nil
   default:
-    choice := ToolChoiceRequired
-    return &choice
+    return nil
🧹 Nitpick comments (12)
ui/app/workspace/logs/views/columns.tsx (1)

40-45: Transcription logs lose distinguishing context (constant “Audio file”).
If log.transcription_input.prompt was a safe label, consider keeping a truncated version (or filename/metadata) so rows aren’t all identical; otherwise add a short comment clarifying this is intentional for privacy/standardization.

 } else if (log?.transcription_input) {
-  return "Audio file";
+  const label = log.transcription_input.prompt?.trim();
+  return label ? label.slice(0, 120) : "Audio file";
 }
core/providers/azure/azure.go (1)

650-653: Consistent with the Responses method refactoring.

The use of getRequestBodyForAnthropicResponses with the streaming flag set to true is appropriate for the streaming endpoint. Error handling with early return is correct.

For consistency across the codebase, consider whether ChatCompletionStream (lines 468-485) should also be refactored to use a similar helper pattern, since it currently uses inline logic for Anthropic request construction.

core/providers/bedrock/types.go (1)

12-14: Nit: DefaultCompletionMaxTokens comment suggests “not passed in body” — consider moving to the budgeting helper instead of provider types.

transports/bifrost-http/integrations/anthropic.go (1)

90-126: Avoid O(n²) string concatenation when combining multiple stream events.

-                        if len(anthropicResponse) > 1 {
-                            combinedContent := ""
+                        if len(anthropicResponse) > 1 {
+                            var b strings.Builder
                             for _, event := range anthropicResponse {
                                 responseJSON, err := sonic.Marshal(event)
                                 if err != nil {
                                     // Log JSON marshaling error but continue processing (should not happen)
                                     log.Printf("Failed to marshal streaming response: %v", err)
                                     continue
                                 }
-                                combinedContent += fmt.Sprintf("event: %s\ndata: %s\n\n", event.Type, responseJSON)
+                                fmt.Fprintf(&b, "event: %s\ndata: %s\n\n", event.Type, responseJSON)
                             }
-                            return "", combinedContent, nil
+                            return "", b.String(), nil
                         }

(Will need strings already imported; this file already imports it.)

core/internal/testutil/reasoning.go (1)

12-116: Token budget bump to 1800 in reasoning tests is fine, but consider making it provider/config-driven to limit CI cost.

transports/bifrost-http/handlers/inference.go (1)

236-244: Nit: comment says “Unmarshal messages” but the field is input.

core/internal/testutil/chat_completion_stream.go (1)

359-521: Consider making the non-validated reasoning stream test explicitly “smoke” (or assert indicators for non-flaky providers).
Right now it only logs when no reasoning indicators are found; if this is meant to catch regressions, consider failing when testConfig.Provider is known-stable.

framework/streaming/responses.go (1)

498-534: ReasoningSummaryTextDelta routing and message creation flow looks reasonable.
Minor: when creating newMessage, consider copying resp.ItemID value into a fresh pointer (to avoid retaining pointers from pooled/aliased stream objects).

core/providers/openai/responses.go (1)

42-84: Add explicit contract documentation and tests for the reasoning block conversion logic.

The code assumes that when ResponsesReasoning != nil and content blocks exist, those blocks are reasoning-only for non-oss models. While the OpenAI Responses API documentation describes reasoning output items and content parts with specific types, it does not explicitly guarantee that content blocks are reasoning-only when both reasoning and content coexist. Add inline comments explaining:

  1. What OpenAI format behavior this code depends on (e.g., that reasoning content must be converted/skipped based on model type)
  2. Which models support/reject reasoning_text content blocks (oss vs non-oss distinction)

Include unit tests covering both the skip path (no summary, has content blocks, non-oss model) and the convert path (has summary, no content blocks, oss model) to prevent silent breakage if the contract assumption changes.

core/providers/cohere/responses.go (3)

141-173: Fallback content-block mapping likely produces wrong “input_*” types (and odd text payload).
convertCohereContentBlockToBifrost is used for response conversion, but the fallback path returns Type: input_text and sets Text to the block type string (not content). This can violate the output schema / confuse consumers.

Suggested tweak (keep behavior but make it clearly “output text” + a readable marker):

 default:
-    // Fallback to text block
+    // Fallback to output text block
     return schemas.ResponsesMessageContentBlock{
-        Type: schemas.ResponsesInputMessageContentBlockTypeText,
-        Text: schemas.Ptr(string(cohereBlock.Type)),
+        Type: schemas.ResponsesOutputMessageContentTypeText,
+        Text: schemas.Ptr("[unsupported_cohere_block_type:" + string(cohereBlock.Type) + "]"),
     }

499-506: Duplicate comment line.
Two consecutive // Generate stable ID for text item comments. Drop one for readability.


1314-1323: Use the guarded field for ID assignment (clarity + avoids future footguns).
You guard on msg.ResponsesToolMessage.CallID but assign toolCall.ID = msg.CallID (promoted field). It’s equivalent today, but coupling to promotion rules is brittle.

 if msg.ResponsesToolMessage != nil && msg.ResponsesToolMessage.CallID != nil {
-  toolCall.ID = msg.CallID
+  toolCall.ID = msg.ResponsesToolMessage.CallID
 }
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 63a0c31 and ac6bbd4.

⛔ Files ignored due to path filters (1)
  • ui/package-lock.json is excluded by !**/package-lock.json
📒 Files selected for processing (55)
  • core/internal/testutil/account.go (1 hunks)
  • core/internal/testutil/chat_completion_stream.go (1 hunks)
  • core/internal/testutil/reasoning.go (4 hunks)
  • core/internal/testutil/responses_stream.go (1 hunks)
  • core/internal/testutil/tests.go (2 hunks)
  • core/providers/anthropic/anthropic.go (3 hunks)
  • core/providers/anthropic/chat.go (5 hunks)
  • core/providers/anthropic/errors.go (2 hunks)
  • core/providers/anthropic/types.go (8 hunks)
  • core/providers/anthropic/utils.go (2 hunks)
  • core/providers/azure/azure.go (2 hunks)
  • core/providers/azure/utils.go (1 hunks)
  • core/providers/bedrock/bedrock.go (2 hunks)
  • core/providers/bedrock/bedrock_test.go (13 hunks)
  • core/providers/bedrock/types.go (2 hunks)
  • core/providers/bedrock/utils.go (2 hunks)
  • core/providers/cerebras/cerebras_test.go (2 hunks)
  • core/providers/cohere/chat.go (3 hunks)
  • core/providers/cohere/cohere.go (2 hunks)
  • core/providers/cohere/cohere_test.go (1 hunks)
  • core/providers/cohere/responses.go (10 hunks)
  • core/providers/cohere/types.go (1 hunks)
  • core/providers/gemini/responses.go (2 hunks)
  • core/providers/groq/groq_test.go (2 hunks)
  • core/providers/mistral/mistral_test.go (1 hunks)
  • core/providers/openai/chat.go (1 hunks)
  • core/providers/openai/responses.go (2 hunks)
  • core/providers/openai/text.go (1 hunks)
  • core/providers/openai/types.go (3 hunks)
  • core/providers/openai/utils.go (1 hunks)
  • core/providers/openrouter/openrouter_test.go (1 hunks)
  • core/providers/utils/utils.go (4 hunks)
  • core/providers/vertex/errors.go (1 hunks)
  • core/providers/vertex/types.go (0 hunks)
  • core/providers/vertex/utils.go (1 hunks)
  • core/providers/vertex/vertex.go (3 hunks)
  • core/providers/vertex/vertex_test.go (1 hunks)
  • core/schemas/bifrost.go (1 hunks)
  • core/schemas/mux.go (0 hunks)
  • core/schemas/responses.go (5 hunks)
  • core/schemas/utils.go (1 hunks)
  • docs/docs.json (0 hunks)
  • framework/configstore/rdb.go (0 hunks)
  • framework/streaming/chat.go (0 hunks)
  • framework/streaming/responses.go (2 hunks)
  • transports/bifrost-http/handlers/inference.go (1 hunks)
  • transports/bifrost-http/handlers/middlewares.go (1 hunks)
  • transports/bifrost-http/integrations/anthropic.go (5 hunks)
  • transports/bifrost-http/integrations/router.go (3 hunks)
  • ui/app/workspace/logs/views/columns.tsx (1 hunks)
  • ui/app/workspace/logs/views/logDetailsSheet.tsx (1 hunks)
  • ui/app/workspace/logs/views/logResponsesMessageView.tsx (2 hunks)
  • ui/app/workspace/logs/views/logResponsesOutputView.tsx (0 hunks)
  • ui/lib/types/logs.ts (2 hunks)
  • ui/package.json (1 hunks)
💤 Files with no reviewable changes (6)
  • framework/streaming/chat.go
  • framework/configstore/rdb.go
  • core/schemas/mux.go
  • core/providers/vertex/types.go
  • docs/docs.json
  • ui/app/workspace/logs/views/logResponsesOutputView.tsx
✅ Files skipped from review due to trivial changes (1)
  • transports/bifrost-http/handlers/middlewares.go
🚧 Files skipped from review as they are similar to previous changes (19)
  • core/schemas/utils.go
  • core/providers/openai/utils.go
  • transports/bifrost-http/integrations/router.go
  • core/providers/vertex/vertex_test.go
  • ui/app/workspace/logs/views/logResponsesMessageView.tsx
  • core/internal/testutil/tests.go
  • core/providers/vertex/utils.go
  • ui/app/workspace/logs/views/logDetailsSheet.tsx
  • core/providers/vertex/vertex.go
  • core/providers/openai/chat.go
  • core/providers/cerebras/cerebras_test.go
  • core/providers/cohere/cohere_test.go
  • core/providers/bedrock/bedrock.go
  • core/providers/anthropic/utils.go
  • core/providers/anthropic/errors.go
  • core/providers/azure/utils.go
  • core/providers/utils/utils.go
  • ui/package.json
  • core/providers/openai/text.go
🧰 Additional context used
📓 Path-based instructions (1)
**

⚙️ CodeRabbit configuration file

always check the stack if there is one for the current PR. do not give localized reviews for the PR, always see all changes in the light of the whole stack of PRs (if there is a stack, if there is no stack you can continue to make localized suggestions/reviews)

Files:

  • core/internal/testutil/account.go
  • core/providers/groq/groq_test.go
  • transports/bifrost-http/handlers/inference.go
  • core/providers/vertex/errors.go
  • core/providers/anthropic/anthropic.go
  • core/providers/bedrock/bedrock_test.go
  • core/providers/azure/azure.go
  • core/providers/openai/responses.go
  • transports/bifrost-http/integrations/anthropic.go
  • core/providers/openai/types.go
  • core/providers/cohere/types.go
  • core/providers/cohere/chat.go
  • framework/streaming/responses.go
  • core/internal/testutil/chat_completion_stream.go
  • core/schemas/responses.go
  • ui/lib/types/logs.ts
  • core/providers/gemini/responses.go
  • core/providers/mistral/mistral_test.go
  • core/providers/anthropic/chat.go
  • core/internal/testutil/responses_stream.go
  • core/providers/openrouter/openrouter_test.go
  • core/providers/bedrock/utils.go
  • core/providers/anthropic/types.go
  • core/schemas/bifrost.go
  • ui/app/workspace/logs/views/columns.tsx
  • core/internal/testutil/reasoning.go
  • core/providers/cohere/responses.go
  • core/providers/cohere/cohere.go
  • core/providers/bedrock/types.go
🧠 Learnings (3)
📚 Learning: 2025-12-09T17:07:42.007Z
Learnt from: qwerty-dvorak
Repo: maximhq/bifrost PR: 1006
File: core/schemas/account.go:9-18
Timestamp: 2025-12-09T17:07:42.007Z
Learning: In core/schemas/account.go, the HuggingFaceKeyConfig field within the Key struct is currently unused and reserved for future Hugging Face inference endpoint deployments. Do not flag this field as missing from OpenAPI documentation or require its presence in the API spec until the feature is actively implemented and used. When the feature is added, update the OpenAPI docs accordingly; otherwise, treat this field as non-breaking and not part of the current API surface.

Applied to files:

  • core/internal/testutil/account.go
  • core/providers/groq/groq_test.go
  • transports/bifrost-http/handlers/inference.go
  • core/providers/vertex/errors.go
  • core/providers/anthropic/anthropic.go
  • core/providers/bedrock/bedrock_test.go
  • core/providers/azure/azure.go
  • core/providers/openai/responses.go
  • transports/bifrost-http/integrations/anthropic.go
  • core/providers/openai/types.go
  • core/providers/cohere/types.go
  • core/providers/cohere/chat.go
  • framework/streaming/responses.go
  • core/internal/testutil/chat_completion_stream.go
  • core/schemas/responses.go
  • core/providers/gemini/responses.go
  • core/providers/mistral/mistral_test.go
  • core/providers/anthropic/chat.go
  • core/internal/testutil/responses_stream.go
  • core/providers/openrouter/openrouter_test.go
  • core/providers/bedrock/utils.go
  • core/providers/anthropic/types.go
  • core/schemas/bifrost.go
  • core/internal/testutil/reasoning.go
  • core/providers/cohere/responses.go
  • core/providers/cohere/cohere.go
  • core/providers/bedrock/types.go
📚 Learning: 2025-12-11T07:38:31.413Z
Learnt from: Pratham-Mishra04
Repo: maximhq/bifrost PR: 1000
File: core/providers/bedrock/bedrock_test.go:1374-1390
Timestamp: 2025-12-11T07:38:31.413Z
Learning: In core/providers/bedrock tests, follow a layered testing approach: - Unit tests (e.g., TestBifrostToBedrockResponseConversion) should perform structural comparisons and type/field checks to avoid brittleness from dynamic fields. - Separate scenario-based and integration tests should validate the full end-to-end conversion logic, including content block internals. Ensure unit tests avoid brittle string/field matching and that integration tests cover end-to-end behavior with realistic data.

Applied to files:

  • core/providers/bedrock/bedrock_test.go
📚 Learning: 2025-12-11T11:58:25.307Z
Learnt from: Pratham-Mishra04
Repo: maximhq/bifrost PR: 1000
File: core/providers/openai/responses.go:42-84
Timestamp: 2025-12-11T11:58:25.307Z
Learning: In core/providers/openai/responses.go (and related OpenAI response handling), document and enforce the API format constraint: if ResponsesReasoning != nil and the response contains content blocks, all content blocks should be treated as reasoning blocks by default. Implement type guards or parsing logic accordingly, and add unit tests to verify that when ResponsesReasoning is non-nil, content blocks are labeled as reasoning blocks. Include clear comments in the code explaining the rationale and ensure downstream consumers rely on this behavior.

Applied to files:

  • core/providers/openai/responses.go
  • core/providers/openai/types.go
🧬 Code graph analysis (15)
transports/bifrost-http/handlers/inference.go (2)
core/schemas/bifrost.go (1)
  • ResponsesRequest (91-91)
core/schemas/responses.go (1)
  • ResponsesParameters (87-114)
core/providers/vertex/errors.go (4)
core/providers/utils/utils.go (2)
  • CheckAndDecodeBody (490-498)
  • NewBifrostOperationError (516-527)
core/schemas/provider.go (1)
  • ErrProviderResponseDecode (29-29)
core/providers/vertex/vertex.go (1)
  • VertexError (25-31)
core/providers/vertex/types.go (1)
  • VertexValidationError (153-160)
core/providers/anthropic/anthropic.go (3)
core/schemas/bifrost.go (1)
  • ResponsesRequest (91-91)
transports/bifrost-http/handlers/inference.go (1)
  • ResponsesRequest (257-261)
core/providers/utils/utils.go (3)
  • HandleProviderResponse (359-445)
  • ShouldSendBackRawRequest (551-556)
  • ShouldSendBackRawResponse (559-564)
core/providers/azure/azure.go (5)
core/schemas/bifrost.go (1)
  • BifrostError (358-367)
ui/lib/types/logs.ts (1)
  • BifrostError (226-232)
core/schemas/utils.go (1)
  • IsAnthropicModel (1043-1045)
core/providers/utils/utils.go (1)
  • CheckContextAndGetRequestBody (257-275)
core/providers/openai/responses.go (1)
  • ToOpenAIResponsesRequest (37-107)
core/providers/openai/responses.go (4)
core/schemas/responses.go (6)
  • ResponsesMessage (314-327)
  • ResponsesReasoning (731-734)
  • ResponsesMessageContentBlock (399-411)
  • ResponsesOutputMessageContentTypeReasoning (394-394)
  • ResponsesMessageContent (339-344)
  • ResponsesParameters (87-114)
core/providers/openai/types.go (3)
  • OpenAIResponsesRequest (180-189)
  • OpenAIResponsesRequestInput (147-150)
  • MinMaxCompletionTokens (12-12)
core/schemas/utils.go (1)
  • Ptr (16-18)
core/providers/openai/utils.go (1)
  • SanitizeUserField (51-56)
transports/bifrost-http/integrations/anthropic.go (2)
core/schemas/bifrost.go (6)
  • BifrostContextKeyUserAgent (123-123)
  • BifrostContextKeyUseRawRequestBody (117-117)
  • Anthropic (37-37)
  • BifrostContextKeyExtraHeaders (115-115)
  • BifrostContextKeyURLPath (116-116)
  • BifrostContextKeySkipKeySelection (114-114)
core/schemas/utils.go (1)
  • IsAnthropicModel (1043-1045)
core/providers/openai/types.go (2)
core/schemas/chatcompletions.go (1)
  • ChatParameters (155-184)
core/schemas/responses.go (1)
  • ResponsesParametersReasoning (234-239)
core/schemas/responses.go (1)
ui/lib/types/logs.ts (2)
  • ResponsesMessageContentBlockType (352-359)
  • ResponsesReasoningSummary (412-415)
ui/lib/types/logs.ts (1)
core/schemas/responses.go (2)
  • ResponsesReasoningSummary (745-748)
  • ResponsesReasoning (731-734)
core/providers/gemini/responses.go (3)
ui/lib/types/logs.ts (6)
  • FunctionCall (157-160)
  • ResponsesToolMessage (403-409)
  • ResponsesMessage (423-438)
  • ResponsesMessageContent (400-400)
  • ResponsesReasoning (417-420)
  • ResponsesReasoningSummary (412-415)
core/providers/gemini/types.go (4)
  • FunctionCall (1091-1101)
  • Role (13-13)
  • Content (922-930)
  • Type (778-778)
core/schemas/responses.go (8)
  • ResponsesToolMessage (462-482)
  • ResponsesMessage (314-327)
  • ResponsesInputMessageRoleAssistant (332-332)
  • ResponsesMessageContent (339-344)
  • ResponsesMessageTypeFunctionCall (295-295)
  • ResponsesMessageTypeReasoning (307-307)
  • ResponsesReasoning (731-734)
  • ResponsesReasoningSummary (745-748)
core/providers/bedrock/utils.go (6)
core/schemas/chatcompletions.go (1)
  • BifrostChatRequest (12-19)
core/providers/bedrock/types.go (4)
  • BedrockConverseRequest (55-75)
  • DefaultCompletionMaxTokens (13-13)
  • BedrockInferenceConfig (253-258)
  • MinimumReasoningMaxTokens (12-12)
core/providers/cohere/types.go (2)
  • DefaultCompletionMaxTokens (12-12)
  • MinimumReasoningMaxTokens (11-11)
core/schemas/utils.go (2)
  • Ptr (16-18)
  • IsAnthropicModel (1043-1045)
core/providers/anthropic/types.go (1)
  • MinimumReasoningMaxTokens (15-15)
core/providers/utils/utils.go (1)
  • GetBudgetTokensFromReasoningEffort (1084-1121)
core/providers/anthropic/types.go (5)
core/providers/bedrock/types.go (1)
  • Alias (102-102)
core/providers/openai/types.go (2)
  • Alias (83-83)
  • Alias (194-194)
core/schemas/bifrost.go (1)
  • Alias (388-388)
core/schemas/chatcompletions.go (4)
  • Alias (189-189)
  • Alias (643-643)
  • Alias (770-770)
  • Alias (860-860)
core/providers/gemini/types.go (8)
  • Alias (220-220)
  • Alias (245-245)
  • Alias (828-830)
  • Alias (842-844)
  • Alias (857-859)
  • Alias (871-873)
  • Alias (1306-1306)
  • Alias (1326-1326)
core/internal/testutil/reasoning.go (5)
core/schemas/utils.go (1)
  • Ptr (16-18)
core/schemas/bifrost.go (1)
  • BifrostError (358-367)
core/schemas/chatcompletions.go (5)
  • ChatMessage (469-478)
  • BifrostChatRequest (12-19)
  • ChatParameters (155-184)
  • BifrostChatResponse (26-41)
  • ChatAssistantMessage (626-632)
core/internal/testutil/utils.go (3)
  • CreateBasicChatMessage (247-254)
  • GetErrorMessage (606-639)
  • GetChatContent (329-356)
core/internal/testutil/test_retry_framework.go (5)
  • GetTestRetryConfigForScenario (1116-1148)
  • TestRetryContext (168-173)
  • ChatRetryConfig (186-193)
  • ChatRetryCondition (126-129)
  • WithChatTestRetry (274-424)
core/providers/cohere/cohere.go (1)
core/providers/cohere/responses.go (1)
  • ToCohereResponsesRequest (892-1006)
core/providers/bedrock/types.go (5)
core/providers/anthropic/types.go (1)
  • Alias (103-103)
core/providers/openai/types.go (2)
  • Alias (83-83)
  • Alias (194-194)
core/schemas/bifrost.go (2)
  • Alias (388-388)
  • Alias (405-405)
core/schemas/chatcompletions.go (4)
  • Alias (189-189)
  • Alias (643-643)
  • Alias (770-770)
  • Alias (860-860)
core/providers/gemini/types.go (7)
  • Alias (220-220)
  • Alias (245-245)
  • Alias (828-830)
  • Alias (842-844)
  • Alias (857-859)
  • Alias (871-873)
  • Alias (1306-1306)
⏰ Context from checks skipped due to timeout of 900000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (14)
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
🔇 Additional comments (39)
core/providers/openrouter/openrouter_test.go (1)

25-49: The review comment cannot be definitively verified due to repository access limitations. Manual verification by the developer is required to confirm: (1) whether "openai/gpt-oss-120b" is available on the CI's OpenRouter account, (2) whether env override patterns are used elsewhere in the codebase for model configuration, and (3) whether the suggested changes align with existing conventions.

core/providers/azure/azure.go (1)

548-564: Good refactoring to centralize Anthropic request body construction.

The delegation to getRequestBodyForAnthropicResponses reduces code duplication and improves maintainability. The conditional logic is clear, and error handling is correct.

However, please verify that the helper function exists and is properly implemented in core/providers/azure/utils.go, particularly confirming its signature matches the expected parameters: ctx, request, deployment, provider.GetProviderKey(), and the streaming boolean flag.

core/providers/groq/groq_test.go (1)

55-55: LGTM: Reasoning test enabled for Groq.

Enabling the Reasoning test scenario aligns with the PR's objectives to expand reasoning support across providers. This change integrates Groq into the broader reasoning-enabled test flow introduced in this stack.

Note: The effectiveness of this test depends on the ReasoningModel (line 37) being valid and supporting reasoning capabilities. Please verify the model name as flagged in the previous comment.

transports/bifrost-http/integrations/anthropic.go (1)

230-245: shouldUsePassthrough + isClaudeModel split looks reasonable.

core/internal/testutil/reasoning.go (1)

201-411: Chat-completions reasoning validation looks solid (nil-safe, good diagnostics).

core/providers/anthropic/types.go (2)

200-227: New redacted_thinking support + data field addition looks consistent with the existing content-block model.


414-423: stop_sequence configuration is correct — the Anthropic streaming API expects the field to always be present as null or a string value, never omitted.

Anthropic's Messages API streaming contract specifies that stop_sequence is always emitted (string|null), with null values in message_start events and potentially null or a string in subsequent events. The current pointer type with no omitempty tag correctly produces "stop_sequence": null in JSON when nil, matching the contract. Clients should treat missing fields as null and rely on the final message_stop event for the definitive value.

core/internal/testutil/account.go (1)

138-143: New Bedrock deployment mapping looks fine; please double-check the deployment ID is enabled in the test AWS account.

core/internal/testutil/responses_stream.go (1)

435-446: MaxOutputTokens bump for reasoning streaming makes sense; keep an eye on the responseCount cap (150) if providers emit many small deltas.

core/providers/cohere/types.go (1)

11-13: Constants addition looks fine.

transports/bifrost-http/handlers/inference.go (1)

224-254: ResponsesRequest custom unmarshal mirrors ChatRequest pattern well (avoids sonic + embedded/custom-unmarshal conflicts).

core/schemas/bifrost.go (1)

120-123: LGTM! New context keys are well-documented.

The new BifrostContextKeyIntegrationType and BifrostContextKeyUserAgent constants follow the established naming conventions and include appropriate type annotations in comments. These additions support the broader streaming/context propagation changes in this PR stack.

core/providers/anthropic/anthropic.go (2)

677-692: Good refactor: Centralized request body construction.

The Responses method now uses the centralized getRequestBodyForResponses helper, improving consistency with the streaming path and reducing code duplication. The variable rename from jsonData to jsonBody aligns with conventions used elsewhere in the codebase.


726-729: LGTM! Consistent streaming request body construction.

The ResponsesStream method properly uses the centralized helper with isStreaming=true and correctly propagates any errors from request body construction.

core/providers/vertex/errors.go (1)

13-27: Good improvement: Proper response body decoding before unmarshalling.

Adding CheckAndDecodeBody ensures gzip-encoded error responses are properly decoded before attempting to unmarshal. The early return on decode failure prevents confusing unmarshal errors when the underlying issue is compression handling. All subsequent unmarshal calls correctly use decodedBody.

core/providers/cohere/cohere.go (2)

513-513: LGTM! Updated to match new function signature.

The Responses method correctly adapts to the updated ToCohereResponsesRequest signature that now returns (*CohereChatRequest, error), enabling proper error propagation from the conversion layer.


570-577: LGTM! Proper error handling and nil-safety in streaming path.

The ResponsesStream method now correctly captures and propagates errors from ToCohereResponsesRequest. The nil check on reqBody before setting the Stream flag is a good defensive practice.

core/providers/cohere/chat.go (3)

7-7: LGTM! Import alias added for provider utilities.

The providerUtils alias is consistent with other provider implementations and provides access to GetBudgetTokensFromReasoningEffort.


104-129: LGTM! Reasoning budget calculation is now safe and robust.

The refactored reasoning handling correctly addresses the previous nil pointer concern by using DefaultCompletionMaxTokens as a fallback when MaxCompletionTokens is not provided. The error propagation from GetBudgetTokensFromReasoningEffort ensures invalid effort values are properly surfaced to callers.


440-453: LGTM! Clean extraction and reuse of thinking text.

Extracting thinkingText once and reusing it for both Reasoning and ReasoningDetails.Text is cleaner than repeated dereferences. The mapping correctly populates both fields for streaming reasoning content.

core/providers/bedrock/utils.go (1)

39-73: Reasoning config logic addresses past review feedback.

The implementation now correctly uses model-specific minimum budget tokens: Bedrock's MinimumReasoningMaxTokens for non-Anthropic models and anthropic.MinimumReasoningMaxTokens for Anthropic models (lines 57-60). This resolves the previously flagged concern.

core/providers/gemini/responses.go (3)

166-179: ThoughtSignature preservation creates separate reasoning message.

The implementation correctly stores the thought signature in a separate ResponsesReasoning message with an empty Summary slice and the signature in EncryptedContent. This aligns with the schema definition in core/schemas/responses.go (lines 730-733).


609-629: Look-ahead logic for thought signature restoration is correct.

The implementation properly looks ahead to find the next reasoning message and extracts the encrypted content to restore the ThoughtSignature. The bounds check at line 621 prevents index out-of-range errors.


148-156: Range loop variable capture comment needs context on Go module version.

The comment about avoiding "range loop variable capture" is relevant only if the project targets Go versions before 1.22. Starting with Go 1.22, loop variables are created per-iteration by default—however, this is opt-in and only applies to modules explicitly declaring go 1.22 (or later) in go.mod. If the project declares an older Go version, the copies are necessary and correct. If it declares Go 1.22+, these copies are redundant.

Verify the Go version in go.mod (go directive) to determine if the copies should be removed.

core/providers/openai/types.go (3)

108-139: Custom UnmarshalJSON correctly handles embedded struct with custom unmarshaller.

The implementation properly separates base field unmarshalling from ChatParameters unmarshalling, avoiding the issue where ChatParameters.UnmarshalJSON would hijack the entire process. This is a clean solution for Go's embedded struct JSON handling limitation.


191-229: MarshalJSON correctly strips MaxTokens from reasoning for OpenAI.

The implementation creates a shallow copy of ResponsesParametersReasoning with MaxTokens explicitly set to nil before marshalling. This ensures OpenAI doesn't receive the max_tokens field which it doesn't support in reasoning parameters, while preserving Effort, GenerateSummary, and Summary.


11-13: New constant MinMaxCompletionTokens appears unused in this file.

The constant is declared but not referenced within this file. Verify it's used elsewhere or if this is leftover from refactoring.

core/providers/anthropic/chat.go (3)

100-121: Reasoning conversion logic is well-structured with proper priority.

The three-branch approach correctly prioritizes:

  1. Explicit MaxTokens → direct use
  2. Effort (non-"none") → computed budget via GetBudgetTokensFromReasoningEffort
  3. Fallback → disabled

Error handling for budget computation is properly propagated. The use of anthropic.MinimumReasoningMaxTokens is correct for Anthropic provider.


604-604: Guard condition for PartialJSON is appropriate.

The nil check before accessing *chunk.Delta.PartialJSON prevents potential nil pointer dereference.


630-657: Thinking delta handling correctly extracts to local variable.

Extracting thinkingText avoids repeated dereferences and improves readability. The use of schemas.Ptr(thinkingText) for both Reasoning and Text fields is consistent.

ui/lib/types/logs.ts (2)

411-419: Type rename aligns with backend schema changes.

The rename from ResponsesReasoningContent to ResponsesReasoningSummary maintains consistency with the backend schema in core/schemas/responses.go (lines 744-747). The field structure (type: "summary_text", text: string) correctly mirrors the Go struct.


434-434: ResponsesMessage.summary type updated consistently.

The optional summary field now uses ResponsesReasoningSummary[], maintaining type consistency across the interface hierarchy.

core/providers/openai/responses.go (2)

95-104: MaxOutputTokens clamp + User sanitization look good.
Nice guardrails before hitting the provider.


109-137: Verify tool_choice consistency when filtering unsupported tools.
The function filters req.Tools but doesn't validate or adjust req.ToolChoice if it references a removed tool type, which could make the API request invalid. Confirm whether ToolChoice is validated elsewhere or needs to be handled in this function. Also verify the supportedTypes list matches OpenAI Responses API's current specifications.

core/internal/testutil/chat_completion_stream.go (1)

576-707: Validated reasoning streaming test structure looks solid (retry + indicator checks).

core/schemas/responses.go (2)

397-411: Signature on ResponsesMessageContentBlock is a good extension point for reasoning provenance.


730-748: Reasoning summary type rename (ResponsesReasoningSummary) looks consistent and clearer than the old name.

framework/streaming/responses.go (1)

626-727: Helper split for reasoning delta vs signature is a net readability win.

core/providers/bedrock/bedrock_test.go (1)

482-934: Bedrock request-to-bifrost mapping expectations (Type field + tool output string) updates look consistent.

@Pratham-Mishra04 Pratham-Mishra04 force-pushed the 12-04-feat_responses_reasoning_fixes branch from ac6bbd4 to 3a4d9a5 Compare December 12, 2025 08:27
@Pratham-Mishra04 Pratham-Mishra04 force-pushed the 12-05-feat_send_back_raw_request_support branch from abdcda6 to 1a511ff Compare December 12, 2025 08:27
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 6

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (3)
transports/bifrost-http/handlers/inference.go (1)

91-118: Add user to responsesParamsKnownFields to avoid duplicating it into ExtraParams.
schemas.ResponsesParameters includes User, but since it’s not “known” here, extractExtraParams will also capture it, risking double-send/override downstream.

 var responsesParamsKnownFields = map[string]bool{
@@
     "truncation":           true,
+    "user":                 true,
 }
core/providers/cohere/responses.go (1)

187-742: Tool-call args: avoid initializing Arguments to "" unless you’re sure downstream can handle “empty JSON”.

In StreamEventToolCallStart, you set:

Arguments: schemas.Ptr(""), // Arguments will be filled by deltas

If a tool call ends up emitting no arg deltas, this leaves Arguments == "", which is the same class of replay/unmarshal hazard you’ve hit elsewhere.

Safer pattern:

  • keep Arguments nil until first delta arrives, or
  • ensure ToolCallEnd emits a done event that sets a valid JSON object ({}) for empty args, rather than empty string.
core/providers/anthropic/types.go (1)

58-140: Initialize ExtraParams only when capturing unknown fields to preserve omitempty behavior.

UnmarshalJSON unconditionally initializes mr.ExtraParams to a non-nil map, which defeats the omitempty tag. In Go, non-nil empty maps serialize as {} regardless of omitempty. When no unknown fields are present, ExtraParams should remain nil so the field is omitted from JSON output.

Apply the suggested fix: reset ExtraParams to nil, collect unknowns in a local map, and only assign if len(unknown) > 0. This aligns with the pattern already used in handlers/inference.go:extractExtraParams().

Additionally, consider returning an error instead of silently skipping unmarshal failures (continue), as this could mask malformed payloads that should be rejected.

♻️ Duplicate comments (15)
ui/app/workspace/logs/views/logDetailsSheet.tsx (1)

187-237: Boolean/number rendering issues remain unaddressed.

The previous review already flagged this: reasoning.generate_summary is a boolean and won't render as text inside the Badge (line 227). Also, truthiness checks like reasoning.max_tokens && ... (line 232) will hide the field when the value is 0.

core/internal/testutil/chat_completion_stream.go (1)

535-538: Cerebras prompt mismatch with test configuration.

The Cerebras-specific prompt asks to "use your tools for this" but the test request doesn't configure any tools. This was flagged in a previous review.

transports/bifrost-http/integrations/anthropic.go (3)

78-85: Align /v1/messages RawResponse passthrough gating with shouldUsePassthrough (avoid accidental raw payload exposure / inconsistency).
Right now /v1/complete and streaming require shouldUsePassthrough(...), but /v1/messages non-stream returns RawResponse for any Claude-like model regardless of claude-cli detection.

-           ResponsesResponseConverter: func(ctx *context.Context, resp *schemas.BifrostResponsesResponse) (interface{}, error) {
-               if isClaudeModel(resp.ExtraFields.ModelRequested, resp.ExtraFields.ModelDeployment, string(resp.ExtraFields.Provider)) {
+           ResponsesResponseConverter: func(ctx *context.Context, resp *schemas.BifrostResponsesResponse) (interface{}, error) {
+               if shouldUsePassthrough(ctx, resp.ExtraFields.Provider, resp.ExtraFields.ModelRequested, resp.ExtraFields.ModelDeployment) {
                    if resp.ExtraFields.RawResponse != nil {
                        return resp.ExtraFields.RawResponse, nil
                    }
                }
                return anthropic.ToAnthropicResponsesResponse(resp), nil
            },

240-245: Reconfirm provider == "" behavior in isClaudeModel (and document intent).
As written, provider-less requests can be treated as Anthropic based on schemas.IsAnthropicModel(model). If this is intentional for backwards compatibility, add an explicit comment explaining why; otherwise, tighten the condition.


90-127: Don’t drop passthrough stream events on JSON parse failure; avoid stdlib log.Printf; avoid string += in loop.
If RawResponse exists but can’t be parsed into AnthropicStreamEvent, returning ("", nil, nil) can stall/lose frames. Also, log.Printf bypasses your structured logger, and combinedContent += ... is quadratic.

@@
                if shouldUsePassthrough(ctx, resp.ExtraFields.Provider, resp.ExtraFields.ModelRequested, resp.ExtraFields.ModelDeployment) {
                    if resp.ExtraFields.RawResponse != nil {
                        raw, ok := resp.ExtraFields.RawResponse.(string)
                        if !ok {
                            return "", nil, fmt.Errorf("expected RawResponse string, got %T", resp.ExtraFields.RawResponse)
                        }
                        var rawResponseJSON anthropic.AnthropicStreamEvent
                        if err := sonic.Unmarshal([]byte(raw), &rawResponseJSON); err == nil {
                            return string(rawResponseJSON.Type), raw, nil
                        }
+                       // Fallback: forward raw payload even if we can't classify the event type.
+                       return "", raw, nil
                    }
                    return "", nil, nil
                }
@@
-                       if len(anthropicResponse) > 1 {
-                           combinedContent := ""
+                       if len(anthropicResponse) > 1 {
+                           var combinedContent strings.Builder
                            for _, event := range anthropicResponse {
                                responseJSON, err := sonic.Marshal(event)
                                if err != nil {
-                                   // Log JSON marshaling error but continue processing (should not happen)
-                                   log.Printf("Failed to marshal streaming response: %v", err)
+                                   // Prefer router/provider logger (plumb it here) rather than stdlib log.
                                    continue
                                }
-                               combinedContent += fmt.Sprintf("event: %s\ndata: %s\n\n", event.Type, responseJSON)
+                               combinedContent.WriteString(fmt.Sprintf("event: %s\ndata: %s\n\n", event.Type, responseJSON))
                            }
-                           return "", combinedContent, nil
+                           return "", combinedContent.String(), nil
                        } else if len(anthropicResponse) == 1 {
                            return string(anthropicResponse[0].Type), anthropicResponse[0], nil
                        } else {
                            return "", nil, nil
                        }
core/providers/anthropic/utils.go (1)

30-76: Past concern about stream field override remains unaddressed.

The past review flagged that when isStreaming is true, the code always sets requestBody["stream"] = true (line 48), potentially overriding an explicit stream: false in the raw body. This concern appears unaddressed in the current code.

The suggested fix from the past review was:

 		// Add stream if not present
 		if isStreaming {
-			requestBody["stream"] = true
+			if _, exists := requestBody["stream"]; !exists {
+				requestBody["stream"] = true
+			}
 		}

Also, as noted previously, consider adding function-level documentation explaining the dual-mode behavior (raw vs. conversion) and the security assumptions around raw request body handling.

core/providers/openai/types.go (2)

108-139: Regression tests still needed for the UnmarshalJSON fix.

The implementation correctly addresses the ChatParameters unmarshalling hijack issue by using a two-phase approach. However, as noted in a past review, regression tests should be added to verify that Model, Messages, Stream, MaxTokens, and Fallbacks are properly preserved alongside ChatParameters fields.


191-229: Unit tests still needed for OpenAIResponsesRequest.MarshalJSON().

The implementation correctly shadows reasoning.max_tokens (setting it to nil) and preserves custom Input marshalling. As noted in a past review, add tests to verify that max_tokens is absent from the JSON output and that input handles both string and array forms correctly.

core/providers/bedrock/utils.go (1)

118-123: Don’t overwrite AdditionalModelRequestFields (it can erase reasoning_config).
bedrockReq.AdditionalModelRequestFields = orderedFields can drop the earlier "reasoning_config" entry (and any other previously-set fields). Merge instead.

-				if orderedFields, ok := schemas.SafeExtractOrderedMap(requestFields); ok {
-					bedrockReq.AdditionalModelRequestFields = orderedFields
-				}
+				if orderedFields, ok := schemas.SafeExtractOrderedMap(requestFields); ok {
+					if bedrockReq.AdditionalModelRequestFields == nil {
+						bedrockReq.AdditionalModelRequestFields = make(schemas.OrderedMap)
+					}
+					// Preserve required/computed fields (e.g. reasoning_config) by default.
+					for k, v := range orderedFields {
+						if _, exists := bedrockReq.AdditionalModelRequestFields[k]; !exists {
+							bedrockReq.AdditionalModelRequestFields[k] = v
+						}
+					}
+				}
core/providers/azure/utils.go (1)

13-60: Normalize "stream" in raw-body mode and add a nil request guard.

  1. If raw JSON already has "stream": true and isStreaming=false, you’ll still send a streaming request.
  2. request can be nil → panic.
 func getRequestBodyForAnthropicResponses(ctx context.Context, request *schemas.BifrostResponsesRequest, deployment string, providerName schemas.ModelProvider, isStreaming bool) ([]byte, *schemas.BifrostError) {
 	var jsonBody []byte
 	var err error
+
+	if request == nil {
+		return nil, providerUtils.NewBifrostOperationError(schemas.ErrRequestBodyConversion, fmt.Errorf("request is nil"), providerName)
+	}
 
 	// Check if raw request body should be used
 	if useRawBody, ok := ctx.Value(schemas.BifrostContextKeyUseRawRequestBody).(bool); ok && useRawBody {
 		jsonBody = request.GetRawRequestBody()
@@
 		// Add stream if not present
 		if isStreaming {
 			requestBody["stream"] = true
+		} else {
+			delete(requestBody, "stream")
 		}
core/providers/vertex/utils.go (1)

13-42: Normalize "stream" in raw-body mode and add a nil request guard.
Same issue as Azure: a raw body containing "stream": true will keep streaming enabled even when isStreaming=false. Also, request == nil can panic.

 func getRequestBodyForAnthropicResponses(ctx context.Context, request *schemas.BifrostResponsesRequest, deployment string, providerName schemas.ModelProvider, isStreaming bool) ([]byte, *schemas.BifrostError) {
 	var jsonBody []byte
 	var err error
+
+	if request == nil {
+		return nil, providerUtils.NewBifrostOperationError(schemas.ErrRequestBodyConversion, fmt.Errorf("request is nil"), providerName)
+	}
@@
 		// Add stream if not present
 		if isStreaming {
 			requestBody["stream"] = true
+		} else {
+			delete(requestBody, "stream")
 		}
core/providers/gemini/responses.go (2)

138-179: Don’t emit empty Arguments, and base64-encode ThoughtSignature before putting into JSON.

This is the same issue previously flagged (and noted as “handled in different pr”), but the final code shown here still:

  • always sets ResponsesToolMessage.Arguments: &argumentsStr even when empty (can break sonic.Unmarshal on replay), and
  • stores ThoughtSignature via string([]byte) which can corrupt non-UTF8 bytes.

Suggested patch:

@@
-				argumentsStr := ""
+				var argumentsStr *string
 				if part.FunctionCall.Args != nil {
 					if argsBytes, err := json.Marshal(part.FunctionCall.Args); err == nil {
-						argumentsStr = string(argsBytes)
+						s := string(argsBytes)
+						if strings.TrimSpace(s) != "" {
+							argumentsStr = &s
+						}
 					}
 				}
@@
 				toolMsg := &schemas.ResponsesToolMessage{
 					CallID:    &functionCallID,
 					Name:      &functionCallName,
-					Arguments: &argumentsStr,
+					Arguments: argumentsStr,
 				}
@@
 				if len(part.ThoughtSignature) > 0 {
-					thoughtSig := string(part.ThoughtSignature)
+					thoughtSig := base64.RawStdEncoding.EncodeToString(part.ThoughtSignature)
 					reasoningMsg := schemas.ResponsesMessage{
@@
 						ResponsesReasoning: &schemas.ResponsesReasoning{
 							Summary:          []schemas.ResponsesReasoningSummary{},
 							EncryptedContent: &thoughtSig,
 						},
 					}
 					messages = append(messages, reasoningMsg)
 				}

Also, since this is a Graphite stack: please ensure the “other PR” with this fix is upstack of #1000, otherwise this PR reintroduces the bug.


609-630: Decode base64 EncryptedContent back into ThoughtSignature (don’t treat it as raw bytes).

If EncryptedContent is base64 (as it should be), this line will currently produce the wrong bytes:
part.ThoughtSignature = []byte(*nextMsg.ResponsesReasoning.EncryptedContent).

@@
 						if nextMsg.Type != nil && *nextMsg.Type == schemas.ResponsesMessageTypeReasoning &&
 							nextMsg.ResponsesReasoning != nil && nextMsg.ResponsesReasoning.EncryptedContent != nil {
-							part.ThoughtSignature = []byte(*nextMsg.ResponsesReasoning.EncryptedContent)
+							sigBytes, err := base64.RawStdEncoding.DecodeString(*nextMsg.ResponsesReasoning.EncryptedContent)
+							if err != nil {
+								return nil, nil, fmt.Errorf("failed to decode thought signature: %w", err)
+							}
+							part.ThoughtSignature = sigBytes
 						}
core/providers/cohere/responses.go (2)

1294-1330: Regression risk: encrypted reasoning is embedded into plain text marker.

This is the same concern raised earlier: the fallback branch still constructs:
[ENCRYPTED_REASONING: ...].

Even if most callers set Summary: []...{} (thus skipping the branch), this remains a footgun for any message where Summary is nil but EncryptedContent is set.

Recommendation: always drop EncryptedContent for Cohere (or store it in non-user-visible metadata if you have a safe place), never embed it in text.


742-823: Delete AnnotationIndexToContentIndex entries on citation end to prevent unbounded growth.

StreamEventCitationEnd looks up the mapping but never removes it. Add:

 			contentIndex, exists := state.AnnotationIndexToContentIndex[*chunk.Index]
 			if !exists {
 				contentIndex = *chunk.Index
 			}
+			delete(state.AnnotationIndexToContentIndex, *chunk.Index)
🧹 Nitpick comments (16)
core/schemas/utils.go (2)

1042-1045: Comment tweak looks good; consider labeling this as heuristic detection. The implementation is a substring check (not authoritative provider parsing), so “checks if … is an Anthropic model” can be read as stronger than what it does.


1047-1050: Comment tweak looks good; same “heuristic” wording suggestion applies. This is also a substring-based check and may match unexpected strings.

core/providers/cohere/types.go (2)

69-97: Inconsistent JSON library usage between CohereMessageContent and streaming structs.

CohereMessageContent uses encoding/json for marshal/unmarshal (lines 72-75, 84-91), while the streaming structs (CohereStreamToolCallStruct, CohereStreamContentStruct, CohereStreamCitationStruct) use github.com/bytedance/sonic. Consider using sonic consistently throughout for performance alignment, or document why standard json is preferred here.

 func (c *CohereMessageContent) MarshalJSON() ([]byte, error) {
 	if c.StringContent != nil {
-		return json.Marshal(*c.StringContent)
+		return sonic.Marshal(*c.StringContent)
 	}
 	if c.BlocksContent != nil {
-		return json.Marshal(c.BlocksContent)
+		return sonic.Marshal(c.BlocksContent)
 	}
 	return []byte("null"), nil
 }

 func (c *CohereMessageContent) UnmarshalJSON(data []byte) error {
 	var str string
-	if err := json.Unmarshal(data, &str); err == nil {
+	if err := sonic.Unmarshal(data, &str); err == nil {
 		c.StringContent = &str
 		return nil
 	}

 	var blocks []CohereContentBlock
-	if err := json.Unmarshal(data, &blocks); err == nil {
+	if err := sonic.Unmarshal(data, &blocks); err == nil {
 		c.BlocksContent = blocks
 		return nil
 	}

 	return fmt.Errorf("content must be either string or array of content blocks")
 }

425-433: Inconsistent null marshaling between structs.

CohereMessageContent.MarshalJSON returns []byte("null"), nil when both fields are nil, but the streaming structs return sonic.Marshal(nil). While functionally equivalent, consider using the same approach for consistency:

 func (c *CohereStreamToolCallStruct) MarshalJSON() ([]byte, error) {
 	if c.CohereToolCallObject != nil {
 		return sonic.Marshal(c.CohereToolCallObject)
 	}
 	if c.CohereToolCallArray != nil {
 		return sonic.Marshal(c.CohereToolCallArray)
 	}
-	return sonic.Marshal(nil)
+	return []byte("null"), nil
 }
core/internal/testutil/chat_completion_stream.go (1)

583-693: Consider extracting shared streaming validation logic.

The validation callback in ChatCompletionStreamWithReasoningValidated (lines 583-693) shares significant logic with ChatCompletionStreamWithReasoning (lines 422-497). Consider extracting the common streaming read loop and reasoning detection into a helper function to reduce duplication.

core/providers/bedrock/types.go (1)

128-137: Silent skip of unmarshal errors may hide data issues.

When an unknown field fails to unmarshal (line 133), it's silently skipped. Consider logging a warning to aid debugging when unexpected field types are encountered:

 for key, value := range rawData {
 	if !bedrockConverseRequestKnownFields[key] {
 		var v interface{}
 		if err := sonic.Unmarshal(value, &v); err != nil {
-			continue // Skip fields that can't be unmarshaled
+			// Log but don't fail - allows forward compatibility
+			// Consider: log.Printf("bedrock: failed to unmarshal extra param %q: %v", key, err)
+			continue
 		}
 		r.ExtraParams[key] = v
 	}
 }
transports/bifrost-http/integrations/router.go (2)

709-712: DONE marker gating is brittle when keyed off config.Path substring.
strings.Contains(config.Path, "/responses") can accidentally affect non-OpenAI integrations if they ever have “/responses” in route paths; prefer gating by route/integration type + request kind, or pass a boolean from handleStreamingRequest when the request is ResponsesStream.

-        shouldSendDoneMarker := true
-        if config.Type == RouteConfigTypeAnthropic || strings.Contains(config.Path, "/responses") {
+        shouldSendDoneMarker := true
+        if config.Type == RouteConfigTypeAnthropic ||
+           (config.Type == RouteConfigTypeOpenAI && strings.Contains(config.Path, "/responses")) {
             shouldSendDoneMarker = false
         }

879-886: Raw SSE string detection should accept more valid SSE prefixes (and “event:” without requiring a space).
Current check can double-wrap valid SSE strings like id: ... / retry: ... or event: (no space), producing invalid output.

-                } else if sseString, ok := convertedResponse.(string); ok {
+                } else if sseString, ok := convertedResponse.(string); ok {
                     // CUSTOM SSE FORMAT: The converter returned a complete SSE string
                     // This is used by providers like Anthropic that need custom event types
                     // Example: "event: content_block_delta\ndata: {...}\n\n"
-                    if !strings.HasPrefix(sseString, "data: ") && !strings.HasPrefix(sseString, "event: ") {
+                    // Accept common SSE fields; allow optional space after ':' per SSE conventions.
+                    if !(strings.HasPrefix(sseString, "data:") ||
+                        strings.HasPrefix(sseString, "event:") ||
+                        strings.HasPrefix(sseString, "id:") ||
+                        strings.HasPrefix(sseString, "retry:")) {
                         sseString = fmt.Sprintf("data: %s\n\n", sseString)
                     }
                     if _, err := fmt.Fprint(w, sseString); err != nil {
                         cancel() // Client disconnected (write error), cancel upstream stream
                         return
                     }
core/providers/openai/types_test.go (1)

204-331: Make the error-case assertion more specific (improves debuggability).
Right now the “both reasoning and reasoning_effort should error” case only checks err != nil; consider asserting the error mentions reasoning_effort (or whichever invariant you enforce).

         if tt.expectError {
             if err == nil {
                 t.Error("Expected error but got none")
             }
+            // Optional: tighten to a stable substring if available.
+            // if err != nil && !strings.Contains(err.Error(), "reasoning_effort") { ... }
             return
         }
core/providers/openai/responses.go (1)

114-142: Consider documenting how to maintain the supported tools list.

The filterUnsupportedTools method uses a hardcoded map of supported tool types. While the implementation is correct, this creates a maintenance burden: when new tool types are added to schemas.ResponsesToolType, this list must also be updated.

Consider adding a comment noting this, or alternatively, use a deny-list approach (filter out specifically unsupported types) if that list is shorter and more stable.

core/providers/bedrock/utils.go (1)

35-76: Reasoning budget derivation looks good; consider making "disabled" consistent type-wise.
The split between MaxTokens vs Effort (and Bedrock-vs-Anthropic min budget selection) is a solid improvement. One small consistency tweak: "disabled" uses map[string]string while enabled uses map[string]any; consider using map[string]any for both to keep downstream handling uniform.

core/providers/vertex/utils.go (1)

53-78: Optional: remove the “set model then delete model” step (clarity).
reqBody.Model = deployment has no effect on the final payload since "model" is removed before sending to Vertex; consider dropping it to reduce confusion (unless some downstream logic depends on it prior to the delete).

core/schemas/responses.go (1)

400-404: Signature field addition looks reasonable; ensure it’s populated/forwarded consistently across the stack.
This adds a new surface area (signature) that UI/stream aggregators will likely rely on; worth double-checking all provider adapters in this PR stack either set it or intentionally leave it nil.

core/providers/bedrock/bedrock_test.go (1)

1497-1513: Structural comparison for Bedrock response conversion is fine; consider require.NotNil on nested pointers to avoid panic on regressions.

If actual.Output / actual.Output.Message ever regresses to nil, these assertions would panic. A couple require.NotNil(t, actual.Output) / require.NotNil(t, actual.Output.Message) would make failures cleaner.

core/providers/cohere/responses.go (2)

1225-1247: ToolChoice default-to-required is risky for unknown strings.

If tool_choice contains an unrecognized value, defaulting to required can force tool calls unexpectedly. Consider returning nil (let Cohere default) for the default: case.


1345-1347: Minor: use msg.ResponsesToolMessage.CallID instead of promoted msg.CallID for clarity.

Current code is safe due to the nil-guard, but accessing through msg.ResponsesToolMessage.CallID avoids confusion about promoted fields.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between ac6bbd4 and 3a4d9a5.

⛔ Files ignored due to path filters (1)
  • ui/package-lock.json is excluded by !**/package-lock.json
📒 Files selected for processing (58)
  • core/internal/testutil/account.go (1 hunks)
  • core/internal/testutil/chat_completion_stream.go (1 hunks)
  • core/internal/testutil/reasoning.go (4 hunks)
  • core/internal/testutil/responses_stream.go (1 hunks)
  • core/internal/testutil/tests.go (2 hunks)
  • core/providers/anthropic/anthropic.go (3 hunks)
  • core/providers/anthropic/chat.go (5 hunks)
  • core/providers/anthropic/errors.go (2 hunks)
  • core/providers/anthropic/types.go (8 hunks)
  • core/providers/anthropic/utils.go (2 hunks)
  • core/providers/azure/azure.go (2 hunks)
  • core/providers/azure/utils.go (1 hunks)
  • core/providers/bedrock/bedrock.go (2 hunks)
  • core/providers/bedrock/bedrock_test.go (13 hunks)
  • core/providers/bedrock/types.go (2 hunks)
  • core/providers/bedrock/utils.go (2 hunks)
  • core/providers/cerebras/cerebras_test.go (2 hunks)
  • core/providers/cohere/chat.go (3 hunks)
  • core/providers/cohere/cohere.go (2 hunks)
  • core/providers/cohere/cohere_test.go (1 hunks)
  • core/providers/cohere/responses.go (7 hunks)
  • core/providers/cohere/types.go (1 hunks)
  • core/providers/gemini/responses.go (2 hunks)
  • core/providers/groq/groq_test.go (2 hunks)
  • core/providers/mistral/mistral_test.go (1 hunks)
  • core/providers/openai/chat.go (1 hunks)
  • core/providers/openai/responses.go (2 hunks)
  • core/providers/openai/responses_marshal_test.go (1 hunks)
  • core/providers/openai/responses_test.go (1 hunks)
  • core/providers/openai/text.go (1 hunks)
  • core/providers/openai/types.go (3 hunks)
  • core/providers/openai/types_test.go (1 hunks)
  • core/providers/openai/utils.go (1 hunks)
  • core/providers/openrouter/openrouter_test.go (1 hunks)
  • core/providers/utils/utils.go (4 hunks)
  • core/providers/vertex/errors.go (1 hunks)
  • core/providers/vertex/types.go (0 hunks)
  • core/providers/vertex/utils.go (1 hunks)
  • core/providers/vertex/vertex.go (3 hunks)
  • core/providers/vertex/vertex_test.go (1 hunks)
  • core/schemas/bifrost.go (1 hunks)
  • core/schemas/mux.go (0 hunks)
  • core/schemas/responses.go (5 hunks)
  • core/schemas/utils.go (1 hunks)
  • docs/docs.json (0 hunks)
  • framework/configstore/rdb.go (0 hunks)
  • framework/streaming/chat.go (0 hunks)
  • framework/streaming/responses.go (2 hunks)
  • transports/bifrost-http/handlers/inference.go (1 hunks)
  • transports/bifrost-http/handlers/middlewares.go (1 hunks)
  • transports/bifrost-http/integrations/anthropic.go (5 hunks)
  • transports/bifrost-http/integrations/router.go (3 hunks)
  • ui/app/workspace/logs/views/columns.tsx (1 hunks)
  • ui/app/workspace/logs/views/logDetailsSheet.tsx (1 hunks)
  • ui/app/workspace/logs/views/logResponsesMessageView.tsx (2 hunks)
  • ui/app/workspace/logs/views/logResponsesOutputView.tsx (0 hunks)
  • ui/lib/types/logs.ts (2 hunks)
  • ui/package.json (1 hunks)
💤 Files with no reviewable changes (6)
  • core/schemas/mux.go
  • docs/docs.json
  • ui/app/workspace/logs/views/logResponsesOutputView.tsx
  • core/providers/vertex/types.go
  • framework/configstore/rdb.go
  • framework/streaming/chat.go
🚧 Files skipped from review as they are similar to previous changes (20)
  • core/providers/cohere/cohere_test.go
  • ui/app/workspace/logs/views/logResponsesMessageView.tsx
  • ui/lib/types/logs.ts
  • ui/app/workspace/logs/views/columns.tsx
  • core/providers/anthropic/chat.go
  • ui/package.json
  • core/providers/openai/chat.go
  • core/providers/openrouter/openrouter_test.go
  • core/internal/testutil/responses_stream.go
  • core/providers/azure/azure.go
  • core/providers/groq/groq_test.go
  • core/providers/cerebras/cerebras_test.go
  • core/providers/cohere/cohere.go
  • core/internal/testutil/account.go
  • core/internal/testutil/tests.go
  • core/providers/anthropic/anthropic.go
  • core/providers/utils/utils.go
  • core/providers/openai/text.go
  • transports/bifrost-http/handlers/middlewares.go
  • core/providers/bedrock/bedrock.go
🧰 Additional context used
📓 Path-based instructions (1)
**

⚙️ CodeRabbit configuration file

always check the stack if there is one for the current PR. do not give localized reviews for the PR, always see all changes in the light of the whole stack of PRs (if there is a stack, if there is no stack you can continue to make localized suggestions/reviews)

Files:

  • core/providers/openai/utils.go
  • core/providers/openai/responses.go
  • core/providers/vertex/utils.go
  • core/providers/openai/types_test.go
  • core/providers/openai/responses_test.go
  • core/providers/vertex/errors.go
  • core/providers/openai/responses_marshal_test.go
  • ui/app/workspace/logs/views/logDetailsSheet.tsx
  • core/internal/testutil/chat_completion_stream.go
  • core/providers/vertex/vertex.go
  • core/schemas/responses.go
  • core/providers/gemini/responses.go
  • core/providers/anthropic/utils.go
  • core/providers/cohere/chat.go
  • core/providers/bedrock/utils.go
  • transports/bifrost-http/integrations/anthropic.go
  • transports/bifrost-http/handlers/inference.go
  • core/schemas/utils.go
  • core/providers/anthropic/errors.go
  • core/providers/cohere/types.go
  • core/providers/azure/utils.go
  • core/providers/mistral/mistral_test.go
  • core/schemas/bifrost.go
  • core/providers/openai/types.go
  • framework/streaming/responses.go
  • core/internal/testutil/reasoning.go
  • transports/bifrost-http/integrations/router.go
  • core/providers/bedrock/types.go
  • core/providers/cohere/responses.go
  • core/providers/anthropic/types.go
  • core/providers/bedrock/bedrock_test.go
  • core/providers/vertex/vertex_test.go
🧠 Learnings (4)
📚 Learning: 2025-12-09T17:07:42.007Z
Learnt from: qwerty-dvorak
Repo: maximhq/bifrost PR: 1006
File: core/schemas/account.go:9-18
Timestamp: 2025-12-09T17:07:42.007Z
Learning: In core/schemas/account.go, the HuggingFaceKeyConfig field within the Key struct is currently unused and reserved for future Hugging Face inference endpoint deployments. Do not flag this field as missing from OpenAPI documentation or require its presence in the API spec until the feature is actively implemented and used. When the feature is added, update the OpenAPI docs accordingly; otherwise, treat this field as non-breaking and not part of the current API surface.

Applied to files:

  • core/providers/openai/utils.go
  • core/providers/openai/responses.go
  • core/providers/vertex/utils.go
  • core/providers/openai/types_test.go
  • core/providers/openai/responses_test.go
  • core/providers/vertex/errors.go
  • core/providers/openai/responses_marshal_test.go
  • core/internal/testutil/chat_completion_stream.go
  • core/providers/vertex/vertex.go
  • core/schemas/responses.go
  • core/providers/gemini/responses.go
  • core/providers/anthropic/utils.go
  • core/providers/cohere/chat.go
  • core/providers/bedrock/utils.go
  • transports/bifrost-http/integrations/anthropic.go
  • transports/bifrost-http/handlers/inference.go
  • core/schemas/utils.go
  • core/providers/anthropic/errors.go
  • core/providers/cohere/types.go
  • core/providers/azure/utils.go
  • core/providers/mistral/mistral_test.go
  • core/schemas/bifrost.go
  • core/providers/openai/types.go
  • framework/streaming/responses.go
  • core/internal/testutil/reasoning.go
  • transports/bifrost-http/integrations/router.go
  • core/providers/bedrock/types.go
  • core/providers/cohere/responses.go
  • core/providers/anthropic/types.go
  • core/providers/bedrock/bedrock_test.go
  • core/providers/vertex/vertex_test.go
📚 Learning: 2025-12-11T11:58:25.307Z
Learnt from: Pratham-Mishra04
Repo: maximhq/bifrost PR: 1000
File: core/providers/openai/responses.go:42-84
Timestamp: 2025-12-11T11:58:25.307Z
Learning: In core/providers/openai/responses.go (and related OpenAI response handling), document and enforce the API format constraint: if ResponsesReasoning != nil and the response contains content blocks, all content blocks should be treated as reasoning blocks by default. Implement type guards or parsing logic accordingly, and add unit tests to verify that when ResponsesReasoning is non-nil, content blocks are labeled as reasoning blocks. Include clear comments in the code explaining the rationale and ensure downstream consumers rely on this behavior.

Applied to files:

  • core/providers/openai/utils.go
  • core/providers/openai/responses.go
  • core/providers/openai/types_test.go
  • core/providers/openai/responses_test.go
  • core/providers/openai/responses_marshal_test.go
  • core/providers/openai/types.go
📚 Learning: 2025-12-12T08:25:02.629Z
Learnt from: Pratham-Mishra04
Repo: maximhq/bifrost PR: 1000
File: transports/bifrost-http/integrations/router.go:709-712
Timestamp: 2025-12-12T08:25:02.629Z
Learning: In transports/bifrost-http/**/*.go, update streaming response handling to align with OpenAI Responses API: use typed SSE events such as response.created, response.output_text.delta, response.done, etc., and do not rely on the legacy data: [DONE] termination marker. Note that data: [DONE] is only used by the older Chat Completions and Text Completions streaming APIs. Ensure parsers, writers, and tests distinguish SSE events from the [DONE] sentinel and handle each event type accordingly for correct stream termination and progress updates.

Applied to files:

  • transports/bifrost-http/integrations/anthropic.go
  • transports/bifrost-http/handlers/inference.go
  • transports/bifrost-http/integrations/router.go
📚 Learning: 2025-12-11T07:38:31.413Z
Learnt from: Pratham-Mishra04
Repo: maximhq/bifrost PR: 1000
File: core/providers/bedrock/bedrock_test.go:1374-1390
Timestamp: 2025-12-11T07:38:31.413Z
Learning: In core/providers/bedrock tests, follow a layered testing approach: - Unit tests (e.g., TestBifrostToBedrockResponseConversion) should perform structural comparisons and type/field checks to avoid brittleness from dynamic fields. - Separate scenario-based and integration tests should validate the full end-to-end conversion logic, including content block internals. Ensure unit tests avoid brittle string/field matching and that integration tests cover end-to-end behavior with realistic data.

Applied to files:

  • core/providers/bedrock/bedrock_test.go
🧬 Code graph analysis (16)
core/providers/openai/responses.go (4)
core/schemas/responses.go (2)
  • ResponsesMessage (314-327)
  • ResponsesParameters (87-114)
core/schemas/utils.go (1)
  • Ptr (16-18)
core/providers/openai/types.go (3)
  • OpenAIResponsesRequest (180-189)
  • OpenAIResponsesRequestInput (147-150)
  • MinMaxCompletionTokens (12-12)
core/providers/openai/utils.go (1)
  • SanitizeUserField (51-56)
core/providers/openai/types_test.go (2)
core/providers/openai/types.go (1)
  • OpenAIChatRequest (46-59)
core/schemas/chatcompletions.go (2)
  • ChatMessageRoleUser (462-462)
  • ChatMessageRoleSystem (463-463)
core/providers/vertex/errors.go (4)
core/providers/utils/utils.go (2)
  • CheckAndDecodeBody (490-498)
  • NewBifrostOperationError (516-527)
core/schemas/provider.go (1)
  • ErrProviderResponseDecode (29-29)
core/providers/vertex/vertex.go (1)
  • VertexError (25-31)
core/providers/vertex/types.go (1)
  • VertexValidationError (153-160)
core/providers/openai/responses_marshal_test.go (2)
core/providers/openai/types.go (2)
  • OpenAIResponsesRequest (180-189)
  • OpenAIResponsesRequestInput (147-150)
core/schemas/responses.go (3)
  • ResponsesParameters (87-114)
  • ResponsesParametersReasoning (234-239)
  • ResponsesMessage (314-327)
core/schemas/responses.go (2)
core/providers/gemini/types.go (1)
  • Type (778-778)
ui/lib/types/logs.ts (2)
  • ResponsesMessageContentBlockType (352-359)
  • ResponsesReasoningSummary (412-415)
core/providers/anthropic/utils.go (7)
core/schemas/responses.go (1)
  • BifrostResponsesRequest (32-39)
core/schemas/bifrost.go (3)
  • ModelProvider (32-32)
  • BifrostError (358-367)
  • BifrostContextKeyUseRawRequestBody (117-117)
core/providers/utils/utils.go (1)
  • NewBifrostOperationError (516-527)
core/schemas/provider.go (2)
  • ErrRequestBodyConversion (25-25)
  • ErrProviderRequestMarshal (26-26)
core/providers/anthropic/types.go (1)
  • AnthropicDefaultMaxTokens (14-14)
core/providers/anthropic/responses.go (1)
  • ToAnthropicResponsesRequest (1419-1532)
core/schemas/utils.go (1)
  • Ptr (16-18)
core/providers/cohere/chat.go (5)
core/providers/cohere/types.go (5)
  • CohereThinking (174-177)
  • ThinkingTypeEnabled (183-183)
  • DefaultCompletionMaxTokens (12-12)
  • MinimumReasoningMaxTokens (11-11)
  • ThinkingTypeDisabled (184-184)
core/providers/bedrock/types.go (2)
  • DefaultCompletionMaxTokens (13-13)
  • MinimumReasoningMaxTokens (12-12)
core/providers/utils/utils.go (1)
  • GetBudgetTokensFromReasoningEffort (1084-1121)
core/schemas/utils.go (1)
  • Ptr (16-18)
core/schemas/chatcompletions.go (4)
  • ChatStreamResponseChoice (751-753)
  • ChatStreamResponseChoiceDelta (756-763)
  • ChatReasoningDetails (723-730)
  • BifrostReasoningDetailsTypeText (719-719)
transports/bifrost-http/handlers/inference.go (2)
core/schemas/bifrost.go (1)
  • ResponsesRequest (91-91)
core/schemas/responses.go (1)
  • ResponsesParameters (87-114)
core/providers/anthropic/errors.go (2)
core/schemas/bifrost.go (1)
  • BifrostError (358-367)
ui/lib/types/logs.ts (1)
  • BifrostError (226-232)
core/providers/openai/types.go (2)
core/schemas/chatcompletions.go (1)
  • ChatParameters (155-184)
core/schemas/responses.go (1)
  • ResponsesParametersReasoning (234-239)
framework/streaming/responses.go (3)
core/schemas/responses.go (8)
  • ResponsesMessage (314-327)
  • ResponsesMessageTypeReasoning (307-307)
  • ResponsesInputMessageRoleAssistant (332-332)
  • ResponsesReasoning (731-734)
  • ResponsesReasoningSummary (745-748)
  • ResponsesMessageContent (339-344)
  • ResponsesMessageContentBlock (399-411)
  • ResponsesOutputMessageContentTypeReasoning (394-394)
core/schemas/utils.go (1)
  • Ptr (16-18)
core/utils.go (1)
  • Ptr (51-53)
transports/bifrost-http/integrations/router.go (2)
core/schemas/bifrost.go (1)
  • BifrostContextKeyIntegrationType (120-120)
core/providers/gemini/types.go (1)
  • Type (778-778)
core/providers/bedrock/types.go (5)
core/providers/anthropic/types.go (1)
  • Alias (104-104)
core/providers/openai/types.go (2)
  • Alias (83-83)
  • Alias (194-194)
core/schemas/bifrost.go (2)
  • Alias (388-388)
  • Alias (405-405)
core/schemas/chatcompletions.go (4)
  • Alias (189-189)
  • Alias (643-643)
  • Alias (770-770)
  • Alias (860-860)
core/providers/gemini/types.go (7)
  • Alias (220-220)
  • Alias (245-245)
  • Alias (828-830)
  • Alias (842-844)
  • Alias (857-859)
  • Alias (871-873)
  • Alias (1306-1306)
core/providers/cohere/responses.go (4)
core/providers/cohere/types.go (2)
  • CohereContentBlock (146-160)
  • CohereStreamEvent (405-410)
core/schemas/responses.go (5)
  • ResponsesMessageContentBlock (399-411)
  • BifrostResponsesStreamResponse (1427-1466)
  • ResponsesMessage (314-327)
  • ResponsesMessageContent (339-344)
  • ResponsesOutputMessageContentTextAnnotation (438-449)
core/schemas/utils.go (1)
  • Ptr (16-18)
core/providers/utils/utils.go (2)
  • GetRandomString (1027-1038)
  • GetBudgetTokensFromReasoningEffort (1084-1121)
core/providers/anthropic/types.go (3)
core/providers/bedrock/types.go (1)
  • Alias (104-104)
core/providers/openai/types.go (2)
  • Alias (83-83)
  • Alias (194-194)
core/schemas/chatcompletions.go (4)
  • Alias (189-189)
  • Alias (643-643)
  • Alias (770-770)
  • Alias (860-860)
core/providers/bedrock/bedrock_test.go (3)
core/schemas/utils.go (1)
  • Ptr (16-18)
core/schemas/responses.go (3)
  • ResponsesMessageTypeMessage (290-290)
  • ResponsesToolMessage (462-482)
  • ResponsesToolMessageOutputStruct (531-535)
core/schemas/chatcompletions.go (1)
  • OrderedMap (268-268)
⏰ Context from checks skipped due to timeout of 900000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
🔇 Additional comments (36)
core/providers/vertex/vertex_test.go (1)

31-31: No changes needed. The Vertex provider intentionally supports Anthropic Claude models through explicit model routing logic. The IsAnthropicModel() function in schemas/utils.go recognizes any model string containing "claude" and the Vertex provider has dedicated handling to route these models via the Anthropic API format (adding anthropic_version to the request). Setting ReasoningModel: "claude-4.5-haiku" in the test configuration is correct and consistent with how other providers (Bedrock, Anthropic, Azure) configure reasoning models.

core/providers/cohere/types.go (1)

11-12: LGTM! Constants align with other providers.

These reasoning token budget constants are consistent with the same values introduced in core/providers/bedrock/types.go, establishing uniform semantics across providers.

core/providers/vertex/errors.go (2)

14-17: LGTM! Proper response decoding before unmarshalling.

Good addition of CheckAndDecodeBody to handle content-encoded responses (e.g., gzip). The early return on decode failure with ErrProviderResponseDecode is appropriate.


19-43: Multi-format error parsing looks correct.

The fallback chain (OpenAI → []VertexError → VertexError → VertexValidationError) correctly uses decodedBody throughout, and the error propagation paths are consistent with the relevant code snippets from core/providers/vertex/vertex.go and types.go.

core/internal/testutil/chat_completion_stream.go (1)

361-521: LGTM! Well-structured reasoning streaming test.

The test properly checks for multiple reasoning indicators (content, details, tokens) and handles the fact that different providers may expose reasoning differently. The 200-second timeout is appropriate for reasoning-heavy models, and the warning-only approach for missing indicators is reasonable given provider variability.

core/providers/bedrock/types.go (2)

12-13: LGTM! Constants consistent with Cohere provider.

These reasoning token budget constants match core/providers/cohere/types.go, maintaining consistency across providers.


82-98: Known fields map now correctly includes all struct fields.

The previous review comments about missing serviceTier and extra_params have been addressed. The map now correctly includes all JSON-tagged fields from BedrockConverseRequest.

core/internal/testutil/reasoning.go (3)

12-51: Responses reasoning test updates look coherent (token bump + include reasoning.encrypted_content).
No concerns with the updated request construction / validation flow.

Also applies to: 87-116


201-310: Chat Completions reasoning test: sensible provider skip + retry wiring.
Looks fine; the OpenAI skip is a pragmatic flake-reduction lever.


314-411: Validator is nil-safe and non-fatal; logging is bounded.
No issues spotted in the new reasoning detection heuristics.

core/providers/mistral/mistral_test.go (1)

49-50: Commented Reasoning=false is clear and implementation-accurate.
Good clarification that this is a bifrost integration gap, not a provider capability claim.

core/providers/openai/utils.go (1)

47-56: User-field sanitization is a good defensive guard (64-char cap).
Returning nil to omit the field is reasonable and avoids upstream request failures. Based on learnings, this matches an observed OpenAI constraint.

transports/bifrost-http/handlers/inference.go (1)

224-254: Custom ResponsesRequest.UnmarshalJSON looks correct (mirrors ChatRequest workaround).

transports/bifrost-http/integrations/router.go (1)

312-314: Context: integration type propagation looks correct; consider using it (instead of path heuristics) for stream behavior.
Storing string(config.Type) under schemas.BifrostContextKeyIntegrationType is a good stack-level hook; it would also let you avoid strings.Contains(config.Path, ...) for stream quirks.

core/schemas/bifrost.go (1)

120-123: New context keys look good and align with router propagation.
BifrostContextKeyIntegrationType and BifrostContextKeyUserAgent are reasonable additions to the shared context surface.

core/providers/vertex/vertex.go (2)

841-893: Streaming path now shares the same helper; double-check it sets streaming-specific fields for Vertex Anthropic (streamRawPredict).
You’re calling the helper with isStreaming=true and passing the returned bytes to anthropic.HandleAnthropicResponsesStream; please confirm the helper sets/keeps any required streaming flags/fields and doesn’t regress SSE framing expectations across the PR stack.


703-707: Helper correctly preserves raw-body passthrough and all Anthropic-on-Vertex semantics.

The getRequestBodyForAnthropicResponses helper in core/providers/vertex/utils.go properly implements both the raw-body passthrough path (via context flag BifrostContextKeyUseRawRequestBody) and the standard Anthropic conversion path. It correctly handles:

  • Raw body retrieval and processing when flagged, with required transformations (anthropic_version injection, max_tokens, stream flag)
  • Removal of model/region fields for Vertex compatibility
  • Proper streaming flag assignment (false for responses, true for stream)
  • Deployment-to-model mapping
  • Consistent error handling

The refactored calls at lines 703 and ~841 are correct and preserve prior semantics.

core/providers/openai/types_test.go (2)

1-203: Good coverage for base-field + ChatParameters preservation.
These cases should prevent accidental regressions in OpenAIChatRequest custom unmarshalling.


333-462: Presence + value assertion tests look clean and complementary.
Nice separation between “field present” and “exact value” validation.

core/providers/anthropic/errors.go (1)

3-58: No redeclaration exists for ToAnthropicResponsesStreamError—only one definition is present in the codebase.

The primary concern about package-level redeclaration is unfounded; there is exactly one definition at core/providers/anthropic/errors.go:43 and one active usage at transports/bifrost-http/integrations/anthropic.go:129.

The silent return of "" on json.Marshal failure is a minor consideration. In practice, marshaling the constructed AnthropicMessageError is extremely unlikely to fail since it uses well-defined, serializable types. If error resilience is needed, a minimal fallback (e.g., "event: error\ndata: {}\n\n") could be considered, but the current implementation is acceptable for this use case.

Likely an incorrect or invalid review comment.

core/providers/openai/types.go (1)

11-13: LGTM! Constant defined for minimum token clamping.

The MinMaxCompletionTokens = 16 constant provides a sensible minimum floor for output tokens, used in responses.go to clamp MaxOutputTokens.

core/providers/openai/responses.go (3)

47-59: LGTM! Excellent documentation of the reasoning-only message skip constraint.

The detailed comment (lines 47-52) clearly explains why reasoning-only messages are skipped for non-gpt-oss models. This addresses the past review request for documenting this API format constraint. The logic correctly checks all required conditions before skipping.


71-78: LGTM! Pointer-to-range-variable bug correctly fixed.

Using schemas.Ptr(summary.Text) (line 76) correctly creates a new pointer for each iteration, avoiding the aliasing issue where all Text pointers would reference the same memory location. This addresses the critical bug flagged in past reviews.


100-109: LGTM! Proper parameter sanitization and clamping.

The code correctly:

  • Clamps MaxOutputTokens to a minimum of 16 tokens (lines 102-104)
  • Sanitizes the User field to respect OpenAI's 64-character limit (line 106)
  • Filters out unsupported tool types (line 108)
core/providers/openai/responses_test.go (2)

10-193: LGTM! Comprehensive tests for reasoning-only message skip logic.

These tests thoroughly cover the documented OpenAI responses format constraint. The test cases validate:

  • Reasoning-only messages are skipped for non-gpt-oss models (when Summary is empty, ContentBlocks are non-empty, and EncryptedContent is nil)
  • Messages with Summary, EncryptedContent, empty ContentBlocks, or nil Content are preserved
  • gpt-oss models preserve reasoning-only messages

This directly addresses the past review request for unit tests to lock in this behavior.


195-346: LGTM! Thorough tests for Summary-to-ContentBlocks conversion.

The tests properly validate that:

  • gpt-oss models convert Summary to ContentBlocks when Content is nil
  • Existing Content is preserved (not overwritten)
  • Variant model names containing "gpt-oss" also trigger conversion
  • Original message fields (ID, Type, Status, Role) are preserved during transformation
core/providers/cohere/chat.go (2)

104-129: LGTM! Reasoning budget calculation with proper error handling.

The updated logic correctly:

  1. Prioritizes explicit MaxTokens when provided (lines 106-110)
  2. Falls back to effort-based budget calculation using GetBudgetTokensFromReasoningEffort (lines 111-123)
  3. Properly handles the error return from the budget calculation (lines 117-119)
  4. Uses DefaultCompletionMaxTokens as fallback when MaxCompletionTokens is nil (lines 112-115)
  5. Disables thinking when neither MaxTokens nor valid Effort is provided (lines 124-128)

This addresses the past review concern about nil pointer dereference on MaxCompletionTokens.


439-463: LGTM! Clean handling of thinking text in streaming response.

The code correctly:

  1. Extracts the thinking text once (line 440) to avoid repeated dereferencing
  2. Sets both Reasoning and ReasoningDetails[0].Text to point to the thinking content (lines 448, 453)
  3. Uses schemas.Ptr(thinkingText) to create proper pointers
core/providers/openai/responses_marshal_test.go (1)

12-480: Good coverage for custom marshaling/round-trip edge cases.
These tests should help stabilize the “field shadowing + omit reasoning.max_tokens” behavior and input union encoding.

core/schemas/responses.go (1)

730-748: ResponsesReasoning.Summary is consistently initialized to empty slice across all providers; the null-marshaling concern doesn't apply in practice.

The Summary field lacks omitempty and will marshal as an empty array [] rather than null. However, this is the intended behavior—every provider (OpenAI, Anthropic, Bedrock, Cohere, Gemini) initializes Summary to []schemas.ResponsesReasoningSummary{} when creating ResponsesReasoning, ensuring it never remains nil. If you want to make this behavior more explicit or prevent future regressions where Summary might be left uninitialized, consider adding omitempty tag; otherwise, the current pattern is sound.

core/providers/anthropic/types.go (3)

161-197: Sonic-based AnthropicContent (un)marshal looks good; just ensure callers don’t set both fields (incl. empty slices).

The “both ContentStr and ContentBlocks set” guard is correct, but note ContentBlocks != nil will be true for empty-but-non-nil slices—so any code that default-initializes ContentBlocks: []AnthropicContentBlock{} and later sets ContentStr will now error.

If that pattern exists, you may want to normalize empty slices to nil at construction sites (not necessarily here).


201-228: redacted_thinking + Data field addition is consistent with Anthropic’s newer redaction shapes.

No concerns with the enum expansion and Data *string.


365-424: Streaming delta tags: StopSequence *string \json:"stop_sequence"`preservesnull` semantics.

This matches the requirement to faithfully roundtrip null vs string for stop_sequence.

core/providers/bedrock/bedrock_test.go (2)

485-935: Test expectations updated cleanly for ResponsesMessage.Type and new tool-output representation.

The added Type fields and migration to ResponsesToolCallOutputStr align the tests with the updated schemas and should reduce brittleness.


1683-1717: Nice: nil-guards before pointer deref in assertions.

This avoids test panics and gives clearer failures.

core/providers/cohere/responses.go (1)

14-151: State pooling/reset additions are solid (maps are cleared, non-map fields reset).

This is the right direction for avoiding cross-request contamination in streaming conversions.

@TejasGhatte TejasGhatte force-pushed the 12-04-feat_responses_reasoning_fixes branch from 3a4d9a5 to bd1d8c9 Compare December 12, 2025 09:53
@TejasGhatte TejasGhatte force-pushed the 12-05-feat_send_back_raw_request_support branch from 1a511ff to 6951a77 Compare December 12, 2025 09:54
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (4)
transports/bifrost-http/integrations/router.go (1)

820-907: Bug risk: don’t emit event: line before determining output mode (Bedrock eventstream / preformatted SSE string / JSON SSE).
Right now eventType (Line 820+) is written before the Bedrock branch and before the “preformatted SSE string” branch—this can corrupt Bedrock streams and can double-emit event: for converters that already return fully formatted SSE strings.

-               if eventType != "" {
-                   // OPENAI RESPONSES FORMAT: Use event: and data: lines for OpenAI responses API compatibility
-                   if _, err := fmt.Fprintf(w, "event: %s\n", eventType); err != nil {
-                       cancel() // Client disconnected (write error), cancel upstream stream
-                       return
-                   }
-               }
-
-               // Handle Bedrock Event Stream format
-               if config.Type == RouteConfigTypeBedrock && eventStreamEncoder != nil {
+               // Handle Bedrock Event Stream format first (NOT SSE)
+               if config.Type == RouteConfigTypeBedrock && eventStreamEncoder != nil {
                    // We need to cast to BedrockStreamEvent to determine event type and structure
                    if bedrockEvent, ok := convertedResponse.(*bedrock.BedrockStreamEvent); ok {
                        // Convert to sequence of specific Bedrock events
                        events := bedrockEvent.ToEncodedEvents()
@@
                    // Continue to next chunk (we handled flushing internally)
                    continue
-               } else if sseString, ok := convertedResponse.(string); ok {
+               }
+
+               // CUSTOM SSE FORMAT: converter returned a complete SSE string (should own event/data lines)
+               if sseString, ok := convertedResponse.(string); ok {
                    // CUSTOM SSE FORMAT: The converter returned a complete SSE string
                    // This is used by providers like Anthropic that need custom event types
                    // Example: "event: content_block_delta\ndata: {...}\n\n"
                    if !strings.HasPrefix(sseString, "data: ") && !strings.HasPrefix(sseString, "event: ") {
                        sseString = fmt.Sprintf("data: %s\n\n", sseString)
                    }
                    if _, err := fmt.Fprint(w, sseString); err != nil {
                        cancel() // Client disconnected (write error), cancel upstream stream
                        return
                    }
                } else {
+                   // STANDARD SSE FORMAT: emit eventType only for object->JSON SSE
+                   if eventType != "" {
+                       if _, err := fmt.Fprintf(w, "event: %s\n", eventType); err != nil {
+                           cancel()
+                           return
+                       }
+                   }
                    // STANDARD SSE FORMAT: The converter returned an object
                    // This will be JSON marshaled and wrapped as "data: {json}\n\n"
                    // Used by most providers (OpenAI chat/completions, Google, etc.)
                    responseJSON, err := sonic.Marshal(convertedResponse)
                    if err != nil {
                        // Log JSON marshaling error but continue processing
                        log.Printf("Failed to marshal streaming response: %v", err)
                        continue
                    }
transports/bifrost-http/handlers/inference.go (2)

91-118: Add "user" to responsesParamsKnownFields to prevent it from being duplicated into ExtraParams.

Right now "user" can be treated as “unknown”, get copied into ResponsesParameters.ExtraParams, and potentially conflict with ResponsesParameters.User handling/sanitization downstream.

 var responsesParamsKnownFields = map[string]bool{
@@
 	"truncation":           true,
+	"user":                 true,
 }

Also applies to: 612-617


580-660: The responses handler should populate RawRequestBody when raw passthrough is enabled. Currently, ctx.PostBody() is parsed for JSON but never preserved via SetRawRequestBody(). While integration routers handle this at lines 355-356 in router.go, the direct handler path in inference.go lacks this mechanism entirely. Add logic to detect raw passthrough conditions (similar to integration handlers) and call bifrostResponsesReq.SetRawRequestBody(ctx.PostBody()) when appropriate to align with the router-based implementation.

core/providers/openai/responses.go (1)

114-142: Remove ResponsesToolTypeWebSearchPreview from the allowlist; it is a legacy preview variant.

The code currently allows both web_search and web_search_preview, but OpenAI's Responses API has moved web_search to general availability with improved features (domain filtering, external web access control). The preview variant (web_search_preview) lacks these capabilities and should not be actively supported. Keep only web_search in the supported types map.

The custom tool type and local_shell (mapped to shell) are both currently supported by OpenAI and correctly included. The rest of the allowlist aligns with documented OpenAI Responses API tool types.

♻️ Duplicate comments (12)
ui/app/workspace/logs/views/logDetailsSheet.tsx (1)

187-237: Previous review feedback not addressed: boolean/number rendering bugs remain.

The issues flagged in the earlier review still exist:

  1. Critical: Line 227 renders reasoning.generate_summary (a boolean) directly in the Badge. In JSX, booleans are not rendered as text, so the Badge will appear visually empty even when the value is true.

  2. Major: Line 232's truthiness check (reasoning.max_tokens && ...) will hide the field when max_tokens is 0, which is a valid value that should be displayed in a debugging/logging UI.

  3. Lines 199, 210, 221 use truthiness checks that could also hide falsy-but-valid values.

Apply the same fix as previously suggested:

-												{reasoning.effort && (
+												{typeof reasoning.effort === "string" && (
 													<LogEntryDetailsView
 														className="w-full"
 														label="Effort"
 														value={
 															<Badge variant="secondary" className="uppercase">
 																{reasoning.effort}
 															</Badge>
 														}
 													/>
 												)}
-												{reasoning.summary && (
+												{typeof reasoning.summary === "string" && (
 													<LogEntryDetailsView
 														className="w-full"
 														label="Summary"
 														value={
 															<Badge variant="secondary" className="uppercase">
 																{reasoning.summary}
 															</Badge>
 														}
 													/>
 												)}
-												{reasoning.generate_summary && (
+												{typeof reasoning.generate_summary === "boolean" && (
 													<LogEntryDetailsView
 														className="w-full"
 														label="Generate Summary"
 														value={
 															<Badge variant="secondary" className="uppercase">
-																{reasoning.generate_summary}
+																{String(reasoning.generate_summary)}
 															</Badge>
 														}
 													/>
 												)}
-												{reasoning.max_tokens && <LogEntryDetailsView className="w-full" label="Max Tokens" value={reasoning.max_tokens} />}
+												{typeof reasoning.max_tokens === "number" && (
+													<LogEntryDetailsView className="w-full" label="Max Tokens" value={reasoning.max_tokens} />
+												)}
core/providers/azure/utils.go (2)

45-46: Use a schemas.Err... constant instead of string literal.

The string-literal error code "request body is not provided" won't be handled consistently with other error codes. Consider using an existing constant like schemas.ErrRequestBodyConversion with a more specific wrapped error, or add a new constant if needed.

 		if reqBody == nil {
-			return nil, providerUtils.NewBifrostOperationError("request body is not provided", nil, providerName)
+			return nil, providerUtils.NewBifrostOperationError(schemas.ErrRequestBodyConversion, fmt.Errorf("request body is nil after conversion"), providerName)
 		}

31-34: Normalize stream field when isStreaming=false in raw-body path.

When useRawRequestBody=true and the raw JSON contains "stream": true, this code won't reset it when isStreaming=false. This creates a protocol mismatch: the raw body forces streaming even when the caller expects non-streaming responses.

 		// Add stream if not present
-		if isStreaming {
-			requestBody["stream"] = true
-		}
+		if isStreaming {
+			requestBody["stream"] = true
+		} else {
+			delete(requestBody, "stream")
+		}
core/providers/utils/utils.go (2)

256-275: Avoid sonic.MarshalIndent for provider request bodies (payload bloat + regression).

CheckContextAndGetRequestBody returns the JSON that gets sent to providers; indenting increases bytes/CPU for every call.

-		jsonBody, err := sonic.MarshalIndent(convertedBody, "", "  ")
+		jsonBody, err := sonic.Marshal(convertedBody)

1026-1038: GetRandomString: revert to the already-reviewed, lower-allocation implementation.

Current code rebuilds a []rune alphabet and allocates []rune output each call.

 func GetRandomString(length int) string {
 	if length <= 0 {
 		return ""
 	}
 	randomSource := rand.New(rand.NewSource(time.Now().UnixNano()))
-	letters := []rune("abcdefghijklmnopqrstuvwxyz0123456789")
-	b := make([]rune, length)
+	const letters = "abcdefghijklmnopqrstuvwxyz0123456789"
+	b := make([]byte, length)
 	for i := range b {
-		b[i] = letters[randomSource.Intn(len(letters))]
+		b[i] = letters[randomSource.Intn(len(letters))]
 	}
 	return string(b)
 }
core/providers/vertex/utils.go (1)

13-43: Raw-body path must normalize "stream" when isStreaming=false (avoid accidental streaming).

If caller provides raw JSON with "stream": true but isStreaming is false, the request still streams.

-		// Add stream if not present
-		if isStreaming {
-			requestBody["stream"] = true
-		}
+		if isStreaming {
+			requestBody["stream"] = true
+		} else {
+			delete(requestBody, "stream")
+		}
core/schemas/responses.go (1)

69-70: Fix “Not is OpenAI’s spec” typo (appears twice).

-	StopReason         *string                             `json:"stop_reason,omitempty"` // Not is OpenAI's spec, but sent by other providers
+	StopReason         *string                             `json:"stop_reason,omitempty"` // Not in OpenAI's spec, but sent by other providers
@@
-	Signature *string                                    `json:"signature,omitempty"` // Not is OpenAI's spec, but sent by other providers
+	Signature *string                                    `json:"signature,omitempty"` // Not in OpenAI's spec, but sent by other providers

Also applies to: 399-411, 731-748, 1440-1442

core/providers/gemini/responses.go (1)

138-179: Fix unsafe tool-arguments pointer + ThoughtSignature encoding/decoding (still present in this PR branch).
This reintroduces the same failure modes: Arguments: &argumentsStr even when empty, and string(part.ThoughtSignature) (non‑UTF8 risk). Also, the lookahead path should decode from base64.

@@
-               argumentsStr := ""
+               var argumentsStr *string
                if part.FunctionCall.Args != nil {
                    if argsBytes, err := json.Marshal(part.FunctionCall.Args); err == nil {
-                       argumentsStr = string(argsBytes)
+                       s := string(argsBytes)
+                       // Avoid empty/whitespace/"null" which breaks replay Unmarshal.
+                       if ts := strings.TrimSpace(s); ts != "" && ts != "null" {
+                           argumentsStr = &s
+                       }
                    }
                }
@@
                toolMsg := &schemas.ResponsesToolMessage{
                    CallID:    &functionCallID,
                    Name:      &functionCallName,
-                   Arguments: &argumentsStr,
+                   Arguments: argumentsStr,
                }
@@
                if len(part.ThoughtSignature) > 0 {
-                   thoughtSig := string(part.ThoughtSignature)
+                   thoughtSig := base64.RawStdEncoding.EncodeToString(part.ThoughtSignature)
                    reasoningMsg := schemas.ResponsesMessage{
@@
                        ResponsesReasoning: &schemas.ResponsesReasoning{
                            Summary:          []schemas.ResponsesReasoningSummary{},
                            EncryptedContent: &thoughtSig,
                        },
                    }
                    messages = append(messages, reasoningMsg)
                }
@@
                    if i+1 < len(messages) {
                        nextMsg := messages[i+1]
                        if nextMsg.Type != nil && *nextMsg.Type == schemas.ResponsesMessageTypeReasoning &&
                            nextMsg.ResponsesReasoning != nil && nextMsg.ResponsesReasoning.EncryptedContent != nil {
-                           part.ThoughtSignature = []byte(*nextMsg.ResponsesReasoning.EncryptedContent)
+                           sigStr := *nextMsg.ResponsesReasoning.EncryptedContent
+                           if sigBytes, err := base64.RawStdEncoding.DecodeString(sigStr); err == nil {
+                               part.ThoughtSignature = sigBytes
+                           } else {
+                               // Backward-compat fallback if older logs stored raw string bytes.
+                               part.ThoughtSignature = []byte(sigStr)
+                           }
                        }
                    }

If this is “handled in a different PR in the stack”, please ensure this PR is rebased onto that fix before merging the stack (otherwise the bug remains here).

Also applies to: 609-626

core/internal/testutil/chat_completion_stream.go (1)

535-538: Cerebras prompt requires tools, but the request config provides none.
This is likely to fail/flap; either remove “use your tools” wording or configure tools for Cerebras.

@@
-            if testConfig.Provider == schemas.Cerebras {
-                problemPrompt = "Hello how are you, can you search hackernews news regarding maxim ai for me? use your tools for this"
-            }
+            if testConfig.Provider == schemas.Cerebras {
+                // Keep it reasoning-focused (no tools configured in this test).
+                problemPrompt = "Explain step by step: What is 15% of 200, then multiply that result by 3?"
+            }

Also applies to: 544-556

core/providers/anthropic/utils.go (1)

30-53: Raw mode may override user's explicit stream: false.

When isStreaming is true, line 48 unconditionally sets requestBody["stream"] = true, potentially overriding a user-provided stream: false. This differs from the max_tokens handling which checks for existence first (line 43).

Apply this diff to respect user-provided stream values:

 		// Add stream if not present
 		if isStreaming {
-			requestBody["stream"] = true
+			if _, exists := requestBody["stream"]; !exists {
+				requestBody["stream"] = true
+			}
 		}
core/providers/bedrock/utils.go (1)

119-124: additionalModelRequestFieldPaths can overwrite reasoning_config set earlier.

This overwrites AdditionalModelRequestFields entirely, potentially discarding the reasoning_config set in lines 43-75. Consider merging instead:

 if requestFields, exists := bifrostReq.Params.ExtraParams["additionalModelRequestFieldPaths"]; exists {
 	if orderedFields, ok := schemas.SafeExtractOrderedMap(requestFields); ok {
-		bedrockReq.AdditionalModelRequestFields = orderedFields
+		if bedrockReq.AdditionalModelRequestFields == nil {
+			bedrockReq.AdditionalModelRequestFields = orderedFields
+		} else {
+			for key, value := range orderedFields {
+				bedrockReq.AdditionalModelRequestFields[key] = value
+			}
+		}
 	}
 }
transports/bifrost-http/integrations/anthropic.go (1)

208-208: Make claude-cli detection case-insensitive for header values.

The user-agent header name comparison uses EqualFold (line 201) for case-insensitivity, but the value comparison at lines 208 and 233 uses Contains, which is case-sensitive. A user-agent like "Claude-CLI/1.0" or "CLAUDE-CLI" won't be detected.

Apply this pattern for consistency:

-   if strings.Contains(userAgent[0], "claude-cli") {
+   if strings.Contains(strings.ToLower(userAgent[0]), "claude-cli") {
        *bifrostCtx = context.WithValue(*bifrostCtx, schemas.BifrostContextKeyUserAgent, "claude-cli")
    }

And similarly at line 233:

-   if strings.Contains(userAgent, "claude-cli") {
+   if strings.Contains(strings.ToLower(userAgent), "claude-cli") {
        isClaudeCode = true
    }
🤖 Prompt for AI Agents
In transports/bifrost-http/integrations/anthropic.go at lines 208 and 233, the
code checks the User-Agent header value with strings.Contains which is
case-sensitive; update both locations to use strings.Contains(strings.ToLower(userAgent),
"claude-cli") or strings.Contains(strings.ToLower(userAgent[0]), "claude-cli")
respectively, so the check matches "claude-cli", "Claude-CLI", "CLAUDE-CLI",
etc.

Also applies to: 233-233

🧹 Nitpick comments (11)
transports/bifrost-http/integrations/router.go (2)

709-712: Prefer capability-driven DONE-marker gating over strings.Contains(config.Path, "/responses") (path check is brittle).
This matches the intended behavior (no data: [DONE] for Responses/Anthropic), but tying semantics to a path substring can silently break if routes are renamed/aliased or other providers reuse /responses. Consider driving this off route type + request kind (e.g., ResponsesRequest) or a RouteConfig flag. Based on learnings, this is the right direction (typed SSE events; no legacy DONE).

Also applies to: 916-927


883-885: Preformatted SSE detection is too narrow (might wrap valid SSE starting with id: / retry: / comments).
If any converter emits those fields (or multi-line SSE where first line isn’t data:/event:), this will incorrectly wrap and break the stream. Consider treating “already contains \n” as preformatted, or accept the standard SSE field prefixes.

core/providers/openrouter/openrouter_test.go (1)

25-49: Model slug is valid, but consider env override for test resilience.

openai/gpt-oss-120b is a real OpenRouter model with reasoning support (released Aug 2025). The slug format is correct. However, since this is an external dependency, consider making ReasoningModel overridable via environment variable (defaulting to this value) so CI/dev can quickly swap models without code changes if entitlements or availability changes.

core/providers/utils/utils.go (2)

318-346: Keep the “copy body” invariant when decoding compressed provider errors.

You start with a defensive copy (body := append([]byte(nil), resp.Body()...)) but then overwrite it with decodedBody (which can be resp.Body() for non-gzip). Consider copying decodedBody back into body before unmarshalling to avoid any future reuse/lifetime surprises if callers release resp.

-	body = decodedBody
+	body = append([]byte(nil), decodedBody...)

Also applies to: 489-498


1040-1121: Reasoning-effort ↔ budget mapping: pin edge-case semantics with tests (esp. maxTokens <= minBudgetTokens).

The “zero/negative range” paths return "high" (effort) or minBudgetTokens (budget). That may be correct, but it’s easy to regress—worth locking with unit tests for:

  • budgetTokens <= 0
  • maxTokens <= 0
  • maxTokens == minBudgetTokens
  • unknown effort strings
core/internal/testutil/reasoning.go (1)

201-223: ChatCompletionReasoningTest: skip logic is fine; confirm provider capability split is intentional.
This runs reasoning via Chat Completions (and skips OpenAI). Just confirm testConfig.Scenarios.Reasoning means “reasoning supported via both APIs”, or adjust to separate flags (Responses vs Chat) if you start seeing provider-specific failures.

Also applies to: 232-245

core/internal/testutil/chat_completion_stream.go (1)

403-420: Use a timeout ctx for the request, not just for reading the channel.
Right now streamCtx only affects the reader; the provider stream can keep running after timeout.

@@
-            responseChannel, err := WithStreamRetry(t, retryConfig, retryContext, func() (chan *schemas.BifrostStream, *schemas.BifrostError) {
-                return client.ChatCompletionStreamRequest(ctx, request)
-            })
+            reqCtx, reqCancel := context.WithTimeout(ctx, 200*time.Second)
+            defer reqCancel()
+            responseChannel, err := WithStreamRetry(t, retryConfig, retryContext, func() (chan *schemas.BifrostStream, *schemas.BifrostError) {
+                return client.ChatCompletionStreamRequest(reqCtx, request)
+            })
@@
-            streamCtx, cancel := context.WithTimeout(ctx, 200*time.Second)
+            streamCtx, cancel := context.WithTimeout(reqCtx, 200*time.Second)
             defer cancel()

(Apply similarly in the “Validated” subtest.)

Also applies to: 591-599

core/providers/vertex/vertex.go (1)

777-785: Duplicate ModelRequested assignment.

response.ExtraFields.ModelRequested is set twice - once within the struct initialization (line 780) and again at line 784. The second assignment is redundant.

Apply this diff to remove the redundant assignment:

 		response.ExtraFields = schemas.BifrostResponseExtraFields{
 			RequestType:    schemas.ResponsesRequest,
 			Provider:       providerName,
 			ModelRequested: request.Model,
 			Latency:        latency.Milliseconds(),
 		}
 
-		response.ExtraFields.ModelRequested = request.Model
-
 		// Set raw request if enabled
core/providers/bedrock/types.go (1)

100-140: Custom UnmarshalJSON implementation is correct but silently ignores malformed unknown fields.

The implementation correctly:

  1. Uses the alias pattern to avoid infinite recursion
  2. Captures all unregistered fields into ExtraParams
  3. Includes all struct fields in bedrockConverseRequestKnownFields

However, line 133 silently continues when an unknown field fails to unmarshal. Consider logging this for debugging:

 for key, value := range rawData {
 	if !bedrockConverseRequestKnownFields[key] {
 		var v interface{}
 		if err := sonic.Unmarshal(value, &v); err != nil {
-			continue // Skip fields that can't be unmarshaled
+			// Log for debugging but continue processing other fields
+			continue
 		}
 		r.ExtraParams[key] = v
 	}
 }
transports/bifrost-http/integrations/anthropic.go (2)

7-7: Use structured logger instead of stdlib log.Printf for streaming errors.

The log package is imported (line 7) and used at line 115 for logging JSON marshaling errors. Based on learnings, you should use the existing structured logger instead of stdlib logging.

Consider passing the router/handler logger through the converter context or using a logger from the request context, similar to how other handlers in this codebase log errors.

🤖 Prompt for AI Agents
In transports/bifrost-http/integrations/anthropic.go at line 115, replace the
stdlib log.Printf call with a structured logger; either pass the logger through
the converter function signature (e.g., add a logger parameter to
ResponsesStreamResponseConverter) or extract it from the context (*ctx), then
use logger.Error or logger.Warn to log the JSON marshaling error with
structured fields (e.g., logger.Error("failed to marshal streaming response",
"error", err)) instead of log.Printf.

Also applies to: 115-115


110-120: Consider using strings.Builder for SSE event concatenation.

When concatenating multiple streaming events (lines 110-120), the code uses combinedContent += fmt.Sprintf(...) which creates a new string on each iteration. For many events, this can be inefficient due to repeated allocations.

Using strings.Builder would be more efficient:

-   combinedContent := ""
+   var combinedContent strings.Builder
    for _, event := range anthropicResponse {
        responseJSON, err := sonic.Marshal(event)
        if err != nil {
            log.Printf("Failed to marshal streaming response: %v", err)
            continue
        }
-       combinedContent += fmt.Sprintf("event: %s\ndata: %s\n\n", event.Type, responseJSON)
+       combinedContent.WriteString(fmt.Sprintf("event: %s\ndata: %s\n\n", event.Type, responseJSON))
    }
-   return "", combinedContent, nil
+   return "", combinedContent.String(), nil
🤖 Prompt for AI Agents
In transports/bifrost-http/integrations/anthropic.go around lines 110 to 120,
replace the string concatenation (combinedContent += ...) with a strings.Builder
for better performance when handling multiple events; declare var combinedContent
strings.Builder, use combinedContent.WriteString instead of +=, and return
combinedContent.String() at the end.
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 3a4d9a5 and bd1d8c9.

⛔ Files ignored due to path filters (1)
  • ui/package-lock.json is excluded by !**/package-lock.json
📒 Files selected for processing (58)
  • core/internal/testutil/account.go (1 hunks)
  • core/internal/testutil/chat_completion_stream.go (1 hunks)
  • core/internal/testutil/reasoning.go (4 hunks)
  • core/internal/testutil/responses_stream.go (1 hunks)
  • core/internal/testutil/tests.go (2 hunks)
  • core/providers/anthropic/anthropic.go (3 hunks)
  • core/providers/anthropic/chat.go (5 hunks)
  • core/providers/anthropic/errors.go (2 hunks)
  • core/providers/anthropic/types.go (8 hunks)
  • core/providers/anthropic/utils.go (2 hunks)
  • core/providers/azure/azure.go (2 hunks)
  • core/providers/azure/utils.go (1 hunks)
  • core/providers/bedrock/bedrock.go (2 hunks)
  • core/providers/bedrock/bedrock_test.go (13 hunks)
  • core/providers/bedrock/types.go (2 hunks)
  • core/providers/bedrock/utils.go (2 hunks)
  • core/providers/cerebras/cerebras_test.go (2 hunks)
  • core/providers/cohere/chat.go (3 hunks)
  • core/providers/cohere/cohere.go (2 hunks)
  • core/providers/cohere/cohere_test.go (1 hunks)
  • core/providers/cohere/responses.go (7 hunks)
  • core/providers/cohere/types.go (1 hunks)
  • core/providers/gemini/responses.go (2 hunks)
  • core/providers/groq/groq_test.go (2 hunks)
  • core/providers/mistral/mistral_test.go (1 hunks)
  • core/providers/openai/chat.go (1 hunks)
  • core/providers/openai/responses.go (2 hunks)
  • core/providers/openai/responses_marshal_test.go (1 hunks)
  • core/providers/openai/responses_test.go (1 hunks)
  • core/providers/openai/text.go (1 hunks)
  • core/providers/openai/types.go (3 hunks)
  • core/providers/openai/types_test.go (1 hunks)
  • core/providers/openai/utils.go (1 hunks)
  • core/providers/openrouter/openrouter_test.go (1 hunks)
  • core/providers/utils/utils.go (4 hunks)
  • core/providers/vertex/errors.go (1 hunks)
  • core/providers/vertex/types.go (0 hunks)
  • core/providers/vertex/utils.go (1 hunks)
  • core/providers/vertex/vertex.go (3 hunks)
  • core/providers/vertex/vertex_test.go (1 hunks)
  • core/schemas/bifrost.go (1 hunks)
  • core/schemas/mux.go (0 hunks)
  • core/schemas/responses.go (5 hunks)
  • core/schemas/utils.go (1 hunks)
  • docs/docs.json (0 hunks)
  • framework/configstore/rdb.go (0 hunks)
  • framework/streaming/chat.go (0 hunks)
  • framework/streaming/responses.go (2 hunks)
  • transports/bifrost-http/handlers/inference.go (1 hunks)
  • transports/bifrost-http/handlers/middlewares.go (1 hunks)
  • transports/bifrost-http/integrations/anthropic.go (5 hunks)
  • transports/bifrost-http/integrations/router.go (3 hunks)
  • ui/app/workspace/logs/views/columns.tsx (1 hunks)
  • ui/app/workspace/logs/views/logDetailsSheet.tsx (1 hunks)
  • ui/app/workspace/logs/views/logResponsesMessageView.tsx (2 hunks)
  • ui/app/workspace/logs/views/logResponsesOutputView.tsx (0 hunks)
  • ui/lib/types/logs.ts (2 hunks)
  • ui/package.json (1 hunks)
💤 Files with no reviewable changes (6)
  • core/providers/vertex/types.go
  • framework/streaming/chat.go
  • ui/app/workspace/logs/views/logResponsesOutputView.tsx
  • docs/docs.json
  • framework/configstore/rdb.go
  • core/schemas/mux.go
🚧 Files skipped from review as they are similar to previous changes (18)
  • ui/lib/types/logs.ts
  • core/providers/openai/utils.go
  • core/providers/anthropic/errors.go
  • core/schemas/utils.go
  • core/schemas/bifrost.go
  • ui/app/workspace/logs/views/columns.tsx
  • core/internal/testutil/account.go
  • ui/app/workspace/logs/views/logResponsesMessageView.tsx
  • core/providers/anthropic/anthropic.go
  • core/providers/vertex/vertex_test.go
  • ui/package.json
  • core/internal/testutil/tests.go
  • transports/bifrost-http/handlers/middlewares.go
  • core/providers/openai/chat.go
  • core/providers/cerebras/cerebras_test.go
  • core/internal/testutil/responses_stream.go
  • core/providers/azure/azure.go
  • framework/streaming/responses.go
🧰 Additional context used
📓 Path-based instructions (1)
**

⚙️ CodeRabbit configuration file

always check the stack if there is one for the current PR. do not give localized reviews for the PR, always see all changes in the light of the whole stack of PRs (if there is a stack, if there is no stack you can continue to make localized suggestions/reviews)

Files:

  • core/providers/openrouter/openrouter_test.go
  • core/providers/anthropic/chat.go
  • core/providers/vertex/errors.go
  • core/providers/anthropic/utils.go
  • core/providers/bedrock/types.go
  • core/providers/groq/groq_test.go
  • core/providers/openai/types.go
  • core/providers/openai/responses_test.go
  • core/providers/bedrock/utils.go
  • core/providers/vertex/vertex.go
  • core/providers/openai/text.go
  • transports/bifrost-http/integrations/anthropic.go
  • transports/bifrost-http/handlers/inference.go
  • ui/app/workspace/logs/views/logDetailsSheet.tsx
  • core/providers/cohere/chat.go
  • core/schemas/responses.go
  • core/providers/azure/utils.go
  • core/internal/testutil/chat_completion_stream.go
  • core/providers/bedrock/bedrock_test.go
  • core/providers/vertex/utils.go
  • core/providers/gemini/responses.go
  • core/providers/openai/types_test.go
  • core/providers/cohere/cohere_test.go
  • core/providers/cohere/cohere.go
  • core/internal/testutil/reasoning.go
  • core/providers/bedrock/bedrock.go
  • core/providers/openai/responses.go
  • core/providers/anthropic/types.go
  • core/providers/cohere/responses.go
  • core/providers/utils/utils.go
  • core/providers/openai/responses_marshal_test.go
  • core/providers/cohere/types.go
  • core/providers/mistral/mistral_test.go
  • transports/bifrost-http/integrations/router.go
🧠 Learnings (4)
📚 Learning: 2025-12-09T17:07:42.007Z
Learnt from: qwerty-dvorak
Repo: maximhq/bifrost PR: 1006
File: core/schemas/account.go:9-18
Timestamp: 2025-12-09T17:07:42.007Z
Learning: In core/schemas/account.go, the HuggingFaceKeyConfig field within the Key struct is currently unused and reserved for future Hugging Face inference endpoint deployments. Do not flag this field as missing from OpenAPI documentation or require its presence in the API spec until the feature is actively implemented and used. When the feature is added, update the OpenAPI docs accordingly; otherwise, treat this field as non-breaking and not part of the current API surface.

Applied to files:

  • core/providers/openrouter/openrouter_test.go
  • core/providers/anthropic/chat.go
  • core/providers/vertex/errors.go
  • core/providers/anthropic/utils.go
  • core/providers/bedrock/types.go
  • core/providers/groq/groq_test.go
  • core/providers/openai/types.go
  • core/providers/openai/responses_test.go
  • core/providers/bedrock/utils.go
  • core/providers/vertex/vertex.go
  • core/providers/openai/text.go
  • transports/bifrost-http/integrations/anthropic.go
  • transports/bifrost-http/handlers/inference.go
  • core/providers/cohere/chat.go
  • core/schemas/responses.go
  • core/providers/azure/utils.go
  • core/internal/testutil/chat_completion_stream.go
  • core/providers/bedrock/bedrock_test.go
  • core/providers/vertex/utils.go
  • core/providers/gemini/responses.go
  • core/providers/openai/types_test.go
  • core/providers/cohere/cohere_test.go
  • core/providers/cohere/cohere.go
  • core/internal/testutil/reasoning.go
  • core/providers/bedrock/bedrock.go
  • core/providers/openai/responses.go
  • core/providers/anthropic/types.go
  • core/providers/cohere/responses.go
  • core/providers/utils/utils.go
  • core/providers/openai/responses_marshal_test.go
  • core/providers/cohere/types.go
  • core/providers/mistral/mistral_test.go
  • transports/bifrost-http/integrations/router.go
📚 Learning: 2025-12-11T11:58:25.307Z
Learnt from: Pratham-Mishra04
Repo: maximhq/bifrost PR: 1000
File: core/providers/openai/responses.go:42-84
Timestamp: 2025-12-11T11:58:25.307Z
Learning: In core/providers/openai/responses.go (and related OpenAI response handling), document and enforce the API format constraint: if ResponsesReasoning != nil and the response contains content blocks, all content blocks should be treated as reasoning blocks by default. Implement type guards or parsing logic accordingly, and add unit tests to verify that when ResponsesReasoning is non-nil, content blocks are labeled as reasoning blocks. Include clear comments in the code explaining the rationale and ensure downstream consumers rely on this behavior.

Applied to files:

  • core/providers/openai/types.go
  • core/providers/openai/responses_test.go
  • core/providers/openai/text.go
  • core/providers/openai/types_test.go
  • core/providers/openai/responses.go
  • core/providers/openai/responses_marshal_test.go
📚 Learning: 2025-12-12T08:25:02.629Z
Learnt from: Pratham-Mishra04
Repo: maximhq/bifrost PR: 1000
File: transports/bifrost-http/integrations/router.go:709-712
Timestamp: 2025-12-12T08:25:02.629Z
Learning: In transports/bifrost-http/**/*.go, update streaming response handling to align with OpenAI Responses API: use typed SSE events such as response.created, response.output_text.delta, response.done, etc., and do not rely on the legacy data: [DONE] termination marker. Note that data: [DONE] is only used by the older Chat Completions and Text Completions streaming APIs. Ensure parsers, writers, and tests distinguish SSE events from the [DONE] sentinel and handle each event type accordingly for correct stream termination and progress updates.

Applied to files:

  • transports/bifrost-http/integrations/anthropic.go
  • transports/bifrost-http/handlers/inference.go
  • transports/bifrost-http/integrations/router.go
📚 Learning: 2025-12-11T07:38:31.413Z
Learnt from: Pratham-Mishra04
Repo: maximhq/bifrost PR: 1000
File: core/providers/bedrock/bedrock_test.go:1374-1390
Timestamp: 2025-12-11T07:38:31.413Z
Learning: In core/providers/bedrock tests, follow a layered testing approach: - Unit tests (e.g., TestBifrostToBedrockResponseConversion) should perform structural comparisons and type/field checks to avoid brittleness from dynamic fields. - Separate scenario-based and integration tests should validate the full end-to-end conversion logic, including content block internals. Ensure unit tests avoid brittle string/field matching and that integration tests cover end-to-end behavior with realistic data.

Applied to files:

  • core/providers/bedrock/bedrock_test.go
🧬 Code graph analysis (22)
core/providers/anthropic/chat.go (6)
core/providers/anthropic/types.go (2)
  • MinimumReasoningMaxTokens (15-15)
  • AnthropicThinking (69-72)
core/providers/bedrock/types.go (1)
  • MinimumReasoningMaxTokens (12-12)
core/providers/cohere/types.go (1)
  • MinimumReasoningMaxTokens (11-11)
core/providers/utils/utils.go (1)
  • GetBudgetTokensFromReasoningEffort (1084-1121)
core/schemas/utils.go (1)
  • Ptr (16-18)
core/utils.go (1)
  • Ptr (51-53)
core/providers/anthropic/utils.go (6)
core/schemas/responses.go (1)
  • BifrostResponsesRequest (32-39)
core/providers/utils/utils.go (1)
  • NewBifrostOperationError (516-527)
core/schemas/provider.go (2)
  • ErrRequestBodyConversion (25-25)
  • ErrProviderRequestMarshal (26-26)
core/providers/anthropic/types.go (1)
  • AnthropicDefaultMaxTokens (14-14)
core/providers/anthropic/responses.go (1)
  • ToAnthropicResponsesRequest (1419-1532)
core/schemas/utils.go (1)
  • Ptr (16-18)
core/providers/bedrock/types.go (4)
core/providers/anthropic/types.go (1)
  • Alias (104-104)
core/providers/openai/types.go (2)
  • Alias (83-83)
  • Alias (194-194)
core/schemas/bifrost.go (2)
  • Alias (388-388)
  • Alias (405-405)
core/schemas/chatcompletions.go (4)
  • Alias (189-189)
  • Alias (643-643)
  • Alias (770-770)
  • Alias (860-860)
core/providers/openai/types.go (4)
core/schemas/chatcompletions.go (5)
  • ChatParameters (155-184)
  • Alias (189-189)
  • Alias (643-643)
  • Alias (770-770)
  • Alias (860-860)
core/providers/anthropic/types.go (1)
  • Alias (104-104)
core/schemas/responses.go (1)
  • ResponsesParametersReasoning (234-239)
ui/lib/types/logs.ts (1)
  • ResponsesParametersReasoning (513-520)
core/providers/openai/responses_test.go (3)
core/schemas/responses.go (7)
  • ResponsesMessage (314-327)
  • ResponsesReasoning (731-734)
  • ResponsesReasoningSummary (745-748)
  • ResponsesMessageContent (339-344)
  • ResponsesMessageContentBlock (399-411)
  • ResponsesOutputMessageContentTypeReasoning (394-394)
  • BifrostResponsesRequest (32-39)
core/schemas/utils.go (1)
  • Ptr (16-18)
core/providers/openai/responses.go (1)
  • ToOpenAIResponsesRequest (37-112)
core/providers/bedrock/utils.go (4)
core/schemas/utils.go (2)
  • IsAnthropicModel (1043-1045)
  • Ptr (16-18)
core/providers/anthropic/types.go (1)
  • MinimumReasoningMaxTokens (15-15)
core/providers/cohere/types.go (2)
  • MinimumReasoningMaxTokens (11-11)
  • DefaultCompletionMaxTokens (12-12)
core/providers/utils/utils.go (1)
  • GetBudgetTokensFromReasoningEffort (1084-1121)
core/providers/openai/text.go (2)
core/schemas/textcompletions.go (1)
  • TextCompletionParameters (120-140)
core/providers/openai/utils.go (1)
  • SanitizeUserField (51-56)
transports/bifrost-http/integrations/anthropic.go (3)
core/schemas/bifrost.go (5)
  • BifrostContextKeyUserAgent (123-123)
  • Anthropic (37-37)
  • BifrostContextKeyExtraHeaders (115-115)
  • BifrostContextKeyURLPath (116-116)
  • BifrostContextKeySkipKeySelection (114-114)
core/providers/anthropic/types.go (1)
  • AnthropicStreamEvent (395-404)
core/schemas/utils.go (1)
  • IsAnthropicModel (1043-1045)
transports/bifrost-http/handlers/inference.go (2)
core/schemas/bifrost.go (1)
  • ResponsesRequest (91-91)
core/schemas/responses.go (1)
  • ResponsesParameters (87-114)
ui/app/workspace/logs/views/logDetailsSheet.tsx (3)
ui/components/ui/separator.tsx (1)
  • DottedSeparator (43-43)
ui/app/workspace/logs/views/logEntryDetailsView.tsx (1)
  • LogEntryDetailsView (15-49)
ui/components/ui/badge.tsx (1)
  • Badge (37-37)
core/providers/cohere/chat.go (5)
core/providers/cohere/types.go (5)
  • CohereThinking (174-177)
  • ThinkingTypeEnabled (183-183)
  • DefaultCompletionMaxTokens (12-12)
  • MinimumReasoningMaxTokens (11-11)
  • ThinkingTypeDisabled (184-184)
core/providers/bedrock/types.go (2)
  • DefaultCompletionMaxTokens (13-13)
  • MinimumReasoningMaxTokens (12-12)
core/providers/utils/utils.go (1)
  • GetBudgetTokensFromReasoningEffort (1084-1121)
core/providers/anthropic/types.go (1)
  • MinimumReasoningMaxTokens (15-15)
core/schemas/utils.go (1)
  • Ptr (16-18)
core/schemas/responses.go (2)
core/providers/gemini/types.go (1)
  • Type (778-778)
ui/lib/types/logs.ts (2)
  • ResponsesMessageContentBlockType (352-359)
  • ResponsesReasoningSummary (412-415)
core/providers/azure/utils.go (6)
core/schemas/responses.go (1)
  • BifrostResponsesRequest (32-39)
core/providers/utils/utils.go (1)
  • NewBifrostOperationError (516-527)
core/schemas/provider.go (2)
  • ErrRequestBodyConversion (25-25)
  • ErrProviderRequestMarshal (26-26)
core/providers/anthropic/types.go (1)
  • AnthropicDefaultMaxTokens (14-14)
core/providers/anthropic/responses.go (1)
  • ToAnthropicResponsesRequest (1419-1532)
core/schemas/utils.go (1)
  • Ptr (16-18)
core/internal/testutil/chat_completion_stream.go (5)
core/schemas/chatcompletions.go (6)
  • ChatMessage (469-478)
  • BifrostChatRequest (12-19)
  • ChatParameters (155-184)
  • ChatReasoning (223-226)
  • BifrostChatResponse (26-41)
  • BifrostReasoningDetailsTypeText (719-719)
core/internal/testutil/utils.go (1)
  • CreateBasicChatMessage (247-254)
core/schemas/utils.go (1)
  • Ptr (16-18)
core/internal/testutil/test_retry_framework.go (5)
  • StreamingRetryConfig (819-839)
  • TestRetryContext (168-173)
  • WithStreamRetry (580-688)
  • WithChatStreamValidationRetry (2339-2485)
  • ChatStreamValidationResult (2328-2335)
core/schemas/bifrost.go (2)
  • BifrostStream (323-330)
  • BifrostError (358-367)
core/providers/bedrock/bedrock_test.go (3)
core/schemas/utils.go (1)
  • Ptr (16-18)
core/schemas/responses.go (4)
  • ResponsesMessageTypeMessage (290-290)
  • ResponsesMessageTypeFunctionCallOutput (296-296)
  • ResponsesToolMessage (462-482)
  • ResponsesToolMessageOutputStruct (531-535)
core/schemas/chatcompletions.go (1)
  • OrderedMap (268-268)
core/providers/openai/types_test.go (2)
core/providers/openai/types.go (1)
  • OpenAIChatRequest (46-59)
core/schemas/chatcompletions.go (2)
  • ChatMessageRoleUser (462-462)
  • ChatMessageRoleSystem (463-463)
core/providers/cohere/cohere.go (1)
core/providers/cohere/responses.go (1)
  • ToCohereResponsesRequest (916-1030)
core/internal/testutil/reasoning.go (8)
core/internal/testutil/account.go (1)
  • ComprehensiveTestConfig (47-64)
core/schemas/bifrost.go (2)
  • OpenAI (35-35)
  • BifrostError (358-367)
ui/lib/types/logs.ts (2)
  • ChatMessage (108-117)
  • BifrostError (226-232)
core/schemas/chatcompletions.go (5)
  • ChatMessage (469-478)
  • BifrostChatRequest (12-19)
  • ChatParameters (155-184)
  • ChatReasoning (223-226)
  • BifrostChatResponse (26-41)
core/internal/testutil/utils.go (2)
  • CreateBasicChatMessage (247-254)
  • GetChatContent (329-356)
core/schemas/models.go (1)
  • Model (109-129)
core/internal/testutil/test_retry_framework.go (1)
  • WithChatTestRetry (274-424)
core/internal/testutil/validation_presets.go (1)
  • GetExpectationsForScenario (208-293)
core/providers/bedrock/bedrock.go (1)
core/schemas/bifrost.go (1)
  • BifrostContextKeyStreamEndIndicator (113-113)
core/providers/utils/utils.go (2)
core/schemas/bifrost.go (2)
  • BifrostError (358-367)
  • ErrorField (376-383)
ui/lib/types/logs.ts (2)
  • BifrostError (226-232)
  • ErrorField (217-224)
core/providers/openai/responses_marshal_test.go (3)
core/providers/openai/types.go (2)
  • OpenAIResponsesRequest (180-189)
  • OpenAIResponsesRequestInput (147-150)
core/schemas/utils.go (1)
  • Ptr (16-18)
core/schemas/responses.go (3)
  • ResponsesParameters (87-114)
  • ResponsesParametersReasoning (234-239)
  • ResponsesMessage (314-327)
transports/bifrost-http/integrations/router.go (1)
core/schemas/bifrost.go (1)
  • BifrostContextKeyIntegrationType (120-120)
⏰ Context from checks skipped due to timeout of 900000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Graphite / mergeability_check

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (3)
core/providers/vertex/vertex_test.go (1)

31-47: Clarify the reasoning model configuration.

The test sets ReasoningModel: "claude-4.5-haiku" but keeps Reasoning: false with a comment stating "Not supported right now because we are not using native gemini converters." This is inconsistent:

  1. If reasoning isn't supported, why specify a Claude reasoning model?
  2. The comment mentions Gemini converters, but the model is Claude

Either:

  • Remove or comment out the ReasoningModel field if reasoning is truly unsupported, or
  • Enable reasoning (Reasoning: true) and update the comment to reflect Claude-based reasoning support via Vertex

This will prevent confusion about the actual reasoning capabilities being tested.

core/providers/openai/responses.go (1)

114-142: Remove unsupported tool types that will cause OpenAI API failures.

The supportedTypes map is over-inclusive and contradicts OpenAI's official May 2025 Responses API specification. OpenAI only supports: file_search, web_search, code_interpreter, image_generation, and mcp.

The map currently marks these as supported (but OpenAI doesn't support them):

  • function — not a tool type in OpenAI's API
  • computer_use_preview — Anthropic-specific tool
  • local_shell — Anthropic-specific tool (bash)
  • custom — not in OpenAI's API
  • web_search_preview — not in official OpenAI spec

These unsupported types will pass through the filter and be sent to OpenAI, causing API failures. Update the map to only allow the five tool types OpenAI actually supports.

framework/streaming/responses.go (1)

14-157: Deep-copy is missing newly-added Signature (and other scalar pointers like FileID).
This defeats the “prevent shared mutation” goal once Signature starts being appended during accumulation.

 func deepCopyResponsesStreamResponse(original *schemas.BifrostResponsesStreamResponse) *schemas.BifrostResponsesStreamResponse {
@@
 	if original.Delta != nil {
 		copyDelta := *original.Delta
 		copy.Delta = &copyDelta
 	}
+
+	if original.Signature != nil {
+		copySig := *original.Signature
+		copy.Signature = &copySig
+	}
@@
 }
 
 func deepCopyResponsesMessageContentBlock(original schemas.ResponsesMessageContentBlock) schemas.ResponsesMessageContentBlock {
 	copy := schemas.ResponsesMessageContentBlock{
 		Type: original.Type,
 	}
+
+	if original.FileID != nil {
+		fid := *original.FileID
+		copy.FileID = &fid
+	}
 
 	if original.Text != nil {
 		copyText := *original.Text
 		copy.Text = &copyText
 	}
+
+	if original.Signature != nil {
+		sig := *original.Signature
+		copy.Signature = &sig
+	}
@@
 	return copy
 }

Also applies to: 381-424

♻️ Duplicate comments (17)
ui/app/workspace/logs/views/logDetailsSheet.tsx (1)

187-237: Fix boolean/0-value rendering in “Reasoning Parameters” (currently hides valid data / renders empty Badge).
This block still uses truthiness checks, and renders reasoning.generate_summary directly inside a <Badge> (booleans don’t render as text), plus hides max_tokens: 0.

Suggested patch:

 {(() => {
   const params = log.params as any;
   const reasoning = params?.reasoning;
-  if (!reasoning || typeof reasoning !== "object" || Object.keys(reasoning).length === 0) {
+  if (
+    !reasoning ||
+    typeof reasoning !== "object" ||
+    Array.isArray(reasoning) ||
+    Object.keys(reasoning).length === 0
+  ) {
     return null;
   }
   return (
     <>
       <DottedSeparator />
       <div className="space-y-4">
         <BlockHeader title="Reasoning Parameters" icon={<FileText className="h-5 w-5 text-gray-600" />} />
         <div className="grid w-full grid-cols-3 items-center justify-between gap-4">
-          {reasoning.effort && (
+          {typeof reasoning.effort === "string" && (
             <LogEntryDetailsView
               className="w-full"
               label="Effort"
               value={
                 <Badge variant="secondary" className="uppercase">
                   {reasoning.effort}
                 </Badge>
               }
             />
           )}
-          {reasoning.summary && (
+          {typeof reasoning.summary === "string" && (
             <LogEntryDetailsView
               className="w-full"
               label="Summary"
               value={
                 <Badge variant="secondary" className="uppercase">
                   {reasoning.summary}
                 </Badge>
               }
             />
           )}
-          {reasoning.generate_summary && (
+          {typeof reasoning.generate_summary === "boolean" && (
             <LogEntryDetailsView
               className="w-full"
               label="Generate Summary"
               value={
                 <Badge variant="secondary" className="uppercase">
-                  {reasoning.generate_summary}
+                  {String(reasoning.generate_summary)}
                 </Badge>
               }
             />
           )}
-          {reasoning.max_tokens && <LogEntryDetailsView className="w-full" label="Max Tokens" value={reasoning.max_tokens} />}
+          {typeof reasoning.max_tokens === "number" && (
+            <LogEntryDetailsView className="w-full" label="Max Tokens" value={reasoning.max_tokens} />
+          )}
         </div>
       </div>
     </>
   );
 })()}

Follow-up (stack-aware): consider typing LogEntry["params"] to include reasoning instead of as any, since the stack is standardizing reasoning/summaries across providers.

core/providers/anthropic/errors.go (1)

3-10: Re-check package-level duplication + consider sonic.Marshal for consistency.

Given prior duplicate-definition issues for ToAnthropicResponsesStreamError, please re-verify it exists in exactly one file in package anthropic.

transports/bifrost-http/integrations/router.go (1)

709-712: Avoid path-substring checks for stream termination behavior; gate by request type instead.

Using strings.Contains(config.Path, "/responses") is fragile (aliases/versioned paths can bypass it). Since this is about Responses vs non-Responses streaming semantics (and per learnings, Responses should not use [DONE]), consider deciding shouldSendDoneMarker earlier in handleStreamingRequest based on bifrostReq.ResponsesRequest != nil, and pass it into handleStreaming(...). Based on learnings, this reduces the chance of accidentally emitting [DONE] for Responses routes.

Proposed direction (sketch):

-func (g *GenericRouter) handleStreaming(ctx *fasthttp.RequestCtx, bifrostCtx *context.Context, config RouteConfig, streamChan chan *schemas.BifrostStream, cancel context.CancelFunc) {
+func (g *GenericRouter) handleStreaming(ctx *fasthttp.RequestCtx, bifrostCtx *context.Context, config RouteConfig, streamChan chan *schemas.BifrostStream, cancel context.CancelFunc, shouldSendDoneMarker bool) {
   ...
-  shouldSendDoneMarker := true
-  if config.Type == RouteConfigTypeAnthropic || strings.Contains(config.Path, "/responses") {
-    shouldSendDoneMarker = false
-  }
core/providers/anthropic/utils.go (1)

30-76: Raw-body branch should normalize "stream" based on isStreaming (and handle empty raw body).
Right now, non-streaming calls can accidentally keep a user-provided "stream": true, and streaming calls always overwrite user intent. At minimum, make behavior explicit and consistent.

 	if useRawBody, ok := ctx.Value(schemas.BifrostContextKeyUseRawRequestBody).(bool); ok && useRawBody {
 		jsonBody = request.GetRawRequestBody()
+		if len(jsonBody) == 0 {
+			return nil, providerUtils.NewBifrostOperationError("request body is not provided", nil, providerName)
+		}
 		// Unmarshal and check if model and region are present
 		var requestBody map[string]interface{}
 		if err := sonic.Unmarshal(jsonBody, &requestBody); err != nil {
 			return nil, providerUtils.NewBifrostOperationError(schemas.ErrRequestBodyConversion, fmt.Errorf("failed to unmarshal request body: %w", err), providerName)
 		}
@@
-		// Add stream if not present
-		if isStreaming {
-			requestBody["stream"] = true
-		}
+		// Normalize stream to match the actual transport
+		if isStreaming {
+			requestBody["stream"] = true
+		} else {
+			delete(requestBody, "stream")
+		}

Also please document the security/contract assumptions of bifrost-use-raw-request-body (who is allowed to set it, and what validations are intentionally bypassed).

core/providers/vertex/utils.go (1)

13-86: Raw-body branch should delete "stream" when isStreaming == false to prevent accidental SSE.

 		// Add stream if not present
 		if isStreaming {
 			requestBody["stream"] = true
+		} else {
+			delete(requestBody, "stream")
 		}

(Also consider sharing the same normalization behavior across core/providers/vertex/utils.go and core/providers/anthropic/utils.go so the stack has one consistent raw-body contract.)

core/providers/bedrock/utils.go (1)

119-124: additionalModelRequestFieldPaths overwrites reasoning_config set earlier.

When both Reasoning config (lines 35-77) and additionalModelRequestFieldPaths in ExtraParams are provided, the assignment at line 122 overwrites AdditionalModelRequestFields entirely, losing the reasoning_config that was set earlier.

Consider merging instead of overwriting:

 		if requestFields, exists := bifrostReq.Params.ExtraParams["additionalModelRequestFieldPaths"]; exists {
 			if orderedFields, ok := schemas.SafeExtractOrderedMap(requestFields); ok {
-				bedrockReq.AdditionalModelRequestFields = orderedFields
+				if bedrockReq.AdditionalModelRequestFields == nil {
+					bedrockReq.AdditionalModelRequestFields = orderedFields
+				} else {
+					// Merge orderedFields into existing, preserving reasoning_config
+					for key, value := range orderedFields {
+						bedrockReq.AdditionalModelRequestFields[key] = value
+					}
+				}
 			}
 		}
core/providers/azure/utils.go (2)

31-34: Raw-body path should normalize stream field when isStreaming=false.

When useRawRequestBody=true and the raw JSON already contains "stream": true, this code won't reset it when isStreaming=false. This creates a protocol mismatch risk where the raw body forces streaming even when the caller expects non-streaming responses.

 		// Add stream if not present
 		if isStreaming {
 			requestBody["stream"] = true
+		} else {
+			delete(requestBody, "stream")
 		}

45-47: Avoid string-literal error codes in NewBifrostOperationError.

Using "request body is not provided" as a string literal won't be handled consistently with other error codes. Prefer a schemas.Err... constant.

 		if reqBody == nil {
-			return nil, providerUtils.NewBifrostOperationError("request body is not provided", nil, providerName)
+			return nil, providerUtils.NewBifrostOperationError(schemas.ErrRequestBodyMissing, nil, providerName)
 		}

Note: You may need to define ErrRequestBodyMissing in the schemas package if it doesn't exist.

core/internal/testutil/chat_completion_stream.go (1)

523-708: Cerebras prompt requires tools but the test doesn’t configure any tools.
This is the same issue previously raised; it will likely cause failures/non-determinism for Cerebras. Consider using the same reasoning/math prompt as other providers, or wire tool configuration for Cerebras in this scenario.

-			if testConfig.Provider == schemas.Cerebras {
-				problemPrompt = "Hello how are you, can you search hackernews news regarding maxim ai for me? use your tools for this"
-			}
+			if testConfig.Provider == schemas.Cerebras {
+				problemPrompt = "Explain step by step: What is 15% of 200, then multiply that result by 3?"
+			}
core/schemas/responses.go (2)

69-70: Fix comment typo (“Not in OpenAI’s spec”).

-	StopReason         *string                             `json:"stop_reason,omitempty"` // Not is OpenAI's spec, but sent by other providers
+	StopReason         *string                             `json:"stop_reason,omitempty"` // Not in OpenAI's spec, but sent by other providers

1440-1443: Fix comment typo (“Not in OpenAI’s spec”).

-	Signature *string                                    `json:"signature,omitempty"` // Not is OpenAI's spec, but sent by other providers
+	Signature *string                                    `json:"signature,omitempty"` // Not in OpenAI's spec, but sent by other providers
framework/streaming/responses.go (2)

498-534: Don’t mutate a non-reasoning message when matching by ItemID only.
This is the same issue previously raised: strengthen the lookup to ensure Type == reasoning before appending deltas/signatures.

-				for i := len(messages) - 1; i >= 0; i-- {
-					if messages[i].ID != nil && *messages[i].ID == *resp.ItemID {
+				for i := len(messages) - 1; i >= 0; i-- {
+					if messages[i].ID != nil && *messages[i].ID == *resp.ItemID &&
+						messages[i].Type != nil && *messages[i].Type == schemas.ResponsesMessageTypeReasoning {
 						targetMessage = &messages[i]
 						break
 					}
 				}

626-727: Reasoning summary delta/signature should consistently populate ResponsesReasoning (not ContentBlocks) + guard negative ContentIndex.
Current logic writes into content blocks when contentIndex != nil, but this event type is “reasoning_summary_text.delta”; it should map to ResponsesReasoning.Summary (+ signature into EncryptedContent) regardless of ContentIndex. Also, make([]T, *contentIndex+1) will panic for negative values.

 func (a *Accumulator) appendReasoningDeltaToResponsesMessage(message *schemas.ResponsesMessage, delta string, contentIndex *int) {
-	// If we have a content index, this is reasoning content in content blocks
-	if contentIndex != nil {
-		...
-	} else {
-		// No content index - this is reasoning summary accumulation
-		if message.ResponsesReasoning == nil {
-			message.ResponsesReasoning = &schemas.ResponsesReasoning{
-				Summary: []schemas.ResponsesReasoningSummary{},
-			}
-		}
-		...
-	}
+	if message.ResponsesReasoning == nil {
+		message.ResponsesReasoning = &schemas.ResponsesReasoning{Summary: []schemas.ResponsesReasoningSummary{}}
+	}
+	idx := 0
+	if contentIndex != nil && *contentIndex >= 0 {
+		idx = *contentIndex
+	}
+	for len(message.ResponsesReasoning.Summary) <= idx {
+		message.ResponsesReasoning.Summary = append(message.ResponsesReasoning.Summary, schemas.ResponsesReasoningSummary{
+			Type: schemas.ResponsesReasoningContentBlockTypeSummaryText,
+			Text: "",
+		})
+	}
+	message.ResponsesReasoning.Summary[idx].Text += delta
 }
 
 func (a *Accumulator) appendReasoningSignatureToResponsesMessage(message *schemas.ResponsesMessage, signature string, contentIndex *int) {
-	if contentIndex != nil {
-		...
-	} else {
-		...
-	}
+	if message.ResponsesReasoning == nil {
+		message.ResponsesReasoning = &schemas.ResponsesReasoning{Summary: []schemas.ResponsesReasoningSummary{}}
+	}
+	if message.ResponsesReasoning.EncryptedContent == nil {
+		message.ResponsesReasoning.EncryptedContent = &signature
+	} else {
+		*message.ResponsesReasoning.EncryptedContent += signature
+	}
 }
transports/bifrost-http/integrations/anthropic.go (4)

109-120: Avoid stdlib log.Printf in request path; use structured logger (and consider strings.Builder).


196-238: Make claude-cli detection case-insensitive for header values.

-			if strings.Contains(userAgent[0], "claude-cli") {
+			if strings.Contains(strings.ToLower(userAgent[0]), "claude-cli") {
 				*bifrostCtx = context.WithValue(*bifrostCtx, schemas.BifrostContextKeyUserAgent, "claude-cli")
 			}
@@
-		if strings.Contains(userAgent, "claude-cli") {
+		if strings.Contains(strings.ToLower(userAgent), "claude-cli") {
 			isClaudeCode = true
 		}

240-245: Re-check provider == "" fallback in isClaudeModel to avoid misclassification when Provider is omitted.


90-126: Don’t drop passthrough stream events when type parse fails (can stall the stream).
Same as previously noted: fall back to forwarding the raw payload even if you can’t classify the event type.

-							var rawResponseJSON anthropic.AnthropicStreamEvent
-							if err := sonic.Unmarshal([]byte(raw), &rawResponseJSON); err == nil {
-								return string(rawResponseJSON.Type), raw, nil
-							}
+							var rawResponseJSON anthropic.AnthropicStreamEvent
+							if err := sonic.Unmarshal([]byte(raw), &rawResponseJSON); err == nil {
+								return string(rawResponseJSON.Type), raw, nil
+							}
+							// Fallback: forward raw event payload even if we can't classify its type.
+							return "", raw, nil
🧹 Nitpick comments (6)
core/providers/bedrock/types.go (1)

129-137: Consider logging or tracking skipped fields during unmarshaling.

The continue statement on line 133 silently drops fields that cannot be unmarshaled into ExtraParams. This could mask malformed data or unexpected field types without providing visibility into what's being skipped.

Consider adding logging or metrics to track when fields are skipped:

 // Extract unknown fields
 for key, value := range rawData {
   if !bedrockConverseRequestKnownFields[key] {
     var v interface{}
     if err := sonic.Unmarshal(value, &v); err != nil {
-      continue // Skip fields that can't be unmarshaled
+      // Log or track skipped fields for observability
+      // For now, continue to maintain backward compatibility
+      continue
     }
     r.ExtraParams[key] = v
   }
 }
core/providers/utils/utils.go (1)

318-335: Consider optimizing body handling to avoid unnecessary allocation.

The code copies resp.Body() into body at line 320, then immediately calls CheckAndDecodeBody which may decompress the response. If decoding succeeds, the original body copy is replaced (line 334), making the initial copy wasteful.

Consider this optimization:

 func HandleProviderAPIError(resp *fasthttp.Response, errorResp any) *schemas.BifrostError {
 	statusCode := resp.StatusCode()
-	body := append([]byte(nil), resp.Body()...)
 
 	// decode body
 	decodedBody, err := CheckAndDecodeBody(resp)
 	if err != nil {
 		return &schemas.BifrostError{
 			IsBifrostError: false,
 			StatusCode:     &statusCode,
 			Error: &schemas.ErrorField{
 				Message: err.Error(),
 			},
 		}
 	}
 
-	body = decodedBody
+	body := decodedBody

This avoids the unnecessary initial copy when decoding succeeds (the common case).

transports/bifrost-http/integrations/router.go (1)

312-314: Prefer storing RouteConfigType (or a dedicated enum) in context instead of string(config.Type).

If downstream code can accept it, store config.Type directly to avoid stringly-typed comparisons (and future typo bugs).

core/providers/openai/responses_test.go (1)

10-193: Nice coverage for skip/preserve rules; consider asserting exact counts, not just presence.
Right now the test reduces to “included vs not”; if future logic accidentally adds two messages, this won’t catch it.

core/internal/testutil/chat_completion_stream.go (1)

359-521: Reasoning stream test is permissive; consider asserting at least one “reasoning indicator” in non-flaky providers.
Right now the test logs indicators but doesn’t fail if none are present (which makes the test less useful as a regression detector). If this is intentional due to provider variability, OK—otherwise consider failing for providers/models where reasoning is expected to surface.

core/schemas/responses.go (1)

730-748: LGTM: ResponsesReasoningSummary aligns with UI shape (type: "summary_text", text).

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 3a4d9a5 and bd1d8c9.

⛔ Files ignored due to path filters (1)
  • ui/package-lock.json is excluded by !**/package-lock.json
📒 Files selected for processing (58)
  • core/internal/testutil/account.go (1 hunks)
  • core/internal/testutil/chat_completion_stream.go (1 hunks)
  • core/internal/testutil/reasoning.go (4 hunks)
  • core/internal/testutil/responses_stream.go (1 hunks)
  • core/internal/testutil/tests.go (2 hunks)
  • core/providers/anthropic/anthropic.go (3 hunks)
  • core/providers/anthropic/chat.go (5 hunks)
  • core/providers/anthropic/errors.go (2 hunks)
  • core/providers/anthropic/types.go (8 hunks)
  • core/providers/anthropic/utils.go (2 hunks)
  • core/providers/azure/azure.go (2 hunks)
  • core/providers/azure/utils.go (1 hunks)
  • core/providers/bedrock/bedrock.go (2 hunks)
  • core/providers/bedrock/bedrock_test.go (13 hunks)
  • core/providers/bedrock/types.go (2 hunks)
  • core/providers/bedrock/utils.go (2 hunks)
  • core/providers/cerebras/cerebras_test.go (2 hunks)
  • core/providers/cohere/chat.go (3 hunks)
  • core/providers/cohere/cohere.go (2 hunks)
  • core/providers/cohere/cohere_test.go (1 hunks)
  • core/providers/cohere/responses.go (7 hunks)
  • core/providers/cohere/types.go (1 hunks)
  • core/providers/gemini/responses.go (2 hunks)
  • core/providers/groq/groq_test.go (2 hunks)
  • core/providers/mistral/mistral_test.go (1 hunks)
  • core/providers/openai/chat.go (1 hunks)
  • core/providers/openai/responses.go (2 hunks)
  • core/providers/openai/responses_marshal_test.go (1 hunks)
  • core/providers/openai/responses_test.go (1 hunks)
  • core/providers/openai/text.go (1 hunks)
  • core/providers/openai/types.go (3 hunks)
  • core/providers/openai/types_test.go (1 hunks)
  • core/providers/openai/utils.go (1 hunks)
  • core/providers/openrouter/openrouter_test.go (1 hunks)
  • core/providers/utils/utils.go (4 hunks)
  • core/providers/vertex/errors.go (1 hunks)
  • core/providers/vertex/types.go (0 hunks)
  • core/providers/vertex/utils.go (1 hunks)
  • core/providers/vertex/vertex.go (3 hunks)
  • core/providers/vertex/vertex_test.go (1 hunks)
  • core/schemas/bifrost.go (1 hunks)
  • core/schemas/mux.go (0 hunks)
  • core/schemas/responses.go (5 hunks)
  • core/schemas/utils.go (1 hunks)
  • docs/docs.json (0 hunks)
  • framework/configstore/rdb.go (0 hunks)
  • framework/streaming/chat.go (0 hunks)
  • framework/streaming/responses.go (2 hunks)
  • transports/bifrost-http/handlers/inference.go (1 hunks)
  • transports/bifrost-http/handlers/middlewares.go (1 hunks)
  • transports/bifrost-http/integrations/anthropic.go (5 hunks)
  • transports/bifrost-http/integrations/router.go (3 hunks)
  • ui/app/workspace/logs/views/columns.tsx (1 hunks)
  • ui/app/workspace/logs/views/logDetailsSheet.tsx (1 hunks)
  • ui/app/workspace/logs/views/logResponsesMessageView.tsx (2 hunks)
  • ui/app/workspace/logs/views/logResponsesOutputView.tsx (0 hunks)
  • ui/lib/types/logs.ts (2 hunks)
  • ui/package.json (1 hunks)
💤 Files with no reviewable changes (6)
  • ui/app/workspace/logs/views/logResponsesOutputView.tsx
  • core/providers/vertex/types.go
  • core/schemas/mux.go
  • framework/streaming/chat.go
  • framework/configstore/rdb.go
  • docs/docs.json
🚧 Files skipped from review as they are similar to previous changes (16)
  • ui/app/workspace/logs/views/logResponsesMessageView.tsx
  • core/schemas/bifrost.go
  • ui/lib/types/logs.ts
  • core/providers/cohere/types.go
  • ui/package.json
  • core/providers/azure/azure.go
  • core/schemas/utils.go
  • core/providers/vertex/vertex.go
  • core/providers/bedrock/bedrock.go
  • transports/bifrost-http/handlers/middlewares.go
  • core/providers/openai/text.go
  • core/internal/testutil/tests.go
  • core/providers/cerebras/cerebras_test.go
  • ui/app/workspace/logs/views/columns.tsx
  • core/internal/testutil/account.go
  • core/providers/openai/types_test.go
🧰 Additional context used
📓 Path-based instructions (1)
**

⚙️ CodeRabbit configuration file

always check the stack if there is one for the current PR. do not give localized reviews for the PR, always see all changes in the light of the whole stack of PRs (if there is a stack, if there is no stack you can continue to make localized suggestions/reviews)

Files:

  • transports/bifrost-http/handlers/inference.go
  • core/providers/vertex/vertex_test.go
  • core/providers/anthropic/errors.go
  • core/providers/openai/utils.go
  • core/providers/openrouter/openrouter_test.go
  • core/providers/vertex/errors.go
  • core/providers/openai/responses.go
  • core/providers/anthropic/chat.go
  • core/providers/anthropic/utils.go
  • core/providers/openai/types.go
  • core/internal/testutil/chat_completion_stream.go
  • core/providers/vertex/utils.go
  • core/providers/azure/utils.go
  • core/providers/utils/utils.go
  • core/providers/cohere/cohere_test.go
  • core/providers/openai/responses_marshal_test.go
  • framework/streaming/responses.go
  • core/providers/anthropic/types.go
  • core/providers/bedrock/types.go
  • core/providers/bedrock/utils.go
  • core/schemas/responses.go
  • core/providers/gemini/responses.go
  • core/internal/testutil/responses_stream.go
  • core/internal/testutil/reasoning.go
  • core/providers/mistral/mistral_test.go
  • core/providers/groq/groq_test.go
  • core/providers/cohere/cohere.go
  • core/providers/openai/responses_test.go
  • core/providers/openai/chat.go
  • transports/bifrost-http/integrations/anthropic.go
  • core/providers/bedrock/bedrock_test.go
  • core/providers/anthropic/anthropic.go
  • transports/bifrost-http/integrations/router.go
  • core/providers/cohere/chat.go
  • ui/app/workspace/logs/views/logDetailsSheet.tsx
  • core/providers/cohere/responses.go
🧠 Learnings (4)
📚 Learning: 2025-12-09T17:07:42.007Z
Learnt from: qwerty-dvorak
Repo: maximhq/bifrost PR: 1006
File: core/schemas/account.go:9-18
Timestamp: 2025-12-09T17:07:42.007Z
Learning: In core/schemas/account.go, the HuggingFaceKeyConfig field within the Key struct is currently unused and reserved for future Hugging Face inference endpoint deployments. Do not flag this field as missing from OpenAPI documentation or require its presence in the API spec until the feature is actively implemented and used. When the feature is added, update the OpenAPI docs accordingly; otherwise, treat this field as non-breaking and not part of the current API surface.

Applied to files:

  • transports/bifrost-http/handlers/inference.go
  • core/providers/vertex/vertex_test.go
  • core/providers/anthropic/errors.go
  • core/providers/openai/utils.go
  • core/providers/openrouter/openrouter_test.go
  • core/providers/vertex/errors.go
  • core/providers/openai/responses.go
  • core/providers/anthropic/chat.go
  • core/providers/anthropic/utils.go
  • core/providers/openai/types.go
  • core/internal/testutil/chat_completion_stream.go
  • core/providers/vertex/utils.go
  • core/providers/azure/utils.go
  • core/providers/utils/utils.go
  • core/providers/cohere/cohere_test.go
  • core/providers/openai/responses_marshal_test.go
  • framework/streaming/responses.go
  • core/providers/anthropic/types.go
  • core/providers/bedrock/types.go
  • core/providers/bedrock/utils.go
  • core/schemas/responses.go
  • core/providers/gemini/responses.go
  • core/internal/testutil/responses_stream.go
  • core/internal/testutil/reasoning.go
  • core/providers/mistral/mistral_test.go
  • core/providers/groq/groq_test.go
  • core/providers/cohere/cohere.go
  • core/providers/openai/responses_test.go
  • core/providers/openai/chat.go
  • transports/bifrost-http/integrations/anthropic.go
  • core/providers/bedrock/bedrock_test.go
  • core/providers/anthropic/anthropic.go
  • transports/bifrost-http/integrations/router.go
  • core/providers/cohere/chat.go
  • core/providers/cohere/responses.go
📚 Learning: 2025-12-12T08:25:02.629Z
Learnt from: Pratham-Mishra04
Repo: maximhq/bifrost PR: 1000
File: transports/bifrost-http/integrations/router.go:709-712
Timestamp: 2025-12-12T08:25:02.629Z
Learning: In transports/bifrost-http/**/*.go, update streaming response handling to align with OpenAI Responses API: use typed SSE events such as response.created, response.output_text.delta, response.done, etc., and do not rely on the legacy data: [DONE] termination marker. Note that data: [DONE] is only used by the older Chat Completions and Text Completions streaming APIs. Ensure parsers, writers, and tests distinguish SSE events from the [DONE] sentinel and handle each event type accordingly for correct stream termination and progress updates.

Applied to files:

  • transports/bifrost-http/handlers/inference.go
  • transports/bifrost-http/integrations/anthropic.go
  • transports/bifrost-http/integrations/router.go
📚 Learning: 2025-12-11T11:58:25.307Z
Learnt from: Pratham-Mishra04
Repo: maximhq/bifrost PR: 1000
File: core/providers/openai/responses.go:42-84
Timestamp: 2025-12-11T11:58:25.307Z
Learning: In core/providers/openai/responses.go (and related OpenAI response handling), document and enforce the API format constraint: if ResponsesReasoning != nil and the response contains content blocks, all content blocks should be treated as reasoning blocks by default. Implement type guards or parsing logic accordingly, and add unit tests to verify that when ResponsesReasoning is non-nil, content blocks are labeled as reasoning blocks. Include clear comments in the code explaining the rationale and ensure downstream consumers rely on this behavior.

Applied to files:

  • core/providers/openai/utils.go
  • core/providers/openai/responses.go
  • core/providers/openai/types.go
  • core/providers/openai/responses_marshal_test.go
  • core/providers/openai/responses_test.go
  • core/providers/openai/chat.go
📚 Learning: 2025-12-11T07:38:31.413Z
Learnt from: Pratham-Mishra04
Repo: maximhq/bifrost PR: 1000
File: core/providers/bedrock/bedrock_test.go:1374-1390
Timestamp: 2025-12-11T07:38:31.413Z
Learning: In core/providers/bedrock tests, follow a layered testing approach: - Unit tests (e.g., TestBifrostToBedrockResponseConversion) should perform structural comparisons and type/field checks to avoid brittleness from dynamic fields. - Separate scenario-based and integration tests should validate the full end-to-end conversion logic, including content block internals. Ensure unit tests avoid brittle string/field matching and that integration tests cover end-to-end behavior with realistic data.

Applied to files:

  • core/providers/bedrock/bedrock_test.go
🧬 Code graph analysis (25)
transports/bifrost-http/handlers/inference.go (1)
core/schemas/responses.go (1)
  • ResponsesParameters (87-114)
core/providers/anthropic/errors.go (2)
core/schemas/bifrost.go (1)
  • BifrostError (358-367)
ui/lib/types/logs.ts (1)
  • BifrostError (226-232)
core/providers/vertex/errors.go (4)
core/providers/utils/utils.go (2)
  • CheckAndDecodeBody (490-498)
  • NewBifrostOperationError (516-527)
core/schemas/provider.go (1)
  • ErrProviderResponseDecode (29-29)
core/providers/vertex/vertex.go (1)
  • VertexError (25-31)
core/providers/vertex/types.go (1)
  • VertexValidationError (153-160)
core/providers/openai/responses.go (4)
core/schemas/responses.go (6)
  • ResponsesMessage (314-327)
  • ResponsesReasoning (731-734)
  • ResponsesMessageContentBlock (399-411)
  • ResponsesOutputMessageContentTypeReasoning (394-394)
  • ResponsesMessageContent (339-344)
  • ResponsesParameters (87-114)
core/schemas/utils.go (1)
  • Ptr (16-18)
core/providers/openai/types.go (3)
  • OpenAIResponsesRequest (180-189)
  • OpenAIResponsesRequestInput (147-150)
  • MinMaxCompletionTokens (12-12)
core/providers/openai/utils.go (1)
  • SanitizeUserField (51-56)
core/providers/anthropic/chat.go (6)
core/providers/anthropic/types.go (2)
  • MinimumReasoningMaxTokens (15-15)
  • AnthropicThinking (69-72)
core/providers/bedrock/types.go (1)
  • MinimumReasoningMaxTokens (12-12)
core/providers/cohere/types.go (1)
  • MinimumReasoningMaxTokens (11-11)
core/providers/utils/utils.go (1)
  • GetBudgetTokensFromReasoningEffort (1084-1121)
core/schemas/utils.go (1)
  • Ptr (16-18)
core/utils.go (1)
  • Ptr (51-53)
core/providers/anthropic/utils.go (6)
core/schemas/responses.go (1)
  • BifrostResponsesRequest (32-39)
core/providers/utils/utils.go (1)
  • NewBifrostOperationError (516-527)
core/schemas/provider.go (1)
  • ErrRequestBodyConversion (25-25)
core/providers/anthropic/types.go (1)
  • AnthropicDefaultMaxTokens (14-14)
core/providers/anthropic/responses.go (1)
  • ToAnthropicResponsesRequest (1419-1532)
core/schemas/utils.go (1)
  • Ptr (16-18)
core/providers/openai/types.go (3)
core/schemas/chatcompletions.go (1)
  • ChatParameters (155-184)
core/schemas/responses.go (1)
  • ResponsesParametersReasoning (234-239)
ui/lib/types/logs.ts (1)
  • ResponsesParametersReasoning (513-520)
core/internal/testutil/chat_completion_stream.go (5)
core/schemas/chatcompletions.go (5)
  • ChatMessage (469-478)
  • BifrostChatRequest (12-19)
  • ChatParameters (155-184)
  • ChatReasoning (223-226)
  • BifrostChatResponse (26-41)
core/internal/testutil/utils.go (1)
  • CreateBasicChatMessage (247-254)
core/schemas/models.go (1)
  • Model (109-129)
core/internal/testutil/test_retry_framework.go (4)
  • StreamingRetryConfig (819-839)
  • TestRetryContext (168-173)
  • WithStreamRetry (580-688)
  • ChatStreamValidationResult (2328-2335)
core/schemas/bifrost.go (2)
  • BifrostStream (323-330)
  • BifrostError (358-367)
core/providers/vertex/utils.go (7)
core/schemas/responses.go (1)
  • BifrostResponsesRequest (32-39)
core/providers/utils/utils.go (1)
  • NewBifrostOperationError (516-527)
core/schemas/provider.go (2)
  • ErrRequestBodyConversion (25-25)
  • ErrProviderRequestMarshal (26-26)
core/providers/anthropic/types.go (1)
  • AnthropicDefaultMaxTokens (14-14)
core/providers/vertex/types.go (1)
  • DefaultVertexAnthropicVersion (8-8)
core/providers/anthropic/responses.go (1)
  • ToAnthropicResponsesRequest (1419-1532)
core/schemas/utils.go (1)
  • Ptr (16-18)
core/providers/utils/utils.go (2)
core/schemas/bifrost.go (2)
  • BifrostError (358-367)
  • ErrorField (376-383)
ui/lib/types/logs.ts (2)
  • BifrostError (226-232)
  • ErrorField (217-224)
core/providers/openai/responses_marshal_test.go (3)
core/providers/openai/types.go (2)
  • OpenAIResponsesRequest (180-189)
  • OpenAIResponsesRequestInput (147-150)
core/schemas/utils.go (1)
  • Ptr (16-18)
core/schemas/responses.go (5)
  • ResponsesParameters (87-114)
  • ResponsesParametersReasoning (234-239)
  • ResponsesMessage (314-327)
  • ResponsesInputMessageRoleUser (333-333)
  • ResponsesMessageContent (339-344)
framework/streaming/responses.go (3)
core/schemas/responses.go (8)
  • ResponsesStreamResponseTypeReasoningSummaryTextDelta (1393-1393)
  • ResponsesMessage (314-327)
  • ResponsesReasoning (731-734)
  • ResponsesReasoningSummary (745-748)
  • ResponsesMessageContent (339-344)
  • ResponsesMessageContentBlock (399-411)
  • ResponsesOutputMessageContentTypeReasoning (394-394)
  • ResponsesReasoningContentBlockTypeSummaryText (741-741)
core/schemas/utils.go (1)
  • Ptr (16-18)
core/utils.go (1)
  • Ptr (51-53)
core/providers/bedrock/types.go (5)
core/providers/anthropic/types.go (1)
  • Alias (104-104)
core/providers/openai/types.go (2)
  • Alias (83-83)
  • Alias (194-194)
core/schemas/bifrost.go (2)
  • Alias (388-388)
  • Alias (405-405)
core/schemas/chatcompletions.go (4)
  • Alias (189-189)
  • Alias (643-643)
  • Alias (770-770)
  • Alias (860-860)
core/providers/gemini/types.go (7)
  • Alias (220-220)
  • Alias (245-245)
  • Alias (828-830)
  • Alias (842-844)
  • Alias (857-859)
  • Alias (871-873)
  • Alias (1306-1306)
core/providers/bedrock/utils.go (6)
core/schemas/chatcompletions.go (1)
  • BifrostChatRequest (12-19)
core/providers/bedrock/types.go (4)
  • BedrockConverseRequest (55-75)
  • MinimumReasoningMaxTokens (12-12)
  • DefaultCompletionMaxTokens (13-13)
  • BedrockInferenceConfig (255-260)
core/schemas/utils.go (2)
  • IsAnthropicModel (1043-1045)
  • Ptr (16-18)
core/providers/anthropic/types.go (1)
  • MinimumReasoningMaxTokens (15-15)
core/providers/cohere/types.go (2)
  • MinimumReasoningMaxTokens (11-11)
  • DefaultCompletionMaxTokens (12-12)
core/providers/utils/utils.go (1)
  • GetBudgetTokensFromReasoningEffort (1084-1121)
core/schemas/responses.go (2)
core/providers/gemini/types.go (1)
  • Type (778-778)
ui/lib/types/logs.ts (2)
  • ResponsesMessageContentBlockType (352-359)
  • ResponsesReasoningSummary (412-415)
core/internal/testutil/responses_stream.go (1)
core/schemas/utils.go (1)
  • Ptr (16-18)
core/internal/testutil/reasoning.go (5)
core/schemas/utils.go (1)
  • Ptr (16-18)
core/schemas/bifrost.go (1)
  • OpenAI (35-35)
core/internal/testutil/utils.go (3)
  • CreateBasicChatMessage (247-254)
  • GetErrorMessage (606-639)
  • GetChatContent (329-356)
core/internal/testutil/test_retry_framework.go (1)
  • WithChatTestRetry (274-424)
core/internal/testutil/validation_presets.go (2)
  • GetExpectationsForScenario (208-293)
  • ModifyExpectationsForProvider (300-347)
core/providers/cohere/cohere.go (1)
core/providers/cohere/responses.go (1)
  • ToCohereResponsesRequest (916-1030)
core/providers/openai/responses_test.go (1)
core/providers/openai/responses.go (1)
  • ToOpenAIResponsesRequest (37-112)
core/providers/openai/chat.go (5)
core/schemas/chatcompletions.go (1)
  • ChatParameters (155-184)
core/providers/openai/types.go (1)
  • MinMaxCompletionTokens (12-12)
core/schemas/utils.go (1)
  • Ptr (16-18)
core/utils.go (1)
  • Ptr (51-53)
core/providers/openai/utils.go (1)
  • SanitizeUserField (51-56)
transports/bifrost-http/integrations/anthropic.go (3)
core/schemas/bifrost.go (5)
  • BifrostContextKeyUserAgent (123-123)
  • Anthropic (37-37)
  • BifrostContextKeyExtraHeaders (115-115)
  • BifrostContextKeyURLPath (116-116)
  • BifrostContextKeySkipKeySelection (114-114)
core/providers/anthropic/types.go (1)
  • AnthropicStreamEvent (395-404)
core/schemas/utils.go (1)
  • IsAnthropicModel (1043-1045)
core/providers/bedrock/bedrock_test.go (3)
core/schemas/utils.go (1)
  • Ptr (16-18)
core/schemas/responses.go (4)
  • ResponsesMessageTypeMessage (290-290)
  • ResponsesMessageTypeFunctionCallOutput (296-296)
  • ResponsesToolMessage (462-482)
  • ResponsesToolMessageOutputStruct (531-535)
core/schemas/chatcompletions.go (1)
  • OrderedMap (268-268)
core/providers/anthropic/anthropic.go (3)
core/schemas/bifrost.go (1)
  • ResponsesRequest (91-91)
transports/bifrost-http/handlers/inference.go (1)
  • ResponsesRequest (257-261)
core/providers/utils/utils.go (3)
  • HandleProviderResponse (359-445)
  • ShouldSendBackRawRequest (551-556)
  • ShouldSendBackRawResponse (559-564)
transports/bifrost-http/integrations/router.go (2)
core/schemas/bifrost.go (1)
  • BifrostContextKeyIntegrationType (120-120)
core/providers/gemini/types.go (1)
  • Type (778-778)
ui/app/workspace/logs/views/logDetailsSheet.tsx (3)
ui/components/ui/separator.tsx (1)
  • DottedSeparator (43-43)
ui/app/workspace/logs/views/logEntryDetailsView.tsx (1)
  • LogEntryDetailsView (15-49)
ui/components/ui/badge.tsx (1)
  • Badge (37-37)
⏰ Context from checks skipped due to timeout of 900000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Graphite / mergeability_check
🔇 Additional comments (51)
core/providers/mistral/mistral_test.go (1)

49-49: LGTM: Clear explanation of implementation limitation.

The addition of Reasoning: false with the explanatory comment correctly reflects bifrost's current state. The comment effectively clarifies that this is an implementation limitation (lack of native Mistral converters) rather than a provider capability gap, addressing the concern raised in the previous review.

core/providers/groq/groq_test.go (1)

55-55: LGTM, assuming the reasoning model is valid.

Enabling reasoning tests for Groq aligns with the PR's objective to expand reasoning support across providers. This change is consistent with similar updates to other provider tests (Cerebras, Cohere, etc.).

The approval is conditional on verifying that the ReasoningModel configured at line 37 is valid and supports reasoning capabilities.

core/providers/bedrock/types.go (3)

3-8: LGTM!

The added imports support the custom JSON unmarshaling implementation. encoding/json provides json.RawMessage, sonic provides the unmarshaling capability, and schemas provides internal types for the struct fields.


70-74: LGTM!

The added fields appropriately extend BedrockConverseRequest:

  • ExtraParams captures unknown fields during JSON unmarshaling
  • Fallbacks provides Bifrost-specific configuration

This pattern aligns with similar implementations in other providers (Anthropic, OpenAI).


82-98: LGTM!

The known-fields map correctly includes all struct fields, including serviceTier, extra_params, and fallbacks which were previously flagged in past reviews. This ensures proper field recognition during custom unmarshaling.

core/providers/vertex/errors.go (1)

10-44: LGTM!

The decoding logic is well-implemented:

  • Properly handles gzip-encoded responses via CheckAndDecodeBody
  • Returns appropriate errors on decode failure
  • Consistently uses decodedBody for all subsequent unmarshal attempts
  • Maintains all existing error fallback paths
core/internal/testutil/reasoning.go (3)

12-117: LGTM!

The rename to RunResponsesReasoningTest clearly distinguishes this as a Responses API-specific test, and the token increase to 1800 provides adequate capacity for complex reasoning scenarios.


201-310: LGTM!

The new Chat Completions reasoning test is well-structured:

  • Mirrors the Responses API test pattern consistently
  • Includes a documented OpenAI skip due to known flakiness
  • Properly configures reasoning parameters (effort: "high", MaxTokens: 1800)
  • Integrates with the retry framework and validation expectations

312-411: LGTM!

The validation logic is comprehensive and correct:

  • Properly accesses reasoning fields via message.ChatAssistantMessage
  • Checks multiple reasoning indicators (content, details array, token usage)
  • Provides detailed logging for each indicator type
  • Correctly iterates over ReasoningDetails with proper type checking
core/internal/testutil/responses_stream.go (1)

440-440: LGTM!

The token increase to 1800 is consistent with reasoning test adjustments across the codebase and provides adequate capacity for reasoning-heavy streaming scenarios.

core/providers/openai/chat.go (1)

31-38: LGTM!

The token clamping and user field sanitization logic is well-implemented:

  • Properly clamps MaxCompletionTokens to the minimum (16)
  • Sanitizes the User field to respect the 64-character limit
  • Clean integration with the utility functions

Note: A past review comment requested test coverage for these guards and was marked as addressed.

core/providers/openai/utils.go (1)

47-56: LGTM!

The user field sanitization is well-implemented:

  • The 64-character limit is properly documented as an OpenAI enforcement
  • SanitizeUserField correctly returns nil when the limit is exceeded
  • Simple, clear implementation

Note: A past review confirmed this limit through actual OpenAI errors, even though it's not in their public documentation.

transports/bifrost-http/handlers/inference.go (1)

224-254: LGTM!

The custom unmarshaller for ResponsesRequest is well-implemented:

  • Follows the same pattern as ChatRequest.UnmarshalJSON (lines 169-199)
  • Properly handles the embedded BifrostParams struct using an alias
  • Separately unmarshals Input and ResponsesParameters to avoid conflicts
  • Includes clear documentation explaining the need for custom unmarshalling

The implementation is consistent and correct.

core/providers/cohere/cohere_test.go (1)

25-50: Ensure the configured Cohere ReasoningModel is actually available in CI/accounts (model-name drift risk).

Hardcoding "command-a-reasoning-08-2025" can make provider tests flaky if that model isn’t enabled for the key used in CI. Consider reading this from env (similar to API key gating) or falling back to ChatModel when unavailable.

core/providers/anthropic/anthropic.go (2)

677-705: Centralized Responses body construction looks like a net win (less drift between modes).

Please sanity-check that getRequestBodyForResponses(..., stream=false) does not accidentally leave stream=true (or equivalent) set from reused structs, and that the marshaled jsonBody matches what Anthropic expects for /v1/messages.


725-729: Good: streaming path now reuses the same body builder and properly propagates conversion errors.

core/providers/cohere/cohere.go (2)

510-517: Good: propagate ToCohereResponsesRequest conversion errors instead of forcing nil error.


569-577: Good: early-return on conversion error before mutating/stream-enabling request.

Just confirm ToCohereResponsesRequest returns a concrete type with .Stream (not any), otherwise this won’t compile.

core/providers/openai/types.go (3)

11-13: Confirm MinMaxCompletionTokens value is correct for all OpenAI(-compatible) backends.
Hard-coding 16 is fine, but please double-check this minimum is actually required/valid across the providers routed through this OpenAI path.


108-139: Unmarshal split (base fields + schemas.ChatParameters) is the right fix for the “hijack” problem.
This avoids losing Model/Messages/Stream/MaxTokens/Fallbacks while still allowing schemas.ChatParameters.UnmarshalJSON to run.


191-229: Good shadowing approach: preserves Input’s custom JSON and reliably omits reasoning.max_tokens.
The clone prevents accidental mutation of r.Reasoning while ensuring the outgoing payload matches the intended OpenAI schema.

core/providers/anthropic/chat.go (2)

605-632: Good: don’t require non-empty PartialJSON to stream tool args.
Allowing PartialJSON != nil matches real-world deltas where empty-string chunks occur.


633-660: Good: thinkingText local avoids accidental aliasing and keeps pointer fields consistent.
This is a safer pattern for constructing Reasoning and ReasoningDetails[].Text.

core/providers/openai/responses_test.go (1)

195-346: Add a regression test for the “ResponsesReasoning != nil ⇒ content blocks treated as reasoning” constraint.
This behavior was called out as a contract; locking it with a focused test will prevent future drift. Based on learnings, this should be enforced and verified for downstream consumers.

core/providers/bedrock/utils.go (2)

11-11: LGTM!

The providerUtils import alias is consistent with the pattern used in other providers in this PR.


39-76: Reasoning config logic correctly addresses model-aware minimum tokens.

The implementation now properly uses Bedrock's MinimumReasoningMaxTokens (1) by default, and only uses Anthropic's higher minimum (1024) when the model is detected as an Anthropic model via IsAnthropicModel(). This resolves the concern from the previous review about spurious errors for Bedrock configs.

core/providers/openai/responses.go (3)

3-7: LGTM!

Import changes are appropriate for the added string-based model checks.


42-89: Message normalization logic correctly implemented with proper documentation.

The code properly:

  1. Documents the OpenAI responses format constraint (lines 47-52)
  2. Uses schemas.Ptr(summary.Text) to avoid pointer-to-range-variable aliasing (line 76)
  3. Handles both gpt-oss and non-gpt-oss models appropriately

Based on learnings, this aligns with the confirmed OpenAI responses format behavior.


100-109: LGTM!

Good defensive parameter handling:

  • Clamping MaxOutputTokens to minimum ensures valid requests
  • Sanitizing User field respects OpenAI's 64-character limit
  • Filtering unsupported tools prevents API errors
core/providers/gemini/responses.go (1)

619-627: Ensure consistent encoding/decoding of ThoughtSignature.

Line 625 converts EncryptedContent directly to bytes:

part.ThoughtSignature = []byte(*nextMsg.ResponsesReasoning.EncryptedContent)

If the outbound path (line 169) is updated to base64-encode the ThoughtSignature, this inbound path will need corresponding base64 decoding. Ensure both directions are updated consistently.

core/providers/cohere/chat.go (3)

7-7: LGTM!

The providerUtils import alias is consistent with other providers in this PR.


104-129: Reasoning budget logic correctly handles nil checks and uses local constants.

The implementation properly:

  1. Checks MaxCompletionTokens != nil before dereferencing (lines 113-115)
  2. Uses DefaultCompletionMaxTokens as fallback when MaxCompletionTokens is not provided
  3. Propagates errors from GetBudgetTokensFromReasoningEffort (lines 117-119)

This addresses the nil pointer dereference concern from the previous review.


439-463: LGTM!

The streaming thinking content handling properly extracts the thinking text into a local variable and uses schemas.Ptr() to create safe pointers for both Reasoning and ReasoningDetails fields.

core/schemas/responses.go (1)

399-411: Signature addition is fine; ensure downstream streaming deep-copy/accumulators preserve it.
Schema change looks consistent, but it increases the importance of updating stream deep-copy and accumulation paths (see framework/streaming/responses.go comments).

core/providers/openai/responses_marshal_test.go (3)

12-129: Good coverage for “omit reasoning.max_tokens” while preserving other reasoning fields.


131-313: Input union tests look solid (string vs array forms).


315-480: Round-trip test is good; verify it matches the current OpenAIResponsesRequest (un)marshal contract.
This is tightly coupled to OpenAIResponsesRequest.MarshalJSON/UnmarshalJSON behavior—worth confirming it’s stable across the stack PRs.

core/providers/anthropic/types.go (4)

58-140: Unknown-field capture into ExtraParams looks correct (and avoids duplicating extra_params).


161-197: Sonic-based union (string vs blocks) marshaling for AnthropicContent looks good.


199-228: Redacted thinking support (redacted_thinking + data) is a clean extension.


365-424: Usage/cache fields + stop_sequence nullability look consistent with the stated API behavior.

core/providers/bedrock/bedrock_test.go (4)

487-487: LGTM: Type field additions align with schema evolution.

The test cases correctly populate the Type field with ResponsesMessageTypeMessage for standard message inputs, consistent with the new schema requirement. The pattern is applied uniformly across all test scenarios.

Also applies to: 527-527, 539-539, 579-579, 624-624, 688-688, 758-758


823-823: LGTM: Tool call representations updated correctly.

The test expectations now use Status: "completed" for function calls and ResponsesToolCallOutputStr for tool outputs, aligning with the updated schema definitions. These changes are applied consistently across all relevant test scenarios.

Also applies to: 865-865, 915-915, 927-927


1498-1513: LGTM: Structural comparison follows documented testing strategy.

The field-by-field comparison approach correctly handles runtime-generated data (IDs, timestamps) while validating the essential conversion logic. This aligns with the documented layered testing strategy where unit tests focus on structure and separate integration tests validate detailed content.

Based on learnings, this pattern is intentional and appropriate.


1684-1717: LGTM: Nil guards added for safe test assertions.

The require.NotNil checks before dereferencing pointer fields ensure tests fail cleanly with clear diagnostics rather than panicking. The comparison logic for Output, Usage, and nested structures is now properly guarded and handles runtime-generated data appropriately.

core/providers/cohere/responses.go (6)

14-151: LGTM: State tracking comprehensively manages streaming conversions.

The expanded CohereResponsesStreamState with reasoning indices, annotation mappings, tool plan tracking, and lifecycle flags provides robust state management for the streaming conversion pipeline. The pool management correctly initializes and clears all new fields, preventing state leakage between requests. The getOrCreateOutputIndex method handles nil content indices defensively.


153-185: LGTM: Content block conversion handles all Cohere block types.

The helper correctly converts text, image, and thinking blocks to Bifrost format. The nil check for ImageURL (line 163) prevents invalid image blocks, and based on previous review feedback, call sites filter out any zero-value blocks with empty Type fields.


187-867: LGTM: Streaming event handlers comprehensively map Cohere to Bifrost format.

The ToBifrostResponsesStream method correctly handles all Cohere streaming event types, including:

  • Lifecycle events (created, in_progress, completed) with proper state tracking
  • Content deltas with distinction between regular text and reasoning summaries
  • Tool call lifecycle with argument buffering and proper status transitions
  • Citation pairing using annotation-to-content-index mapping (addresses previous feedback)
  • Tool plan handling with automatic cleanup before subsequent events

The state management and event sequencing align with the Responses API streaming specification.


869-1030: LGTM: Request conversion correctly maps Bifrost to Cohere format.

The ToCohereResponsesRequest function comprehensively converts Bifrost Responses requests to Cohere's chat format, including:

  • Response format mapping with proper JSON schema handling
  • Reasoning budget calculation with both explicit MaxTokens and effort-based paths
  • Safe nil handling for MaxOutputTokens (lines 946-949) using a default value
  • Proper error propagation from GetBudgetTokensFromReasoningEffort
  • Tools and tool choice conversion via dedicated helpers

Based on previous review feedback, the nil pointer issues have been addressed.


1032-1247: LGTM: Bidirectional message conversion handles all Responses types.

The conversion functions correctly map between Bifrost Responses and Cohere formats:

  • ToBifrostResponsesResponse properly extracts usage and output messages
  • ConvertBifrostMessagesToCohereMessages handles system messages, reasoning blocks, function calls, and regular content with proper accumulation and flushing logic
  • convertBifrostToolChoiceToCohereToolChoice now correctly maps "auto" to ToolChoiceAuto (addresses previous feedback)

The reasoning block attachment to assistant messages and system message aggregation align with Cohere's API requirements.


1249-1569: LGTM: Message-level conversion helpers handle all content and tool types.

The helper functions provide comprehensive bidirectional conversion:

  • System, regular, reasoning, and tool messages all handled appropriately
  • Nil guards prevent panics when accessing nested fields (e.g., line 1345 checks ResponsesToolMessage != nil)
  • Invalid content blocks filtered out (line 1436 checks for non-empty Type)
  • Function call outputs correctly extract from either Output or Content fields
  • Content block conversion handles text, image, and reasoning types

Based on previous review feedback, all identified issues have been addressed.

@Pratham-Mishra04 Pratham-Mishra04 force-pushed the 12-04-feat_responses_reasoning_fixes branch from bd1d8c9 to 76853f9 Compare December 12, 2025 11:24
@Pratham-Mishra04 Pratham-Mishra04 force-pushed the 12-05-feat_send_back_raw_request_support branch from 6951a77 to 2d213a4 Compare December 12, 2025 11:24
@TejasGhatte TejasGhatte force-pushed the 12-04-feat_responses_reasoning_fixes branch from 76853f9 to b6f0c52 Compare December 12, 2025 11:25
@TejasGhatte TejasGhatte force-pushed the 12-05-feat_send_back_raw_request_support branch from 2d213a4 to a22dc43 Compare December 12, 2025 11:25
@Pratham-Mishra04 Pratham-Mishra04 force-pushed the 12-04-feat_responses_reasoning_fixes branch from b6f0c52 to ad08844 Compare December 12, 2025 11:26
@Pratham-Mishra04 Pratham-Mishra04 force-pushed the 12-05-feat_send_back_raw_request_support branch from a22dc43 to 37e01b9 Compare December 12, 2025 11:26
@TejasGhatte TejasGhatte force-pushed the 12-05-feat_send_back_raw_request_support branch from 37e01b9 to 9551b34 Compare December 12, 2025 11:40
@TejasGhatte TejasGhatte force-pushed the 12-04-feat_responses_reasoning_fixes branch from ad08844 to e1ce12c Compare December 12, 2025 11:40
@Pratham-Mishra04 Pratham-Mishra04 force-pushed the 12-05-feat_send_back_raw_request_support branch from 9551b34 to 26864dc Compare December 12, 2025 12:15
@Pratham-Mishra04 Pratham-Mishra04 force-pushed the 12-04-feat_responses_reasoning_fixes branch from e1ce12c to 1806088 Compare December 12, 2025 12:15
Copy link
Collaborator Author

Pratham-Mishra04 commented Dec 12, 2025

Merge activity

  • Dec 12, 1:21 PM UTC: A user started a stack merge that includes this pull request via Graphite.
  • Dec 12, 1:31 PM UTC: Graphite rebased this pull request as part of a merge.
  • Dec 12, 1:31 PM UTC: @Pratham-Mishra04 merged this pull request with Graphite.

@Pratham-Mishra04 Pratham-Mishra04 changed the base branch from 12-05-feat_send_back_raw_request_support to graphite-base/1000 December 12, 2025 13:28
@Pratham-Mishra04 Pratham-Mishra04 changed the base branch from graphite-base/1000 to main December 12, 2025 13:29
@Pratham-Mishra04 Pratham-Mishra04 force-pushed the 12-04-feat_responses_reasoning_fixes branch from 1806088 to cccf60c Compare December 12, 2025 13:30
@Pratham-Mishra04 Pratham-Mishra04 merged commit 294f6dd into main Dec 12, 2025
8 of 9 checks passed
@Pratham-Mishra04 Pratham-Mishra04 deleted the 12-04-feat_responses_reasoning_fixes branch December 12, 2025 13:31
@coderabbitai coderabbitai bot mentioned this pull request Dec 12, 2025
18 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

add support for anthropic passthrough (claude code support) for azure and bedrock Files API Support

3 participants