fix: reroute requests incoming from integrations for full compatibility in custom providers #1030

TejasGhatte · 2025-12-09T10:23:51Z

Summary

Implement cross-endpoint fallback for integration requests, allowing chat completion requests to fall back to responses endpoints and vice versa when a specific endpoint is disabled.

Changes

Added fallback mechanism between chat completion and responses endpoints for integration requests
When an endpoint is disabled but the request comes from an integration, the system now attempts to convert and route the request to the alternative endpoint
Implemented fallback for both streaming and non-streaming requests
Added context flags to track when fallbacks are being used
Modified providers (OpenAI, Anthropic, Bedrock, Cohere) to support this fallback behavior

Type of change

Feature
Refactor

Affected areas

Core (Go)
Transports (HTTP)
Providers/Integrations

How to test

Test integration requests when specific endpoints are disabled:

# Test with disabled chat completion endpoint
go run main.go --disable-chat-completion
# Make an integration request to chat completion endpoint

# Test with disabled responses endpoint
go run main.go --disable-responses
# Make an integration request to responses endpoint

# Verify both streaming and non-streaming requests
go test ./...

Breaking changes

No

Related issues

Improves integration flexibility by allowing requests to succeed even when specific endpoints are disabled.

Security considerations

No additional security implications as this only affects request routing for integration requests.

Checklist

I added/updated tests where appropriate
I verified builds succeed (Go and UI)
I verified the CI pipeline passes locally if applicable

coderabbitai · 2025-12-09T10:24:09Z

Warning

Rate limit exceeded

@TejasGhatte has exceeded the limit for the number of commits or files that can be reviewed per hour. Please wait 9 minutes and 34 seconds before requesting another review.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

📥 Commits

Reviewing files that changed from the base of the PR and between dc29eae and 509b38e.

📒 Files selected for processing (8)

core/providers/anthropic/anthropic.go (2 hunks)
core/providers/bedrock/bedrock.go (2 hunks)
core/providers/cohere/cohere.go (2 hunks)
core/providers/gemini/gemini.go (2 hunks)
core/providers/openai/openai.go (2 hunks)
core/providers/utils/utils.go (1 hunks)
core/schemas/bifrost.go (1 hunks)
transports/bifrost-http/integrations/router.go (1 hunks)

📝 Walkthrough

Walkthrough

Providers add integration-aware fallbacks: when Responses/ResponsesStream is disallowed during an integration request, providers may delegate to ChatCompletion/ChatCompletionStream, set context flags to mark fallback state, and use a utility to avoid repeated fallback attempts.

Changes

Cohort / File(s)	Summary
Provider fallback implementations `core/providers/anthropic/anthropic.go`, `core/providers/bedrock/bedrock.go`, `core/providers/cohere/cohere.go`, `core/providers/openai/openai.go`, `core/providers/gemini/gemini.go`	Add integration-aware fallback paths for `Responses` / `ResponsesStream`: on disallowed operation, check/attempt integration fallback, delegate to `ChatCompletion` / `ChatCompletionStream` using `request.ToChatRequest()` (or equivalent), convert chat results to `BifrostResponsesResponse`, and populate `ExtraFields` (`RequestType=ResponsesRequest`, `Provider`, `ModelRequested`). Streaming fallbacks set a context flag to route to chat streaming.
Context keys / schema `core/schemas/bifrost.go`	Add and standardize public context keys related to integrations/fallbacks: `BifrostContextKeyIntegrationRequest`, `BifrostContextKeyIntegrationFallbackAttempted`, `BifrostContextKeyIntegrationType`, `BifrostContextKeyIsResponsesToChatCompletionFallback`, `BifrostContextKeyStructuredOutputToolName`, `BifrostContextKeyUserAgent`, alongside existing bifrost-* keys.
Integration router `transports/bifrost-http/integrations/router.go`	Mark incoming routed requests as integration requests by setting `BifrostContextKeyIntegrationRequest = true` in `GenericRouter.createHandler` before choosing streaming vs non-streaming handlers.
Fallback gating utility `core/providers/utils/utils.go`	Add `ShouldAttemptIntegrationFallback(ctx context.Context) (context.Context, bool)` which returns false when not an integration request or if a fallback was already attempted; otherwise sets the attempted flag in the returned context and returns true.
Flag propagation & minor flow updates `...`	Propagate and check new context flags across provider methods to coordinate streaming fallbacks, prevent repeated fallback attempts, and preserve original error paths when fallback is not attempted or fails.

sequenceDiagram
    participant Router as HTTP Router
    participant Provider as Provider (OpenAI/Cohere/...)
    participant Utils as Utils (ShouldAttemptIntegrationFallback)
    participant Chat as ChatCompletion/Stream

    Router->>Provider: Responses / ResponsesStream (ctx includes IntegrationRequest=true)
    Provider->>Provider: Check operation allowed?
    alt Allowed
        Provider-->>Router: Return normal Responses response/stream
    else Not allowed
        Provider->>Utils: ShouldAttemptIntegrationFallback(ctx)
        alt Utils returns (ctx', true)
            Provider->>Chat: Delegate to ChatCompletion/ChatCompletionStream with converted ChatRequest (ctx')
            Chat-->>Provider: Return chat response/stream
            Provider->>Provider: Convert to Responses shape, set ExtraFields (RequestType=ResponsesRequest, Provider, ModelRequested) and mark fallback
            Provider-->>Router: Return fallback Responses response/stream
        else Utils returns (ctx, false)
            Provider-->>Router: Return operation-not-allowed error
        end
    end

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Inspect ShouldAttemptIntegrationFallback for correct context immutability and idempotence.
Verify accurate conversion between Responses ↔ ChatCompletion across providers.
Review streaming fallback flag propagation and stream lifecycle/error handling.
Confirm ExtraFields consistency and that fallback markers are only set on successful fallback.

Poem

🐰 I hopped where gates said "not today",
I nudged the chat to find another way.
I left a tiny flag to say "tried once",
So loops won't spin and fallbacks dance.
Hop, code, fall back — bunny-approved!

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title accurately describes the main change: implementing a fallback mechanism to reroute integration requests to alternative endpoints for full compatibility.
Description check	✅ Passed	The PR description is mostly complete with summary, changes, type of change, affected areas, testing steps, breaking changes, and a security section. All major template sections are addressed.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

TejasGhatte · 2025-12-09T10:24:12Z

fix: reroute requests incoming from integrations for full compatibility in custom providers #1030 👈 (View in Graphite)
feat: added support for gemini native converters #1018 : 2 other dependent PRs (#1031 , #1066 )
feat: responses reasoning fixes #1000
feat: send back raw request in extra fields #1010
feat: support raw response accumulation in stream accumulator #999
feat: add reasoning support for chat completions #978 : 1 other dependent PR (#979 )
refactor: extract error parsing into dedicated files for providers #977
main

This stack of pull requests is managed by Graphite. Learn more about stacking.

github-actions · 2025-12-09T10:24:17Z

🧪 Test Suite Available

This PR can be tested by a repository admin.

Run tests for PR #1030

coderabbitai

Actionable comments posted: 7

🧹 Nitpick comments (2)

core/providers/anthropic/anthropic.go (1)
374-377: Consider extracting common fallback logic.

The fallback pattern is duplicated across OpenAI, Anthropic, Bedrock, and Cohere providers with identical structure. Consider extracting a helper function in providerUtils to reduce duplication and ensure consistent behavior (including the recursion guard).

Example signature:
func HandleIntegrationFallback[T any, U any](
    ctx context.Context,
    fallbackFn func(context.Context) (T, *schemas.BifrostError),
    converter func(T) U,
    requestType schemas.RequestType,
    provider schemas.ModelProvider,
    model string,
) (U, *schemas.BifrostError, bool)
Also applies to: 679-691, 740-748
core/providers/cohere/cohere.go (1)
289-305: Consider adding logging for fallback events.

When fallback routing occurs, there's no explicit logging to indicate that a request was rerouted from one endpoint to another. Adding debug or info-level logging would improve observability and help diagnose routing behavior in production.

Example for non-streaming methods:
 		if _, ok := ctx.Value(schemas.BifrostContextKeyIntegrationRequest).(bool); ok {
+			provider.logger.Debug(fmt.Sprintf("Falling back from ChatCompletion to Responses for integration request"))
 			responsesResponse, err := provider.Responses(ctx, key, request.ToResponsesRequest())
Also applies to: 351-358, 521-538, 583-596

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between dda0b09 and 8dca6ec.

📒 Files selected for processing (6)

core/providers/anthropic/anthropic.go (4 hunks)
core/providers/bedrock/bedrock.go (4 hunks)
core/providers/cohere/cohere.go (4 hunks)
core/providers/openai/openai.go (4 hunks)
core/schemas/bifrost.go (1 hunks)
transports/bifrost-http/integrations/router.go (1 hunks)

🧰 Additional context used

📓 Path-based instructions (1)

**

⚙️ CodeRabbit configuration file

always check the stack if there is one for the current PR. do not give localized reviews for the PR, always see all changes in the light of the whole stack of PRs (if there is a stack, if there is no stack you can continue to make localized suggestions/reviews)

Files:

transports/bifrost-http/integrations/router.go
core/providers/openai/openai.go
core/providers/cohere/cohere.go
core/providers/anthropic/anthropic.go
core/providers/bedrock/bedrock.go
core/schemas/bifrost.go

🧬 Code graph analysis (4)

transports/bifrost-http/integrations/router.go (1)

core/schemas/bifrost.go (1)

BifrostContextKeyIntegrationRequest (109-109)

core/providers/openai/openai.go (1)

core/schemas/bifrost.go (6)

BifrostContextKeyIntegrationRequest (109-109)

RequestType (83-83)

ChatCompletionRequest (89-89)

BifrostContextKeyIsChatCompletionToResponsesStreamFallback (121-121)

ResponsesRequest (91-91)

BifrostContextKeyIsResponsesToChatCompletionFallback (120-120)

core/providers/cohere/cohere.go (2)

core/schemas/bifrost.go (6)

BifrostContextKeyIntegrationRequest (109-109)

RequestType (83-83)

ChatCompletionRequest (89-89)

BifrostContextKeyIsChatCompletionToResponsesStreamFallback (121-121)

ResponsesRequest (91-91)

BifrostContextKeyIsResponsesToChatCompletionFallback (120-120)

core/schemas/provider.go (1)

Provider (280-307)

core/providers/bedrock/bedrock.go (1)

core/schemas/bifrost.go (6)

BifrostContextKeyIntegrationRequest (109-109)

RequestType (83-83)

ChatCompletionRequest (89-89)

BifrostContextKeyIsChatCompletionToResponsesStreamFallback (121-121)

ResponsesRequest (91-91)

BifrostContextKeyIsResponsesToChatCompletionFallback (120-120)

🔇 Additional comments (3)

transports/bifrost-http/integrations/router.go (1)

375-377: LGTM! Integration request flag is correctly set.

The context flag is appropriately placed after context conversion and before request dispatch, enabling downstream providers to detect integration-origin requests and apply fallback routing logic.

core/schemas/bifrost.go (1)

105-121: LGTM! New context keys are well-defined.

The new context keys for integration request tracking and fallback indicators follow existing naming conventions and are properly documented with type annotations. These enable the cross-endpoint fallback mechanism across providers.

core/providers/cohere/cohere.go (1)

297-300: LGTM: ExtraFields properly preserve original request metadata.

The fallback logic correctly populates ExtraFields with the original RequestType, ensuring observability is maintained even when requests are rerouted. This allows tracking of which endpoint was requested versus which was actually used for fulfillment.

Also applies to: 530-533

core/providers/anthropic/anthropic.go

core/providers/bedrock/bedrock.go

core/providers/cohere/cohere.go

core/providers/openai/openai.go

coderabbitai

Actionable comments posted: 0

♻️ Duplicate comments (1)

core/providers/cohere/cohere.go (1)

281-295: Cohere integration fallbacks can still recurse infinitely when both operations are disallowed

In Cohere, the Chat↔Responses and ChatStream↔ResponsesStream fallbacks are still keyed only on BifrostContextKeyIntegrationRequest and do not use ShouldAttemptIntegrationFallback. If a custom provider configuration disables both ChatCompletionRequest and ResponsesRequest (or their streaming equivalents) for an integration request, the methods will call each other in a loop until stack overflow.

To align with the other providers and make fallbacks one-shot per request, consider switching to ShouldAttemptIntegrationFallback here. For example:

func (provider *CohereProvider) ChatCompletion(ctx context.Context, key schemas.Key, request *schemas.BifrostChatRequest) (*schemas.BifrostChatResponse, *schemas.BifrostError) {
	// Check if chat completion is allowed
	if err := providerUtils.CheckOperationAllowed(schemas.Cohere, provider.customProviderConfig, schemas.ChatCompletionRequest); err != nil {
-		if _, ok := ctx.Value(schemas.BifrostContextKeyIntegrationRequest).(bool); ok {
+		if ctx, shouldFallback := providerUtils.ShouldAttemptIntegrationFallback(ctx); shouldFallback {
			responsesResponse, err := provider.Responses(ctx, key, request.ToResponsesRequest())
			if err != nil {
				return nil, err
			}

func (provider *CohereProvider) ChatCompletionStream(ctx context.Context, postHookRunner schemas.PostHookRunner, key schemas.Key, request *schemas.BifrostChatRequest) (chan *schemas.BifrostStream, *schemas.BifrostError) {
	// Check if chat completion stream is allowed
	if err := providerUtils.CheckOperationAllowed(schemas.Cohere, provider.customProviderConfig, schemas.ChatCompletionStreamRequest); err != nil {
-		if _, ok := ctx.Value(schemas.BifrostContextKeyIntegrationRequest).(bool); ok {
-			ctx = context.WithValue(ctx, schemas.BifrostContextKeyIsChatCompletionToResponsesStreamFallback, true)
-			return provider.ResponsesStream(ctx, postHookRunner, key, request.ToResponsesRequest())
-		}
+		if ctx, shouldFallback := providerUtils.ShouldAttemptIntegrationFallback(ctx); shouldFallback {
+			ctx = context.WithValue(ctx, schemas.BifrostContextKeyIsChatCompletionToResponsesStreamFallback, true)
+			return provider.ResponsesStream(ctx, postHookRunner, key, request.ToResponsesRequest())
+		}
		return nil, err
	}

func (provider *CohereProvider) Responses(ctx context.Context, key schemas.Key, request *schemas.BifrostResponsesRequest) (*schemas.BifrostResponsesResponse, *schemas.BifrostError) {
	// Check if chat completion is allowed
	if err := providerUtils.CheckOperationAllowed(schemas.Cohere, provider.customProviderConfig, schemas.ResponsesRequest); err != nil {
-		if _, ok := ctx.Value(schemas.BifrostContextKeyIntegrationRequest).(bool); ok {
+		if ctx, shouldFallback := providerUtils.ShouldAttemptIntegrationFallback(ctx); shouldFallback {
			chatResponse, err := provider.ChatCompletion(ctx, key, request.ToChatRequest())
			if err != nil {
				return nil, err
			}

func (provider *CohereProvider) ResponsesStream(ctx context.Context, postHookRunner schemas.PostHookRunner, key schemas.Key, request *schemas.BifrostResponsesRequest) (chan *schemas.BifrostStream, *schemas.BifrostError) {
	// Check if responses stream is allowed
	if err := providerUtils.CheckOperationAllowed(schemas.Cohere, provider.customProviderConfig, schemas.ResponsesStreamRequest); err != nil {
-		if _, ok := ctx.Value(schemas.BifrostContextKeyIntegrationRequest).(bool); ok {
-			ctx = context.WithValue(ctx, schemas.BifrostContextKeyIsResponsesToChatCompletionFallback, true)
-			return provider.ChatCompletionStream(
+		if ctx, shouldFallback := providerUtils.ShouldAttemptIntegrationFallback(ctx); shouldFallback {
+			ctx = context.WithValue(ctx, schemas.BifrostContextKeyIsResponsesToChatCompletionFallback, true)
+			return provider.ChatCompletionStream(
				ctx,
				postHookRunner,
				key,
				request.ToChatRequest(),
			)
		}
		return nil, err
	}

This makes Cohere’s behavior consistent with OpenAI/Anthropic/Bedrock and removes the mutual-recursion hazard.

Also applies to: 348-355, 523-538, 590-603

🧹 Nitpick comments (2)

core/providers/utils/utils.go (1)
1027-1046: Integration fallback guard looks correct; consider checking the bool value explicitly

The one-shot guard using BifrostContextKeyIntegrationFallbackAttempted is a solid pattern and matches how providers use it to avoid mutual recursion. For extra clarity and robustness, you might also check the bool value of BifrostContextKeyIntegrationRequest rather than just its presence:
-func ShouldAttemptIntegrationFallback(ctx context.Context) (context.Context, bool) {
-	// Check if this is an integration request
-	if _, ok := ctx.Value(schemas.BifrostContextKeyIntegrationRequest).(bool); !ok {
+func ShouldAttemptIntegrationFallback(ctx context.Context) (context.Context, bool) {
+	// Check if this is an integration request
+	if isIntegration, ok := ctx.Value(schemas.BifrostContextKeyIntegrationRequest).(bool); !ok || !isIntegration {
 		return ctx, false
 	}
This keeps behavior the same for current callers (which set the flag to true) but avoids surprising behavior if the flag is ever present and false.
core/schemas/bifrost.go (1)

123-124: Consider aligning naming conventions for stream fallback keys.

The naming is slightly inconsistent:

BifrostContextKeyIsResponsesToChatCompletionFallback (no "Stream" suffix)

BifrostContextKeyIsChatCompletionToResponsesStreamFallback (has "Stream" suffix)

Both are used for streaming fallbacks (ResponsesStream → ChatCompletionStream and ChatCompletionStream → ResponsesStream). Consider either:

Adding "Stream" to the first key: BifrostContextKeyIsResponsesToChatCompletionStreamFallback

Removing "Stream" from the second for consistency

This is a minor naming nit that won't affect functionality.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 8dca6ec and e2f10ba.

📒 Files selected for processing (8)

core/providers/anthropic/anthropic.go (4 hunks)
core/providers/bedrock/bedrock.go (4 hunks)
core/providers/cohere/cohere.go (4 hunks)
core/providers/gemini/gemini.go (4 hunks)
core/providers/openai/openai.go (4 hunks)
core/providers/utils/utils.go (1 hunks)
core/schemas/bifrost.go (1 hunks)
transports/bifrost-http/integrations/router.go (1 hunks)

🚧 Files skipped from review as they are similar to previous changes (1)

transports/bifrost-http/integrations/router.go

🧰 Additional context used

📓 Path-based instructions (1)

**

⚙️ CodeRabbit configuration file

always check the stack if there is one for the current PR. do not give localized reviews for the PR, always see all changes in the light of the whole stack of PRs (if there is a stack, if there is no stack you can continue to make localized suggestions/reviews)

Files:

core/providers/utils/utils.go
core/providers/cohere/cohere.go
core/providers/bedrock/bedrock.go
core/providers/gemini/gemini.go
core/providers/anthropic/anthropic.go
core/schemas/bifrost.go
core/providers/openai/openai.go

🧬 Code graph analysis (2)

core/providers/utils/utils.go (1)

core/schemas/bifrost.go (2)

BifrostContextKeyIntegrationRequest (109-109)

BifrostContextKeyIntegrationFallbackAttempted (110-110)

core/providers/cohere/cohere.go (3)

core/schemas/bifrost.go (5)

BifrostContextKeyIntegrationRequest (109-109)

RequestType (83-83)

ChatCompletionRequest (89-89)

ResponsesRequest (91-91)

BifrostContextKeyIsResponsesToChatCompletionFallback (123-123)

core/schemas/provider.go (1)

Provider (282-309)

transports/bifrost-http/handlers/inference.go (1)

ResponsesRequest (257-261)

⏰ Context from checks skipped due to timeout of 900000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (23)

GitHub Check: Graphite / mergeability_check
GitHub Check: Graphite / mergeability_check
GitHub Check: Graphite / mergeability_check
GitHub Check: Graphite / mergeability_check
GitHub Check: Graphite / mergeability_check
GitHub Check: Graphite / mergeability_check
GitHub Check: Graphite / mergeability_check
GitHub Check: Graphite / mergeability_check
GitHub Check: Graphite / mergeability_check
GitHub Check: Graphite / mergeability_check
GitHub Check: Graphite / mergeability_check
GitHub Check: Graphite / mergeability_check
GitHub Check: Graphite / mergeability_check
GitHub Check: Graphite / mergeability_check
GitHub Check: Graphite / mergeability_check
GitHub Check: Graphite / mergeability_check
GitHub Check: Graphite / mergeability_check
GitHub Check: Graphite / mergeability_check
GitHub Check: Graphite / mergeability_check
GitHub Check: Graphite / mergeability_check
GitHub Check: Graphite / mergeability_check
GitHub Check: Graphite / mergeability_check
GitHub Check: Graphite / mergeability_check

🔇 Additional comments (8)

core/providers/anthropic/anthropic.go (1)

311-325: Anthropic integration fallbacks now use shared guard and avoid recursion

The use of providerUtils.ShouldAttemptIntegrationFallback in both Chat↔Responses and stream variants cleanly limits integration fallbacks to a single attempt and prevents the earlier mutual-recursion risk when both operations are disallowed. The extra-fields remapping on converted responses also looks consistent with the rest of the provider.

Also applies to: 378-383, 690-705, 756-767

core/providers/openai/openai.go (1)

613-627: OpenAI integration fallbacks correctly centralized and recursion-safe

These changes route integration-only fallbacks through ShouldAttemptIntegrationFallback, which cleanly prevents mutual recursion between Chat and Responses (and stream variants) when both operations are disallowed. The pattern is consistent with other providers and keeps fallback behavior scoped to integration requests.

Also applies to: 733-739, 1115-1129, 1233-1241

core/providers/bedrock/bedrock.go (1)

679-692: Bedrock integration fallbacks correctly use shared helper and mirror other providers

The Bedrock Chat↔Responses and stream fallbacks now route through ShouldAttemptIntegrationFallback, so only integration requests get a single cross-endpoint fallback and mutual recursion is avoided when both operations are gated off. The additional stream context flags are preserved for downstream handling. This is consistent with the patterns in OpenAI and Anthropic.

Also applies to: 763-767, 935-948, 1020-1030

core/providers/gemini/gemini.go (4)

272-275: Stream fallback correctly sets context flag for response type tracking.

Setting BifrostContextKeyIsChatCompletionToResponsesStreamFallback enables proper response type handling downstream when the ResponsesStream output needs to be converted/returned as ChatCompletionStream format.

516-528: Responses → ChatCompletion fallback mirrors the reverse pattern correctly.

The fallback logic properly converts the request/response and sets RequestType to ResponsesRequest to maintain API contract consistency for the caller.

575-578: Stream fallback correctly delegates to ChatCompletionStream.

The BifrostContextKeyIsResponsesToChatCompletionFallback flag enables the ChatCompletionStream handler to properly format responses for the Responses API contract if needed.

220-232: Verify infinite loop prevention in ShouldAttemptIntegrationFallback.

The ChatCompletion → Responses fallback pattern looks correct: it checks operation eligibility via ShouldAttemptIntegrationFallback, converts requests/responses appropriately, and preserves the original RequestType in ExtraFields. However, confirm that ShouldAttemptIntegrationFallback sets BifrostContextKeyIntegrationFallbackAttempted in the returned context to prevent ChatCompletion → Responses → ChatCompletion loops when both endpoints are disabled, and verify this flag is checked consistently across all four fallback paths (ChatCompletion→Responses, ChatCompletionStream→ResponsesStream, Responses→ChatCompletion, ResponsesStream→ChatCompletionStream).

core/schemas/bifrost.go (1)

109-110: New integration fallback context keys are well-structured.

BifrostContextKeyIntegrationRequest marks integration-sourced requests, and BifrostContextKeyIntegrationFallbackAttempted acts as a circuit breaker to prevent infinite fallback loops between ChatCompletion and Responses endpoints.

coderabbitai

Actionable comments posted: 3

♻️ Duplicate comments (5)

core/providers/cohere/cohere.go (4)
350-353: Recursion guard incomplete: ChatCompletionStream must check flag before fallback.

The past reviewer explicitly stated: "❌ But no method checks for the flag to prevent recursion" and requested that each streaming method check the complementary flag at entry to short-circuit the fallback chain.

Currently, this method sets a flag before calling ResponsesStream, but if ResponsesStream operation is also disallowed and attempts to fall back, this method doesn't check whether it's already in a fallback from ResponsesStream.

Add the check for the complementary flag:
 if err := providerUtils.CheckOperationAllowed(schemas.Cohere, provider.customProviderConfig, schemas.ChatCompletionStreamRequest); err != nil {
-	if ctx, shouldFallback := providerUtils.ShouldAttemptIntegrationFallback(ctx); shouldFallback {
+	if ctx, shouldFallback := providerUtils.ShouldAttemptIntegrationFallback(ctx); shouldFallback && ctx.Value(schemas.BifrostContextKeyIsResponsesToChatCompletionFallback) == nil {
 		ctx = context.WithValue(ctx, schemas.BifrostContextKeyIsChatCompletionToResponsesStreamFallback, true)
 		return provider.ResponsesStream(ctx, postHookRunner, key, request.ToResponsesRequest())
Based on past review comments.

283-295: Recursion guard incomplete: ChatCompletion must check and set fallback flag.

Per the detailed past review feedback, the fallback implementation is incomplete. When both ChatCompletionRequest and ResponsesRequest are disabled, the current code allows infinite recursion: ChatCompletion → Responses → ChatCompletion → loop continues.

The missing safeguards:

Check if BifrostContextKeyIsResponsesToChatCompletionFallback is already set (indicating we're in a fallback from Responses)

Set BifrostContextKeyIsChatCompletionToResponsesStreamFallback before calling provider.Responses

Apply this pattern:
 if err := providerUtils.CheckOperationAllowed(schemas.Cohere, provider.customProviderConfig, schemas.ChatCompletionRequest); err != nil {
-	if ctx, shouldFallback := providerUtils.ShouldAttemptIntegrationFallback(ctx); shouldFallback {
+	if ctx, shouldFallback := providerUtils.ShouldAttemptIntegrationFallback(ctx); shouldFallback && ctx.Value(schemas.BifrostContextKeyIsResponsesToChatCompletionFallback) == nil {
+		ctx = context.WithValue(ctx, schemas.BifrostContextKeyIsChatCompletionToResponsesStreamFallback, true)
 		responsesResponse, err := provider.Responses(ctx, key, request.ToResponsesRequest())
Based on past review comments and the pattern required across all four methods.

526-538: Recursion guard incomplete: Responses must check and set fallback flag.

This method has the same infinite recursion vulnerability as ChatCompletion. When both operations are disabled, Responses → ChatCompletion → Responses → loop continues.

The missing safeguards:

Check if BifrostContextKeyIsChatCompletionToResponsesStreamFallback is already set (indicating we're in a fallback from ChatCompletion)

Set BifrostContextKeyIsResponsesToChatCompletionFallback before calling provider.ChatCompletion

Apply this pattern:
 if err := providerUtils.CheckOperationAllowed(schemas.Cohere, provider.customProviderConfig, schemas.ResponsesRequest); err != nil {
-	if ctx, shouldFallback := providerUtils.ShouldAttemptIntegrationFallback(ctx); shouldFallback {
+	if ctx, shouldFallback := providerUtils.ShouldAttemptIntegrationFallback(ctx); shouldFallback && ctx.Value(schemas.BifrostContextKeyIsChatCompletionToResponsesStreamFallback) == nil {
+		ctx = context.WithValue(ctx, schemas.BifrostContextKeyIsResponsesToChatCompletionFallback, true)
 		chatResponse, err := provider.ChatCompletion(ctx, key, request.ToChatRequest())
Based on past review comments.

593-601: Recursion guard incomplete: ResponsesStream must check flag before fallback.

As noted in past reviews, this method sets a flag but doesn't check whether it's already in a fallback from ChatCompletionStream, creating a potential infinite loop.

Add the check for the complementary flag:
 if err := providerUtils.CheckOperationAllowed(schemas.Cohere, provider.customProviderConfig, schemas.ResponsesStreamRequest); err != nil {
-	if ctx, shouldFallback := providerUtils.ShouldAttemptIntegrationFallback(ctx); shouldFallback {
+	if ctx, shouldFallback := providerUtils.ShouldAttemptIntegrationFallback(ctx); shouldFallback && ctx.Value(schemas.BifrostContextKeyIsChatCompletionToResponsesStreamFallback) == nil {
 		ctx = context.WithValue(ctx, schemas.BifrostContextKeyIsResponsesToChatCompletionFallback, true)
 		return provider.ChatCompletionStream(
Based on past review comments.
core/providers/bedrock/bedrock.go (1)
936-948: Recursion guard incomplete: Responses must check and set fallback flag.

This method has the same mutual recursion vulnerability. When both operations are disabled, the methods call each other indefinitely.

The missing safeguards:

Check if BifrostContextKeyIsChatCompletionToResponsesStreamFallback is already set

Set BifrostContextKeyIsResponsesToChatCompletionFallback before calling provider.ChatCompletion

Apply this pattern:
 if err := providerUtils.CheckOperationAllowed(schemas.Bedrock, provider.customProviderConfig, schemas.ResponsesRequest); err != nil {
-	if ctx, shouldFallback := providerUtils.ShouldAttemptIntegrationFallback(ctx); shouldFallback {
+	if ctx, shouldFallback := providerUtils.ShouldAttemptIntegrationFallback(ctx); shouldFallback && ctx.Value(schemas.BifrostContextKeyIsChatCompletionToResponsesStreamFallback) == nil {
+		ctx = context.WithValue(ctx, schemas.BifrostContextKeyIsResponsesToChatCompletionFallback, true)
 		chatResponse, err := provider.ChatCompletion(ctx, key, request.ToChatRequest())
Based on past review comments.

🧹 Nitpick comments (2)

core/schemas/bifrost.go (1)

103-125: Context keys look consistent; consider updating the stale comment.

The new integration-related context keys fit well with the usage in providers/utils and provider fallbacks. The only nit is that the preceding comment (BifrostContextKeyRequestType is a context key for the request type.) no longer matches the contents of this const block and can be removed or generalized to avoid confusion.
core/providers/utils/utils.go (1)
1028-1046: Integration fallback guard is correct; minor robustness tweak possible.

The helper correctly:

Restricts fallback to integration flows (by key presence), and

Prevents repeated fallbacks on the same context using BifrostContextKeyIntegrationFallbackAttempted.

If you ever intend BifrostContextKeyIntegrationRequest to act as an on/off toggle, you might tighten the check to also require true, e.g.:
-func ShouldAttemptIntegrationFallback(ctx context.Context) (context.Context, bool) {
-	// Check if this is an integration request
-	if _, ok := ctx.Value(schemas.BifrostContextKeyIntegrationRequest).(bool); !ok {
+func ShouldAttemptIntegrationFallback(ctx context.Context) (context.Context, bool) {
+	// Check if this is an integration request
+	if isIntegration, ok := ctx.Value(schemas.BifrostContextKeyIntegrationRequest).(bool); !ok || !isIntegration {
 		return ctx, false
 	}
Not required for correctness today, but makes the semantics future‑proof.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e2f10ba and d1090eb.

📒 Files selected for processing (8)

core/providers/anthropic/anthropic.go (4 hunks)
core/providers/bedrock/bedrock.go (4 hunks)
core/providers/cohere/cohere.go (4 hunks)
core/providers/gemini/gemini.go (4 hunks)
core/providers/openai/openai.go (4 hunks)
core/providers/utils/utils.go (1 hunks)
core/schemas/bifrost.go (1 hunks)
transports/bifrost-http/integrations/router.go (1 hunks)

🚧 Files skipped from review as they are similar to previous changes (1)

transports/bifrost-http/integrations/router.go

🧰 Additional context used

📓 Path-based instructions (1)

**

⚙️ CodeRabbit configuration file

always check the stack if there is one for the current PR. do not give localized reviews for the PR, always see all changes in the light of the whole stack of PRs (if there is a stack, if there is no stack you can continue to make localized suggestions/reviews)

Files:

core/providers/cohere/cohere.go
core/providers/utils/utils.go
core/providers/gemini/gemini.go
core/providers/anthropic/anthropic.go
core/providers/openai/openai.go
core/schemas/bifrost.go
core/providers/bedrock/bedrock.go

🧬 Code graph analysis (3)

core/providers/utils/utils.go (1)

core/schemas/bifrost.go (2)

BifrostContextKeyIntegrationRequest (109-109)

BifrostContextKeyIntegrationFallbackAttempted (110-110)

core/providers/gemini/gemini.go (2)

core/providers/utils/utils.go (1)

ShouldAttemptIntegrationFallback (1032-1046)

core/schemas/bifrost.go (5)

RequestType (83-83)

ChatCompletionRequest (89-89)

BifrostContextKeyIsChatCompletionToResponsesStreamFallback (124-124)

ResponsesRequest (91-91)

BifrostContextKeyIsResponsesToChatCompletionFallback (123-123)

core/providers/openai/openai.go (3)

core/providers/utils/utils.go (1)

ShouldAttemptIntegrationFallback (1032-1046)

core/schemas/bifrost.go (5)

RequestType (83-83)

ChatCompletionRequest (89-89)

BifrostContextKeyIsChatCompletionToResponsesStreamFallback (124-124)

ResponsesRequest (91-91)

BifrostContextKeyIsResponsesToChatCompletionFallback (123-123)

core/schemas/provider.go (1)

Provider (282-309)

⏰ Context from checks skipped due to timeout of 900000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)

GitHub Check: Graphite / mergeability_check
GitHub Check: Graphite / mergeability_check
GitHub Check: Graphite / mergeability_check

🔇 Additional comments (3)

core/providers/anthropic/anthropic.go (1)

311-324: Integration-only cross-endpoint fallback is wired correctly and avoids recursion.

For both chat↔responses and their streaming variants you now:

Gate on CheckOperationAllowed.

Only attempt fallback when ShouldAttemptIntegrationFallback returns true (i.e., integration request and no prior fallback).

Rebind ctx with the updated value before delegating so the second endpoint sees BifrostContextKeyIntegrationFallbackAttempted and does not recurse.

Adjust ExtraFields on the synthesized response to reflect the original request type/provider/model.

This addresses the previous mutual-recursion risk without affecting non-integration traffic.

Also applies to: 378-383, 690-703, 756-765

core/providers/openai/openai.go (1)

613-627: OpenAI integration fallbacks are consistent and recursion-safe.

The OpenAI provider now mirrors the shared integration-fallback design:

Non-streaming: on disallowed Chat or Responses ops, you attempt a single integration-only fallback to the opposite endpoint, convert the result (ToBifrostChatResponse / ToBifrostResponsesResponse), and reset ExtraFields (RequestType, Provider, ModelRequested) to the original request.

Streaming: similar logic, but you tag the context with the appropriate stream-fallback key before delegating, enabling downstream handling while ShouldAttemptIntegrationFallback’s sentinel prevents mutual recursion.

This is a clean resolution of the earlier recursion concerns while keeping behavior scoped to integration requests.

Also applies to: 733-739, 1115-1129, 1235-1240

core/providers/gemini/gemini.go (1)

217-232: Gemini fallbacks match the shared integration pattern and are safe.

These changes bring Gemini in line with the other providers:

For non-streaming, a disallowed Chat/Responses op causes a single integration-scoped fallback to the opposite endpoint, with the response converted and ExtraFields reset to the original request type/provider/model.

For streaming, you set the relevant fallback context key and delegate to the opposite stream method; any second attempt to fallback is blocked by ShouldAttemptIntegrationFallback.

This keeps the behavior consistent across providers and avoids mutual recursion when custom configs gate multiple endpoints.

Also applies to: 269-275, 515-528, 571-578

core/providers/bedrock/bedrock.go

coderabbitai

Actionable comments posted: 1

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between d1090eb and dcf76b0.

📒 Files selected for processing (8)

core/providers/anthropic/anthropic.go (4 hunks)
core/providers/bedrock/bedrock.go (4 hunks)
core/providers/cohere/cohere.go (4 hunks)
core/providers/gemini/gemini.go (4 hunks)
core/providers/openai/openai.go (4 hunks)
core/providers/utils/utils.go (1 hunks)
core/schemas/bifrost.go (1 hunks)
transports/bifrost-http/integrations/router.go (1 hunks)

🚧 Files skipped from review as they are similar to previous changes (4)

transports/bifrost-http/integrations/router.go
core/providers/cohere/cohere.go
core/providers/utils/utils.go
core/schemas/bifrost.go

🧰 Additional context used

📓 Path-based instructions (1)

**

⚙️ CodeRabbit configuration file

always check the stack if there is one for the current PR. do not give localized reviews for the PR, always see all changes in the light of the whole stack of PRs (if there is a stack, if there is no stack you can continue to make localized suggestions/reviews)

Files:

core/providers/anthropic/anthropic.go
core/providers/gemini/gemini.go
core/providers/bedrock/bedrock.go
core/providers/openai/openai.go

🧬 Code graph analysis (3)

core/providers/gemini/gemini.go (2)

core/providers/utils/utils.go (1)

ShouldAttemptIntegrationFallback (1032-1046)

core/schemas/bifrost.go (5)

RequestType (83-83)

ChatCompletionRequest (89-89)

BifrostContextKeyIsChatCompletionToResponsesStreamFallback (124-124)

ResponsesRequest (91-91)

BifrostContextKeyIsResponsesToChatCompletionFallback (123-123)

core/providers/bedrock/bedrock.go (4)

core/providers/utils/utils.go (1)

ShouldAttemptIntegrationFallback (1032-1046)

core/schemas/bifrost.go (5)

RequestType (83-83)

ChatCompletionRequest (89-89)

BifrostContextKeyIsChatCompletionToResponsesStreamFallback (124-124)

ResponsesRequest (91-91)

BifrostContextKeyIsResponsesToChatCompletionFallback (123-123)

core/schemas/provider.go (1)

Provider (282-309)

core/schemas/models.go (1)

Model (109-129)

core/providers/openai/openai.go (2)

core/providers/utils/utils.go (1)

ShouldAttemptIntegrationFallback (1032-1046)

core/schemas/bifrost.go (5)

RequestType (83-83)

ChatCompletionRequest (89-89)

BifrostContextKeyIsChatCompletionToResponsesStreamFallback (124-124)

ResponsesRequest (91-91)

BifrostContextKeyIsResponsesToChatCompletionFallback (123-123)

⏰ Context from checks skipped due to timeout of 900000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)

GitHub Check: Graphite / mergeability_check
GitHub Check: Graphite / mergeability_check
GitHub Check: Graphite / mergeability_check
GitHub Check: Graphite / mergeability_check
GitHub Check: Graphite / mergeability_check
GitHub Check: Graphite / mergeability_check

🔇 Additional comments (15)

core/providers/bedrock/bedrock.go (4)

680-692: LGTM: ChatCompletion fallback correctly implemented.

The fallback from ChatCompletion to Responses is well-structured:

ShouldAttemptIntegrationFallback provides sufficient recursion protection (confirmed in past review discussion)

Request/response conversion properly handled via ToResponsesRequest() and ToBifrostChatResponse()

ExtraFields correctly populated to indicate the original request type

The implementation is consistent with the broader stack pattern across all providers.

764-767: LGTM: ChatCompletionStream fallback correctly implemented.

The streaming fallback is properly structured:

Delegates directly to ResponsesStream while maintaining streaming semantics

Sets BifrostContextKeyIsChatCompletionToResponsesStreamFallback to track conversion direction for response handling

Recursion protection via ShouldAttemptIntegrationFallback is sufficient (as confirmed in past reviews)

936-947: LGTM: Responses fallback correctly implemented.

The fallback from Responses to ChatCompletion mirrors the ChatCompletion fallback pattern:

Proper request/response conversion via ToChatRequest() and ToBifrostResponsesResponse()

ExtraFields consistently populated

Recursion protection in place

The symmetric implementation across ChatCompletion↔Responses ensures predictable fallback behavior.

1022-1030: LGTM: ResponsesStream fallback correctly implemented.

The streaming fallback completes the symmetric fallback implementation:

Delegates to ChatCompletionStream with proper request conversion

Sets BifrostContextKeyIsResponsesToChatCompletionFallback for response handling

Maintains streaming semantics through direct channel return

All four fallback paths (ChatCompletion↔Responses and their streaming variants) are consistently implemented, providing robust cross-endpoint compatibility for integration requests.

core/providers/gemini/gemini.go (4)

272-275: LGTM: Streaming fallback correctly delegates to ResponsesStream.

The context flag BifrostContextKeyIsChatCompletionToResponsesStreamFallback is appropriately set to mark the fallback path for downstream processing. The pattern is consistent with the non-streaming fallback approach.

516-528: LGTM: Responses-to-ChatCompletion fallback mirrors the inverse pattern.

The fallback logic is consistent with the ChatCompletion implementation, properly converting responses and maintaining metadata. The symmetry between the two directions is well-implemented.

575-578: LGTM: ResponsesStream fallback completes the bidirectional fallback implementation.

The context flag BifrostContextKeyIsResponsesToChatCompletionFallback properly marks the fallback direction. The implementation is consistent with the other three fallback paths.

220-232: LGTM: Integration fallback correctly implemented with proper recursion guard.

The ChatCompletion-to-Responses fallback follows the established pattern across providers: it checks via ShouldAttemptIntegrationFallback whether fallback should proceed, delegates to the alternative Responses endpoint, converts the response using ToBifrostChatResponse(), and preserves metadata in ExtraFields. The guard prevents infinite recursion, and the implementation mirrors OpenAI, Cohere, Bedrock, and Anthropic.

core/providers/openai/openai.go (3)

615-627: LGTM! Recursion guard correctly implemented.

The fallback mechanism properly uses ShouldAttemptIntegrationFallback which marks the fallback as attempted before returning. When the inner Responses call also fails its operation check, ShouldAttemptIntegrationFallback will return false (due to BifrostContextKeyIntegrationFallbackAttempted being set), preventing infinite recursion.

1117-1129: LGTM! Consistent with the ChatCompletion fallback implementation.

The fallback from Responses to ChatCompletion follows the same pattern with proper recursion guarding via ShouldAttemptIntegrationFallback.

1236-1239: LGTM! Streaming fallback with response conversion handling.

The fallback correctly sets BifrostContextKeyIsResponsesToChatCompletionFallback, which is checked in HandleOpenAIChatCompletionStreaming (lines 785-792) to enable conversion of chat streaming responses to the responses stream format (lines 954-1008).

core/providers/anthropic/anthropic.go (4)

311-325: Chat→Responses non-stream fallback looks correct and integration-scoped

The fallback is cleanly gated on ShouldAttemptIntegrationFallback, delegates via request.ToResponsesRequest(), and then re-normalizes ExtraFields to Chat semantics, which matches the cross-endpoint design without obvious recursion or data loss in this layer.

378-383: Streaming Chat→Responses fallback wiring is consistent

The streaming fallback correctly reuses ResponsesStream, tags the context with BifrostContextKeyIsChatCompletionToResponsesStreamFallback, and is gated by ShouldAttemptIntegrationFallback, which keeps the behavior integration-specific and ready for downstream converters.

690-704: Responses→Chat non-stream fallback is symmetrical and semantically correct

This block mirrors the Chat fallback: integration-only delegation via ShouldAttemptIntegrationFallback, request conversion with request.ToChatRequest(), and re-normalization of ExtraFields to ResponsesRequest, which keeps external behavior consistent.

756-765: Streaming Responses→Chat fallback mirrors the Chat path appropriately

The streaming Responses fallback is the expected inverse of the Chat path: it’s gated on ShouldAttemptIntegrationFallback, marks the context with BifrostContextKeyIsResponsesToChatCompletionFallback, and reuses ChatCompletionStream, which is aligned with the stack-wide integration-fallback design.

core/providers/openai/openai.go

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (1)

core/schemas/bifrost.go (1)
103-104: Consider updating the block comment.

The comment states "BifrostContextKeyRequestType is a context key for the request type," but this const block now contains many diverse context keys beyond just request type (virtual key, request IDs, integration flags, etc.). A more general description would better reflect the block's contents.
-// BifrostContextKeyRequestType is a context key for the request type.
+// BifrostContextKey constants define context keys used throughout Bifrost for request metadata, integration handling, and response control.
 const (

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between dcf76b0 and 42528f8.

📒 Files selected for processing (8)

core/providers/anthropic/anthropic.go (2 hunks)
core/providers/bedrock/bedrock.go (2 hunks)
core/providers/cohere/cohere.go (2 hunks)
core/providers/gemini/gemini.go (2 hunks)
core/providers/openai/openai.go (2 hunks)
core/providers/utils/utils.go (1 hunks)
core/schemas/bifrost.go (1 hunks)
transports/bifrost-http/integrations/router.go (1 hunks)

🚧 Files skipped from review as they are similar to previous changes (2)

transports/bifrost-http/integrations/router.go
core/providers/cohere/cohere.go

🧰 Additional context used

📓 Path-based instructions (1)

**

⚙️ CodeRabbit configuration file

always check the stack if there is one for the current PR. do not give localized reviews for the PR, always see all changes in the light of the whole stack of PRs (if there is a stack, if there is no stack you can continue to make localized suggestions/reviews)

Files:

core/providers/utils/utils.go
core/providers/openai/openai.go
core/providers/gemini/gemini.go
core/providers/anthropic/anthropic.go
core/providers/bedrock/bedrock.go
core/schemas/bifrost.go

🧠 Learnings (1)

📚 Learning: 2025-12-09T17:07:42.007Z

Learnt from: qwerty-dvorak
Repo: maximhq/bifrost PR: 1006
File: core/schemas/account.go:9-18
Timestamp: 2025-12-09T17:07:42.007Z
Learning: In core/schemas/account.go, the HuggingFaceKeyConfig field within the Key struct is currently unused and reserved for future Hugging Face inference endpoint deployments. Do not flag this field as missing from OpenAPI documentation or require its presence in the API spec until the feature is actively implemented and used. When the feature is added, update the OpenAPI docs accordingly; otherwise, treat this field as non-breaking and not part of the current API surface.

Applied to files:

core/providers/utils/utils.go
core/providers/openai/openai.go
core/providers/gemini/gemini.go
core/providers/anthropic/anthropic.go
core/providers/bedrock/bedrock.go
core/schemas/bifrost.go

🧬 Code graph analysis (4)

core/providers/utils/utils.go (1)

core/schemas/bifrost.go (2)

BifrostContextKeyIntegrationRequest (109-109)

BifrostContextKeyIntegrationFallbackAttempted (110-110)

core/providers/openai/openai.go (3)

core/providers/utils/utils.go (1)

ShouldAttemptIntegrationFallback (1032-1046)

core/schemas/bifrost.go (3)

RequestType (83-83)

ResponsesRequest (91-91)

BifrostContextKeyIsResponsesToChatCompletionFallback (123-123)

transports/bifrost-http/handlers/inference.go (1)

ResponsesRequest (257-261)

core/providers/anthropic/anthropic.go (3)

core/providers/utils/utils.go (1)

ShouldAttemptIntegrationFallback (1032-1046)

core/schemas/bifrost.go (2)

ResponsesRequest (91-91)

BifrostContextKeyIsResponsesToChatCompletionFallback (123-123)

core/schemas/provider.go (1)

Provider (282-309)

core/providers/bedrock/bedrock.go (5)

core/providers/utils/utils.go (1)

ShouldAttemptIntegrationFallback (1032-1046)

core/schemas/bifrost.go (3)

RequestType (83-83)

ResponsesRequest (91-91)

BifrostContextKeyIsResponsesToChatCompletionFallback (123-123)

transports/bifrost-http/handlers/inference.go (1)

ResponsesRequest (257-261)

core/schemas/provider.go (1)

Provider (282-309)

core/schemas/models.go (1)

Model (109-129)

⏰ Context from checks skipped due to timeout of 900000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (9)

GitHub Check: Graphite / mergeability_check
GitHub Check: Graphite / mergeability_check
GitHub Check: Graphite / mergeability_check
GitHub Check: Graphite / mergeability_check
GitHub Check: Graphite / mergeability_check
GitHub Check: Graphite / mergeability_check
GitHub Check: Graphite / mergeability_check
GitHub Check: Graphite / mergeability_check
GitHub Check: Graphite / mergeability_check

🔇 Additional comments (11)

core/providers/openai/openai.go (2)

1099-1112: Integration fallback for Responses is correctly scoped and preserves metadata.

The guarded use of ShouldAttemptIntegrationFallback plus request.ToChatRequest() and ToBifrostResponsesResponse() gives a single, integration-only fallback path while keeping provider/model and latency from the chat path, and avoids recursion via the fallback-attempt flag.

1218-1222: Streaming fallback wiring aligns with existing chat→responses conversion.

Delegating to ChatCompletionStream with BifrostContextKeyIsResponsesToChatCompletionFallback set matches the pattern used in HandleOpenAIChatCompletionStreaming, which already converts chat chunks into ResponsesStream chunks under this flag; this looks consistent and safe.

core/providers/anthropic/anthropic.go (2)

673-686: Non‑streaming Responses→ChatCompletion fallback is straightforward and safe.

The integration-only fallback uses ShouldAttemptIntegrationFallback and request.ToChatRequest(), then converts back via ToBifrostResponsesResponse() while restoring ExtraFields for responses; recursion is prevented by the shared fallback-attempt flag.

739-747: Streaming integration fallback is wired, but verify response shape expectations.

Responses-stream integration fallbacks now delegate to ChatCompletionStream with the BifrostContextKeyIsResponsesToChatCompletionFallback flag set; ensure downstream streaming handlers or HTTP integration code recognize this flag and, where required, emit responses-style chunks for /responses integrations rather than raw chat chunks.

core/providers/bedrock/bedrock.go (2)

918-931: Non‑streaming Responses fallback for Bedrock is consistent with other providers.

Using ShouldAttemptIntegrationFallback to gate a single integration-only call into ChatCompletion and mapping back via ToBifrostResponsesResponse() with corrected ExtraFields gives the intended cross-endpoint behavior without introducing recursion.

1005-1013: Streaming Responses→ChatCompletion fallback is correctly guarded but relies on downstream handling.

The integration-only fallback with ShouldAttemptIntegrationFallback plus the BifrostContextKeyIsResponsesToChatCompletionFallback marker is sound; confirm that Bedrock's chat streaming path (or integration handlers) treat this flag appropriately so /responses integrations still see responses-stream semantics.

core/providers/utils/utils.go (1)

1028-1045: ShouldAttemptIntegrationFallback cleanly centralizes integration-only, single-attempt fallback gating.

Logic is minimal and correct: it restricts fallback to integration requests and uses a context flag to guarantee only one attempt per request; existing call sites correctly reassign the returned context before delegating.

core/providers/gemini/gemini.go (2)

502-514: Non‑streaming Responses→ChatCompletion fallback for Gemini is consistent and safe.

The guarded integration-only fallback to ChatCompletion plus conversion via ToBifrostResponsesResponse() and corrected ExtraFields mirrors other providers and avoids recursion through the shared fallback-attempt context flag.

561-564: Confirm that Gemini's ChatCompletionStream preserves the /responses streaming contract when BifrostContextKeyIsResponsesToChatCompletionFallback is set.

When a ResponsesStream request falls back to ChatCompletionStream, verify that:

The Gemini streaming path or HTTP integration layer inspects the BifrostContextKeyIsResponsesToChatCompletionFallback flag

Responses-style stream chunks are emitted (not raw chat completion chunks) when this flag is true

The streaming contract for /responses integrations is maintained through the fallback

core/schemas/bifrost.go (2)

109-110: LGTM on integration context keys.

The new BifrostContextKeyIntegrationRequest and BifrostContextKeyIntegrationFallbackAttempted keys appropriately support the cross-endpoint fallback feature. The IntegrationFallbackAttempted flag provides a clean mechanism to prevent infinite fallback loops.

123-123: Verify: Is a reverse-direction fallback key needed for chat completion → responses?

The PR objectives mention bidirectional fallback support (chat completion ↔ responses), but only BifrostContextKeyIsResponsesToChatCompletionFallback is defined. If the reverse direction (chat completion → responses fallback) requires special handling (e.g., response transformation, logging, or plugin logic differentiation), a corresponding BifrostContextKeyIsChatCompletionToResponsesFallback key should be added.

If the reverse direction uses generic fallback handling without direction-specific context flagging, the single key is sufficient.

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (1)

core/providers/cohere/cohere.go (1)

505-519: Integration fallback from Responses to ChatCompletion looks correct; only nit is the comment text.

The new integration-aware fallback path is wired cleanly:

Uses providerUtils.CheckOperationAllowed(..., schemas.ResponsesRequest) and only falls back when disallowed.

Gates fallback via providerUtils.ShouldAttemptIntegrationFallback(ctx), so it only triggers for integration requests and at most once per context (no recursion risk even if both operations are disabled).

Converts the chat response to a BifrostResponsesResponse and explicitly sets ExtraFields.RequestType, Provider, and ModelRequested, keeping telemetry consistent with direct Responses calls.

Minor clarity nit: the inline comment on Line 505 still says “Check if chat completion is allowed” even though the check is for schemas.ResponsesRequest; consider updating it to “responses” to avoid confusion during future maintenance.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 42528f8 and 5edcf23.

📒 Files selected for processing (8)

core/providers/anthropic/anthropic.go (2 hunks)
core/providers/bedrock/bedrock.go (2 hunks)
core/providers/cohere/cohere.go (2 hunks)
core/providers/gemini/gemini.go (2 hunks)
core/providers/openai/openai.go (2 hunks)
core/providers/utils/utils.go (1 hunks)
core/schemas/bifrost.go (1 hunks)
transports/bifrost-http/integrations/router.go (1 hunks)

🚧 Files skipped from review as they are similar to previous changes (1)

core/schemas/bifrost.go

🧰 Additional context used

📓 Path-based instructions (1)

**

⚙️ CodeRabbit configuration file

always check the stack if there is one for the current PR. do not give localized reviews for the PR, always see all changes in the light of the whole stack of PRs (if there is a stack, if there is no stack you can continue to make localized suggestions/reviews)

Files:

core/providers/openai/openai.go
core/providers/anthropic/anthropic.go
core/providers/bedrock/bedrock.go
transports/bifrost-http/integrations/router.go
core/providers/gemini/gemini.go
core/providers/cohere/cohere.go
core/providers/utils/utils.go

🧠 Learnings (2)

📚 Learning: 2025-12-09T17:07:42.007Z

Learnt from: qwerty-dvorak
Repo: maximhq/bifrost PR: 1006
File: core/schemas/account.go:9-18
Timestamp: 2025-12-09T17:07:42.007Z
Learning: In core/schemas/account.go, the HuggingFaceKeyConfig field within the Key struct is currently unused and reserved for future Hugging Face inference endpoint deployments. Do not flag this field as missing from OpenAPI documentation or require its presence in the API spec until the feature is actively implemented and used. When the feature is added, update the OpenAPI docs accordingly; otherwise, treat this field as non-breaking and not part of the current API surface.

Applied to files:

core/providers/openai/openai.go
core/providers/anthropic/anthropic.go
core/providers/bedrock/bedrock.go
transports/bifrost-http/integrations/router.go
core/providers/gemini/gemini.go
core/providers/cohere/cohere.go
core/providers/utils/utils.go

📚 Learning: 2025-12-11T11:58:25.307Z

Learnt from: Pratham-Mishra04
Repo: maximhq/bifrost PR: 1000
File: core/providers/openai/responses.go:42-84
Timestamp: 2025-12-11T11:58:25.307Z
Learning: In core/providers/openai/responses.go (and related OpenAI response handling), document and enforce the API format constraint: if ResponsesReasoning != nil and the response contains content blocks, all content blocks should be treated as reasoning blocks by default. Implement type guards or parsing logic accordingly, and add unit tests to verify that when ResponsesReasoning is non-nil, content blocks are labeled as reasoning blocks. Include clear comments in the code explaining the rationale and ensure downstream consumers rely on this behavior.

Applied to files:

core/providers/openai/openai.go

🧬 Code graph analysis (5)

core/providers/openai/openai.go (4)

core/providers/utils/utils.go (1)

ShouldAttemptIntegrationFallback (1118-1132)

core/schemas/bifrost.go (3)

RequestType (83-83)

ResponsesRequest (91-91)

BifrostContextKeyIsResponsesToChatCompletionFallback (123-123)

ui/lib/types/config.ts (1)

RequestType (122-134)

transports/bifrost-http/handlers/inference.go (1)

ResponsesRequest (257-261)

core/providers/anthropic/anthropic.go (5)

core/providers/utils/utils.go (1)

ShouldAttemptIntegrationFallback (1118-1132)

core/schemas/bifrost.go (3)

RequestType (83-83)

ResponsesRequest (91-91)

BifrostContextKeyIsResponsesToChatCompletionFallback (123-123)

ui/lib/types/config.ts (1)

RequestType (122-134)

transports/bifrost-http/handlers/inference.go (1)

ResponsesRequest (257-261)

core/schemas/provider.go (1)

Provider (282-309)

transports/bifrost-http/integrations/router.go (1)

core/schemas/bifrost.go (1)

BifrostContextKeyIntegrationRequest (109-109)

core/providers/gemini/gemini.go (3)

core/providers/utils/utils.go (1)

ShouldAttemptIntegrationFallback (1118-1132)

core/schemas/bifrost.go (3)

RequestType (83-83)

ResponsesRequest (91-91)

BifrostContextKeyIsResponsesToChatCompletionFallback (123-123)

core/schemas/provider.go (1)

Provider (282-309)

core/providers/utils/utils.go (1)

core/schemas/bifrost.go (2)

BifrostContextKeyIntegrationRequest (109-109)

BifrostContextKeyIntegrationFallbackAttempted (110-110)

⏰ Context from checks skipped due to timeout of 900000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (7)

GitHub Check: Graphite / mergeability_check
GitHub Check: Graphite / mergeability_check
GitHub Check: Graphite / mergeability_check
GitHub Check: Graphite / mergeability_check
GitHub Check: Graphite / mergeability_check
GitHub Check: Graphite / mergeability_check
GitHub Check: Graphite / mergeability_check

🔇 Additional comments (8)

transports/bifrost-http/integrations/router.go (1)

378-380: Integration origin flag is set in the right place

Setting BifrostContextKeyIntegrationRequest on the bifrost context here cleanly scopes “integration” semantics to all downstream provider calls and enables the new ShouldAttemptIntegrationFallback helper without affecting non-integration paths.

core/providers/gemini/gemini.go (1)

501-515: Gemini Responses→ChatCompletion integration fallback is well-scoped and non-recursive

The new Responses/ResponsesStream fallbacks correctly:

Gate on CheckOperationAllowed and ShouldAttemptIntegrationFallback, so they only run for integration requests and only once per context.

Delegate to ChatCompletion / ChatCompletionStream using request.ToChatRequest() without introducing a reverse fallback, avoiding recursion.

Normalize ExtraFields back to ResponsesRequest semantics after the non-streaming fallback.

This is consistent with the cross-provider integration-fallback pattern introduced in this stack.

Also applies to: 565-569

core/providers/utils/utils.go (1)

1114-1132: ShouldAttemptIntegrationFallback cleanly centralizes integration-only fallback gating

This helper correctly:

Limits fallbacks to requests explicitly marked as integration-origin.

Ensures only a single fallback attempt per context via BifrostContextKeyIntegrationFallbackAttempted.

It’s a good abstraction point for the provider-level fallback logic added in this stack.

core/providers/bedrock/bedrock.go (1)

918-932: Bedrock Responses fallbacks align with the shared integration-fallback pattern

For Bedrock:

Non-streaming Responses now uses ShouldAttemptIntegrationFallback and a one-way Responses → ChatCompletion conversion with ToBifrostResponsesResponse, then resets ExtraFields to ResponsesRequest/provider/model semantics.

Streaming ResponsesStream mirrors this by gating with ShouldAttemptIntegrationFallback, tagging the context with BifrostContextKeyIsResponsesToChatCompletionFallback, and delegating to ChatCompletionStream.

Because Bedrock’s chat methods don’t themselves fallback to responses, this removes the earlier mutual-recursion risk while still providing integration-only cross-endpoint compatibility.

Also applies to: 1004-1014

core/providers/openai/openai.go (1)

1096-1112: OpenAI Responses fallbacks are correctly integration-scoped and reuse existing streaming adapters

The new OpenAI fallbacks:

Non-streaming: On disallowed ResponsesRequest, ShouldAttemptIntegrationFallback gates an integration-only, single-shot Responses → ChatCompletion fallback, with ToBifrostResponsesResponse() and ExtraFields rewritten to ResponsesRequest/provider/model semantics.

Streaming: On disallowed ResponsesStreamRequest, the same helper gates a ResponsesStream → ChatCompletionStream fallback and sets BifrostContextKeyIsResponsesToChatCompletionFallback, which HandleOpenAIChatCompletionStreaming already uses to emit BifrostResponsesStreamResponse chunks instead of chat chunks.

With ChatCompletion/ChatCompletionStream themselves no longer attempting reciprocal fallbacks, this removes the prior infinite-recursion risk while preserving the intended integration behavior.

Also applies to: 1218-1222

core/providers/cohere/cohere.go (1)

573-583: Streaming ResponsesStream → ChatCompletionStream fallback is safely guarded and consistent with non-streaming.

The streaming fallback mirrors the non-stream behavior:

Only attempts fallback for integration requests when ResponsesStream is disallowed, via ShouldAttemptIntegrationFallback, which also prevents multiple attempts.

Marks the direction with schemas.BifrostContextKeyIsResponsesToChatCompletionFallback before delegating, allowing downstream logic/observability to distinguish fallback streams.

On failure to fallback, it cleanly returns the original CheckOperationAllowed error.

This fits the cross-endpoint integration fallback design without introducing recursion or breaking existing behavior.

core/providers/anthropic/anthropic.go (2)

673-686: Non-stream Responses integration fallback to ChatCompletion is well-scoped and consistent.

This block correctly:

Checks ResponsesRequest via CheckOperationAllowed and only considers fallback on failure.

Uses ShouldAttemptIntegrationFallback(ctx) so fallback is limited to integration requests and only attempted once.

Delegates to ChatCompletion with request.ToChatRequest(), then maps back via ToBifrostResponsesResponse().

Resets ExtraFields.RequestType, Provider, and ModelRequested so callers still see this as a responses invocation.

The behavior is aligned with other providers and avoids recursion.

735-743: Streaming ResponsesStream fallback to ChatCompletionStream is correctly guarded and annotated.

For the streaming path:

Fallback is gated by ShouldAttemptIntegrationFallback, ensuring it’s integration-only and one-shot.

The context is annotated with BifrostContextKeyIsResponsesToChatCompletionFallback before invoking ChatCompletionStream, which is useful for downstream handling/telemetry.

If fallback isn’t allowed or fails the guard, the original CheckOperationAllowed error is returned, preserving existing semantics.

Overall, the streaming fallback implementation matches the intended integration behavior without introducing re-entry issues.

coderabbitai

Actionable comments posted: 3

🧹 Nitpick comments (1)

core/providers/openai/openai.go (1)
1218-1223: Streaming fallback wiring matches the chat-stream conversion path

For ResponsesStream, you correctly:

Gate via ShouldAttemptIntegrationFallback

Tag the context with BifrostContextKeyIsResponsesToChatCompletionFallback

Delegate to ChatCompletionStream with request.ToChatRequest()

This plugs directly into the existing HandleOpenAIChatCompletionStreaming logic that branches on this flag to emit responses-shaped stream chunks. One minor nit: the preceding comment still says “chat completion stream” while the check is for ResponsesStreamRequest; consider updating it for clarity.
-	// Check if chat completion stream is allowed for this provider
+	// Check if responses stream is allowed for this provider

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 5edcf23 and e72dca6.

📒 Files selected for processing (8)

core/providers/anthropic/anthropic.go (2 hunks)
core/providers/bedrock/bedrock.go (2 hunks)
core/providers/cohere/cohere.go (2 hunks)
core/providers/gemini/gemini.go (2 hunks)
core/providers/openai/openai.go (2 hunks)
core/providers/utils/utils.go (1 hunks)
core/schemas/bifrost.go (1 hunks)
transports/bifrost-http/integrations/router.go (1 hunks)

🚧 Files skipped from review as they are similar to previous changes (1)

core/providers/utils/utils.go

🧰 Additional context used

📓 Path-based instructions (1)

**

⚙️ CodeRabbit configuration file

always check the stack if there is one for the current PR. do not give localized reviews for the PR, always see all changes in the light of the whole stack of PRs (if there is a stack, if there is no stack you can continue to make localized suggestions/reviews)

Files:

transports/bifrost-http/integrations/router.go
core/providers/gemini/gemini.go
core/providers/anthropic/anthropic.go
core/providers/cohere/cohere.go
core/providers/bedrock/bedrock.go
core/providers/openai/openai.go
core/schemas/bifrost.go

🧠 Learnings (2)

📚 Learning: 2025-12-09T17:07:42.007Z

Learnt from: qwerty-dvorak
Repo: maximhq/bifrost PR: 1006
File: core/schemas/account.go:9-18
Timestamp: 2025-12-09T17:07:42.007Z
Learning: In core/schemas/account.go, the HuggingFaceKeyConfig field within the Key struct is currently unused and reserved for future Hugging Face inference endpoint deployments. Do not flag this field as missing from OpenAPI documentation or require its presence in the API spec until the feature is actively implemented and used. When the feature is added, update the OpenAPI docs accordingly; otherwise, treat this field as non-breaking and not part of the current API surface.

Applied to files:

transports/bifrost-http/integrations/router.go
core/providers/gemini/gemini.go
core/providers/anthropic/anthropic.go
core/providers/cohere/cohere.go
core/providers/bedrock/bedrock.go
core/providers/openai/openai.go
core/schemas/bifrost.go

📚 Learning: 2025-12-11T11:58:25.307Z

Learnt from: Pratham-Mishra04
Repo: maximhq/bifrost PR: 1000
File: core/providers/openai/responses.go:42-84
Timestamp: 2025-12-11T11:58:25.307Z
Learning: In core/providers/openai/responses.go (and related OpenAI response handling), document and enforce the API format constraint: if ResponsesReasoning != nil and the response contains content blocks, all content blocks should be treated as reasoning blocks by default. Implement type guards or parsing logic accordingly, and add unit tests to verify that when ResponsesReasoning is non-nil, content blocks are labeled as reasoning blocks. Include clear comments in the code explaining the rationale and ensure downstream consumers rely on this behavior.

Applied to files:

core/providers/openai/openai.go

🧬 Code graph analysis (5)

transports/bifrost-http/integrations/router.go (1)

core/schemas/bifrost.go (1)

BifrostContextKeyIntegrationRequest (109-109)

core/providers/gemini/gemini.go (3)

core/providers/utils/utils.go (1)

ShouldAttemptIntegrationFallback (1127-1141)

core/schemas/bifrost.go (3)

RequestType (83-83)

ResponsesRequest (91-91)

BifrostContextKeyIsResponsesToChatCompletionFallback (123-123)

ui/lib/types/config.ts (1)

RequestType (122-134)

core/providers/cohere/cohere.go (4)

core/providers/utils/utils.go (1)

ShouldAttemptIntegrationFallback (1127-1141)

core/schemas/bifrost.go (3)

RequestType (83-83)

ResponsesRequest (91-91)

BifrostContextKeyIsResponsesToChatCompletionFallback (123-123)

ui/lib/types/config.ts (1)

RequestType (122-134)

core/schemas/provider.go (1)

Provider (282-309)

core/providers/bedrock/bedrock.go (3)

core/providers/utils/utils.go (1)

ShouldAttemptIntegrationFallback (1127-1141)

core/schemas/bifrost.go (3)

RequestType (83-83)

ResponsesRequest (91-91)

BifrostContextKeyIsResponsesToChatCompletionFallback (123-123)

core/schemas/provider.go (1)

Provider (282-309)

core/providers/openai/openai.go (3)

core/providers/utils/utils.go (1)

ShouldAttemptIntegrationFallback (1127-1141)

core/schemas/bifrost.go (3)

RequestType (83-83)

ResponsesRequest (91-91)

BifrostContextKeyIsResponsesToChatCompletionFallback (123-123)

ui/lib/types/config.ts (1)

RequestType (122-134)

⏰ Context from checks skipped due to timeout of 900000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (15)

GitHub Check: Graphite / mergeability_check
GitHub Check: Graphite / mergeability_check
GitHub Check: Graphite / mergeability_check
GitHub Check: Graphite / mergeability_check
GitHub Check: Graphite / mergeability_check
GitHub Check: Graphite / mergeability_check
GitHub Check: Graphite / mergeability_check
GitHub Check: Graphite / mergeability_check
GitHub Check: Graphite / mergeability_check
GitHub Check: Graphite / mergeability_check
GitHub Check: Graphite / mergeability_check
GitHub Check: Graphite / mergeability_check
GitHub Check: Graphite / mergeability_check
GitHub Check: Graphite / mergeability_check
GitHub Check: Graphite / mergeability_check

🔇 Additional comments (8)

core/providers/anthropic/anthropic.go (2)

673-687: Integration-only Responses → ChatCompletion fallback looks sound

On CheckOperationAllowed failure you gate the fallback through ShouldAttemptIntegrationFallback, then bridge via request.ToChatRequest() and ToBifrostResponsesResponse() while normalizing ExtraFields to ResponsesRequest. This keeps behavior integration-specific and single-attempt without introducing recursion, since ChatCompletion has no reverse fallback here.

734-743: Streaming Responses → ChatCompletionStream fallback is correctly scoped

The streaming fallback mirrors the non-streaming path, using ShouldAttemptIntegrationFallback and tagging the context with BifrostContextKeyIsResponsesToChatCompletionFallback before delegating to ChatCompletionStream. This cleanly reuses existing chat streaming logic for integrations while avoiding repeated fallbacks.

transports/bifrost-http/integrations/router.go (1)

378-380: Context flag for integration requests correctly enables provider fallbacks

Marking BifrostContextKeyIntegrationRequest on bifrostCtx here ensures all integration HTTP routes participate in the new ShouldAttemptIntegrationFallback behavior while leaving non-integration callers unaffected. This is the right place in the flow, as it happens before the Bifrost client call but after request parsing.

core/providers/openai/openai.go (1)

1099-1113: Responses → ChatCompletion integration fallback is correct and recursion-safe

The new branch uses ShouldAttemptIntegrationFallback to gate integration-only fallback, then converts to chat and back via ToBifrostResponsesResponse(), resetting ExtraFields.RequestType/Provider/ModelRequested to match a true responses call. Since ChatCompletion no longer falls back to Responses, there’s no risk of mutual recursion, and the original err is still returned when fallback isn’t allowed.

core/providers/bedrock/bedrock.go (1)

1004-1015: Streaming Responses → ChatCompletionStream fallback is correctly gated

The streaming fallback for Bedrock:

Reuses ShouldAttemptIntegrationFallback so only integration calls and only a first attempt are routed

Sets BifrostContextKeyIsResponsesToChatCompletionFallback before delegating to ChatCompletionStream

Leaves the normal Bedrock ResponsesStream path unchanged when the operation is allowed

This keeps semantics aligned with other providers and avoids recursion while still giving integrations a working stream when responses streaming is disabled.

core/providers/cohere/cohere.go (2)

505-520: Cohere Responses → ChatCompletion integration fallback is consistent and safe

When ResponsesRequest is disallowed, you now conditionally fall back via ShouldAttemptIntegrationFallback, call ChatCompletion with request.ToChatRequest(), and convert back using ToBifrostResponsesResponse() while setting ExtraFields for the responses API. Since Cohere’s ChatCompletion doesn’t attempt its own fallback, this is a one-way, integration-only bridge with no recursion concerns.

573-583: Streaming Responses → ChatCompletionStream fallback hooks correctly into chat streaming

For ResponsesStream, the new branch:

Uses ShouldAttemptIntegrationFallback for integration-only, single-attempt behavior

Marks BifrostContextKeyIsResponsesToChatCompletionFallback on the context

Delegates to ChatCompletionStream with the converted chat request

This matches the stack-wide pattern and lets integrations get a working stream even when Cohere’s responses streaming is disabled, without affecting direct Bifrost callers.

core/schemas/bifrost.go (1)

105-126: LGTM! Context keys properly defined for integration fallback feature.

The new context keys are well-documented with type annotations and follow the established naming convention. They provide the necessary infrastructure for tracking integration requests and fallback state across the provider chain.

core/providers/bedrock/bedrock.go

core/providers/gemini/gemini.go

coderabbitai

Actionable comments posted: 2

♻️ Duplicate comments (1)

core/providers/bedrock/bedrock.go (1)

914-933: Update Responses comment to reference Bedrock, not Anthropic

The function comment still mentions “Anthropic’s API” even though this is BedrockProvider.Responses. Consider updating it to describe Bedrock correctly to avoid confusion for future readers.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e72dca6 and dc29eae.

📒 Files selected for processing (8)

core/providers/anthropic/anthropic.go (2 hunks)
core/providers/bedrock/bedrock.go (2 hunks)
core/providers/cohere/cohere.go (2 hunks)
core/providers/gemini/gemini.go (2 hunks)
core/providers/openai/openai.go (2 hunks)
core/providers/utils/utils.go (1 hunks)
core/schemas/bifrost.go (1 hunks)
transports/bifrost-http/integrations/router.go (1 hunks)

🚧 Files skipped from review as they are similar to previous changes (4)

core/providers/cohere/cohere.go
core/schemas/bifrost.go
core/providers/gemini/gemini.go
transports/bifrost-http/integrations/router.go

🧰 Additional context used

📓 Path-based instructions (1)

**

⚙️ CodeRabbit configuration file

always check the stack if there is one for the current PR. do not give localized reviews for the PR, always see all changes in the light of the whole stack of PRs (if there is a stack, if there is no stack you can continue to make localized suggestions/reviews)

Files:

core/providers/utils/utils.go
core/providers/openai/openai.go
core/providers/bedrock/bedrock.go
core/providers/anthropic/anthropic.go

🧠 Learnings (2)

📚 Learning: 2025-12-09T17:07:42.007Z

Learnt from: qwerty-dvorak
Repo: maximhq/bifrost PR: 1006
File: core/schemas/account.go:9-18
Timestamp: 2025-12-09T17:07:42.007Z
Learning: In core/schemas/account.go, the HuggingFaceKeyConfig field within the Key struct is currently unused and reserved for future Hugging Face inference endpoint deployments. Do not flag this field as missing from OpenAPI documentation or require its presence in the API spec until the feature is actively implemented and used. When the feature is added, update the OpenAPI docs accordingly; otherwise, treat this field as non-breaking and not part of the current API surface.

Applied to files:

core/providers/utils/utils.go
core/providers/openai/openai.go
core/providers/bedrock/bedrock.go
core/providers/anthropic/anthropic.go

📚 Learning: 2025-12-11T11:58:25.307Z

Learnt from: Pratham-Mishra04
Repo: maximhq/bifrost PR: 1000
File: core/providers/openai/responses.go:42-84
Timestamp: 2025-12-11T11:58:25.307Z
Learning: In core/providers/openai/responses.go (and related OpenAI response handling), document and enforce the API format constraint: if ResponsesReasoning != nil and the response contains content blocks, all content blocks should be treated as reasoning blocks by default. Implement type guards or parsing logic accordingly, and add unit tests to verify that when ResponsesReasoning is non-nil, content blocks are labeled as reasoning blocks. Include clear comments in the code explaining the rationale and ensure downstream consumers rely on this behavior.

Applied to files:

core/providers/openai/openai.go

🧬 Code graph analysis (4)

core/providers/utils/utils.go (1)

core/schemas/bifrost.go (2)

BifrostContextKeyIntegrationRequest (109-109)

BifrostContextKeyIntegrationFallbackAttempted (110-110)

core/providers/openai/openai.go (3)

core/providers/utils/utils.go (1)

ShouldAttemptIntegrationFallback (1127-1141)

core/schemas/bifrost.go (3)

RequestType (83-83)

ResponsesRequest (91-91)

BifrostContextKeyIsResponsesToChatCompletionFallback (123-123)

core/schemas/provider.go (1)

Provider (282-309)

core/providers/bedrock/bedrock.go (3)

core/providers/utils/utils.go (1)

ShouldAttemptIntegrationFallback (1127-1141)

core/schemas/bifrost.go (3)

RequestType (83-83)

ResponsesRequest (91-91)

BifrostContextKeyIsResponsesToChatCompletionFallback (123-123)

transports/bifrost-http/handlers/inference.go (1)

ResponsesRequest (257-261)

core/providers/anthropic/anthropic.go (4)

core/providers/utils/utils.go (1)

ShouldAttemptIntegrationFallback (1127-1141)

core/schemas/bifrost.go (3)

RequestType (83-83)

ResponsesRequest (91-91)

BifrostContextKeyIsResponsesToChatCompletionFallback (123-123)

ui/lib/types/config.ts (1)

RequestType (122-134)

transports/bifrost-http/handlers/inference.go (1)

ResponsesRequest (257-261)

⏰ Context from checks skipped due to timeout of 900000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (8)

GitHub Check: Graphite / mergeability_check
GitHub Check: Graphite / mergeability_check
GitHub Check: Graphite / mergeability_check
GitHub Check: Graphite / mergeability_check
GitHub Check: Graphite / mergeability_check
GitHub Check: Graphite / mergeability_check
GitHub Check: Graphite / mergeability_check
GitHub Check: Graphite / mergeability_check

🔇 Additional comments (5)

core/providers/openai/openai.go (2)

1099-1113: Integration fallback for non-stream Responses is correctly gated and normalized

Using ShouldAttemptIntegrationFallback here ensures fallback only runs for integration flows and only once, and delegating to ChatCompletion via request.ToChatRequest() then converting with ToBifrostResponsesResponse() keeps the external semantics as a ResponsesRequest while normalizing ExtraFields. Looks good and recursion-safe since ChatCompletion doesn’t itself fallback.

1218-1223: Streaming ResponsesStream→ChatCompletionStream fallback is wired correctly

The streaming fallback is properly guarded by ShouldAttemptIntegrationFallback, and setting BifrostContextKeyIsResponsesToChatCompletionFallback before calling ChatCompletionStream matches the pattern used in HandleOpenAIChatCompletionStreaming to emit Responses-style chunks. No additional guard seems necessary here.

core/providers/bedrock/bedrock.go (1)

918-931: Non-stream Bedrock Responses→ChatCompletion fallback is consistent and safe

The non-stream fallback correctly uses ShouldAttemptIntegrationFallback to gate integration-only behavior and prevent multiple attempts, then calls ChatCompletion with request.ToChatRequest() and normalizes the resulting BifrostResponsesResponse.ExtraFields for a ResponsesRequest. With no reverse Chat→Responses fallback in this provider, there’s no recursion risk here.

core/providers/utils/utils.go (1)

1122-1141: ShouldAttemptIntegrationFallback correctly encapsulates integration-only, once-per-request fallback gating

The helper cleanly enforces that fallbacks only run for integration-marked contexts and only once per request by using BifrostContextKeyIntegrationRequest and BifrostContextKeyIntegrationFallbackAttempted. This centralizes the guard logic used across providers and avoids ad hoc recursion checks.

core/providers/anthropic/anthropic.go (1)

672-686: Non-stream Anthropic Responses→ChatCompletion fallback matches the shared integration pattern

This mirrors the intended integration behavior: only integration flows can fallback (via ShouldAttemptIntegrationFallback), the fallback runs at most once, and the result is converted back into a BifrostResponsesResponse with ExtraFields normalized for ResponsesRequest. With no reverse Chat→Responses fallback here, recursion isn’t a risk.

core/providers/anthropic/anthropic.go

core/providers/bedrock/bedrock.go

…ty in custom providers

TejasGhatte requested review from Pratham-Mishra04 and akshaydeo December 9, 2025 10:24

TejasGhatte marked this pull request as ready for review December 9, 2025 10:24

TejasGhatte force-pushed the 12-09-fix_reroute_requests_incoming_from_integrations_for_full_compatibility_in_custom_providers branch from e81d870 to 8dca6ec Compare December 9, 2025 10:30

coderabbitai bot requested changes Dec 9, 2025

View reviewed changes

TejasGhatte changed the base branch from main to graphite-base/1030 December 9, 2025 10:45

TejasGhatte force-pushed the 12-09-fix_reroute_requests_incoming_from_integrations_for_full_compatibility_in_custom_providers branch from 8dca6ec to f2deeaf Compare December 9, 2025 10:45

TejasGhatte force-pushed the graphite-base/1030 branch from dda0b09 to 366d759 Compare December 9, 2025 10:45

TejasGhatte changed the base branch from graphite-base/1030 to 12-09-fix_gemini_transcription_test_cases December 9, 2025 10:45

TejasGhatte force-pushed the 12-09-fix_reroute_requests_incoming_from_integrations_for_full_compatibility_in_custom_providers branch from f2deeaf to e2f10ba Compare December 9, 2025 11:00

coderabbitai bot approved these changes Dec 9, 2025

View reviewed changes

coderabbitai bot reviewed Dec 9, 2025

View reviewed changes

TejasGhatte force-pushed the 12-09-fix_reroute_requests_incoming_from_integrations_for_full_compatibility_in_custom_providers branch from e2f10ba to d1090eb Compare December 9, 2025 12:10

coderabbitai bot requested changes Dec 9, 2025

View reviewed changes

core/providers/bedrock/bedrock.go Outdated Show resolved Hide resolved

core/providers/bedrock/bedrock.go Outdated Show resolved Hide resolved

core/providers/bedrock/bedrock.go Show resolved Hide resolved

coderabbitai bot approved these changes Dec 9, 2025

View reviewed changes

TejasGhatte force-pushed the 12-09-fix_reroute_requests_incoming_from_integrations_for_full_compatibility_in_custom_providers branch from d1090eb to dcf76b0 Compare December 9, 2025 15:30

TejasGhatte force-pushed the 12-09-fix_gemini_transcription_test_cases branch from 366d759 to 315d945 Compare December 9, 2025 15:30

coderabbitai bot requested changes Dec 9, 2025

View reviewed changes

core/providers/openai/openai.go Outdated Show resolved Hide resolved

TejasGhatte force-pushed the 12-09-fix_reroute_requests_incoming_from_integrations_for_full_compatibility_in_custom_providers branch from dcf76b0 to 42528f8 Compare December 10, 2025 09:59

TejasGhatte force-pushed the 12-09-fix_gemini_transcription_test_cases branch from 315d945 to e07c0c1 Compare December 10, 2025 09:59

coderabbitai bot approved these changes Dec 10, 2025

View reviewed changes

coderabbitai bot reviewed Dec 10, 2025

View reviewed changes

TejasGhatte changed the base branch from 12-09-fix_gemini_transcription_test_cases to graphite-base/1030 December 10, 2025 11:29

TejasGhatte force-pushed the graphite-base/1030 branch from e07c0c1 to 9fda43a Compare December 11, 2025 20:02

TejasGhatte force-pushed the 12-09-fix_reroute_requests_incoming_from_integrations_for_full_compatibility_in_custom_providers branch from 42528f8 to 5edcf23 Compare December 11, 2025 20:02

TejasGhatte mentioned this pull request Dec 11, 2025

fix: integration test cases fixes #1066

Merged

16 tasks

coderabbitai bot reviewed Dec 11, 2025

View reviewed changes

Pratham-Mishra04 force-pushed the 12-09-fix_reroute_requests_incoming_from_integrations_for_full_compatibility_in_custom_providers branch from 5edcf23 to e72dca6 Compare December 11, 2025 20:33

Pratham-Mishra04 force-pushed the graphite-base/1030 branch from 9fda43a to a21bf88 Compare December 11, 2025 20:33

coderabbitai bot requested changes Dec 11, 2025

View reviewed changes

core/providers/bedrock/bedrock.go Show resolved Hide resolved

core/providers/gemini/gemini.go Show resolved Hide resolved

core/providers/gemini/gemini.go Show resolved Hide resolved

TejasGhatte force-pushed the 12-09-fix_reroute_requests_incoming_from_integrations_for_full_compatibility_in_custom_providers branch from e72dca6 to dc29eae Compare December 11, 2025 21:21

TejasGhatte force-pushed the graphite-base/1030 branch from a21bf88 to cd07ffe Compare December 11, 2025 21:21

TejasGhatte changed the base branch from graphite-base/1030 to 12-08-feat_added_support_for_gemini_native_converterswq December 11, 2025 21:22

coderabbitai bot approved these changes Dec 11, 2025

View reviewed changes

coderabbitai bot requested changes Dec 11, 2025

View reviewed changes

core/providers/anthropic/anthropic.go Show resolved Hide resolved

core/providers/bedrock/bedrock.go Show resolved Hide resolved

fix: reroute requests incoming from integrations for full compatibili…

509b38e

…ty in custom providers

TejasGhatte force-pushed the 12-08-feat_added_support_for_gemini_native_converterswq branch from cd07ffe to cc1bd19 Compare December 11, 2025 21:47

TejasGhatte force-pushed the 12-09-fix_reroute_requests_incoming_from_integrations_for_full_compatibility_in_custom_providers branch from dc29eae to 509b38e Compare December 11, 2025 21:47

coderabbitai bot approved these changes Dec 11, 2025

View reviewed changes

TejasGhatte changed the base branch from 12-08-feat_added_support_for_gemini_native_converterswq to graphite-base/1030 December 12, 2025 06:37

fix: reroute requests incoming from integrations for full compatibility in custom providers #1030

Are you sure you want to change the base?

fix: reroute requests incoming from integrations for full compatibility in custom providers #1030

Uh oh!

Conversation

TejasGhatte commented Dec 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Changes

Type of change

Affected areas

How to test

Breaking changes

Related issues

Security considerations

Checklist

Uh oh!

coderabbitai bot commented Dec 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Rate limit exceeded

Walkthrough

Changes

Estimated code review effort

Poem

Pre-merge checks and finishing touches

Uh oh!

TejasGhatte commented Dec 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Dec 9, 2025

🧪 Test Suite Available

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

TejasGhatte commented Dec 9, 2025 •

edited

Loading

coderabbitai bot commented Dec 9, 2025 •

edited

Loading

TejasGhatte commented Dec 9, 2025 •

edited

Loading