Skip to content

Conversation

@Pratham-Mishra04
Copy link
Collaborator

Summary

Improved the test retry framework with better logging and added support for cache control in content blocks across multiple providers.

Changes

  • Enhanced test retry framework with more detailed logging:
    • Added attempt count with max attempts (e.g., "attempt 2/10")
    • Added waiting time logs before retries
    • Improved error reporting during test retries
    • Fixed attempt number handling in OnRetry callbacks
  • Added cache control support to content blocks for:
    • Anthropic provider
    • Bedrock provider
    • OpenAI provider (with special handling to exclude from JSON marshaling)
  • Added CacheControl type to schemas with support for ephemeral content and TTL

Type of change

  • Bug fix
  • Feature
  • Refactor

Affected areas

  • Core (Go)
  • Transports (HTTP)
  • Providers/Integrations
  • Plugins
  • UI (Next.js)
  • Docs

How to test

# Core/Transports
go version
go test ./...

Breaking changes

  • Yes
  • No

Security considerations

Cache control implementation allows providers to specify content that should be treated as ephemeral, improving security for sensitive content.

Checklist

  • I added/updated tests where appropriate
  • I verified builds succeed (Go and UI)
  • I verified the CI pipeline passes locally if applicable

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Dec 1, 2025

📝 Walkthrough

Summary by CodeRabbit

  • New Features

    • Cache-control metadata now travels end-to-end across providers, messages, content blocks, tools and streaming paths to enable ephemeral prompt/tool caching.
  • Bug Fixes

    • Improved retry and streaming logs with clearer per-attempt indicators, wait/backoff messages, and continued retries on transient nil responses.
    • System/content cache signals are preserved consistently during conversions and streaming.
  • Chores

    • Test scenarios and CI config updated to enable prompt-caching.
  • Tests

    • Added cross-provider prompt-caching integration tests, helpers and large-context test data.

✏️ Tip: You can customize this high-level summary in your review settings.

Walkthrough

Adds provider-agnostic CacheControl/CachePoint propagation across Anthropic, Bedrock, and OpenAI conversions and streaming; centralizes Anthropic system-message conversion; adjusts Bedrock helpers to emit multiple system/content blocks; refines retry-framework logging and nil-channel retry behavior; and adds cross-provider prompt‑caching tests, constants, and scenario gating.

Changes

Cohort / File(s) Summary
Retry framework logging & behavior
core/internal/testutil/test_retry_framework.go
Enhanced per-attempt logging (first vs retry), added "⏳ Waiting" logs before backoffs, adjusted OnRetry attempt numbering, and continue retries when response channel is nil across retry paths.
Anthropic — types, utils, responses, chat
core/providers/anthropic/types.go, core/providers/anthropic/utils.go, core/providers/anthropic/responses.go, core/providers/anthropic/chat.go
Added CacheControl to AnthropicContentBlock and AnthropicTool; switched some block Type strings to typed enums; moved system-message handling into central conversion (systemContent param); preserve and propagate CacheControl through conversions and streaming; improve Thinking/signature and tool-use handling.
Bedrock — types, utils, chat, responses
core/providers/bedrock/types.go, core/providers/bedrock/utils.go, core/providers/bedrock/chat.go, core/providers/bedrock/responses.go
Added BedrockCachePoint type and CachePoint fields on system/content/tool structs; convert CachePoint ↔ CacheControl across translations; convert helpers to return slices and emit CachePoint marker blocks when CacheControl present; surface cache read/write usage in mappings; updated system-message converter signature to accept slices.
OpenAI — outbound marshaling filter
core/providers/openai/types.go, core/providers/openai/responses_marshal_test.go
When marshaling outbound Requests, shallow-copy messages/tools if any contain CacheControl and strip CacheControl from content and tool-output blocks before JSON; minor test formatting fix.
Schemas — provider-agnostic cache control
core/schemas/chatcompletions.go, core/schemas/responses.go
Introduced CacheControlType, CacheControl (Type + optional TTL), constant CacheControlTypeEphemeral; added CacheControl fields to ChatContentBlock, ChatTool, ResponsesMessageContentBlock, and ResponsesTool.
Bedrock ↔ Bifrost responses conversions
core/providers/bedrock/responses.go
Changed signature for system-message conversion (accept slices); when CachePoint present, add Ephemeral CacheControl markers into downstream content blocks; preserve cache signaling through conversions and into usage metrics.
Tests & test utilities (prompt-caching)
tests/integrations/tests/test_anthropic.py, tests/integrations/tests/test_bedrock.py, tests/integrations/tests/utils/common.py, tests/integrations/config.yml
Added integration prompt-caching tests (system/messages/tools), constants PROMPT_CACHING_TOOLS and PROMPT_CACHING_LARGE_CONTEXT, per-provider prompt_caching flags in config, and cache-assertion helpers; note duplicated insertions in tests.
Internal test config & prompt-caching harness
core/internal/testutil/account.go, core/internal/testutil/prompt_caching.go, core/providers/anthropic/anthropic_test.go, core/providers/bedrock/bedrock_test.go
Added PromptCaching flags and PromptCachingModel fields; test harness now uses ContentBlocks with Ephemeral CacheControl for system prompts and gates prompt-caching tests by scenario capability.
Misc / utils & signature changes
core/schemas/utils.go, core/providers/bedrock/utils.go, core/providers/anthropic/responses.go
Added IsNovaModel(model string) bool (duplicate insertion present); changed convertToolConfig signature to accept model and updated callers; adjusted content/block construction and system-content conversion signatures to support CachePoint/CacheControl markers.

Sequence Diagram(s)

sequenceDiagram
    participant Client
    participant Bifrost
    participant Converter
    participant Provider
    Note over Client,Bifrost: Request may include ContentBlocks with CacheControl
    Client->>Bifrost: Submit chat/responses request (ContentBlocks + CacheControl)
    Bifrost->>Converter: Convert to provider format (map CacheControl → CachePoint if needed)
    Converter->>Provider: Send provider request (CachePoint/CacheControl set)
    Provider-->>Converter: Stream/deltas (may include CachePoint/CacheControl)
    Converter->>Bifrost: Map provider cache metadata back (CachePoint → CacheControl)
    Bifrost-->>Client: Return responses with CacheControl annotations and usage metrics
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

  • Areas needing extra attention:
    • Anthropic conversions/streaming (core/providers/anthropic/*) — system-message refactor, CacheControl/Thinking/tool-use propagation.
    • Bedrock conversions/utils (core/providers/bedrock/*) — CachePoint ↔ CacheControl mapping, multi-block emission, and updated signatures.
    • OpenAI marshaling (core/providers/openai/types.go) — ensure non-mutating copying and correct sanitization of nested tool outputs.
    • Tests and test utilities (tests/integrations/..., core/internal/testutil/...) — duplicated test/helper insertions and scenario gating changes.
    • Duplicate declarations: IsNovaModel appears twice in core/schemas/utils.go and should be deduplicated.

Poem

🐇 I hopped through blocks with CacheControl bright,
I nudged the last block with an ephemeral light,
Logged each retry, waited, then tried once more,
When channels slept I kept the run in sight,
A rabbit cheers: cached tokens take flight! 🥕✨

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)
Check name Status Explanation
Title check ✅ Passed The title 'feat: add cache control support for claude models' clearly and concisely describes the main feature being added, which aligns with the substantial changes to cache control implementation across providers.
Description check ✅ Passed The PR description covers all major sections including summary, changes, type of change, affected areas, testing instructions, breaking changes, and security considerations, following the template structure.
Docstring Coverage ✅ Passed Docstring coverage is 95.92% which is sufficient. The required threshold is 80.00%.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch 11-25-feat_support_for_prompt_caching_parameter_for_claude_models

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Collaborator Author

Pratham-Mishra04 commented Dec 1, 2025

@Pratham-Mishra04 Pratham-Mishra04 force-pushed the 11-25-feat_support_for_prompt_caching_parameter_for_claude_models branch from a839bc0 to 058c2c3 Compare December 1, 2025 19:40
@Pratham-Mishra04 Pratham-Mishra04 force-pushed the 12-01-feat_added_reasoning_support_to_chat_completions branch 2 times, most recently from e1c5d21 to b5f6f04 Compare December 2, 2025 14:40
@Pratham-Mishra04 Pratham-Mishra04 force-pushed the 11-25-feat_support_for_prompt_caching_parameter_for_claude_models branch from 058c2c3 to bccd957 Compare December 2, 2025 14:40
@Pratham-Mishra04 Pratham-Mishra04 changed the base branch from 12-01-feat_added_reasoning_support_to_chat_completions to graphite-base/979 December 4, 2025 13:13
@akshaydeo akshaydeo marked this pull request as ready for review December 11, 2025 19:23
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Nitpick comments (3)
core/providers/anthropic/responses.go (1)

2514-2519: CacheControl missing for simple ContentStr path.

When converting ResponsesMessageTypeMessage with ContentStr (not ContentBlocks), the resulting AnthropicContentBlock doesn't include CacheControl. This is inconsistent with the ContentBlocks path at lines 2520-2528 where blocks go through convertContentBlockToAnthropic which does propagate CacheControl.

Consider whether ContentStr messages should carry cache control metadata. If so, apply it here:

 if msg.Content.ContentStr != nil {
     contentBlocks = append(contentBlocks, AnthropicContentBlock{
-        Type: AnthropicContentBlockTypeText,
-        Text: msg.Content.ContentStr,
+        Type:         AnthropicContentBlockTypeText,
+        Text:         msg.Content.ContentStr,
+        CacheControl: msg.Content.CacheControl, // if available at content level
     })
 }
core/internal/testutil/test_retry_framework.go (2)

2152-2157: Consider consistent attempt-start logging across all stream validation functions.

The attempt-start logging added here improves debugging visibility, but it's only present in WithResponsesStreamValidationRetry. Similar functions like WithSpeechStreamValidationRetry (line 1987+) and WithChatStreamValidationRetry (line 2358+) lack this logging.

For consistency, consider adding similar logging to all stream validation retry functions:

+  // Log attempt start (especially for retries)
+  if attempt > 1 {
+    t.Logf("🔄 Starting [speech|chat] stream retry attempt %d/%d for %s", attempt, config.MaxAttempts, context.ScenarioName)
+  } else {
+    t.Logf("🔄 Starting [speech|chat] stream test attempt %d/%d for %s", attempt, config.MaxAttempts, context.ScenarioName)
+  }

2217-2217: Consider adding waiting-time logs consistently across all retry functions.

The "⏳ Waiting" logs improve visibility into retry timing, but they're only present in WithResponsesStreamValidationRetry. Other retry functions (e.g., WithChatTestRetry, WithStreamRetry, WithSpeechStreamValidationRetry) calculate and sleep on delays without logging the wait time.

For consistent debugging experience, consider adding similar logs before time.Sleep calls in all retry functions:

   delay := calculateRetryDelay(attempt-1, config.BaseDelay, config.MaxDelay)
+  t.Logf("⏳ Waiting %v before retry...", delay)
   time.Sleep(delay)

Also applies to: 2251-2251

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between b5f6f04 and bccd957.

📒 Files selected for processing (11)
  • core/internal/testutil/test_retry_framework.go (5 hunks)
  • core/providers/anthropic/chat.go (2 hunks)
  • core/providers/anthropic/responses.go (11 hunks)
  • core/providers/anthropic/types.go (1 hunks)
  • core/providers/anthropic/utils.go (1 hunks)
  • core/providers/bedrock/chat.go (1 hunks)
  • core/providers/bedrock/types.go (1 hunks)
  • core/providers/bedrock/utils.go (4 hunks)
  • core/providers/openai/types.go (1 hunks)
  • core/schemas/chatcompletions.go (1 hunks)
  • core/schemas/responses.go (1 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**

⚙️ CodeRabbit configuration file

always check the stack if there is one for the current PR. do not give localized reviews for the PR, always see all changes in the light of the whole stack of PRs (if there is a stack, if there is no stack you can continue to make localized suggestions/reviews)

Files:

  • core/providers/bedrock/types.go
  • core/schemas/chatcompletions.go
  • core/providers/openai/types.go
  • core/providers/anthropic/types.go
  • core/providers/bedrock/utils.go
  • core/providers/anthropic/utils.go
  • core/providers/bedrock/chat.go
  • core/providers/anthropic/chat.go
  • core/internal/testutil/test_retry_framework.go
  • core/schemas/responses.go
  • core/providers/anthropic/responses.go
🧠 Learnings (2)
📚 Learning: 2025-12-09T17:07:42.007Z
Learnt from: qwerty-dvorak
Repo: maximhq/bifrost PR: 1006
File: core/schemas/account.go:9-18
Timestamp: 2025-12-09T17:07:42.007Z
Learning: In core/schemas/account.go, the HuggingFaceKeyConfig field within the Key struct is currently unused and reserved for future Hugging Face inference endpoint deployments. Do not flag this field as missing from OpenAPI documentation or require its presence in the API spec until the feature is actively implemented and used. When the feature is added, update the OpenAPI docs accordingly; otherwise, treat this field as non-breaking and not part of the current API surface.

Applied to files:

  • core/providers/bedrock/types.go
  • core/schemas/chatcompletions.go
  • core/providers/openai/types.go
  • core/providers/anthropic/types.go
  • core/providers/bedrock/utils.go
  • core/providers/anthropic/utils.go
  • core/providers/bedrock/chat.go
  • core/providers/anthropic/chat.go
  • core/internal/testutil/test_retry_framework.go
  • core/schemas/responses.go
  • core/providers/anthropic/responses.go
📚 Learning: 2025-12-11T11:58:25.307Z
Learnt from: Pratham-Mishra04
Repo: maximhq/bifrost PR: 1000
File: core/providers/openai/responses.go:42-84
Timestamp: 2025-12-11T11:58:25.307Z
Learning: In core/providers/openai/responses.go (and related OpenAI response handling), document and enforce the API format constraint: if ResponsesReasoning != nil and the response contains content blocks, all content blocks should be treated as reasoning blocks by default. Implement type guards or parsing logic accordingly, and add unit tests to verify that when ResponsesReasoning is non-nil, content blocks are labeled as reasoning blocks. Include clear comments in the code explaining the rationale and ensure downstream consumers rely on this behavior.

Applied to files:

  • core/providers/openai/types.go
🧬 Code graph analysis (8)
core/providers/bedrock/types.go (1)
core/schemas/chatcompletions.go (1)
  • CacheControl (608-611)
core/providers/openai/types.go (2)
core/schemas/chatcompletions.go (1)
  • CacheControl (608-611)
core/schemas/responses.go (1)
  • ResponsesToolMessage (460-480)
core/providers/anthropic/types.go (1)
core/schemas/chatcompletions.go (1)
  • CacheControl (608-611)
core/providers/bedrock/utils.go (1)
core/schemas/chatcompletions.go (1)
  • CacheControl (608-611)
core/providers/anthropic/utils.go (3)
core/providers/gemini/types.go (1)
  • Type (778-778)
core/providers/anthropic/types.go (2)
  • AnthropicContentBlockTypeImage (136-136)
  • AnthropicImageSource (163-168)
core/schemas/chatcompletions.go (1)
  • CacheControl (608-611)
core/providers/bedrock/chat.go (1)
core/schemas/chatcompletions.go (2)
  • ChatContentBlockTypeText (582-582)
  • CacheControl (608-611)
core/providers/anthropic/chat.go (2)
core/providers/anthropic/types.go (1)
  • AnthropicContentBlockTypeText (135-135)
core/schemas/chatcompletions.go (1)
  • CacheControl (608-611)
core/schemas/responses.go (1)
core/schemas/chatcompletions.go (1)
  • CacheControl (608-611)
⏰ Context from checks skipped due to timeout of 900000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
🔇 Additional comments (20)
core/providers/bedrock/chat.go (1)

68-72: LGTM! Cache control propagation is correct.

The addition of CacheControl field properly propagates cache metadata from Bedrock content blocks to Bifrost's ChatContentBlock representation. The use of the typed constant schemas.ChatContentBlockTypeText is consistent with the broader refactoring across providers.

core/schemas/chatcompletions.go (1)

597-611: LGTM! Cache control schema definitions are well-structured.

The introduction of CacheControl type with Type and optional TTL fields provides a clean, provider-agnostic way to represent cache control metadata. The optional omitempty tag ensures backward compatibility.

Note: The TTL comment indicates "1m" | "1h" format, but there's no validation. Consider adding validation logic or more detailed documentation if TTL values need to follow a strict format.

core/schemas/responses.go (1)

406-408: LGTM! Consistent cache control support in Responses API.

The addition of CacheControl field to ResponsesMessageContentBlock maintains consistency with the chat completions schema and properly documents that it's supported by specific providers (Anthropic, Bedrock).

core/providers/anthropic/chat.go (2)

142-146: LGTM! Proper use of typed constants and cache control propagation.

The system message text block construction correctly:

  • Uses the typed constant AnthropicContentBlockTypeText instead of string literal
  • Propagates CacheControl from the source block

This aligns with the provider-agnostic cache control support introduced in this PR.


177-181: LGTM! Consistent cache control propagation in tool results.

The tool result text block construction follows the same pattern as system messages, properly using typed constants and propagating cache control metadata.

core/providers/bedrock/types.go (1)

108-109: LGTM! Consistent cache control field addition.

The CacheControl field addition to BedrockContentBlock is consistent with the same change in Anthropic provider and aligns with the provider-agnostic cache control support.

core/providers/anthropic/types.go (1)

148-159: LGTM! Cache control field properly integrated.

The CacheControl field addition to AnthropicContentBlock is properly typed and follows the same pattern as other providers in this PR.

core/providers/anthropic/utils.go (1)

45-48: LGTM! Image block conversion properly updated.

The conversion function correctly:

  • Uses typed constant AnthropicContentBlockTypeImage instead of string literal
  • Propagates CacheControl from the source block to the Anthropic representation

This is consistent with similar changes for text blocks in other files.

core/providers/bedrock/utils.go (5)

274-277: LGTM: CacheControl propagation for text blocks in tool results.

The CacheControl field is correctly propagated from the input ChatContentBlock to the BedrockContentBlock for text content within tool results.


285-288: LGTM: CacheControl propagation for image blocks in tool results.

The CacheControl field is correctly propagated for image content within tool results, consistent with the text block handling.


334-337: LGTM: CacheControl propagation in general content block conversion.

Both text and image blocks in convertContentBlock correctly propagate CacheControl from the source block to the Bedrock-specific block.

Also applies to: 348-351


260-268: CacheControl not propagated for JSON tool result content.

When tool result content is parsed as JSON (lines 263-268), the resulting BedrockContentBlock with JSON field doesn't include CacheControl. This is inconsistent with the text/image paths. Consider whether cache control semantics should apply to JSON content blocks as well.


311-315: CacheControl missing for simple string content path.

When content.ContentStr is non-nil (line 311-315), the resulting BedrockContentBlock doesn't carry CacheControl. This is inconsistent with the multi-modal content block paths that now propagate it. If the source ChatMessageContent has a CacheControl field at the content level, consider propagating it here as well.

#!/bin/bash
# Check if ChatMessageContent has a CacheControl field that should be propagated
ast-grep --pattern 'type ChatMessageContent struct {
  $$$
}'
core/providers/anthropic/responses.go (7)

219-246: Improved system message handling with CacheControl support.

The refactored system message conversion now properly preserves individual content blocks with their CacheControl metadata instead of concatenating text. This is a good improvement that enables cache control for system prompts.


1461-1468: LGTM: CacheControl propagation for text content blocks.

Text blocks in convertAnthropicMessageToBifrostResponsesMessages correctly propagate CacheControl from the Anthropic content block to the Bifrost responses content block.


1531-1535: LGTM: CacheControl propagation in tool result content blocks.

Both regular tool results and MCP tool results correctly propagate CacheControl for text content blocks, maintaining consistency across tool handling paths.

Also applies to: 1587-1591


2324-2332: LGTM: CacheControl in Anthropic content blocks to Responses conversion.

The text content block conversion correctly includes CacheControl when converting from Anthropic format to Responses format.


2421-2427: LGTM: CacheControl in nested tool result content blocks.

Both regular tool results and MCP tool results correctly propagate CacheControl for text content blocks within nested content structures.

Also applies to: 2482-2488


2700-2727: LGTM: CacheControl propagation in convertContentBlockToAnthropic.

All content block types (text, image, reasoning) correctly propagate CacheControl when converting from Bifrost to Anthropic format.


2749-2757: LGTM: CacheControl in toBifrostResponsesImageBlock helper.

The image block conversion helper correctly includes CacheControl from the source Anthropic block.

@TejasGhatte TejasGhatte mentioned this pull request Dec 11, 2025
16 tasks
@akshaydeo akshaydeo added this to the 2025.12.13 milestone Dec 13, 2025
@TejasGhatte TejasGhatte force-pushed the 11-25-feat_support_for_prompt_caching_parameter_for_claude_models branch from bccd957 to 5a89f32 Compare December 15, 2025 08:16
@TejasGhatte TejasGhatte changed the base branch from graphite-base/979 to main December 15, 2025 08:16
@TejasGhatte TejasGhatte force-pushed the 11-25-feat_support_for_prompt_caching_parameter_for_claude_models branch from 5a89f32 to 5da39b0 Compare December 15, 2025 19:56
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
core/providers/anthropic/responses.go (1)

2150-2185: Use consistent content types and preserve CacheControl in grouped text-block handling.

In convertAnthropicContentBlocksToResponsesMessagesGrouped:

  • Line 2176: Input messages (isOutputMessage == false) incorrectly use ResponsesOutputMessageContentTypeText. Should use ResponsesInputMessageContentBlockTypeText to match the non-grouped helper and schema conventions (seen at line 2392 in convertAnthropicContentBlocksToResponsesMessages).
  • Lines 2165–2168: When accumulating output message text blocks, CacheControl is not copied from block.CacheControl, dropping any cache-control semantics. The non-grouped helper preserves it (line 2379).

These inconsistencies affect Bedrock compatibility (which uses the grouped path) and message fidelity. Align the grouped path with the non-grouped helper:

  • Use ResponsesInputMessageContentBlockTypeText for input messages.
  • Include CacheControl: block.CacheControl when accumulating text.
♻️ Duplicate comments (1)
core/providers/openai/types.go (1)

172-195: Avoid mutating input messages when stripping CacheControl before marshaling.

The loop currently clears CacheControl in-place on OpenAIResponsesRequestInputArray’s nested content blocks and tool output blocks. Because msg.Content and msg.ResponsesToolMessage.Output point into the original slice, this mutates the caller’s data, so any later use of the same messages loses cache-control metadata.

Safer approach: build a sanitized deep copy (messages, their ContentBlocks, and ResponsesFunctionToolCallOutputBlocks) and marshal that instead.

Example pattern:

func (r *OpenAIResponsesRequestInput) MarshalJSON() ([]byte, error) {
    if r.OpenAIResponsesRequestInputStr != nil {
        return sonic.Marshal(*r.OpenAIResponsesRequestInputStr)
    }
    if r.OpenAIResponsesRequestInputArray != nil {
-       // Set CacheControl to nil for all content blocks to exclude it from marshaling
-       for _, msg := range r.OpenAIResponsesRequestInputArray {
-           // Remove CacheControl from message content blocks
-           if msg.Content != nil && msg.Content.ContentBlocks != nil {
-               for i := range msg.Content.ContentBlocks {
-                   msg.Content.ContentBlocks[i].CacheControl = nil
-               }
-           }
-           // Remove CacheControl from tool message output blocks
-           if msg.ResponsesToolMessage != nil && msg.ResponsesToolMessage.Output != nil {
-               if msg.ResponsesToolMessage.Output.ResponsesFunctionToolCallOutputBlocks != nil {
-                   for i := range msg.ResponsesToolMessage.Output.ResponsesFunctionToolCallOutputBlocks {
-                       msg.ResponsesToolMessage.Output.ResponsesFunctionToolCallOutputBlocks[i].CacheControl = nil
-                   }
-               }
-           }
-       }
-       return sonic.Marshal(r.OpenAIResponsesRequestInputArray)
+       // Work on a sanitized deep copy to avoid mutating the caller's slice.
+       msgsCopy := make([]schemas.ResponsesMessage, len(r.OpenAIResponsesRequestInputArray))
+       copy(msgsCopy, r.OpenAIResponsesRequestInputArray)
+
+       for i := range msgsCopy {
+           msg := &msgsCopy[i]
+
+           // Strip CacheControl from message content blocks
+           if msg.Content != nil && msg.Content.ContentBlocks != nil {
+               blocksCopy := make([]schemas.ResponsesMessageContentBlock, len(msg.Content.ContentBlocks))
+               copy(blocksCopy, msg.Content.ContentBlocks)
+               for j := range blocksCopy {
+                   blocksCopy[j].CacheControl = nil
+               }
+               msg.Content.ContentBlocks = blocksCopy
+           }
+
+           // Strip CacheControl from tool output blocks
+           if msg.ResponsesToolMessage != nil && msg.ResponsesToolMessage.Output != nil &&
+               msg.ResponsesToolMessage.Output.ResponsesFunctionToolCallOutputBlocks != nil {
+               outBlocks := msg.ResponsesToolMessage.Output.ResponsesFunctionToolCallOutputBlocks
+               outCopy := make([]schemas.ResponsesMessageContentBlock, len(outBlocks))
+               copy(outCopy, outBlocks)
+               for j := range outCopy {
+                   outCopy[j].CacheControl = nil
+               }
+               msg.ResponsesToolMessage.Output.ResponsesFunctionToolCallOutputBlocks = outCopy
+           }
+       }
+       return sonic.Marshal(msgsCopy)
    }
    return sonic.Marshal(nil)
}

This keeps CacheControl out of the OpenAI wire format without surprising callers by mutating their request payloads.

🧹 Nitpick comments (5)
core/providers/bedrock/types.go (1)

155-201: Bedrock cache point typing and wiring look consistent

The added CachePoint fields on BedrockSystemMessage and BedrockContentBlock plus BedrockCachePointType/BedrockCachePoint give a clean, explicit shape for Bedrock cache metadata and align with the rest of the Bedrock type modeling. No functional issues from this change.

You might optionally add a brief comment on BedrockCachePointTypeDefault to document that "default" currently maps to ephemeral cache control in Bifrost, for future readers.

core/providers/bedrock/responses.go (2)

1245-1290: Tool cache control mapping is correct but relies on positional convention

The bidirectional mapping between Bedrock tool cachePoint entries and Bifrost CacheControl looks internally consistent:

  • Bedrock → Bifrost: a BedrockTool with only CachePoint sets CacheControl on the last bifrostReq.Params.Tools entry.
  • Bifrost → Bedrock: a Bifrost tool with CacheControl causes you to append an extra BedrockTool with just CachePoint.

Two notes:

  • This is intentionally order‑dependent: it assumes a cachePoint-only tool always follows the tool definition it applies to. That’s fine given how you construct these arrays, but it’s worth a short comment near both sites to document the convention.
  • You already guard len(bifrostReq.Params.Tools) > 0, so malformed input with a leading cachePoint is safely ignored.

Functionally this should round‑trip cleanly and supports the prompt‑caching tests you’ve added.

Also applies to: 1534-1587


2376-2484: System message CachePoint ↔ CacheControl conversion is symmetric and safe

  • convertBifrostMessageToBedrockSystemMessages now emits a BedrockSystemMessage{Text: ...} per text block and, when CacheControl is set, a following BedrockSystemMessage{CachePoint: ...}.
  • convertBedrockSystemMessageToBifrostMessages walks []BedrockSystemMessage, and when it encounters a CachePoint entry, it attaches CacheControl{Type: Ephemeral} to the last message’s last content block, which matches how you emitted it in the other direction.

This gives you a clean round‑trip for system‑level cache checkpoints, and the guards (len(bifrostMessages) > 0, non‑nil content/blocks) avoid panics on malformed input.

Consider adding a brief comment that CachePoint entries are expected to be emitted immediately after the system text they apply to, as this ordering is now a contract.

tests/integrations/tests/test_bedrock.py (1)

37-39: Prompt caching test is well-structured; consider small cleanups and lint fixes

The new test_21_prompt_caching does a good job of validating the three cache checkpoint locations (system, messages, tools) end‑to‑end via usage.cacheWriteInputTokens / cacheReadInputTokens. The overall flow and use of PROMPT_CACHING_LARGE_CONTEXT look solid.

A few minor points you may want to address:

  • Unused fixture argument: test_config isn’t used in test_21_prompt_caching. Given Ruff’s ARG002 hint, either:

    • Drop it from the signature here, or
    • Rename to _test_config or explicitly reference it (e.g., assert isinstance(test_config, Config)) if you want to keep the pattern consistent with other tests.
  • Pointless f-strings: Several print calls are marked by Ruff F541 as f-strings without interpolation:

    • Line 1630: print(f"\n--- Test 1: System Message Caching ---")
    • Line 1696: print(f"\n--- Test 2: Messages Caching ---")
    • Line 1754: print(f"\n--- Test 3: Tools Caching ---")

    You can remove the f prefix on those strings to silence the warning.

  • Minor prompt text nit: In the system message setup you currently have:

    • "You are an AI assistant with analyzing legal documents."

    That reads oddly (double space and missing “tasked”). If you care about prompt quality, consider aligning it with the wording used in PROMPT_CACHING_LARGE_CONTEXT (e.g., “You are an AI assistant tasked with analyzing legal documents.”).

If you’re comfortable with the stricter expectations that second requests never write to cache, the assertions around cacheWriteInputTokens == 0 are fine as a contract test for your abstraction.

Also applies to: 49-65, 1620-1824

tests/integrations/tests/test_anthropic.py (1)

1418-1512: Tidy up test_28_prompt_caching for linters and minor style

The overall structure of the new prompt-caching test looks solid and matches the cross-provider patterns used elsewhere in this suite. A couple of small cleanups will keep lint happy:

  • anthropic_client and test_config are unused but required as fixtures; Ruff flags these as ARG002. Easiest fix is to mark the function definition as intentionally ignoring that rule.
  • The final print is an f-string without interpolation, which triggers F541; you can safely drop the f prefix.

Suggested diff:

-    @pytest.mark.parametrize("provider,model", get_cross_provider_params_for_scenario("prompt_caching"))
-    def test_28_prompt_caching(self, anthropic_client, test_config, provider, model):
+    @pytest.mark.parametrize("provider,model", get_cross_provider_params_for_scenario("prompt_caching"))
+    def test_28_prompt_caching(self, anthropic_client, test_config, provider, model):  # noqa: ARG002
@@
-        print(f"\n✓ Prompt caching validated for provider {provider}")
+        print(f"\n✓ Prompt caching validated for provider {provider}")
         print(f"  Cache created: {cache_created} tokens")
         print(f"  Cache read: {cache_read} tokens")
-        print(f"  Efficiency: Cache hit successfully reduced processing on second request")
+        print("  Efficiency: Cache hit successfully reduced processing on second request")

(Keeping the f-strings on the earlier lines is useful since they interpolate provider, cache_created, and cache_read; only the last one needs to be plain.)

Also, if you ever see this test become flaky across providers, consider relaxing the abs(cache_created - cache_read) < 100 tolerance or additionally asserting that the second request’s total input tokens are lower than the first, but that’s optional and can be adjusted based on real-world behavior.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 5a89f32 and 5da39b0.

📒 Files selected for processing (15)
  • core/internal/testutil/test_retry_framework.go (5 hunks)
  • core/providers/anthropic/chat.go (6 hunks)
  • core/providers/anthropic/responses.go (13 hunks)
  • core/providers/anthropic/types.go (2 hunks)
  • core/providers/anthropic/utils.go (1 hunks)
  • core/providers/bedrock/chat.go (1 hunks)
  • core/providers/bedrock/responses.go (8 hunks)
  • core/providers/bedrock/types.go (6 hunks)
  • core/providers/bedrock/utils.go (3 hunks)
  • core/providers/openai/types.go (1 hunks)
  • core/schemas/chatcompletions.go (2 hunks)
  • core/schemas/responses.go (2 hunks)
  • tests/integrations/tests/test_anthropic.py (3 hunks)
  • tests/integrations/tests/test_bedrock.py (3 hunks)
  • tests/integrations/tests/utils/common.py (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (4)
  • core/providers/bedrock/utils.go
  • core/providers/anthropic/types.go
  • core/schemas/responses.go
  • core/providers/bedrock/chat.go
🧰 Additional context used
📓 Path-based instructions (1)
**

⚙️ CodeRabbit configuration file

always check the stack if there is one for the current PR. do not give localized reviews for the PR, always see all changes in the light of the whole stack of PRs (if there is a stack, if there is no stack you can continue to make localized suggestions/reviews)

Files:

  • tests/integrations/tests/test_bedrock.py
  • core/providers/anthropic/utils.go
  • core/internal/testutil/test_retry_framework.go
  • core/schemas/chatcompletions.go
  • core/providers/anthropic/responses.go
  • tests/integrations/tests/utils/common.py
  • core/providers/anthropic/chat.go
  • core/providers/openai/types.go
  • core/providers/bedrock/responses.go
  • core/providers/bedrock/types.go
  • tests/integrations/tests/test_anthropic.py
🧠 Learnings (4)
📚 Learning: 2025-12-09T17:07:42.007Z
Learnt from: qwerty-dvorak
Repo: maximhq/bifrost PR: 1006
File: core/schemas/account.go:9-18
Timestamp: 2025-12-09T17:07:42.007Z
Learning: In core/schemas/account.go, the HuggingFaceKeyConfig field within the Key struct is currently unused and reserved for future Hugging Face inference endpoint deployments. Do not flag this field as missing from OpenAPI documentation or require its presence in the API spec until the feature is actively implemented and used. When the feature is added, update the OpenAPI docs accordingly; otherwise, treat this field as non-breaking and not part of the current API surface.

Applied to files:

  • core/providers/anthropic/utils.go
  • core/internal/testutil/test_retry_framework.go
  • core/schemas/chatcompletions.go
  • core/providers/anthropic/responses.go
  • core/providers/anthropic/chat.go
  • core/providers/openai/types.go
  • core/providers/bedrock/responses.go
  • core/providers/bedrock/types.go
📚 Learning: 2025-12-12T10:28:54.988Z
Learnt from: Pratham-Mishra04
Repo: maximhq/bifrost PR: 978
File: core/providers/anthropic/chat.go:207-216
Timestamp: 2025-12-12T10:28:54.988Z
Learning: In core/providers/anthropic/chat.go, within ToAnthropicChatRequest, ensure the mapping from ReasoningDetails to Anthropic thinking blocks preserves all fields from ReasoningDetails, including when Text is nil or empty. Do not drop or coerce missing Text; propagate nil/empty values to maintain complete bidirectional mapping of Anthropic events in Bifrost. Add tests to verify that ReasoningDetails with Text nil or "" round-trip without loss, and update code to handle nullable Text without dropping data. If there are multiple blocks, preserve order and all fields during mapping.

Applied to files:

  • core/providers/anthropic/chat.go
📚 Learning: 2025-12-11T11:58:25.307Z
Learnt from: Pratham-Mishra04
Repo: maximhq/bifrost PR: 1000
File: core/providers/openai/responses.go:42-84
Timestamp: 2025-12-11T11:58:25.307Z
Learning: In core/providers/openai/responses.go (and related OpenAI response handling), document and enforce the API format constraint: if ResponsesReasoning != nil and the response contains content blocks, all content blocks should be treated as reasoning blocks by default. Implement type guards or parsing logic accordingly, and add unit tests to verify that when ResponsesReasoning is non-nil, content blocks are labeled as reasoning blocks. Include clear comments in the code explaining the rationale and ensure downstream consumers rely on this behavior.

Applied to files:

  • core/providers/openai/types.go
📚 Learning: 2025-12-14T14:43:30.902Z
Learnt from: Radheshg04
Repo: maximhq/bifrost PR: 980
File: core/providers/openai/images.go:10-22
Timestamp: 2025-12-14T14:43:30.902Z
Learning: Enforce the OpenAI image generation SSE event type values across the OpenAI image flow in the repository: use "image_generation.partial_image" for partial chunks, "image_generation.completed" for the final result, and "error" for errors. Apply this consistently in schemas, constants, tests, accumulator routing, and UI code within core/providers/openai (and related Go files) to ensure uniform event typing and avoid mismatches.

Applied to files:

  • core/providers/openai/types.go
🧬 Code graph analysis (6)
tests/integrations/tests/test_bedrock.py (1)
tests/integrations/tests/utils/parametrize.py (2)
  • get_cross_provider_params_for_scenario (12-47)
  • format_provider_model (50-65)
core/providers/anthropic/utils.go (2)
core/providers/anthropic/types.go (2)
  • AnthropicContentBlockTypeImage (205-205)
  • AnthropicImageSource (234-239)
core/schemas/chatcompletions.go (1)
  • CacheControl (610-613)
core/providers/anthropic/responses.go (4)
core/schemas/responses.go (5)
  • ResponsesMessage (318-331)
  • ResponsesInputMessageRoleSystem (338-338)
  • ResponsesMessageContent (343-348)
  • ResponsesMessageContentBlock (403-418)
  • ResponsesInputMessageContentBlockTypeText (392-392)
core/schemas/utils.go (1)
  • Ptr (16-18)
core/schemas/chatcompletions.go (1)
  • CacheControl (610-613)
core/providers/anthropic/types.go (1)
  • AnthropicContentBlockTypeText (204-204)
core/providers/anthropic/chat.go (2)
core/schemas/chatcompletions.go (2)
  • CacheControl (610-613)
  • ChatContentBlockTypeText (584-584)
core/providers/anthropic/types.go (1)
  • AnthropicContentBlockTypeText (204-204)
core/providers/openai/types.go (2)
core/schemas/chatcompletions.go (1)
  • CacheControl (610-613)
core/schemas/responses.go (1)
  • ResponsesToolMessage (469-489)
tests/integrations/tests/test_anthropic.py (2)
tests/integrations/tests/utils/parametrize.py (2)
  • get_cross_provider_params_for_scenario (12-47)
  • format_provider_model (50-65)
tests/integrations/tests/utils/common.py (1)
  • assert_valid_chat_response (606-638)
🪛 Ruff (0.14.8)
tests/integrations/tests/test_bedrock.py

1620-1620: Unused method argument: test_config

(ARG002)


1630-1630: f-string without any placeholders

Remove extraneous f prefix

(F541)


1696-1696: f-string without any placeholders

Remove extraneous f prefix

(F541)


1754-1754: f-string without any placeholders

Remove extraneous f prefix

(F541)

tests/integrations/tests/test_anthropic.py

1419-1419: Unused method argument: anthropic_client

(ARG002)


1419-1419: Unused method argument: test_config

(ARG002)


1511-1511: f-string without any placeholders

Remove extraneous f prefix

(F541)

🔇 Additional comments (12)
core/internal/testutil/test_retry_framework.go (2)

818-835: Streaming retry attempt semantics are now explicit and consistent

The updated comments and t.Logf("…attempt %d/10…", attempt+1, …) clearly express that attempt is the current failed attempt and the log shows the upcoming attempt, matching how other helpers pass attempt into OnRetry. No issues here.


2150-2325: Responses stream retry logging/control-flow improvements look correct

The added per-attempt “Starting … attempt” logs, the extra failure logging before retries, and the explicit “⏳ Waiting … before retry…” messages significantly improve observability. Passing the current failed attempt into config.OnRetry in all error / nil-channel / validation-failure paths now aligns with the rest of the retry framework, and switching the nil-channel branch to continue (with retries) instead of returning fixes the earlier short-circuit bug while still failing on the final attempt. The overall retry behavior remains correct and bounded.

core/providers/anthropic/utils.go (1)

97-105: CacheControl propagation for image blocks looks correct.

Using AnthropicContentBlockTypeImage and threading block.CacheControl into AnthropicContentBlock.CacheControl keeps image handling consistent with the text-path changes and preserves cache semantics end‑to‑end. No issues spotted here.

core/schemas/chatcompletions.go (1)

244-250: Chat-level CacheControl schema additions look consistent and non-breaking.

Adding CacheControl *CacheControl to ChatTool and ChatContentBlock, plus the CacheControlType/CacheControl definitions, cleanly models ephemeral/TTL semantics at the chat schema level. Using pointer fields with omitempty preserves backward compatibility, and the shape matches the Responses-side CacheControl struct, so conversions can stay symmetric.

Also applies to: 592-613

core/providers/anthropic/chat.go (1)

42-72: CacheControl propagation across Anthropic chat/tool paths is wired correctly.

Tool definitions, system text blocks, tool-result text, and user/assistant text content all now copy CacheControl between schemas.Chat* and AnthropicContentBlock/AnthropicTool, and the reverse mappings in ToBifrostChatResponse/ToAnthropicChatResponse preserve it. The only path that drops CacheControl is where a single text block is collapsed to a plain string, which is expected since there’s no per-block metadata in that representation. Looks good.

Also applies to: 139-159, 167-201, 218-250, 318-333, 504-518

core/providers/anthropic/responses.go (2)

1340-1347: System message conversion now correctly handles CacheControl and empty-block cases.

ToBifrostResponsesRequest delegating system handling to ConvertAnthropicMessagesToBifrostMessages, plus the new convertAnthropicSystemToBifrostMessages implementation, fixes the prior empty‑ContentBlocks issue and preserves CacheControl on system text blocks. System content is emitted only when there’s actual text (either ContentStr or non‑empty ContentBlocks), which should be safer for downstream consumers.

Also applies to: 2065-2097


2260-2267: CacheControl is propagated comprehensively across Responses text/image/tool conversions.

The new assignments of CacheControl:

  • On text blocks in tool outputs (toolMsgContentBlocks for function/MCP/computer outputs),
  • On Anthropic→Bifrost text blocks for both input and output messages,
  • In convertContentBlockToAnthropic for text and image blocks (via ConvertToAnthropicImageBlock),
  • In toBifrostResponsesImageBlock, and
  • Between AnthropicTool.CacheControl and schemas.ResponsesTool.CacheControl in both convertAnthropicToolToBifrost and convertBifrostToolToAnthropic

give you a symmetric, end‑to‑end CacheControl story across Responses and Anthropic. No correctness issues stood out in these paths.

Also applies to: 2365-2383, 2488-2507, 2559-2567, 3325-3347, 3376-3383, 2982-3084, 3257-3265

core/providers/bedrock/responses.go (2)

1703-1712: Cached token usage propagation from Bifrost to Bedrock looks correct

Populating CacheReadInputTokens and CacheWriteInputTokens from Usage.InputTokensDetails.CachedTokens and Usage.OutputTokensDetails.CachedTokens in ToBedrockConverseResponse matches the schema and lets the Bedrock integration tests assert cache reads/writes via usage.cacheReadInputTokens/cacheWriteInputTokens. No issues here.


2524-2686: Content block CachePoint handling is consistent with Bifrost CacheControl

The new logic for content‑block caching appears coherent:

  • convertBifrostResponsesMessageContentBlocksToBedrockContentBlocks:

    • Emits a normal BedrockContentBlock for each supported Bifrost block type.
    • When block.CacheControl != nil, appends a second BedrockContentBlock with only CachePoint set. Unknown block types are skipped entirely, including their cache control, which is acceptable.
  • convertSingleBedrockMessageToBifrostMessages:

    • On encountering a BedrockContentBlock with CachePoint only, it back‑patches CacheControl{Type: Ephemeral} onto the last output message’s last content block, matching the emission pattern above.
    • It safely no‑ops if there are no prior messages or content blocks.

This should make cache control on text (and reasoning) content blocks round‑trip correctly between Bedrock and the Responses schema, and underpins the system/messages prompt‑caching tests.

No functional issues detected.

Also applies to: 2737-2785

tests/integrations/tests/utils/common.py (1)

226-286: Shared large-context prompt constant is appropriate for caching tests

Defining PROMPT_CACHING_LARGE_CONTEXT here and repeating it to guarantee a sizable token footprint is a good approach for exercising prompt caching across providers. No issues from a test or maintenance perspective.

tests/integrations/tests/test_anthropic.py (2)

39-39: Doc list entry for Test 28 is consistent and clear

The new “Prompt caching – cache creation and cache hit validation” entry correctly documents test_28_prompt_caching and keeps the numbered scenario list in sync. No changes needed.


80-83: Shared large-context import is used appropriately

Importing PROMPT_CACHING_LARGE_CONTEXT here and reusing it in test_28_prompt_caching keeps the test focused on caching behavior and avoids duplicating a big literal in this file. This is a good use of the shared test constant.

@TejasGhatte TejasGhatte force-pushed the 11-25-feat_support_for_prompt_caching_parameter_for_claude_models branch from 5da39b0 to 7f9b310 Compare December 16, 2025 04:14
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (1)
core/providers/openai/types.go (1)

178-194: Avoid mutating OpenAIResponsesRequestInputArray when stripping CacheControl.

The loop currently clears CacheControl in-place on r.OpenAIResponsesRequestInputArray (and its nested content/output blocks). Because msg.Content and ResponsesFunctionToolCallOutputBlocks share backing data with the caller, this silently mutates the original request structure and can break any downstream reuse (logging, retries, multiple sends).

Build a sanitized copy of the messages (and their relevant nested slices) before marshaling so the caller’s data remains unchanged.

 func (r *OpenAIResponsesRequestInput) MarshalJSON() ([]byte, error) {
 	if r.OpenAIResponsesRequestInputStr != nil {
 		return sonic.Marshal(*r.OpenAIResponsesRequestInputStr)
 	}
 	if r.OpenAIResponsesRequestInputArray != nil {
-		// Set CacheControl to nil for all content blocks to exclude it from marshaling
-		for _, msg := range r.OpenAIResponsesRequestInputArray {
-			// Remove CacheControl from message content blocks
-			if msg.Content != nil && msg.Content.ContentBlocks != nil {
-				for i := range msg.Content.ContentBlocks {
-					msg.Content.ContentBlocks[i].CacheControl = nil
-				}
-			}
-			// Remove CacheControl from tool message output blocks
-			if msg.ResponsesToolMessage != nil && msg.ResponsesToolMessage.Output != nil {
-				if msg.ResponsesToolMessage.Output.ResponsesFunctionToolCallOutputBlocks != nil {
-					for i := range msg.ResponsesToolMessage.Output.ResponsesFunctionToolCallOutputBlocks {
-						msg.ResponsesToolMessage.Output.ResponsesFunctionToolCallOutputBlocks[i].CacheControl = nil
-					}
-				}
-			}
-		}
-		return sonic.Marshal(r.OpenAIResponsesRequestInputArray)
+		// Work on a copy so we don't mutate the caller's data.
+		msgsCopy := make([]schemas.ResponsesMessage, len(r.OpenAIResponsesRequestInputArray))
+		copy(msgsCopy, r.OpenAIResponsesRequestInputArray)
+
+		for i := range msgsCopy {
+			msg := &msgsCopy[i]
+
+			// Strip CacheControl from message content blocks
+			if msg.Content != nil && msg.Content.ContentBlocks != nil {
+				blocksCopy := make([]schemas.ResponsesMessageContentBlock, len(msg.Content.ContentBlocks))
+				copy(blocksCopy, msg.Content.ContentBlocks)
+				for j := range blocksCopy {
+					blocksCopy[j].CacheControl = nil
+				}
+				msg.Content.ContentBlocks = blocksCopy
+			}
+
+			// Strip CacheControl from tool message output blocks
+			if msg.ResponsesToolMessage != nil &&
+				msg.ResponsesToolMessage.Output != nil &&
+				msg.ResponsesToolMessage.Output.ResponsesFunctionToolCallOutputBlocks != nil {
+
+				outputBlocksCopy := make(
+					[]schemas.ResponsesMessageContentBlock,
+					len(msg.ResponsesToolMessage.Output.ResponsesFunctionToolCallOutputBlocks),
+				)
+				copy(outputBlocksCopy, msg.ResponsesToolMessage.Output.ResponsesFunctionToolCallOutputBlocks)
+				for j := range outputBlocksCopy {
+					outputBlocksCopy[j].CacheControl = nil
+				}
+				msg.ResponsesToolMessage.Output.ResponsesFunctionToolCallOutputBlocks = outputBlocksCopy
+			}
+		}
+		return sonic.Marshal(msgsCopy)
 	}
 	return sonic.Marshal(nil)
 }
🧹 Nitpick comments (5)
tests/integrations/tests/utils/common.py (1)

88-281: Prompt‑caching test fixtures look solid; duplication is acceptable here.

The PROMPT_CACHING_TOOLS definitions and PROMPT_CACHING_LARGE_CONTEXT blob provide realistic, rich fixtures for cache‑behavior tests and are structurally consistent with existing tool/message formats. If you start evolving tool schemas heavily, consider reusing WEATHER_TOOL/CALCULATOR_TOOL to avoid drift, but it's not required for this PR.

Also applies to: 421-481

tests/integrations/tests/test_anthropic.py (1)

1421-1485: Consider relaxing cache token comparison assertion.

Line 1481 asserts that cache read tokens should be within 100 tokens of cache write tokens:

assert abs(cache_write_tokens - cache_read_tokens) < 100

This assertion may be overly strict. Cache read and write token counts can legitimately differ due to:

  • Provider-specific tokenization variations
  • Cache granularity differences
  • Metadata overhead

Consider either:

  1. Removing this assertion entirely (the individual validations in the helper functions are sufficient)
  2. Increasing the tolerance threshold
  3. Making it a warning rather than a hard failure
-        assert abs(cache_write_tokens - cache_read_tokens) < 100, \
-            f"Cache read tokens ({cache_read_tokens}) should be close to cache creation tokens ({cache_write_tokens})"
+        # Log token difference for debugging but don't enforce strict equality
+        print(f"Cache token difference: {abs(cache_write_tokens - cache_read_tokens)} tokens")

Note: The unused parameter warnings from static analysis are false positives - these are standard pytest fixture patterns.

core/providers/anthropic/responses.go (2)

2065-2097: System content → Bifrost messages helper is consistent, with a minor type nit

The new convertAnthropicSystemToBifrostMessages correctly:

  • Emits a single system message for ContentStr.
  • Emits one system message with text ContentBlocks, propagating CacheControl from Anthropic blocks.

One minor consistency nit: these system messages are inputs, but system ContentBlocks use ResponsesOutputMessageContentTypeText. Elsewhere (e.g., convertAnthropicContentBlocksToResponsesMessages) non‑output messages use ResponsesInputMessageContentBlockTypeText. If no downstream code depends on the “output_text” type here, consider switching to the input variant for clarity and consistency.

-					Type:         schemas.ResponsesOutputMessageContentTypeText,
+					Type:         schemas.ResponsesInputMessageContentBlockTypeText,

2174-2180: CacheControl propagation across Anthropic↔Bifrost looks correct; a couple of small consistency gaps

The new CacheControl wiring is generally solid:

  • Input messages (convertAnthropicContentBlocksToResponsesMessagesGrouped non‑output branch, and convertAnthropicContentBlocksToResponsesMessages input branch) now emit ContentBlocks with CacheControl on text blocks.
  • Tool result conversions (AnthropicContentBlockTypeToolResult paths in both grouped/ungrouped converters) attach CacheControl on inner text blocks.
  • Tool definitions (convertAnthropicToolToBifrost / convertBifrostToolToAnthropic) mirror CacheControl both ways.
  • convertContentBlockToAnthropic mirrors CacheControl for text and image blocks, and toBifrostResponsesImageBlock preserves it back into Bifrost.

Two minor improvements for completeness:

  1. Grouped output text path drops CacheControl
    In convertAnthropicContentBlocksToResponsesMessagesGrouped, the isOutputMessage case for AnthropicContentBlockTypeText still ignores block.CacheControl:

    accumulatedTextContent = append(accumulatedTextContent, schemas.ResponsesMessageContentBlock{
        Type: schemas.ResponsesOutputMessageContentTypeText,
        Text: block.Text,
    })

    Even if that path isn’t currently used for outputs, it’s cheap and safer to keep it consistent:

    -	accumulatedTextContent = append(accumulatedTextContent, schemas.ResponsesMessageContentBlock{
    -		Type: schemas.ResponsesOutputMessageContentTypeText,
    -		Text: block.Text,
    -	})
    +	accumulatedTextContent = append(accumulatedTextContent, schemas.ResponsesMessageContentBlock{
    +		Type:         schemas.ResponsesOutputMessageContentTypeText,
    +		Text:         block.Text,
    +		CacheControl: block.CacheControl,
    +	})
  2. Stale comment on input text path
    In convertAnthropicContentBlocksToResponsesMessages, the comment still says “For input messages, use ContentStr” but the implementation now (correctly) uses ContentBlocks with ResponsesInputMessageContentBlockTypeText. Consider updating the comment to avoid confusion.

Overall behavior looks correct; these are just small consistency cleanups.

Also applies to: 2262-2266, 2377-2380, 2390-2395, 2495-2498, 2560-2563, 3082-3084, 3262-3264, 3330-3333, 3343-3344, 3382-3383

core/providers/bedrock/responses.go (1)

2378-2385: System/content CachePoint handling is consistent both directions; corner-case worth noting

Across the helpers:

  • convertBifrostMessageToBedrockSystemMessages emits [Text, CachePoint] pairs when a system text block has CacheControl.
  • convertBedrockSystemMessageToBifrostMessages interprets BedrockSystemMessage.CachePoint by marking the last content block of the last emitted system message as CacheControl{Type: Ephemeral}.
  • convertBifrostResponsesMessageContentBlocksToBedrockContentBlocks emits [block, CachePoint] for any content block with CacheControl.
  • convertSingleBedrockMessageToBifrostMessages interprets BedrockContentBlock.CachePoint by tagging the last content block of the last emitted Bifrost message with Ephemeral CacheControl.

This gives a symmetric representation for both system and regular content blocks and correctly localizes the cache marker to the preceding block.

One behavior to be aware of: if Bedrock ever sends a CachePoint as the first element (before any text/content), we’ll silently ignore it because there’s no prior Bifrost message/block to tag. If that’s not allowed by the upstream API, no change needed; if it is allowed, we might eventually want to accumulate such “orphan” CachePoints and apply them when the first content block is created.

Also applies to: 2399-2420, 2450-2484, 2657-2666, 2747-2781

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 5da39b0 and 7f9b310.

📒 Files selected for processing (15)
  • core/internal/testutil/test_retry_framework.go (5 hunks)
  • core/providers/anthropic/chat.go (6 hunks)
  • core/providers/anthropic/responses.go (13 hunks)
  • core/providers/anthropic/types.go (2 hunks)
  • core/providers/anthropic/utils.go (1 hunks)
  • core/providers/bedrock/chat.go (1 hunks)
  • core/providers/bedrock/responses.go (8 hunks)
  • core/providers/bedrock/types.go (6 hunks)
  • core/providers/bedrock/utils.go (3 hunks)
  • core/providers/openai/types.go (1 hunks)
  • core/schemas/chatcompletions.go (2 hunks)
  • core/schemas/responses.go (2 hunks)
  • tests/integrations/tests/test_anthropic.py (5 hunks)
  • tests/integrations/tests/test_bedrock.py (3 hunks)
  • tests/integrations/tests/utils/common.py (2 hunks)
🚧 Files skipped from review as they are similar to previous changes (3)
  • core/providers/bedrock/utils.go
  • core/providers/bedrock/chat.go
  • core/providers/anthropic/utils.go
🧰 Additional context used
📓 Path-based instructions (1)
**

⚙️ CodeRabbit configuration file

always check the stack if there is one for the current PR. do not give localized reviews for the PR, always see all changes in the light of the whole stack of PRs (if there is a stack, if there is no stack you can continue to make localized suggestions/reviews)

Files:

  • core/providers/anthropic/chat.go
  • core/schemas/responses.go
  • core/providers/anthropic/types.go
  • core/schemas/chatcompletions.go
  • core/providers/openai/types.go
  • core/providers/bedrock/types.go
  • tests/integrations/tests/test_bedrock.py
  • core/providers/anthropic/responses.go
  • tests/integrations/tests/test_anthropic.py
  • core/internal/testutil/test_retry_framework.go
  • tests/integrations/tests/utils/common.py
  • core/providers/bedrock/responses.go
🧠 Learnings (4)
📚 Learning: 2025-12-09T17:07:42.007Z
Learnt from: qwerty-dvorak
Repo: maximhq/bifrost PR: 1006
File: core/schemas/account.go:9-18
Timestamp: 2025-12-09T17:07:42.007Z
Learning: In core/schemas/account.go, the HuggingFaceKeyConfig field within the Key struct is currently unused and reserved for future Hugging Face inference endpoint deployments. Do not flag this field as missing from OpenAPI documentation or require its presence in the API spec until the feature is actively implemented and used. When the feature is added, update the OpenAPI docs accordingly; otherwise, treat this field as non-breaking and not part of the current API surface.

Applied to files:

  • core/providers/anthropic/chat.go
  • core/schemas/responses.go
  • core/providers/anthropic/types.go
  • core/schemas/chatcompletions.go
  • core/providers/openai/types.go
  • core/providers/bedrock/types.go
  • core/providers/anthropic/responses.go
  • core/internal/testutil/test_retry_framework.go
  • core/providers/bedrock/responses.go
📚 Learning: 2025-12-12T10:28:54.988Z
Learnt from: Pratham-Mishra04
Repo: maximhq/bifrost PR: 978
File: core/providers/anthropic/chat.go:207-216
Timestamp: 2025-12-12T10:28:54.988Z
Learning: In core/providers/anthropic/chat.go, within ToAnthropicChatRequest, ensure the mapping from ReasoningDetails to Anthropic thinking blocks preserves all fields from ReasoningDetails, including when Text is nil or empty. Do not drop or coerce missing Text; propagate nil/empty values to maintain complete bidirectional mapping of Anthropic events in Bifrost. Add tests to verify that ReasoningDetails with Text nil or "" round-trip without loss, and update code to handle nullable Text without dropping data. If there are multiple blocks, preserve order and all fields during mapping.

Applied to files:

  • core/providers/anthropic/chat.go
📚 Learning: 2025-12-11T11:58:25.307Z
Learnt from: Pratham-Mishra04
Repo: maximhq/bifrost PR: 1000
File: core/providers/openai/responses.go:42-84
Timestamp: 2025-12-11T11:58:25.307Z
Learning: In core/providers/openai/responses.go (and related OpenAI response handling), document and enforce the API format constraint: if ResponsesReasoning != nil and the response contains content blocks, all content blocks should be treated as reasoning blocks by default. Implement type guards or parsing logic accordingly, and add unit tests to verify that when ResponsesReasoning is non-nil, content blocks are labeled as reasoning blocks. Include clear comments in the code explaining the rationale and ensure downstream consumers rely on this behavior.

Applied to files:

  • core/providers/openai/types.go
📚 Learning: 2025-12-14T14:43:30.902Z
Learnt from: Radheshg04
Repo: maximhq/bifrost PR: 980
File: core/providers/openai/images.go:10-22
Timestamp: 2025-12-14T14:43:30.902Z
Learning: Enforce the OpenAI image generation SSE event type values across the OpenAI image flow in the repository: use "image_generation.partial_image" for partial chunks, "image_generation.completed" for the final result, and "error" for errors. Apply this consistently in schemas, constants, tests, accumulator routing, and UI code within core/providers/openai (and related Go files) to ensure uniform event typing and avoid mismatches.

Applied to files:

  • core/providers/openai/types.go
🧬 Code graph analysis (6)
core/providers/anthropic/chat.go (3)
core/schemas/chatcompletions.go (2)
  • CacheControl (610-613)
  • ChatContentBlockTypeText (584-584)
core/providers/gemini/types.go (1)
  • Type (782-782)
core/providers/anthropic/types.go (1)
  • AnthropicContentBlockTypeText (204-204)
core/schemas/responses.go (1)
core/schemas/chatcompletions.go (1)
  • CacheControl (610-613)
core/providers/anthropic/types.go (1)
core/schemas/chatcompletions.go (1)
  • CacheControl (610-613)
core/providers/openai/types.go (2)
core/schemas/chatcompletions.go (1)
  • CacheControl (610-613)
core/schemas/responses.go (1)
  • ResponsesToolMessage (469-489)
core/providers/anthropic/responses.go (3)
core/schemas/responses.go (6)
  • ResponsesMessage (318-331)
  • ResponsesInputMessageRoleSystem (338-338)
  • ResponsesMessageContent (343-348)
  • ResponsesMessageContentBlock (403-418)
  • ResponsesOutputMessageContentTypeText (396-396)
  • ResponsesInputMessageContentBlockTypeText (392-392)
core/schemas/chatcompletions.go (1)
  • CacheControl (610-613)
core/providers/anthropic/types.go (1)
  • AnthropicContentBlockTypeText (204-204)
tests/integrations/tests/test_anthropic.py (1)
tests/integrations/tests/utils/parametrize.py (2)
  • get_cross_provider_params_for_scenario (12-47)
  • format_provider_model (50-65)
🪛 Ruff (0.14.8)
tests/integrations/tests/test_bedrock.py

1624-1624: Unused method argument: test_config

(ARG002)


1674-1674: Unused method argument: test_config

(ARG002)


1725-1725: Unused method argument: test_config

(ARG002)

tests/integrations/tests/test_anthropic.py

1422-1422: Unused method argument: anthropic_client

(ARG002)


1422-1422: Unused method argument: test_config

(ARG002)


1488-1488: Unused method argument: anthropic_client

(ARG002)


1488-1488: Unused method argument: test_config

(ARG002)


1562-1562: Unused method argument: anthropic_client

(ARG002)


1562-1562: Unused method argument: test_config

(ARG002)

🔇 Additional comments (23)
core/internal/testutil/test_retry_framework.go (3)

832-834: LGTM! Attempt parameter semantics now consistent.

The comment correctly documents that attempt represents the current failed attempt number, and the log displays attempt+1 to show the next attempt. This resolves the inconsistency flagged in the previous review where different retry functions had different semantics.


2153-2158: Excellent logging improvements for observability.

The added logs provide clear visibility into:

  • Attempt start/progression (lines 2153-2158)
  • Retry errors with context (lines 2205-2209)
  • Waiting time before retries (line 2218)

These enhancements will make debugging test failures much easier.

Also applies to: 2205-2209, 2218-2218


2211-2212: LGTM! Attempt number semantics now consistent across all retry functions.

The updated comments (lines 2211, 2246, 2317) confirm that config.OnRetry now receives the current failed attempt number, making WithResponsesStreamValidationRetry consistent with other retry helpers like WithStreamRetry and WithChatTestRetry.

The enhanced nil channel handling (lines 2244-2254) properly logs and retries, and the CRITICAL comment at line 2254 prevents potential regression where the function might return early instead of retrying.

Also applies to: 2244-2254, 2317-2318

core/schemas/responses.go (1)

416-418: CacheControl propagation on responses content blocks and tools looks consistent.

Adding CacheControl *CacheControl to ResponsesMessageContentBlock and ResponsesTool with omitempty is a clean, backward‑compatible way to carry provider cache metadata through the unified responses schema. No marshal/unmarshal conflicts in this file.

Also applies to: 1042-1044

core/schemas/chatcompletions.go (1)

246-249: Chat CacheControl schema extension is well‑scoped and backwards‑compatible.

Introducing CacheControlType/CacheControl and wiring CacheControl *CacheControl into ChatTool and ChatContentBlock cleanly models ephemeral/TTL behavior while keeping existing payloads valid (omitempty, pointer fields, no custom marshalling to update).

Also applies to: 592-613

core/providers/anthropic/types.go (1)

218-231: Anthropic CacheControl wiring matches shared schema and is safe.

Adding CacheControl *schemas.CacheControl to AnthropicContentBlock and AnthropicTool (with omitempty) correctly exposes provider cache metadata without affecting existing flows or custom (un)marshalling behavior in this file.

Also applies to: 292-297

core/providers/anthropic/chat.go (6)

65-67: LGTM: Tool cache control propagation is correct.

The implementation properly propagates CacheControl from Bifrost tools to Anthropic tools with appropriate nil checking.


149-152: LGTM: System message cache control is properly implemented.

The system content blocks correctly include Type, Text, and CacheControl fields, maintaining consistency with the schema.


184-187: LGTM: Tool result cache control propagation is consistent.

The implementation correctly propagates CacheControl for text blocks within tool results, following the same pattern as other content blocks.


241-244: LGTM: Message content cache control is properly handled.

The propagation of CacheControl for user and assistant message content blocks is correctly implemented and consistent with the overall pattern.


327-330: LGTM: Bidirectional cache control mapping is correctly implemented.

The conversion from Anthropic to Bifrost properly preserves CacheControl metadata, ensuring cache information is available to downstream consumers.


513-516: LGTM: Reverse conversion preserves cache control.

The Bifrost-to-Anthropic conversion correctly includes CacheControl, completing the bidirectional mapping required for proper cache metadata flow.

tests/integrations/tests/test_bedrock.py (3)

1673-1722: LGTM: Messages caching test is correctly structured.

The test properly validates prompt caching at the message level, following a consistent pattern of cache write validation followed by cache read validation.


1724-1785: LGTM: Tools caching test is properly implemented.

The test correctly validates prompt caching with tool definitions, converting tools to Bedrock format and adding a cache checkpoint after all tool definitions.


1787-1813: LGTM: Cache validation helpers are well-designed.

The validate_cache_write and validate_cache_read functions provide clear, reusable validation logic with helpful debug output for troubleshooting cache behavior.

tests/integrations/tests/test_anthropic.py (3)

1487-1559: LGTM: Messages caching test is correctly implemented.

The test properly validates prompt caching at the message level using Anthropic's cache_control format with ephemeral type.


1561-1610: LGTM: Tools caching test uses correct Anthropic format.

The test properly implements tools caching by adding cache_control directly to the last tool definition, which is the correct approach for the Anthropic API (vs. Bedrock's separate cachePoint).


1665-1695: LGTM: Cache validation helpers correctly use Anthropic SDK format.

The helpers properly use snake_case attribute names (cache_creation_input_tokens, cache_read_input_tokens) consistent with the Anthropic SDK, and safely access attributes using getattr with defaults.

core/providers/anthropic/responses.go (1)

1340-1346: System + grouped-tool routing into ConvertAnthropicMessagesToBifrostMessages looks correct

Passing request.System into ConvertAnthropicMessagesToBifrostMessages and toggling keepToolsGrouped based on the provider centralizes system handling and Bedrock-specific grouping without changing external signatures; this wiring looks sound and matches the rest of the file’s conversion patterns.

core/providers/bedrock/responses.go (2)

1244-1290: Tool-level CachePoint ↔ CacheControl mapping is symmetric and scoped correctly

The new logic:

  • In ToBifrostResponsesRequest, when a Bedrock tool entry has CachePoint and no ToolSpec, you tag the previous Bifrost tool with CacheControl{Type: Ephemeral}.
  • In ToBedrockResponsesRequest, when a Bifrost tool has non‑nil CacheControl, you append a synthetic BedrockTool with CachePoint{Type: default} immediately after its ToolSpec.

This cleanly round‑trips tool cache metadata while preserving existing ToolSpec behavior. The “attach to last tool” semantics match how the Bedrock tools array is constructed here, so there’s no obvious ordering bug.

Also applies to: 1535-1583


1701-1713: Usage cache token fields are now round-tripped correctly

Mapping:

  • bifrostResp.Usage.InputTokensDetails.CachedTokensBedrockTokenUsage.CacheReadInputTokens
  • bifrostResp.Usage.OutputTokensDetails.CachedTokensBedrockTokenUsage.CacheWriteInputTokens

matches the earlier Bedrock→Bifrost mapping in BedrockConverseResponse.ToBifrostResponsesResponse, so cached token counts now flow both directions without loss. No issues here.

core/providers/bedrock/types.go (2)

155-160: Bedrock CachePoint types/fields align with response conversion logic

Adding:

  • CachePoint *BedrockCachePoint to BedrockSystemMessage, BedrockContentBlock, and BedrockTool.
  • BedrockCachePointType + BedrockCachePointTypeDefault and BedrockCachePoint struct.

is consistent with how responses.go now converts CachePoint↔CacheControl at the system, content, and tool levels. These are optional JSON fields, so they’re wire‑compatible, and their usage is localized to Bedrock conversions.

No issues from a type/struct perspective.

Also applies to: 163-190, 192-201, 284-287


640-646: ETag field definition is correct for these new types — no action required

BedrockFileRetrieveRequest, BedrockFileDeleteRequest, and BedrockFileContentRequest are new struct types in this commit with ETag string json:"etag". These types are properly handled in the codebase:

  • ETag is explicitly populated at all call sites (bedrock.go lines 903, 910, 917)
  • Tests verify ETag values are present
  • These request types are never marshaled back to external clients (they're internal request ↔ conversion layer)

The JSON struct tag without omitempty does not impose a requirement on clients to send the field during unmarshaling — it only affects whether the field appears in marshaled output. Since the code consistently populates ETag internally before any serialization, the current definition is appropriate.

Likely an incorrect or invalid review comment.

@TejasGhatte TejasGhatte force-pushed the 11-25-feat_support_for_prompt_caching_parameter_for_claude_models branch from 7f9b310 to 6fe2adf Compare December 16, 2025 05:30
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (3)
core/providers/openai/types.go (1)

178-194: Data mutation issue persists from previous review.

The code still modifies the original input data structure by setting CacheControl = nil on content blocks and tool output blocks. This issue was previously flagged with a comprehensive fix suggestion. The newly added tool output block handling (lines 186-193) follows the same problematic mutation pattern.

Please refer to the earlier review comment for the detailed solution approach, which includes creating deep copies before modification to avoid mutating shared data structures.

tests/integrations/tests/test_anthropic.py (2)

1487-1559: Message‑level caching test mirrors system test; same unused‑arg issue

test_29_prompt_caching_messages correctly:

  • places the cached span in the middle of the user content with cache_control,
  • asserts cache creation on the first call and read on the second,
  • reuses the same large context to probe actual cache hits.

Same as test 28, anthropic_client and test_config are unused and will keep tripping ARG002; please apply the same fix (drop, underscore, or # noqa: ARG002) consistently across these prompt‑caching tests.


1561-1610: Tool‑level caching test matches provider semantics; address unused fixtures

test_30_prompt_caching_tools cleanly:

  • converts PROMPT_CACHING_TOOLS to Anthropic’s tool format,
  • adds cache_control only on the last tool, and
  • verifies that a second call with the same tool set reads from cache.

Again, anthropic_client and test_config are unused here; consider the same ARG002 mitigation as for tests 28–29 so Ruff stays quiet.

🧹 Nitpick comments (6)
tests/integrations/tests/utils/common.py (1)

88-281: Prompt caching test data looks solid; consider minor deduplication

The new PROMPT_CACHING_TOOLS set and PROMPT_CACHING_LARGE_CONTEXT give you a good variety and sufficient token volume for cache behavior tests. There is some duplication with existing tool constants (e.g., weather/calculate) and the huge inline legal text blob, but it’s all test‑only and harmless. If this needs maintenance later, you could:

  • Reuse WEATHER_TOOL / CALCULATOR_TOOL when building PROMPT_CACHING_TOOLS, and
  • Optionally move the large context into a shorter base paragraph plus a small loop to repeat/compose it, or into a separate fixture module.

Not required for correctness, just a readability/DRY tweak.

Also applies to: 421-481

tests/integrations/tests/test_bedrock.py (1)

1623-1785: Prompt caching tests cover system/messages/tools checkpoints appropriately

The three new tests exercise Bedrock prompt caching semantics across system, message content, and tool definitions, and the validate_cache_write / validate_cache_read helpers assert against the correct Bedrock usage fields (cacheWriteInputTokens / cacheReadInputTokens). The parametrization over prompt_caching providers ties cleanly back to the config.

One small cleanup opportunity: in test_23_prompt_caching_tools, you manually rebuild the Bedrock tool specs from PROMPT_CACHING_TOOLS. You could reuse convert_to_bedrock_tools to avoid duplicating the "toolSpec" wiring, e.g.:

tools_config = convert_to_bedrock_tools(PROMPT_CACHING_TOOLS)
bedrock_tools = tools_config["tools"]
bedrock_tools.append({"cachePoint": {"type": "default"}})
tool_config = {"tools": bedrock_tools}

Purely optional; current implementation is correct as written.

core/providers/bedrock/utils.go (1)

598-667: Tool‑level CachePoint injection: semantics and potential dedup

Appending an extra BedrockTool that only carries CachePoint when tool.CacheControl != nil && !schemas.IsNovaModel(model) is a reasonable way to surface per‑tool caching to Bedrock without touching ToolSpec, and the Nova guard matches the intent to avoid unsupported cache hints.

Two follow‑ups to consider:

  • If multiple tools have CacheControl, you’ll send several identical { "cachePoint": { "type": "default" } } entries. That’s likely harmless but could be deduped for cleaner requests.
  • Cache metadata from tool.CacheControl (e.g. TTL) isn’t represented here; if Bedrock ever surfaces more granular cache config, this may need extending.

Functionally this looks fine; please just double‑check the Bedrock cachePoint contract for Nova/non‑Nova models and whether a single cachePoint entry would suffice for multiple cached tools.

tests/integrations/tests/test_anthropic.py (1)

1421-1485: System prompt‑caching test is sound; drop unused fixtures to satisfy Ruff

The flow for test_28_prompt_caching_system correctly:

  • uses a large cached system block with cache_control: {"type": "ephemeral"},
  • verifies first call writes cache and second call reads from it via usage.cache_creation_input_tokens / usage.cache_read_input_tokens,
  • checks read/write token counts are close.

Static analysis is right that anthropic_client and test_config aren’t used here; you can either:

  • remove them from the signature, or
  • rename to _anthropic_client / _test_config, or
  • add a # noqa: ARG002 comment if you want to keep the fixture signature uniform.
core/providers/anthropic/responses.go (1)

2163-2184: Grouped converter: CacheControl preserved, but block type for input text is slightly inconsistent

For non‑output messages in convertAnthropicContentBlocksToResponsesMessagesGrouped, you now emit a single ResponsesMessage whose ContentBlocks[0] carries:

  • Type: ResponsesOutputMessageContentTypeText
  • Text: block.Text
  • CacheControl: block.CacheControl

Functionally this works (downstream converters treat input/output text types similarly), and it correctly propagates CacheControl. For consistency with the non‑grouped path (which uses ResponsesInputMessageContentBlockTypeText for inputs), consider switching to the input text type here as well.

core/providers/bedrock/types.go (1)

640-678: ETag requirement on file requests: clarify caller expectations

Adding mandatory etag fields to:

  • BedrockFileRetrieveRequest,
  • BedrockFileDeleteRequest,
  • BedrockFileContentRequest

is a good way to support conditional operations and stronger consistency against S3. Since the JSON tag omits omitempty, these will always be serialized; ensure all call sites either:

  • always populate ETag (preferred for concurrency), or
  • treat an empty string as “no constraint” if that’s acceptable.

Might be worth documenting this requirement alongside the higher‑level file API helpers in this stack.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 7f9b310 and 6fe2adf.

📒 Files selected for processing (17)
  • core/internal/testutil/test_retry_framework.go (5 hunks)
  • core/providers/anthropic/chat.go (6 hunks)
  • core/providers/anthropic/responses.go (13 hunks)
  • core/providers/anthropic/types.go (2 hunks)
  • core/providers/anthropic/utils.go (1 hunks)
  • core/providers/bedrock/chat.go (1 hunks)
  • core/providers/bedrock/responses.go (8 hunks)
  • core/providers/bedrock/types.go (6 hunks)
  • core/providers/bedrock/utils.go (6 hunks)
  • core/providers/openai/types.go (1 hunks)
  • core/schemas/chatcompletions.go (2 hunks)
  • core/schemas/responses.go (2 hunks)
  • core/schemas/utils.go (1 hunks)
  • tests/integrations/config.yml (7 hunks)
  • tests/integrations/tests/test_anthropic.py (5 hunks)
  • tests/integrations/tests/test_bedrock.py (3 hunks)
  • tests/integrations/tests/utils/common.py (2 hunks)
🚧 Files skipped from review as they are similar to previous changes (5)
  • core/providers/anthropic/utils.go
  • core/schemas/responses.go
  • core/internal/testutil/test_retry_framework.go
  • core/providers/anthropic/types.go
  • core/providers/bedrock/responses.go
🧰 Additional context used
📓 Path-based instructions (1)
**

⚙️ CodeRabbit configuration file

always check the stack if there is one for the current PR. do not give localized reviews for the PR, always see all changes in the light of the whole stack of PRs (if there is a stack, if there is no stack you can continue to make localized suggestions/reviews)

Files:

  • core/providers/openai/types.go
  • tests/integrations/config.yml
  • tests/integrations/tests/utils/common.py
  • core/providers/bedrock/chat.go
  • tests/integrations/tests/test_bedrock.py
  • core/providers/bedrock/utils.go
  • core/schemas/chatcompletions.go
  • tests/integrations/tests/test_anthropic.py
  • core/schemas/utils.go
  • core/providers/anthropic/chat.go
  • core/providers/bedrock/types.go
  • core/providers/anthropic/responses.go
🧠 Learnings (4)
📚 Learning: 2025-12-09T17:07:42.007Z
Learnt from: qwerty-dvorak
Repo: maximhq/bifrost PR: 1006
File: core/schemas/account.go:9-18
Timestamp: 2025-12-09T17:07:42.007Z
Learning: In core/schemas/account.go, the HuggingFaceKeyConfig field within the Key struct is currently unused and reserved for future Hugging Face inference endpoint deployments. Do not flag this field as missing from OpenAPI documentation or require its presence in the API spec until the feature is actively implemented and used. When the feature is added, update the OpenAPI docs accordingly; otherwise, treat this field as non-breaking and not part of the current API surface.

Applied to files:

  • core/providers/openai/types.go
  • core/providers/bedrock/chat.go
  • core/providers/bedrock/utils.go
  • core/schemas/chatcompletions.go
  • core/schemas/utils.go
  • core/providers/anthropic/chat.go
  • core/providers/bedrock/types.go
  • core/providers/anthropic/responses.go
📚 Learning: 2025-12-11T11:58:25.307Z
Learnt from: Pratham-Mishra04
Repo: maximhq/bifrost PR: 1000
File: core/providers/openai/responses.go:42-84
Timestamp: 2025-12-11T11:58:25.307Z
Learning: In core/providers/openai/responses.go (and related OpenAI response handling), document and enforce the API format constraint: if ResponsesReasoning != nil and the response contains content blocks, all content blocks should be treated as reasoning blocks by default. Implement type guards or parsing logic accordingly, and add unit tests to verify that when ResponsesReasoning is non-nil, content blocks are labeled as reasoning blocks. Include clear comments in the code explaining the rationale and ensure downstream consumers rely on this behavior.

Applied to files:

  • core/providers/openai/types.go
📚 Learning: 2025-12-14T14:43:30.902Z
Learnt from: Radheshg04
Repo: maximhq/bifrost PR: 980
File: core/providers/openai/images.go:10-22
Timestamp: 2025-12-14T14:43:30.902Z
Learning: Enforce the OpenAI image generation SSE event type values across the OpenAI image flow in the repository: use "image_generation.partial_image" for partial chunks, "image_generation.completed" for the final result, and "error" for errors. Apply this consistently in schemas, constants, tests, accumulator routing, and UI code within core/providers/openai (and related Go files) to ensure uniform event typing and avoid mismatches.

Applied to files:

  • core/providers/openai/types.go
📚 Learning: 2025-12-12T10:28:54.988Z
Learnt from: Pratham-Mishra04
Repo: maximhq/bifrost PR: 978
File: core/providers/anthropic/chat.go:207-216
Timestamp: 2025-12-12T10:28:54.988Z
Learning: In core/providers/anthropic/chat.go, within ToAnthropicChatRequest, ensure the mapping from ReasoningDetails to Anthropic thinking blocks preserves all fields from ReasoningDetails, including when Text is nil or empty. Do not drop or coerce missing Text; propagate nil/empty values to maintain complete bidirectional mapping of Anthropic events in Bifrost. Add tests to verify that ReasoningDetails with Text nil or "" round-trip without loss, and update code to handle nullable Text without dropping data. If there are multiple blocks, preserve order and all fields during mapping.

Applied to files:

  • core/providers/anthropic/chat.go
🧬 Code graph analysis (8)
core/providers/openai/types.go (2)
core/schemas/chatcompletions.go (1)
  • CacheControl (610-613)
core/schemas/responses.go (1)
  • ResponsesToolMessage (469-489)
core/providers/bedrock/chat.go (1)
core/schemas/chatcompletions.go (4)
  • ChatContentBlock (592-602)
  • ChatContentBlockTypeText (584-584)
  • CacheControl (610-613)
  • CacheControlTypeEphemeral (607-607)
tests/integrations/tests/test_bedrock.py (2)
tests/integrations/tests/utils/parametrize.py (2)
  • get_cross_provider_params_for_scenario (12-47)
  • format_provider_model (50-65)
tests/integrations/tests/test_anthropic.py (2)
  • validate_cache_write (1665-1679)
  • validate_cache_read (1681-1695)
core/providers/bedrock/utils.go (2)
core/providers/bedrock/types.go (5)
  • BedrockContentBlock (163-190)
  • BedrockCachePoint (199-201)
  • BedrockCachePointTypeDefault (195-195)
  • BedrockMessage (150-153)
  • BedrockTool (284-287)
core/schemas/utils.go (1)
  • IsNovaModel (1043-1045)
core/schemas/chatcompletions.go (2)
core/providers/gemini/types.go (1)
  • Type (782-782)
ui/lib/types/logs.ts (1)
  • Function (144-149)
tests/integrations/tests/test_anthropic.py (2)
tests/integrations/tests/utils/parametrize.py (2)
  • get_cross_provider_params_for_scenario (12-47)
  • format_provider_model (50-65)
tests/integrations/tests/utils/common.py (1)
  • assert_valid_chat_response (801-833)
core/providers/anthropic/chat.go (3)
core/schemas/chatcompletions.go (2)
  • CacheControl (610-613)
  • ChatContentBlockTypeText (584-584)
core/providers/gemini/types.go (1)
  • Type (782-782)
core/providers/anthropic/types.go (1)
  • AnthropicContentBlockTypeText (204-204)
core/providers/anthropic/responses.go (5)
core/schemas/responses.go (6)
  • ResponsesMessage (318-331)
  • ResponsesInputMessageRoleSystem (338-338)
  • ResponsesMessageContent (343-348)
  • ResponsesMessageContentBlock (403-418)
  • ResponsesOutputMessageContentTypeText (396-396)
  • ResponsesInputMessageContentBlockTypeText (392-392)
ui/lib/types/logs.ts (3)
  • ResponsesMessage (423-438)
  • ResponsesMessageContent (400-400)
  • ResponsesMessageContentBlock (362-397)
core/schemas/utils.go (1)
  • Ptr (16-18)
core/schemas/chatcompletions.go (1)
  • CacheControl (610-613)
core/providers/anthropic/types.go (1)
  • AnthropicContentBlockTypeText (204-204)
🪛 Ruff (0.14.8)
tests/integrations/tests/test_bedrock.py

1624-1624: Unused method argument: test_config

(ARG002)


1674-1674: Unused method argument: test_config

(ARG002)


1725-1725: Unused method argument: test_config

(ARG002)

tests/integrations/tests/test_anthropic.py

1422-1422: Unused method argument: anthropic_client

(ARG002)


1422-1422: Unused method argument: test_config

(ARG002)


1488-1488: Unused method argument: anthropic_client

(ARG002)


1488-1488: Unused method argument: test_config

(ARG002)


1562-1562: Unused method argument: anthropic_client

(ARG002)


1562-1562: Unused method argument: test_config

(ARG002)

🔇 Additional comments (21)
core/schemas/utils.go (1)

1042-1045: LGTM!

The IsNovaModel function follows the established pattern of other model detection utilities in this file. The implementation is consistent with IsAnthropicModel, IsMistralModel, and IsGeminiModel.

tests/integrations/config.yml (1)

109-127: Prompt caching configuration and capability mapping look consistent

The updated Bedrock chat model and the new prompt_caching flags per provider line up with scenario_capabilities.prompt_caching: "chat", so cross‑provider parametrization for the new tests will target only providers that actually support prompt caching (Anthropic + Bedrock) while leaving others disabled.

Also applies to: 167-167, 205-205, 243-243, 281-281, 319-319, 362-362

core/schemas/chatcompletions.go (1)

246-250: CacheControl schema extensions are well-scoped and backward compatible

Adding CacheControl to ChatTool and ChatContentBlock, plus the CacheControlType/CacheControl types, follows existing schema patterns (pointer fields with omitempty) and should not disturb existing clients while enabling per‑tool and per‑block caching metadata.

Also applies to: 591-613

core/providers/anthropic/chat.go (1)

65-67: Anthropic CacheControl propagation is symmetric and consistent

Forward and reverse conversions now carry CacheControl on both tools and text content blocks (system, tool results, user/assistant messages), matching the new central schemas.CacheControl type. This keeps Anthropic ↔ Bifrost mappings lossless for caching metadata without altering existing behavior for messages that don’t set it.

Also applies to: 150-153, 184-188, 242-245, 328-331, 514-517

core/providers/bedrock/utils.go (3)

26-32: Pass model into convertToolConfig for Nova gating

Wiring bifrostReq.Model into convertToolConfig is appropriate and keeps model‑specific behavior (e.g. Nova cache restrictions) localized there. No issues seen with this change.


319-348: Per‑block CachePoint on tool results looks correct

Attaching CachePoint on tool‑result text and image blocks only when block.CacheControl != nil preserves existing behavior while enabling cache signaling, and the construction via a local toolResultContentBlock avoids partial initialization issues. This segment looks good.


392-421: Content block CachePoint propagation is consistent

convertContentBlock now mirrors the tool‑result logic by attaching a BedrockCachePoint{Type: BedrockCachePointTypeDefault} whenever block.CacheControl is present for text and image blocks. This is a straightforward, nil‑safe extension and aligns with the new CachePoint field on BedrockContentBlock.

tests/integrations/tests/test_anthropic.py (2)

52-86: New prompt‑caching test fixtures and data look well‑integrated

Importing PROMPT_CACHING_TOOLS and PROMPT_CACHING_LARGE_CONTEXT from the shared utils keeps these Anthropic tests aligned with the cross‑provider scenarios. The additions fit the existing fixture/config patterns in this file.


1665-1695: Cache usage validators align with Anthropic usage schema

validate_cache_write / validate_cache_read:

  • log input_tokens, cache_creation_input_tokens, and cache_read_input_tokens,
  • assert presence of the relevant cache field and that it’s > 0,
  • ensure write vs read operations don’t both happen in the same request.

Assuming the Anthropic SDK exposes these fields with the expected snake_case names (which matches the Go AnthropicUsage JSON tags), this is the right assertion surface for prompt‑caching behavior.

core/providers/anthropic/responses.go (10)

1340-1347: Unifying message+system conversion through ConvertAnthropicMessagesToBifrostMessages

Passing request.System into ConvertAnthropicMessagesToBifrostMessages centralizes system‑content handling with the regular message pipeline and lets the new system helper reuse CacheControl‑aware logic. This avoids duplicated system‑conversion code and looks like a solid refactor.


2065-2097: System content conversion now preserves blocks and avoids empty system messages

convertAnthropicSystemToBifrostMessages:

  • Handles both simple ContentStr and structured ContentBlocks.
  • Emits system messages only when there is non‑empty text (guarded by len(contentBlocks) > 0).
  • Preserves per‑block CacheControl when system content comes as blocks.

This fixes the earlier empty‑ContentBlocks system‑message edge case and ensures cache directives on system text survive into Bifrost.


2253-2275: Tool‑result text blocks correctly carry CacheControl into tool outputs

Within tool‑result handling for Anthropic blocks, the mapping:

  • chooses blockType based on isOutputMessage, and
  • copies contentBlock.CacheControl onto each ResponsesMessageContentBlock.

This ensures cache directives on tool outputs are preserved when converting Anthropic tool results into Bifrost FunctionCallOutput messages. The same pattern is mirrored later in the non‑grouped converter.


2377-2398: Text content CacheControl propagation for input/output messages is symmetric

In convertAnthropicContentBlocksToResponsesMessages, both branches now:

  • wrap text in a ContentBlocks slice (rather than ContentStr), and
  • attach CacheControl to the text block for both output (output_text) and input (input_text) cases.

That gives a single representation for cacheable text across the Responses pipeline and works cleanly with convertBifrostContentBlocksToAnthropic.


2495-2507: Tool‑result CacheControl preserved in non‑grouped converter as well

The non‑grouped variant mirrors the grouped one by copying contentBlock.CacheControl onto each text block inside tool‑result outputs. This keeps CacheControl round‑tripped for tool outputs regardless of which converter path is used.


2560-2568: MCP tool‑result text blocks now also carry CacheControl

For MCP tool results, adding CacheControl: contentBlock.CacheControl to each text block ensures MCP outputs participate in the same caching semantics as standard tools, which is important for consistent cross‑tool behavior.


3082-3084: Anthropic→Bifrost tool conversion: CacheControl now forwarded

The new block:

if tool.CacheControl != nil {
    bifrostTool.CacheControl = tool.CacheControl
}

means Anthropic cache_control on tools will be preserved when building Bifrost ResponsesTools, which is necessary for the new prompt‑caching tests that set tool‑level cache control.


3262-3264: Bifrost→Anthropic tool conversion: CacheControl round‑trip completed

Mirroring the previous change, propagating tool.CacheControl back to AnthropicTool.CacheControl closes the loop so cache metadata round‑trips across Anthropic↔Bifrost conversions.


3325-3347: Content block conversion to Anthropic now includes CacheControl (text & image)

convertContentBlockToAnthropic now:

  • copies block.CacheControl into text blocks, and
  • passes it through when wrapping image blocks via ConvertToAnthropicImageBlock.

Together with convertBifrostContentBlocksToAnthropic, this ensures CacheControl on Bifrost content blocks is preserved in Anthropic request/response payloads.


3376-3383: Image block back‑conversion preserves CacheControl

toBifrostResponsesImageBlock now sets CacheControl: block.CacheControl on the returned ResponsesMessageContentBlock, so image‑level cache hints from Anthropic are preserved when mapping back into Responses messages.

core/providers/bedrock/types.go (2)

155-201: Bedrock CachePoint types and wiring look coherent

Adding:

  • CachePoint *BedrockCachePoint to BedrockSystemMessage and BedrockContentBlock,
  • BedrockCachePointType + BedrockCachePointTypeDefault, and
  • BedrockCachePoint{Type BedrockCachePointType},

matches how convertContentBlock, convertToolMessage, and convertToolConfig now populate cache hints. The JSON tag cachePoint is consistent across all three structs and aligns with Bedrock’s Converse schema.


283-287: Tool‑level CachePoint field aligns with new tool config behavior

Extending BedrockTool to:

ToolSpec   *BedrockToolSpec   `json:"toolSpec,omitempty"`
CachePoint *BedrockCachePoint `json:"cachePoint,omitempty"`

matches the new convertToolConfig behavior that appends bare cachePoint tools. Struct layout and tags look correct; no issues here.

@TejasGhatte TejasGhatte force-pushed the 11-25-feat_support_for_prompt_caching_parameter_for_claude_models branch from 6fe2adf to b4bc2f2 Compare December 16, 2025 07:27
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
core/providers/anthropic/chat.go (1)

274-281: CacheControl is lost when collapsing single text block to ContentStr.

When there's exactly one text content block, this optimization collapses it into ContentStr:

if len(content) == 1 && content[0].Type == AnthropicContentBlockTypeText {
    anthropicMsg.Content = AnthropicContent{ContentStr: content[0].Text}
}

This discards any CacheControl attached to that block. Consider preserving the block-based representation when CacheControl is present:

- if len(content) == 1 && content[0].Type == AnthropicContentBlockTypeText {
+ if len(content) == 1 && content[0].Type == AnthropicContentBlockTypeText && content[0].CacheControl == nil {
      anthropicMsg.Content = AnthropicContent{ContentStr: content[0].Text}
  } else if len(content) > 0 {
🧹 Nitpick comments (3)
tests/integrations/tests/test_anthropic.py (3)

1481-1482: Consider making the token proximity threshold configurable.

The hardcoded 100-token threshold for validating cache token proximity might be fragile across different providers and models. Consider making this threshold configurable per provider or using a percentage-based approach.

Example:

-        assert abs(cache_write_tokens - cache_read_tokens) < 100, \
+        # Allow 5% variance or 100 tokens, whichever is larger
+        tolerance = max(100, int(cache_write_tokens * 0.05))
+        assert abs(cache_write_tokens - cache_read_tokens) < tolerance, \
-            f"Cache read tokens ({cache_read_tokens}) should be close to cache creation tokens ({cache_write_tokens})"
+            f"Cache read tokens ({cache_read_tokens}) should be close to cache creation tokens ({cache_write_tokens}) within {tolerance} tokens"

1487-1560: Consider adding proximity validation for consistency.

Test 28 includes a proximity check to validate that cache read tokens are close to cache write tokens (line 1481-1482), but test 29 omits this validation. Consider adding it for consistency:

         assert_valid_chat_response(response2)
         cache_read_tokens = validate_cache_read(response2.usage, "Second request")
         
+        # Validate that cache read tokens are approximately equal to cache creation tokens
+        tolerance = max(100, int(cache_write_tokens * 0.05))
+        assert abs(cache_write_tokens - cache_read_tokens) < tolerance, \
+            f"Cache read tokens ({cache_read_tokens}) should be close to cache creation tokens ({cache_write_tokens})"
+        
         print(f"✓ Messages caching validated - Cache created: {cache_write_tokens} tokens, "
               f"Cache read: {cache_read_tokens} tokens")

1561-1611: Consider adding proximity validation for consistency.

Similar to test 29, test 30 omits the proximity check that test 28 includes. For consistency across all three caching tests, consider adding the validation here as well.

Additionally, the test doesn't validate that the responses are appropriate (e.g., that response1 attempts to use the weather tool). While not strictly required for a caching test, it might strengthen the test:

# After line 1591, optionally add:
# Verify first response uses appropriate tool (weather)
if hasattr(response1, "content") and len(response1.content) > 0:
    # Check if model attempted to use tools
    has_tool_use = any(
        hasattr(block, "type") and block.type == "tool_use" 
        for block in response1.content
    )
    # Tool use is expected but not required for caching test
    if has_tool_use:
        print("✓ Model used tools as expected")
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 6fe2adf and b4bc2f2.

📒 Files selected for processing (18)
  • core/internal/testutil/test_retry_framework.go (5 hunks)
  • core/providers/anthropic/chat.go (6 hunks)
  • core/providers/anthropic/responses.go (13 hunks)
  • core/providers/anthropic/types.go (2 hunks)
  • core/providers/anthropic/utils.go (1 hunks)
  • core/providers/bedrock/chat.go (1 hunks)
  • core/providers/bedrock/responses.go (8 hunks)
  • core/providers/bedrock/types.go (6 hunks)
  • core/providers/bedrock/utils.go (6 hunks)
  • core/providers/openai/responses_marshal_test.go (0 hunks)
  • core/providers/openai/types.go (1 hunks)
  • core/schemas/chatcompletions.go (2 hunks)
  • core/schemas/responses.go (2 hunks)
  • core/schemas/utils.go (1 hunks)
  • tests/integrations/config.yml (7 hunks)
  • tests/integrations/tests/test_anthropic.py (5 hunks)
  • tests/integrations/tests/test_bedrock.py (3 hunks)
  • tests/integrations/tests/utils/common.py (2 hunks)
💤 Files with no reviewable changes (1)
  • core/providers/openai/responses_marshal_test.go
🚧 Files skipped from review as they are similar to previous changes (6)
  • core/schemas/responses.go
  • tests/integrations/config.yml
  • core/schemas/chatcompletions.go
  • core/providers/anthropic/types.go
  • core/providers/bedrock/types.go
  • tests/integrations/tests/utils/common.py
🧰 Additional context used
📓 Path-based instructions (1)
**

⚙️ CodeRabbit configuration file

always check the stack if there is one for the current PR. do not give localized reviews for the PR, always see all changes in the light of the whole stack of PRs (if there is a stack, if there is no stack you can continue to make localized suggestions/reviews)

Files:

  • core/providers/anthropic/chat.go
  • core/providers/openai/types.go
  • core/providers/anthropic/responses.go
  • core/providers/anthropic/utils.go
  • core/providers/bedrock/utils.go
  • core/providers/bedrock/chat.go
  • core/internal/testutil/test_retry_framework.go
  • tests/integrations/tests/test_anthropic.py
  • tests/integrations/tests/test_bedrock.py
  • core/schemas/utils.go
  • core/providers/bedrock/responses.go
🧠 Learnings (4)
📚 Learning: 2025-12-09T17:07:42.007Z
Learnt from: qwerty-dvorak
Repo: maximhq/bifrost PR: 1006
File: core/schemas/account.go:9-18
Timestamp: 2025-12-09T17:07:42.007Z
Learning: In core/schemas/account.go, the HuggingFaceKeyConfig field within the Key struct is currently unused and reserved for future Hugging Face inference endpoint deployments. Do not flag this field as missing from OpenAPI documentation or require its presence in the API spec until the feature is actively implemented and used. When the feature is added, update the OpenAPI docs accordingly; otherwise, treat this field as non-breaking and not part of the current API surface.

Applied to files:

  • core/providers/anthropic/chat.go
  • core/providers/openai/types.go
  • core/providers/anthropic/responses.go
  • core/providers/anthropic/utils.go
  • core/providers/bedrock/utils.go
  • core/providers/bedrock/chat.go
  • core/internal/testutil/test_retry_framework.go
  • core/schemas/utils.go
  • core/providers/bedrock/responses.go
📚 Learning: 2025-12-12T10:28:54.988Z
Learnt from: Pratham-Mishra04
Repo: maximhq/bifrost PR: 978
File: core/providers/anthropic/chat.go:207-216
Timestamp: 2025-12-12T10:28:54.988Z
Learning: In core/providers/anthropic/chat.go, within ToAnthropicChatRequest, ensure the mapping from ReasoningDetails to Anthropic thinking blocks preserves all fields from ReasoningDetails, including when Text is nil or empty. Do not drop or coerce missing Text; propagate nil/empty values to maintain complete bidirectional mapping of Anthropic events in Bifrost. Add tests to verify that ReasoningDetails with Text nil or "" round-trip without loss, and update code to handle nullable Text without dropping data. If there are multiple blocks, preserve order and all fields during mapping.

Applied to files:

  • core/providers/anthropic/chat.go
📚 Learning: 2025-12-11T11:58:25.307Z
Learnt from: Pratham-Mishra04
Repo: maximhq/bifrost PR: 1000
File: core/providers/openai/responses.go:42-84
Timestamp: 2025-12-11T11:58:25.307Z
Learning: In core/providers/openai/responses.go (and related OpenAI response handling), document and enforce the API format constraint: if ResponsesReasoning != nil and the response contains content blocks, all content blocks should be treated as reasoning blocks by default. Implement type guards or parsing logic accordingly, and add unit tests to verify that when ResponsesReasoning is non-nil, content blocks are labeled as reasoning blocks. Include clear comments in the code explaining the rationale and ensure downstream consumers rely on this behavior.

Applied to files:

  • core/providers/openai/types.go
📚 Learning: 2025-12-14T14:43:30.902Z
Learnt from: Radheshg04
Repo: maximhq/bifrost PR: 980
File: core/providers/openai/images.go:10-22
Timestamp: 2025-12-14T14:43:30.902Z
Learning: Enforce the OpenAI image generation SSE event type values across the OpenAI image flow in the repository: use "image_generation.partial_image" for partial chunks, "image_generation.completed" for the final result, and "error" for errors. Apply this consistently in schemas, constants, tests, accumulator routing, and UI code within core/providers/openai (and related Go files) to ensure uniform event typing and avoid mismatches.

Applied to files:

  • core/providers/openai/types.go
🧬 Code graph analysis (5)
core/providers/anthropic/chat.go (2)
core/schemas/chatcompletions.go (7)
  • CacheControl (610-613)
  • ChatContentBlock (592-602)
  • ChatContentBlockTypeText (584-584)
  • ChatAssistantMessageToolCallFunction (711-714)
  • ChatAssistantMessageToolCall (702-708)
  • ChatReasoningDetails (739-746)
  • BifrostReasoningDetailsTypeText (735-735)
core/providers/anthropic/types.go (3)
  • AnthropicContentBlockTypeText (204-204)
  • AnthropicContentBlockTypeToolUse (206-206)
  • AnthropicContentBlockTypeThinking (212-212)
core/providers/openai/types.go (2)
core/schemas/responses.go (3)
  • ResponsesMessage (318-331)
  • ResponsesMessageContentBlock (403-418)
  • ResponsesToolMessage (469-489)
core/schemas/chatcompletions.go (1)
  • CacheControl (610-613)
core/providers/bedrock/chat.go (3)
core/schemas/bifrost.go (1)
  • BifrostContextKeyStructuredOutputToolName (135-135)
core/schemas/chatcompletions.go (8)
  • ChatContentBlock (592-602)
  • ChatContentBlockTypeText (584-584)
  • CacheControl (610-613)
  • CacheControlTypeEphemeral (607-607)
  • ChatAssistantMessageToolCall (702-708)
  • ChatAssistantMessageToolCallFunction (711-714)
  • ChatReasoningDetails (739-746)
  • BifrostReasoningDetailsTypeText (735-735)
core/schemas/utils.go (1)
  • Ptr (16-18)
tests/integrations/tests/test_bedrock.py (2)
tests/integrations/tests/utils/parametrize.py (2)
  • get_cross_provider_params_for_scenario (12-47)
  • format_provider_model (50-65)
tests/integrations/tests/test_anthropic.py (2)
  • validate_cache_write (1665-1679)
  • validate_cache_read (1681-1695)
core/providers/bedrock/responses.go (4)
core/schemas/utils.go (1)
  • IsNovaModel (1043-1045)
core/schemas/chatcompletions.go (2)
  • CacheControl (610-613)
  • CacheControlTypeEphemeral (607-607)
core/providers/bedrock/types.go (5)
  • BedrockTool (284-287)
  • BedrockCachePoint (199-201)
  • BedrockCachePointTypeDefault (195-195)
  • BedrockSystemMessage (156-160)
  • BedrockContentBlock (163-190)
core/schemas/responses.go (3)
  • ResponsesMessage (318-331)
  • ResponsesMessageContent (343-348)
  • ResponsesMessageContentBlock (403-418)
🪛 Ruff (0.14.8)
tests/integrations/tests/test_anthropic.py

1422-1422: Unused method argument: anthropic_client

(ARG002)


1422-1422: Unused method argument: test_config

(ARG002)


1488-1488: Unused method argument: anthropic_client

(ARG002)


1488-1488: Unused method argument: test_config

(ARG002)


1562-1562: Unused method argument: anthropic_client

(ARG002)


1562-1562: Unused method argument: test_config

(ARG002)

tests/integrations/tests/test_bedrock.py

1624-1624: Unused method argument: test_config

(ARG002)


1674-1674: Unused method argument: test_config

(ARG002)


1725-1725: Unused method argument: test_config

(ARG002)

🔇 Additional comments (40)
tests/integrations/tests/test_anthropic.py (5)

66-66: LGTM: New imports for prompt caching tests.

The imported constants PROMPT_CACHING_TOOLS and PROMPT_CACHING_LARGE_CONTEXT are appropriately used in the new caching test cases.

Also applies to: 85-85


39-41: LGTM: Documentation updated for new test cases.

The docstring additions clearly describe the three new prompt caching test scenarios.


1421-1486: LGTM: System message caching test is well-structured.

The test correctly validates cache creation and reuse for system messages across providers. Using get_provider_anthropic_client(provider) is appropriate for cross-provider testing with the x-model-provider header.

Note: The static analysis warnings about unused anthropic_client and test_config are false positives—these are pytest fixtures that must be declared for the test framework.


1665-1679: LGTM: Cache write validation helper is well-implemented.

The function appropriately validates cache creation operations with clear assertions and safe attribute access using getattr. The attribute-based access pattern (e.g., usage.input_tokens) is correct for Anthropic SDK response objects.


1681-1695: LGTM: Cache read validation helper is well-implemented.

The function correctly validates cache read operations with appropriate assertions. It's consistent with validate_cache_write and provides clear diagnostic output.

core/internal/testutil/test_retry_framework.go (6)

832-834: LGTM - Consistent attempt semantics!

The updated comment and logging correctly clarify that attempt represents the current failed attempt number, and the log displays attempt+1 to show the next attempt that will run. This addresses the inconsistency flagged in the previous review and aligns with other retry functions throughout the codebase.


2153-2158: Helpful logging enhancement for retry tracking.

The attempt start logging distinguishes between the initial attempt and retries, making it easier to follow test execution flow in logs. This is particularly useful when debugging cache control tests that rely on the Responses API streaming.


2205-2212: Correct attempt parameter passing.

The code correctly passes the current failed attempt number to the OnRetry callback (line 2211), which is consistent with the fix from the previous review. The conditional error logging (lines 2206-2208) avoids duplicate messages by only logging for retry attempts (attempt > 1).


2218-2218: Good UX improvement with explicit wait indicators.

Logging the wait time before retries (with the ⏳ indicator) helps distinguish between active retry attempts and backoff delays, making test logs easier to interpret.


2244-2254: Robust nil channel handling with clear intent.

The enhanced logging for the nil channel case provides better visibility into retry decisions. The critical comment at line 2254 correctly emphasizes that the code must continue to retry rather than returning immediately, ensuring retry resilience even when the channel is nil on intermediate attempts.


2317-2317: Consistent attempt parameter semantics.

The comment correctly documents that the current failed attempt number is passed to the OnRetry callback, maintaining consistency with the fixes at lines 2211 and 2246.

core/schemas/utils.go (1)

1042-1045: LGTM!

The IsNovaModel function follows the established pattern used by IsAnthropicModel, IsMistralModel, and IsGeminiModel. The case-sensitive check for "nova" is appropriate given typical model naming conventions.

core/providers/anthropic/utils.go (1)

96-101: LGTM!

Good improvements:

  1. Using the typed constant AnthropicContentBlockTypeImage instead of a string literal improves type safety.
  2. Propagating CacheControl from the input block ensures cache metadata is preserved during image block conversion.
core/providers/anthropic/chat.go (6)

65-68: LGTM!

Properly propagates CacheControl from Bifrost tools to Anthropic tools, enabling per-tool cache control during request conversion.


149-154: LGTM!

System message content blocks now correctly propagate CacheControl metadata, enabling cache control at the system prompt level.


184-191: LGTM!

Tool result content blocks correctly propagate CacheControl, and the image block conversion via ConvertToAnthropicImageBlock (which also handles CacheControl) is appropriate.


241-249: LGTM!

User and assistant message content blocks correctly propagate CacheControl metadata while maintaining image block handling via the dedicated converter.


320-366: LGTM!

The response conversion correctly:

  1. Propagates CacheControl from Anthropic text blocks to Bifrost content blocks
  2. Handles tool use blocks with proper JSON marshaling and fallback
  3. Accumulates thinking/reasoning details with correct type assignment

501-511: LGTM!

The bidirectional conversion correctly propagates CacheControl from Bifrost content blocks back to Anthropic format, ensuring cache metadata is preserved in both directions.

core/providers/bedrock/utils.go (4)

30-32: LGTM!

The function call now correctly passes the model to convertToolConfig, enabling model-specific cache control behavior.


319-351: LGTM!

Tool result content blocks (both text and image) correctly propagate cache signaling by adding CachePoint when the source block has CacheControl set.


390-421: LGTM!

The convertContentBlock function correctly adds CachePoint to both text and image blocks when CacheControl is present on the source, maintaining consistency with the tool result handling.


599-651: The CachePoint format and Nova model exclusion are correct.

The Bedrock Converse API tool structure is a Tagged Union where only one of toolSpec, systemTool, or cachePoint can be set, confirming that adding CachePoint as a separate BedrockTool entry is the correct API format.

Nova models do not support tool caching, and attempting to use tool caching strategies with Nova throws an exception, so the !schemas.IsNovaModel(model) check correctly prevents CachePoint injection for Nova models.

core/providers/bedrock/chat.go (2)

63-78: LGTM!

The structured output handling correctly:

  1. Checks for the structured output tool by name from context
  2. Marshals tool input to JSON string as the content
  3. Provides fallback to string conversion on marshal failure

80-140: LGTM with minor edge case consideration.

The content block handling correctly:

  1. Converts text blocks to Bifrost format
  2. Attaches CacheControl to the last content block when encountering a CachePoint block
  3. Handles tool use with proper JSON marshaling and variable capture to avoid loop variable issues (lines 112-113)
  4. Accumulates reasoning content with correct signature handling

One edge case to consider: if a CachePoint block appears before any content blocks, len(contentBlocks) > 0 check (line 90) safely prevents a panic, but the cache signal would be silently dropped. This is likely acceptable given Bedrock's expected ordering.

core/providers/anthropic/responses.go (4)

2065-2097: LGTM! System message conversion correctly preserves CacheControl.

The implementation properly handles both ContentStr and ContentBlocks cases while preserving CacheControl on each text block. The guard at line 2086 correctly prevents appending system messages with empty ContentBlocks arrays, addressing the concern from the past review.


2174-2180: Excellent cache control propagation across content blocks.

The implementation consistently preserves CacheControl across all content block types (text, tool results) in both input and output message conversions. This ensures cache metadata is maintained throughout the conversion pipeline.

Also applies to: 2263-2266, 2377-2379, 2390-2396, 2495-2497, 2559-2562


3082-3084: Tool conversion correctly preserves CacheControl in both directions.

The bidirectional tool conversions properly preserve CacheControl, enabling cache metadata to flow through tool definitions. The nil checks ensure safe assignment.

Also applies to: 3262-3264


3330-3332: Image and helper conversions complete the CacheControl propagation.

The remaining conversions (image blocks and helper functions) correctly preserve CacheControl, completing comprehensive cache metadata support across all content block types.

Also applies to: 3343-3343, 3382-3382

tests/integrations/tests/test_bedrock.py (3)

1623-1671: System message caching test correctly structured.

The test properly validates prompt caching for system messages by:

  1. Placing a cachePoint after system content blocks
  2. Making two requests with identical system content but different user messages
  3. Validating cache write on the first request and cache read on the second

The Bedrock cachePoint format with {"type": "default"} is correctly used.


1673-1722: Messages caching test correctly implements mid-message cache points.

The test validates prompt caching within messages by placing a cachePoint between the large context and the specific question. This allows the large context to be cached while enabling different questions, which is the intended use case for efficient prompt caching.


1724-1785: Tools caching test and helper functions correctly implemented.

The tools caching test properly validates caching of tool definitions by:

  • Using 12 tools from PROMPT_CACHING_TOOLS to create a substantial cache
  • Placing the cachePoint after all tool definitions
  • Making two requests with the same tools but different user messages

The helper functions validate_cache_write and validate_cache_read correctly use Bedrock's camelCase field names (cacheWriteInputTokens, cacheReadInputTokens) rather than Anthropic's snake_case format, ensuring proper validation of cache metrics.

Also applies to: 1787-1813

core/providers/bedrock/responses.go (6)

1572-1578: Cache point conversion logic mirrors the inverse operation.

The conversion from Bifrost CacheControl to Bedrock CachePoint correctly appends a separate tool entry, which mirrors the inverse operation in lines 1283-1290.


1707-1712: Cache token mapping is correct.

The mapping of cached token counts from Bifrost's InputTokensDetails/OutputTokensDetails to Bedrock's CacheReadInputTokens/CacheWriteInputTokens is accurate.


2382-2384: System message conversion refactored to handle slices.

The updated call correctly passes the entire systemMessages slice and only appends results if non-empty. Clean refactor.


2413-2419: Cache point appending is consistent with tool conversion pattern.

The logic correctly appends a separate BedrockSystemMessage with CachePoint when a content block has CacheControl, matching the pattern used in tool conversion.


2775-2781: Cache point appending after content blocks is correct.

The conversion from Bifrost CacheControl to Bedrock CachePoint correctly appends a separate content block marker after processing each block with cache control, maintaining consistency with Bedrock's cache representation.


1283-1290: Add code comment explaining Nova tool cache limitation.

The Nova model exclusion at line 1283 is correct—Amazon Nova models do not support tool caching. Consider adding an inline comment documenting this limitation for future maintainers.

core/providers/openai/types.go (2)

178-211: Data mutation issue successfully resolved.

The implementation correctly addresses the past review concern by creating deep copies of the message structures before stripping CacheControl. The approach ensures the original input data remains unchanged:

  • Creates a new messagesCopy slice
  • Deep copies Content and ContentBlocks when present
  • Deep copies ResponsesToolMessage and Output when present
  • Marshals the copied data instead of the original

The nil checks are properly placed, preventing any potential panics when accessing nested fields.


196-209: Code correctly handles CacheControl stripping in ResponsesMessage.

The implementation properly strips CacheControl from all message content fields:

  • Content blocks (line 189): All ResponsesMessageContentBlock instances in Content.ContentBlocks
  • Tool output blocks (line 203): All ResponsesMessageContentBlock instances in ResponsesToolMessage.Output.ResponsesFunctionToolCallOutputBlocks

ResponsesReasoning and other embedded tool structs (FileSearch, Computer, CodeInterpreter, MCP, Custom, ImageGeneration) do not contain CacheControl fields and require no modification.

@TejasGhatte TejasGhatte force-pushed the 11-25-feat_support_for_prompt_caching_parameter_for_claude_models branch from b4bc2f2 to 217e2c0 Compare December 16, 2025 08:56
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🧹 Nitpick comments (3)
core/internal/testutil/prompt_caching.go (1)

245-268: Prompt‑caching harness correctly uses CacheControl, with a couple of small robustness nits

  • Marking the last tool in GetPromptCachingTools as CacheControl: Ephemeral and wrapping the long system prefix in a ChatContentBlock with CacheControl: Ephemeral matches the new schema and Anthropic/Bedrock mappings, so the Go harness will actually exercise provider‑level caching.
  • Gating RunPromptCachingTest on both SimpleChat and PromptCaching avoids hitting providers that don’t expose caching metrics, which should reduce flaky skips.

Two small follow‑ups you may want to consider:

  • The “last tool is cached” convention is index‑based; if the tools slice is reordered later the cached tool changes silently. Keying the cache‑controlled tool by name (e.g., always making check_compliance_status ephemeral) would be more robust.
  • In retryContext.TestMetadata, you still log model: testConfig.ChatModel while the request uses PromptCachingModel; switching the metadata to the actual model being called would make retries/debugging clearer.

Also applies to: 276-304, 369-379

tests/integrations/tests/test_bedrock.py (1)

37-41: Prompt caching integration tests for Bedrock are well structured

The new tests and helpers correctly exercise Bedrock’s cache metrics:

  • test_21_prompt_caching_system, test_22_prompt_caching_messages, and test_23_prompt_caching_tools each validate a distinct checkpoint type (system, messages, tools) with a clear two-call pattern (write then read).
  • validate_cache_write / validate_cache_read use the Bedrock field names (inputTokens, cacheWriteInputTokens, cacheReadInputTokens) and assert the right invariants (write>0/read==0 and vice versa).
  • Tool checkpoint test builds a 12-tool toolConfig matching how Bedrock expects cachePoint after tool specs.

If you find yourself adding similar helpers for other providers again, you could consider moving the cache read/write validators into a shared test util, but that’s optional.

Also applies to: 51-61, 1623-1813

core/providers/anthropic/responses.go (1)

1344-1388: Centralized system-message handling and Bedrock grouping look correct

The refactor to:

  • Pass request.System into ConvertAnthropicMessagesToBifrostMessages, and
  • Introduce convertAnthropicSystemToBifrostMessages,

nicely centralizes system-message handling for both request and response paths while preserving any CacheControl on system blocks.

Using the keepToolsGrouped flag (provider == schemas.Bedrock) to switch between grouped and ungrouped conversions is a sensible way to keep Bedrock’s mixed text+tool_use turns intact without affecting other providers.

One minor stylistic nit: in convertAnthropicSystemToBifrostMessages you tag system text content as ResponsesOutputMessageContentTypeText; using the input_text variant might better reflect that these are request-side messages, but functionally it should be equivalent given how other code treats text blocks.

Also applies to: 1628-1649, 2065-2097

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between b4bc2f2 and 217e2c0.

📒 Files selected for processing (22)
  • core/internal/testutil/account.go (3 hunks)
  • core/internal/testutil/prompt_caching.go (3 hunks)
  • core/internal/testutil/test_retry_framework.go (5 hunks)
  • core/providers/anthropic/anthropic_test.go (2 hunks)
  • core/providers/anthropic/chat.go (6 hunks)
  • core/providers/anthropic/responses.go (13 hunks)
  • core/providers/anthropic/types.go (2 hunks)
  • core/providers/anthropic/utils.go (1 hunks)
  • core/providers/bedrock/bedrock_test.go (2 hunks)
  • core/providers/bedrock/chat.go (1 hunks)
  • core/providers/bedrock/responses.go (8 hunks)
  • core/providers/bedrock/types.go (6 hunks)
  • core/providers/bedrock/utils.go (6 hunks)
  • core/providers/openai/responses_marshal_test.go (0 hunks)
  • core/providers/openai/types.go (1 hunks)
  • core/schemas/chatcompletions.go (2 hunks)
  • core/schemas/responses.go (2 hunks)
  • core/schemas/utils.go (1 hunks)
  • tests/integrations/config.yml (7 hunks)
  • tests/integrations/tests/test_anthropic.py (5 hunks)
  • tests/integrations/tests/test_bedrock.py (3 hunks)
  • tests/integrations/tests/utils/common.py (2 hunks)
💤 Files with no reviewable changes (1)
  • core/providers/openai/responses_marshal_test.go
🚧 Files skipped from review as they are similar to previous changes (6)
  • core/schemas/responses.go
  • core/providers/openai/types.go
  • core/providers/anthropic/utils.go
  • tests/integrations/tests/utils/common.py
  • core/providers/anthropic/types.go
  • core/schemas/utils.go
🧰 Additional context used
📓 Path-based instructions (1)
**

⚙️ CodeRabbit configuration file

always check the stack if there is one for the current PR. do not give localized reviews for the PR, always see all changes in the light of the whole stack of PRs (if there is a stack, if there is no stack you can continue to make localized suggestions/reviews)

Files:

  • core/internal/testutil/account.go
  • core/schemas/chatcompletions.go
  • tests/integrations/config.yml
  • core/providers/bedrock/bedrock_test.go
  • core/providers/anthropic/anthropic_test.go
  • core/providers/bedrock/utils.go
  • core/providers/bedrock/responses.go
  • core/providers/anthropic/responses.go
  • core/providers/bedrock/chat.go
  • tests/integrations/tests/test_bedrock.py
  • core/providers/bedrock/types.go
  • core/internal/testutil/test_retry_framework.go
  • core/providers/anthropic/chat.go
  • tests/integrations/tests/test_anthropic.py
  • core/internal/testutil/prompt_caching.go
🧠 Learnings (4)
📚 Learning: 2025-12-09T17:07:42.007Z
Learnt from: qwerty-dvorak
Repo: maximhq/bifrost PR: 1006
File: core/schemas/account.go:9-18
Timestamp: 2025-12-09T17:07:42.007Z
Learning: In core/schemas/account.go, the HuggingFaceKeyConfig field within the Key struct is currently unused and reserved for future Hugging Face inference endpoint deployments. Do not flag this field as missing from OpenAPI documentation or require its presence in the API spec until the feature is actively implemented and used. When the feature is added, update the OpenAPI docs accordingly; otherwise, treat this field as non-breaking and not part of the current API surface.

Applied to files:

  • core/internal/testutil/account.go
  • core/schemas/chatcompletions.go
  • core/providers/bedrock/bedrock_test.go
  • core/providers/anthropic/anthropic_test.go
  • core/providers/bedrock/utils.go
  • core/providers/bedrock/responses.go
  • core/providers/anthropic/responses.go
  • core/providers/bedrock/chat.go
  • core/providers/bedrock/types.go
  • core/internal/testutil/test_retry_framework.go
  • core/providers/anthropic/chat.go
  • core/internal/testutil/prompt_caching.go
📚 Learning: 2025-12-11T07:38:31.413Z
Learnt from: Pratham-Mishra04
Repo: maximhq/bifrost PR: 1000
File: core/providers/bedrock/bedrock_test.go:1374-1390
Timestamp: 2025-12-11T07:38:31.413Z
Learning: In core/providers/bedrock tests, follow a layered testing approach: - Unit tests (e.g., TestBifrostToBedrockResponseConversion) should perform structural comparisons and type/field checks to avoid brittleness from dynamic fields. - Separate scenario-based and integration tests should validate the full end-to-end conversion logic, including content block internals. Ensure unit tests avoid brittle string/field matching and that integration tests cover end-to-end behavior with realistic data.

Applied to files:

  • core/providers/bedrock/bedrock_test.go
📚 Learning: 2025-12-15T10:16:21.909Z
Learnt from: qwerty-dvorak
Repo: maximhq/bifrost PR: 1006
File: core/providers/huggingface/huggingface_test.go:12-63
Timestamp: 2025-12-15T10:16:21.909Z
Learning: In provider tests under core/providers/<provider>/*_test.go, do not require or flag the use of defer for Shutdown(); instead call client.Shutdown() at the end of each test function. This pattern appears consistent across all provider tests. Apply this rule only within this path; for other tests or resources, defer may still be appropriate.

Applied to files:

  • core/providers/bedrock/bedrock_test.go
  • core/providers/anthropic/anthropic_test.go
📚 Learning: 2025-12-12T10:28:54.988Z
Learnt from: Pratham-Mishra04
Repo: maximhq/bifrost PR: 978
File: core/providers/anthropic/chat.go:207-216
Timestamp: 2025-12-12T10:28:54.988Z
Learning: In core/providers/anthropic/chat.go, within ToAnthropicChatRequest, ensure the mapping from ReasoningDetails to Anthropic thinking blocks preserves all fields from ReasoningDetails, including when Text is nil or empty. Do not drop or coerce missing Text; propagate nil/empty values to maintain complete bidirectional mapping of Anthropic events in Bifrost. Add tests to verify that ReasoningDetails with Text nil or "" round-trip without loss, and update code to handle nullable Text without dropping data. If there are multiple blocks, preserve order and all fields during mapping.

Applied to files:

  • core/providers/anthropic/chat.go
🧬 Code graph analysis (8)
core/providers/bedrock/utils.go (3)
core/schemas/models.go (1)
  • Model (109-129)
core/providers/bedrock/types.go (6)
  • BedrockContentBlock (163-190)
  • BedrockCachePoint (199-201)
  • BedrockCachePointTypeDefault (195-195)
  • BedrockMessage (150-153)
  • BedrockToolConfig (278-281)
  • BedrockTool (284-287)
core/schemas/utils.go (1)
  • IsNovaModel (1043-1045)
core/providers/bedrock/responses.go (4)
core/schemas/utils.go (1)
  • IsNovaModel (1043-1045)
core/schemas/chatcompletions.go (2)
  • CacheControl (610-613)
  • CacheControlTypeEphemeral (607-607)
core/providers/bedrock/types.go (4)
  • BedrockTool (284-287)
  • BedrockCachePoint (199-201)
  • BedrockCachePointTypeDefault (195-195)
  • BedrockContentBlock (163-190)
core/schemas/responses.go (4)
  • ResponsesMessage (318-331)
  • ResponsesInputMessageRoleSystem (338-338)
  • ResponsesMessageContent (343-348)
  • ResponsesMessageContentBlock (403-418)
core/providers/anthropic/responses.go (7)
core/schemas/bifrost.go (1)
  • Bedrock (38-38)
core/schemas/responses.go (6)
  • ResponsesMessage (318-331)
  • ResponsesInputMessageRoleSystem (338-338)
  • ResponsesMessageContent (343-348)
  • ResponsesMessageContentBlock (403-418)
  • ResponsesOutputMessageContentTypeText (396-396)
  • ResponsesInputMessageContentBlockTypeText (392-392)
ui/lib/types/logs.ts (3)
  • ResponsesMessage (423-438)
  • ResponsesMessageContent (400-400)
  • ResponsesMessageContentBlock (362-397)
core/providers/gemini/types.go (3)
  • Role (17-17)
  • Content (977-985)
  • Type (782-782)
core/schemas/utils.go (1)
  • Ptr (16-18)
core/schemas/chatcompletions.go (1)
  • CacheControl (610-613)
core/providers/anthropic/types.go (1)
  • AnthropicContentBlockTypeText (204-204)
core/providers/bedrock/chat.go (3)
core/schemas/bifrost.go (1)
  • BifrostContextKeyStructuredOutputToolName (135-135)
core/schemas/chatcompletions.go (6)
  • ChatContentBlock (592-602)
  • ChatContentBlockTypeText (584-584)
  • CacheControl (610-613)
  • CacheControlTypeEphemeral (607-607)
  • ChatAssistantMessageToolCall (702-708)
  • ChatAssistantMessageToolCallFunction (711-714)
core/schemas/utils.go (1)
  • Ptr (16-18)
tests/integrations/tests/test_bedrock.py (2)
tests/integrations/tests/utils/parametrize.py (2)
  • get_cross_provider_params_for_scenario (12-47)
  • format_provider_model (50-65)
tests/integrations/tests/test_anthropic.py (2)
  • validate_cache_write (1665-1679)
  • validate_cache_read (1681-1695)
core/providers/anthropic/chat.go (2)
core/schemas/chatcompletions.go (5)
  • CacheControl (610-613)
  • ChatContentBlock (592-602)
  • ChatContentBlockTypeText (584-584)
  • ChatAssistantMessageToolCallFunction (711-714)
  • ChatAssistantMessageToolCall (702-708)
core/providers/anthropic/types.go (3)
  • AnthropicContentBlockTypeText (204-204)
  • AnthropicContentBlockTypeToolUse (206-206)
  • AnthropicContentBlockTypeThinking (212-212)
tests/integrations/tests/test_anthropic.py (2)
tests/integrations/tests/utils/parametrize.py (2)
  • get_cross_provider_params_for_scenario (12-47)
  • format_provider_model (50-65)
tests/integrations/tests/utils/common.py (1)
  • assert_valid_chat_response (801-833)
core/internal/testutil/prompt_caching.go (2)
core/schemas/chatcompletions.go (4)
  • CacheControl (610-613)
  • CacheControlTypeEphemeral (607-607)
  • ChatContentBlock (592-602)
  • ChatContentBlockTypeText (584-584)
core/schemas/utils.go (1)
  • Ptr (16-18)
🪛 Ruff (0.14.8)
tests/integrations/tests/test_bedrock.py

1624-1624: Unused method argument: test_config

(ARG002)


1674-1674: Unused method argument: test_config

(ARG002)


1725-1725: Unused method argument: test_config

(ARG002)

tests/integrations/tests/test_anthropic.py

1422-1422: Unused method argument: anthropic_client

(ARG002)


1422-1422: Unused method argument: test_config

(ARG002)


1488-1488: Unused method argument: anthropic_client

(ARG002)


1488-1488: Unused method argument: test_config

(ARG002)


1562-1562: Unused method argument: anthropic_client

(ARG002)


1562-1562: Unused method argument: test_config

(ARG002)

🔇 Additional comments (24)
core/internal/testutil/test_retry_framework.go (3)

2153-2158: LGTM! Improved attempt logging.

The addition of per-attempt start logs enhances test observability by clearly distinguishing between initial attempts and retries.


2205-2220: LGTM! Consistent attempt semantics and enhanced logging.

The changes correctly implement consistent attempt semantics (passing the current failed attempt number to OnRetry) and add helpful waiting logs before retries, addressing the previous review feedback.


2241-2254: LGTM! Critical nil-channel retry fix.

The code correctly continues the retry loop when the response channel is nil instead of returning early. The enhanced logging and consistent attempt semantics further improve debuggability.

core/schemas/chatcompletions.go (2)

245-250: Add CacheControl to ChatTool looks correct and backward‑compatible

Pointer field with omitempty cleanly extends the schema and will be ignored by existing callers and providers that don’t use cache control.


591-613: ChatContentBlock + CacheControl type shape is sound

Carrying CacheControl on content blocks via a typed CacheControlType and TTL string is a minimal, provider‑agnostic extension and integrates cleanly with existing (de)serialization.

tests/integrations/config.yml (2)

109-114: Bedrock chat model override aligns with Claude 4.5 Bedrock deployment

Pointing providers.bedrock.chat at us.anthropic.claude-sonnet-4-5-20250929-v1:0 matches the Bedrock deployment mapping used in Go test accounts and makes this model the default for Bedrock chat and prompt‑caching scenarios.


167-167: Prompt‑caching scenario wiring is consistent across providers and capabilities

Marking prompt_caching per‑provider in provider_scenarios and mapping it to "chat" in scenario_capabilities cleanly scopes the new cross‑provider tests so only Anthropic and Bedrock participate while others are explicitly disabled.

Also applies to: 205-205, 243-243, 281-281, 319-319, 342-363

core/providers/bedrock/bedrock_test.go (1)

45-56: Bedrock comprehensive test config correctly enables prompt caching and embeddings

Populating EmbeddingModel (cohere.embed-v4:0), ReasoningModel (claude-4.5-sonnet), and PromptCachingModel alongside PromptCaching: true in Scenarios matches the Bedrock key configuration and ensures the shared prompt‑caching harness runs only where the models and metrics are available. Based on learnings, this keeps Bedrock tests at the structural/config layer without adding brittle content checks.

Also applies to: 70-74

core/internal/testutil/account.go (2)

22-55: PromptCaching flag is a natural extension to TestScenarios

Adding a dedicated PromptCaching boolean keeps scenario toggles cohesive and gives the Go harness an explicit switch instead of overloading existing chat/tool flags.


590-619: Anthropic and Bedrock correctly opted into prompt‑caching scenarios

Enabling PromptCaching: true only for Anthropic and Bedrock in AllProviderConfigs matches the integration YAML and the underlying provider support, so the shared prompt‑caching test helper will automatically skip unsupported providers.

Also applies to: 624-661

core/providers/anthropic/anthropic_test.go (1)

25-35: Anthropic config now fully wired for vision, reasoning, and prompt caching

Setting VisionModel, ReasoningModel, and PromptCachingModel together with PromptCaching: true in Scenarios gives the comprehensive test runner enough information to exercise the new cache‑control flows without impacting unrelated scenarios.

Also applies to: 35-52

core/providers/anthropic/chat.go (1)

42-72: Cache‑control propagation for Anthropic text/tools is symmetrical and non‑breaking

Forwarding CacheControl from ChatTool and text ChatContentBlocks into AnthropicTool/AnthropicContentBlock, and then mapping it back in both ToBifrostChatResponse and ToAnthropicChatResponse, gives you a clean round‑trip for cache metadata without changing existing semantics for callers that don’t set it. Reasoning/thinking and tool‑use handling remain structurally identical, so prior behavior is preserved.

Also applies to: 139-162, 167-210, 218-250, 318-367, 495-511

tests/integrations/tests/test_anthropic.py (1)

52-60: New prompt‑caching constants are correctly imported for Anthropic tests

Importing PROMPT_CACHING_TOOLS and PROMPT_CACHING_LARGE_CONTEXT from the shared utils keeps the Anthropic prompt‑caching tests consistent with the Bedrock and cross‑provider helpers.

Also applies to: 82-86

core/providers/bedrock/utils.go (1)

30-32: CachePoint propagation across tools and content blocks looks sound

The changes cleanly propagate CacheControl into Bedrock-specific CachePoints:

  • convertChatParameters now passes model into convertToolConfig, enabling the Nova guard without introducing nil risks (params already checked).
  • convertToolMessage and convertContentBlock attach a BedrockCachePoint{Type: BedrockCachePointTypeDefault} only when the source ChatContentBlock.CacheControl is non-nil, for both text and image blocks.
  • convertToolConfig’s new model parameter and the !schemas.IsNovaModel(model) check correctly avoid emitting cache points for Nova models, while still allowing per-tool cachepoints elsewhere.

No functional issues spotted in this wiring.

Also applies to: 320-348, 390-420, 598-667

core/providers/bedrock/types.go (1)

155-160: Bedrock CachePoint type additions are coherent and backward compatible

The introduction of BedrockCachePointType, BedrockCachePoint, and the new CachePoint fields on BedrockSystemMessage, BedrockContentBlock, and BedrockTool is consistent with how they’re consumed in the Bedrock utils/chat paths:

  • All new fields are pointers with omitempty, so they don’t change existing JSON unless explicitly set.
  • The single BedrockCachePointTypeDefault constant matches the current usage; additional variants can be added later without API breakage.
  • Struct layout for BedrockTool keeps ToolSpec and CachePoint orthogonal, which fits the “sentinel tool” pattern used for tool-cache checkpoints.

Looks good as a foundation for prompt caching.

Also applies to: 162-201, 283-287

core/providers/anthropic/responses.go (2)

2151-2185: CacheControl round‑trip for text/image blocks and tool outputs is consistent

Across the various content-block converters:

  • Anthropic → Bifrost (convertAnthropicContentBlocksToResponsesMessages*) now copies block.CacheControl into the corresponding ResponsesMessageContentBlock for text (both input and output) and for tool-result text blocks.
  • Bifrost → Anthropic (convertContentBlockToAnthropic, convertBifrostContentBlocksToAnthropic, toBifrostResponsesImageBlock) propagates CacheControl on both text and image blocks back onto AnthropicContentBlock.CacheControl.

This ensures CacheControl survives:

  • Regular messages,
  • Tool result payloads, and
  • Image blocks

in both directions without introducing new branches or regressions. The wiring looks internally consistent with the shared CacheControl schema.

Also applies to: 2263-2276, 2359-2401, 2467-2508, 3326-3347, 3377-3383


2982-3087: Tool‑level CacheControl mapping between Anthropic and Responses tools is symmetric

The changes to convertAnthropicToolToBifrost and convertBifrostToolToAnthropic to copy tool.CacheControl:

  • Ensure that prompt-caching hints attached at the tool-definition level are preserved when moving between Anthropic’s native tool schema and the generic ResponsesTool abstraction.
  • Keep the mapping symmetric, so round-tripping a tool won’t silently strip or duplicate cache metadata.

No functional issues here; this is a straightforward and correct extension of the existing tool-conversion logic.

Also applies to: 3183-3267

core/providers/bedrock/responses.go (7)

1283-1289: LGTM! Cache point handling follows Bedrock API contract.

The logic correctly applies CacheControl to the last tool when a CachePoint marker is encountered, with appropriate Nova model exclusion.


1571-1578: LGTM! Correct bidirectional cache point propagation.

The code properly converts Bifrost CacheControl to a separate Bedrock CachePoint entry, maintaining consistency with Bedrock's API structure.


2382-2384: LGTM! Clean refactoring for batch system message processing.

The signature change from single-message to batch processing enables proper handling of CachePoint relationships across multiple system messages.


2413-2419: LGTM! Correct system message cache control propagation.

Properly converts Bifrost CacheControl to separate Bedrock CachePoint entries, consistent with the tool-level handling.


2450-2484: LGTM! System message conversion correctly handles cache points.

The function properly converts Bedrock CachePoint entries to Bifrost CacheControl on the last content block of the preceding message. As confirmed in previous reviews, Bedrock's API contract guarantees CachePoint only appears after content blocks, making the current implementation safe.


2656-2665: LGTM! Message content cache point handling is correct.

Consistent with system message handling (lines 2450-2484), this correctly converts CachePoint to CacheControl on the last content block. As previously confirmed, upstream validation ensures CachePoint never appears before content blocks.


2774-2781: LGTM! Completes bidirectional cache control conversion.

Correctly propagates CacheControl to separate Bedrock CachePoint entries in content blocks, maintaining consistency with tool and system message handling.

@TejasGhatte TejasGhatte force-pushed the 11-25-feat_support_for_prompt_caching_parameter_for_claude_models branch from 217e2c0 to 4ff38e2 Compare December 16, 2025 09:44
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (3)
core/internal/testutil/test_retry_framework.go (1)

832-834: Consider showing max attempts in retry log.

The comment clarification about attempt semantics is good and addresses the past review concern. However, the log format shows only the next attempt number without the maximum:

t.Logf("🔄 Retrying streaming test (attempt %d): %s", attempt+1, reason)

Showing the max attempts (e.g., attempt %d/%d) would improve visibility into retry progress, especially useful during test debugging.

Apply this diff to show max attempts:

-			t.Logf("🔄 Retrying streaming test (attempt %d): %s", attempt+1, reason)
+			t.Logf("🔄 Retrying streaming test (attempt %d/%d): %s", attempt+1, config.MaxAttempts, reason)
core/providers/bedrock/chat.go (1)

63-78: CachePoint handling has an edge case when no content blocks exist.

When a CachePoint block is encountered but contentBlocks is empty (line 73), the code silently skips attaching the CacheControl. This could happen if:

  1. The first content block is a CachePoint (unlikely but possible)
  2. All previous blocks were filtered out

Consider logging or handling this case explicitly, though it may be intentional to ignore orphan CachePoints.

 			} else if contentBlock.CachePoint != nil {
 				// add it to last content block
 				if len(contentBlocks) > 0 {
 					contentBlocks[len(contentBlocks)-1].CacheControl = &schemas.CacheControl{
 						Type: schemas.CacheControlTypeEphemeral,
 					}
-				}
+				} // CachePoint without preceding content block is ignored
 			}
tests/integrations/tests/test_anthropic.py (1)

1669-1695: Consider making Anthropic validate_cache_* a bit more defensive

These helpers assume usage.input_tokens always exists; if Anthropic (or our wrapper) ever omits it, the print will raise before the assertions. Mirroring the Bedrock helpers by using getattr(usage, "input_tokens", 0) would make them safer without changing semantics.

-def validate_cache_write(usage: Any, operation: str) -> int:
-    """Validate cache write operation and return tokens written"""
-    print(f"{operation} usage - input_tokens: {usage.input_tokens}, "
-            f"cache_creation_input_tokens: {getattr(usage, 'cache_creation_input_tokens', 0)}, "
-            f"cache_read_input_tokens: {getattr(usage, 'cache_read_input_tokens', 0)}")
+def validate_cache_write(usage: Any, operation: str) -> int:
+    """Validate cache write operation and return tokens written"""
+    print(
+        f"{operation} usage - input_tokens: {getattr(usage, 'input_tokens', 0)}, "
+        f"cache_creation_input_tokens: {getattr(usage, 'cache_creation_input_tokens', 0)}, "
+        f"cache_read_input_tokens: {getattr(usage, 'cache_read_input_tokens', 0)}"
+    )
@@
-def validate_cache_read(usage: Any, operation: str) -> int:
-    """Validate cache read operation and return tokens read"""
-    print(f"{operation} usage - input_tokens: {usage.input_tokens}, "
-            f"cache_creation_input_tokens: {getattr(usage, 'cache_creation_input_tokens', 0)}, "
-            f"cache_read_input_tokens: {getattr(usage, 'cache_read_input_tokens', 0)}")
+def validate_cache_read(usage: Any, operation: str) -> int:
+    """Validate cache read operation and return tokens read"""
+    print(
+        f"{operation} usage - input_tokens: {getattr(usage, 'input_tokens', 0)}, "
+        f"cache_creation_input_tokens: {getattr(usage, 'cache_creation_input_tokens', 0)}, "
+        f"cache_read_input_tokens: {getattr(usage, 'cache_read_input_tokens', 0)}"
+    )
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 217e2c0 and 4ff38e2.

📒 Files selected for processing (22)
  • core/internal/testutil/account.go (3 hunks)
  • core/internal/testutil/prompt_caching.go (3 hunks)
  • core/internal/testutil/test_retry_framework.go (5 hunks)
  • core/providers/anthropic/anthropic_test.go (2 hunks)
  • core/providers/anthropic/chat.go (6 hunks)
  • core/providers/anthropic/responses.go (13 hunks)
  • core/providers/anthropic/types.go (2 hunks)
  • core/providers/anthropic/utils.go (1 hunks)
  • core/providers/bedrock/bedrock_test.go (2 hunks)
  • core/providers/bedrock/chat.go (1 hunks)
  • core/providers/bedrock/responses.go (8 hunks)
  • core/providers/bedrock/types.go (6 hunks)
  • core/providers/bedrock/utils.go (6 hunks)
  • core/providers/openai/responses_marshal_test.go (0 hunks)
  • core/providers/openai/types.go (1 hunks)
  • core/schemas/chatcompletions.go (2 hunks)
  • core/schemas/responses.go (2 hunks)
  • core/schemas/utils.go (1 hunks)
  • tests/integrations/config.yml (7 hunks)
  • tests/integrations/tests/test_anthropic.py (5 hunks)
  • tests/integrations/tests/test_bedrock.py (3 hunks)
  • tests/integrations/tests/utils/common.py (2 hunks)
💤 Files with no reviewable changes (1)
  • core/providers/openai/responses_marshal_test.go
🚧 Files skipped from review as they are similar to previous changes (9)
  • core/providers/anthropic/utils.go
  • core/internal/testutil/account.go
  • core/providers/anthropic/types.go
  • core/schemas/chatcompletions.go
  • tests/integrations/tests/utils/common.py
  • core/providers/anthropic/anthropic_test.go
  • core/schemas/responses.go
  • tests/integrations/config.yml
  • core/internal/testutil/prompt_caching.go
🧰 Additional context used
📓 Path-based instructions (1)
**

⚙️ CodeRabbit configuration file

always check the stack if there is one for the current PR. do not give localized reviews for the PR, always see all changes in the light of the whole stack of PRs (if there is a stack, if there is no stack you can continue to make localized suggestions/reviews)

Files:

  • core/providers/openai/types.go
  • core/schemas/utils.go
  • core/providers/bedrock/chat.go
  • core/internal/testutil/test_retry_framework.go
  • core/providers/anthropic/responses.go
  • core/providers/bedrock/bedrock_test.go
  • core/providers/anthropic/chat.go
  • tests/integrations/tests/test_bedrock.py
  • tests/integrations/tests/test_anthropic.py
  • core/providers/bedrock/types.go
  • core/providers/bedrock/responses.go
  • core/providers/bedrock/utils.go
🧠 Learnings (6)
📚 Learning: 2025-12-09T17:07:42.007Z
Learnt from: qwerty-dvorak
Repo: maximhq/bifrost PR: 1006
File: core/schemas/account.go:9-18
Timestamp: 2025-12-09T17:07:42.007Z
Learning: In core/schemas/account.go, the HuggingFaceKeyConfig field within the Key struct is currently unused and reserved for future Hugging Face inference endpoint deployments. Do not flag this field as missing from OpenAPI documentation or require its presence in the API spec until the feature is actively implemented and used. When the feature is added, update the OpenAPI docs accordingly; otherwise, treat this field as non-breaking and not part of the current API surface.

Applied to files:

  • core/providers/openai/types.go
  • core/schemas/utils.go
  • core/providers/bedrock/chat.go
  • core/internal/testutil/test_retry_framework.go
  • core/providers/anthropic/responses.go
  • core/providers/bedrock/bedrock_test.go
  • core/providers/anthropic/chat.go
  • core/providers/bedrock/types.go
  • core/providers/bedrock/responses.go
  • core/providers/bedrock/utils.go
📚 Learning: 2025-12-11T11:58:25.307Z
Learnt from: Pratham-Mishra04
Repo: maximhq/bifrost PR: 1000
File: core/providers/openai/responses.go:42-84
Timestamp: 2025-12-11T11:58:25.307Z
Learning: In core/providers/openai/responses.go (and related OpenAI response handling), document and enforce the API format constraint: if ResponsesReasoning != nil and the response contains content blocks, all content blocks should be treated as reasoning blocks by default. Implement type guards or parsing logic accordingly, and add unit tests to verify that when ResponsesReasoning is non-nil, content blocks are labeled as reasoning blocks. Include clear comments in the code explaining the rationale and ensure downstream consumers rely on this behavior.

Applied to files:

  • core/providers/openai/types.go
📚 Learning: 2025-12-14T14:43:30.902Z
Learnt from: Radheshg04
Repo: maximhq/bifrost PR: 980
File: core/providers/openai/images.go:10-22
Timestamp: 2025-12-14T14:43:30.902Z
Learning: Enforce the OpenAI image generation SSE event type values across the OpenAI image flow in the repository: use "image_generation.partial_image" for partial chunks, "image_generation.completed" for the final result, and "error" for errors. Apply this consistently in schemas, constants, tests, accumulator routing, and UI code within core/providers/openai (and related Go files) to ensure uniform event typing and avoid mismatches.

Applied to files:

  • core/providers/openai/types.go
📚 Learning: 2025-12-11T07:38:31.413Z
Learnt from: Pratham-Mishra04
Repo: maximhq/bifrost PR: 1000
File: core/providers/bedrock/bedrock_test.go:1374-1390
Timestamp: 2025-12-11T07:38:31.413Z
Learning: In core/providers/bedrock tests, follow a layered testing approach: - Unit tests (e.g., TestBifrostToBedrockResponseConversion) should perform structural comparisons and type/field checks to avoid brittleness from dynamic fields. - Separate scenario-based and integration tests should validate the full end-to-end conversion logic, including content block internals. Ensure unit tests avoid brittle string/field matching and that integration tests cover end-to-end behavior with realistic data.

Applied to files:

  • core/providers/bedrock/bedrock_test.go
📚 Learning: 2025-12-15T10:16:21.909Z
Learnt from: qwerty-dvorak
Repo: maximhq/bifrost PR: 1006
File: core/providers/huggingface/huggingface_test.go:12-63
Timestamp: 2025-12-15T10:16:21.909Z
Learning: In provider tests under core/providers/<provider>/*_test.go, do not require or flag the use of defer for Shutdown(); instead call client.Shutdown() at the end of each test function. This pattern appears consistent across all provider tests. Apply this rule only within this path; for other tests or resources, defer may still be appropriate.

Applied to files:

  • core/providers/bedrock/bedrock_test.go
📚 Learning: 2025-12-12T10:28:54.988Z
Learnt from: Pratham-Mishra04
Repo: maximhq/bifrost PR: 978
File: core/providers/anthropic/chat.go:207-216
Timestamp: 2025-12-12T10:28:54.988Z
Learning: In core/providers/anthropic/chat.go, within ToAnthropicChatRequest, ensure the mapping from ReasoningDetails to Anthropic thinking blocks preserves all fields from ReasoningDetails, including when Text is nil or empty. Do not drop or coerce missing Text; propagate nil/empty values to maintain complete bidirectional mapping of Anthropic events in Bifrost. Add tests to verify that ReasoningDetails with Text nil or "" round-trip without loss, and update code to handle nullable Text without dropping data. If there are multiple blocks, preserve order and all fields during mapping.

Applied to files:

  • core/providers/anthropic/chat.go
🧬 Code graph analysis (7)
core/providers/openai/types.go (2)
core/schemas/responses.go (3)
  • ResponsesMessage (318-331)
  • ResponsesMessageContentBlock (403-418)
  • ResponsesToolMessage (469-489)
core/schemas/chatcompletions.go (1)
  • CacheControl (610-613)
core/providers/bedrock/chat.go (3)
core/schemas/chatcompletions.go (8)
  • ChatContentBlock (592-602)
  • ChatContentBlockTypeText (584-584)
  • CacheControl (610-613)
  • CacheControlTypeEphemeral (607-607)
  • ChatAssistantMessageToolCall (702-708)
  • ChatAssistantMessageToolCallFunction (711-714)
  • ChatReasoningDetails (739-746)
  • BifrostReasoningDetailsTypeText (735-735)
core/schemas/bifrost.go (1)
  • BifrostContextKeyStructuredOutputToolName (135-135)
core/schemas/utils.go (1)
  • Ptr (16-18)
core/providers/anthropic/responses.go (4)
core/schemas/responses.go (4)
  • ResponsesMessage (318-331)
  • ResponsesInputMessageRoleSystem (338-338)
  • ResponsesMessageContent (343-348)
  • ResponsesMessageContentBlock (403-418)
core/schemas/utils.go (1)
  • Ptr (16-18)
core/schemas/chatcompletions.go (1)
  • CacheControl (610-613)
core/providers/anthropic/types.go (1)
  • AnthropicContentBlockTypeText (204-204)
core/providers/anthropic/chat.go (2)
core/schemas/chatcompletions.go (6)
  • CacheControl (610-613)
  • ChatContentBlock (592-602)
  • ChatAssistantMessageToolCallFunction (711-714)
  • ChatAssistantMessageToolCall (702-708)
  • ChatToolTypeFunction (240-240)
  • ChatReasoningDetails (739-746)
core/providers/anthropic/types.go (3)
  • AnthropicContentBlockTypeText (204-204)
  • AnthropicContentBlockTypeToolUse (206-206)
  • AnthropicContentBlockTypeThinking (212-212)
tests/integrations/tests/test_anthropic.py (3)
tests/integrations/tests/utils/parametrize.py (2)
  • get_cross_provider_params_for_scenario (12-47)
  • format_provider_model (50-65)
tests/integrations/tests/utils/common.py (1)
  • assert_valid_chat_response (801-833)
tests/integrations/tests/test_bedrock.py (2)
  • validate_cache_write (1799-1809)
  • validate_cache_read (1811-1821)
core/providers/bedrock/responses.go (4)
core/schemas/utils.go (1)
  • IsNovaModel (1043-1045)
core/schemas/chatcompletions.go (2)
  • CacheControl (610-613)
  • CacheControlTypeEphemeral (607-607)
core/providers/bedrock/types.go (5)
  • BedrockTool (284-287)
  • BedrockCachePoint (199-201)
  • BedrockCachePointTypeDefault (195-195)
  • BedrockSystemMessage (156-160)
  • BedrockContentBlock (163-190)
core/schemas/responses.go (5)
  • ResponsesMessage (318-331)
  • ResponsesInputMessageRoleSystem (338-338)
  • ResponsesMessageTypeMessage (294-294)
  • ResponsesMessageContent (343-348)
  • ResponsesMessageContentBlock (403-418)
core/providers/bedrock/utils.go (2)
core/providers/bedrock/types.go (6)
  • BedrockContentBlock (163-190)
  • BedrockCachePoint (199-201)
  • BedrockCachePointTypeDefault (195-195)
  • BedrockMessage (150-153)
  • BedrockToolConfig (278-281)
  • BedrockTool (284-287)
core/schemas/utils.go (1)
  • IsNovaModel (1043-1045)
🔇 Additional comments (38)
core/providers/openai/types.go (1)

178-211: LGTM! Data mutation issue properly resolved.

The deep-copy implementation correctly addresses the previous concern about in-place modification. The approach is sound:

  • Creates a new message slice and copies each message
  • Deep copies Content and ContentBlocks when present, strips CacheControl, and replaces the pointer
  • Deep copies ResponsesToolMessage, Output, and output blocks when present, strips CacheControl, and replaces the pointer
  • Preserves all other fields in the copied structs
  • Original data remains unmodified

The nil checks ensure safe dereferencing, and the logic correctly handles cases where content or tool blocks may be absent.

core/providers/bedrock/bedrock_test.go (2)

73-77: LGTM! Prompt caching configuration added correctly.

The test configuration properly includes the new PromptCachingModel field set to claude-4.5-sonnet, which aligns with the reasoning model. The field structure and assignments follow the existing pattern.


95-95: LGTM! Prompt caching scenario enabled.

The PromptCaching: true flag correctly enables prompt caching tests for the Bedrock provider, which is consistent with the PR objectives.

core/internal/testutil/test_retry_framework.go (3)

2153-2158: LGTM! Improved logging for attempt tracking.

The added per-attempt start logs enhance test observability by clearly distinguishing the first attempt from retries and showing progress (attempt X/Y format).


2205-2218: LGTM! Enhanced retry error logging and timing visibility.

The added error logging, clarified attempt parameter comments, and waiting logs improve test observability and address the past review concerns about attempt semantics consistency.


2241-2254: LGTM! Critical fix: Continue retrying on nil channel instead of early return.

The change from returning early to continuing the retry loop (line 2254) is important for retry resilience. When the response channel is nil, the code now properly retries instead of failing immediately. The added logging and comment "CRITICAL: Must continue to retry, not return" correctly emphasize this behavior.

core/schemas/utils.go (1)

1042-1045: The search confirms there is only one IsNovaModel function declaration in core/schemas/utils.go (lines 1043-1045). No duplicate declarations exist, so there is no compilation error to address.

Likely an incorrect or invalid review comment.

core/providers/anthropic/responses.go (12)

1344-1345: LGTM on centralized system content handling.

The signature change to accept systemContent *AnthropicContent cleanly centralizes system message handling and aligns with the PR's goal of propagating cache control through the conversion pipeline.


2066-2097: Verify handling of nil systemContent.ContentStr.

When systemContent.ContentStr is not nil but points to an empty string, the condition on line 2068 evaluates to false, falling through to the ContentBlocks branch. If ContentBlocks is also nil, no system message is emitted. This is likely intended behavior, but worth confirming.

Also, the function correctly guards against empty contentBlocks at line 2086 before appending, which addresses the concern from the past review comment about empty content blocks.


2174-2181: CacheControl correctly propagated for grouped text blocks.

The CacheControl field is properly attached when converting Anthropic text blocks to Bifrost format in the grouped message conversion path.


2263-2266: CacheControl propagated in tool result content blocks.

The CacheControl is correctly preserved when converting text content blocks within tool results.


2377-2380: CacheControl propagated for output message text blocks.

Text blocks in output messages correctly include CacheControl.


2390-2397: CacheControl propagated for input message text blocks.

Input message text blocks now use ContentBlocks instead of ContentStr to preserve CacheControl. This is a good change that ensures cache metadata is not lost.


2495-2498: Consistent CacheControl propagation in tool result conversions.

Both tool result and MCP tool result content block conversions correctly preserve CacheControl.

Also applies to: 2560-2563


3082-3086: CacheControl propagated from Anthropic tools to Bifrost tools.

The conversion correctly transfers CacheControl from AnthropicTool to the Bifrost ResponsesTool structure.


3262-3266: CacheControl propagated from Bifrost tools back to Anthropic tools.

The reverse conversion also correctly preserves CacheControl, ensuring bidirectional fidelity.


3330-3333: CacheControl propagated in content block to Anthropic conversion.

Text blocks correctly include CacheControl when converting to Anthropic format.


3343-3344: CacheControl preserved for image block conversions.

Image blocks correctly pass CacheControl through to the Anthropic image block conversion.


3382-3383: CacheControl preserved in toBifrostResponsesImageBlock.

Image blocks correctly include CacheControl when converting to Bifrost format.

core/providers/anthropic/chat.go (6)

65-68: CacheControl correctly propagated for tools.

Tools now properly include CacheControl when converting from Bifrost to Anthropic format.


150-153: CacheControl propagated in system content blocks.

System message content blocks correctly include CacheControl.


185-188: CacheControl propagated in tool result content blocks.

Tool result text blocks correctly preserve CacheControl.


242-245: CacheControl propagated in message content blocks.

Regular message text blocks correctly include CacheControl.


320-366: Refactored content processing looks good.

The switch-based content block processing is cleaner and properly handles:

  • Text blocks with CacheControl propagation
  • Tool use blocks with proper argument marshaling
  • Thinking blocks with signature preservation

The refactor maintains all existing functionality while adding CacheControl support.


505-508: CacheControl propagated in response content blocks.

ToAnthropicChatResponse correctly includes CacheControl when converting content blocks back to Anthropic format.

core/providers/bedrock/types.go (4)

159-160: CachePoint field added to BedrockSystemMessage.

The optional CachePoint field enables cache signaling for system messages in Bedrock requests.


285-287: CachePoint field added to BedrockTool.

Tools can now include cache points for prompt caching optimization.


645-646: Minor formatting changes to ETag fields.

These appear to be whitespace/alignment changes only with no functional impact.

Also applies to: 663-664, 677-678


187-201: New BedrockCachePoint types are well-structured.

The new types follow Go conventions and Bedrock's API patterns:

  • BedrockCachePointType as a typed string constant
  • BedrockCachePointTypeDefault as the standard value
  • BedrockCachePoint struct with the required type field

The implementation is consistent across the codebase with proper JSON struct tags and appropriate usage in content block handling.

core/providers/bedrock/chat.go (3)

80-120: Tool call processing logic is correct.

The refactored code properly:

  1. Checks for structured output tool and sets contentStr without adding to toolCalls
  2. Handles regular tool calls with proper argument marshaling
  3. Uses continue to skip adding structured output to toolCalls

This addresses the past review comment about structured output dropping normal content.


122-135: Reasoning content extraction is correct.

Reasoning blocks are properly extracted with text and signature preservation. The nil check on ReasoningText prevents panics.


138-142: MessageContent construction preserves both contentStr and contentBlocks.

Unlike the previous implementation that collapsed single text blocks to ContentStr, this version keeps both fields which preserves CacheControl metadata. This addresses the past review comment about dropping CacheControl.

tests/integrations/tests/test_anthropic.py (1)

1421-1614: Prompt-caching tests for Anthropic look correct and aligned with Bedrock coverage

The three tests exercise system, message, and tools checkpoints, assert non‑trivial cache write/read usage, and use approximate equality rather than brittle counter invariants. The cross‑provider parametrization and use of PROMPT_CACHING_LARGE_CONTEXT/PROMPT_CACHING_TOOLS are consistent with the Bedrock tests.

core/providers/bedrock/utils.go (1)

30-32: Cache-control propagation into Bedrock content blocks and tools looks consistent

Passing the model into convertToolConfig and emitting BedrockCachePoint for text/image/tool‑result blocks and tools when CacheControl is set is in line with the new BedrockContentBlock/BedrockTool schema and with how the tests construct prompt‑caching ToolConfig (tools plus a trailing cachePoint sentinel). The Nova guard in convertToolConfig is also a sensible safety check.

Also applies to: 323-348, 392-420, 599-667

tests/integrations/tests/test_bedrock.py (1)

1799-1821: Bedrock cache-usage validators are simple and aligned with provider semantics

The validate_cache_write/validate_cache_read helpers correctly assert positive cacheWriteInputTokens / cacheReadInputTokens and log the raw usage dict. This matches how Bedrock reports prompt-caching metrics and keeps the tests focused on “did we write/read from cache” rather than exact counter behavior.

core/providers/bedrock/responses.go (3)

1282-1290: Tool-level CachePoint ↔ CacheControl round-trip is coherent

Mapping Bedrock {"cachePoint": {"type": "default"}} entries into CacheControlTypeEphemeral on the last Bifrost tool (and back again by appending a CachePoint-only BedrockTool when tool.CacheControl != nil and the model is not Nova) matches the way the tests construct ToolConfig and keeps the conversion symmetric.

Also applies to: 1572-1578


2378-2385: CachePoint handling for system and content blocks is consistent in both directions

  • Bedrock → Bifrost: convertBedrockSystemMessageToBifrostMessages and the block.CachePoint branch in convertSingleBedrockMessageToBifrostMessages correctly attach an ephemeral CacheControl to the last content block of the last relevant message.
  • Bifrost → Bedrock: convertBifrostMessageToBedrockSystemMessages and convertBifrostResponsesMessageContentBlocksToBedrockContentBlocks emit separate CachePoint-only system messages/content blocks whenever a CacheControl is present.

This matches the Bedrock schema (BedrockSystemMessage.CachePoint, BedrockContentBlock.CachePoint) and the new prompt-caching tests.

Also applies to: 2413-2419, 2450-2484, 2657-2665, 2737-2781


1623-1640: The cache token mapping is semantically correct and intentional—no changes needed.

The concern about both Bedrock fields being named ...InputTokens is understandable, but the split is justified. CacheWriteInputTokens represents cache creation cost (tokens written to cache), which correctly maps to OutputTokensDetails.CachedTokens. This follows the schema design where output token details explicitly document cache creation tokens as "input tokens used to create the cache entry." The mapping treats cache creation as part of completion cost, consistent with OpenAI's model and the broader codebase pattern (see Anthropic's separate CacheCreationInputTokens field).

@TejasGhatte TejasGhatte force-pushed the 11-25-feat_support_for_prompt_caching_parameter_for_claude_models branch from 4ff38e2 to e4c62a0 Compare December 16, 2025 14:44
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
core/internal/testutil/prompt_caching.go (1)

277-443: Fix loop variable capture in parallel subtests in RunPromptCachingTest.

The subtests call t.Parallel() and close over i and query from the for i, query := range testQueries loop. Without rebinding, all parallel subtests can see the final values of i/query, breaking the “second query” expectations (e.g., cached‑tokens checks).

Rebind per iteration before t.Run:

-    for i, query := range testQueries {
-        t.Run(query.name, func(t *testing.T) {
+    for i, query := range testQueries {
+        i, query := i, query // capture loop variables for this subtest
+        t.Run(query.name, func(t *testing.T) {
             userMessage := schemas.ChatMessage{
                 Role: schemas.ChatMessageRoleUser,
@@
-                if i == 1 {
+                if i == 1 {
                     expectations.ProviderSpecific = map[string]interface{}{
                         "min_cached_tokens_percentage": 0.90, // 90% minimum
                         "query_index":                  i,
                     }
                 }
@@
-                if i == 0 {
+                if i == 0 {
                     ...
-                } else if i == 1 {
+                } else if i == 1 {
                     ...
                 }
             })
         }

This keeps each subtest’s logs and expectations aligned with the intended query index even when they run in parallel.

core/providers/anthropic/chat.go (1)

42-72: Avoid dropping CacheControl when collapsing single text content to ContentStr in ToAnthropicChatRequest.

You correctly propagate CacheControl from ChatContentBlock into AnthropicContentBlock for system, tool‑result, and normal messages, but the single‑block optimization:

if len(content) == 1 && content[0].Type == AnthropicContentBlockTypeText {
    anthropicMsg.Content = AnthropicContent{ContentStr: content[0].Text}
} else if len(content) > 0 {
    anthropicMsg.Content = AnthropicContent{ContentBlocks: content}
}

drops content[0].CacheControl entirely. That means a Bifrost request with a single text block marked for caching won’t send cache_control to Anthropic.

Align this with the Bedrock fix by only collapsing when there is no cache control:

-        // Set content
-        if len(content) == 1 && content[0].Type == AnthropicContentBlockTypeText {
-            // Single text content can be string
-            anthropicMsg.Content = AnthropicContent{ContentStr: content[0].Text}
-        } else if len(content) > 0 {
-            // Multiple content blocks
-            anthropicMsg.Content = AnthropicContent{ContentBlocks: content}
-        }
+        // Set content
+        if len(content) == 1 &&
+            content[0].Type == AnthropicContentBlockTypeText &&
+            content[0].CacheControl == nil {
+            // Single plain text content can be string
+            anthropicMsg.Content = AnthropicContent{ContentStr: content[0].Text}
+        } else if len(content) > 0 {
+            // Preserve blocks when there are multiple items or cache_control is set
+            anthropicMsg.Content = AnthropicContent{ContentBlocks: content}
+        }

This keeps the fast path while preserving cache semantics for cached content.

Also applies to: 141-201, 218-285

♻️ Duplicate comments (4)
tests/integrations/tests/test_bedrock.py (1)

1623-1797: Fix Bedrock client fixture usage in prompt‑caching tests (still calling the fixture function).

test_21_prompt_caching_system, test_22_prompt_caching_messages, and test_23_prompt_caching_tools call bedrock_client.converse(...) but don’t accept bedrock_client as a fixture argument, so bedrock_client here resolves to the fixture function, not an injected client. This will fail at runtime ('function' object has no attribute 'converse').

Update the signatures to use the injected fixture and (optionally) make test_config available instead of _test_config:

-    @pytest.mark.parametrize("provider,model", get_cross_provider_params_for_scenario("prompt_caching"))
-    def test_21_prompt_caching_system(self, _test_config, provider, model):
+    @pytest.mark.parametrize("provider,model", get_cross_provider_params_for_scenario("prompt_caching"))
+    def test_21_prompt_caching_system(self, bedrock_client, test_config, provider, model):

@@
-    @pytest.mark.parametrize("provider,model", get_cross_provider_params_for_scenario("prompt_caching"))
-    def test_22_prompt_caching_messages(self, _test_config, provider, model):
+    @pytest.mark.parametrize("provider,model", get_cross_provider_params_for_scenario("prompt_caching"))
+    def test_22_prompt_caching_messages(self, bedrock_client, test_config, provider, model):

@@
-    @pytest.mark.parametrize("provider,model", get_cross_provider_params_for_scenario("prompt_caching"))
-    def test_23_prompt_caching_tools(self, _test_config, provider, model):
+    @pytest.mark.parametrize("provider,model", get_cross_provider_params_for_scenario("prompt_caching"))
+    def test_23_prompt_caching_tools(self, bedrock_client, test_config, provider, model):

After this change, the .converse() calls will operate on the actual client instance, not the fixture function.

core/providers/bedrock/responses.go (3)

1245-1292: Bidirectional CacheControl↔CachePoint wiring for tools/content looks coherent

The new logic that:

  • Reads tool.CachePoint entries from Bedrock’s ToolConfig.Tools and applies an ephemeral CacheControl to the last bifrostReq.Params.Tools entry, and
  • Emits separate BedrockTool{CachePoint: ...} entries whenever a Bifrost tool has CacheControl (for non‑Nova models),
  • Emits / consumes CachePoint blocks around content via convertBifrostResponsesMessageContentBlocksToBedrockContentBlocks and block.CachePoint handling in convertSingleBedrockMessageToBifrostMessages,

is internally consistent and symmetric with the chat-path changes.

This relies on the Bedrock contract that cache-point tools and cache-point content blocks only appear after the corresponding function/content they annotate; given earlier discussion that the provider enforces this, the current behavior is reasonable.

#!/bin/bash
# Sanity-check that CacheControl is only ever translated to BedrockCachePointTypeDefault
# and that Nova models are gated consistently.
rg -n "CachePointTypeDefault" core/providers/bedrock
rg -n "IsNovaModel" core/providers

Also applies to: 1535-1587, 2737-2781


1702-1713: CacheRead/CacheWrite token mapping still overloads OutputTokensDetails for input metrics

ToBifrostResponsesResponse maps:

  • response.Usage.CacheReadInputTokensUsage.InputTokensDetails.CachedTokens
  • response.Usage.CacheWriteInputTokensUsage.OutputTokensDetails.CachedTokens

and ToBedrockConverseResponse inverts this by reading InputTokensDetails into CacheReadInputTokens and OutputTokensDetails into CacheWriteInputTokens.

Semantically, both Bedrock metrics are input token counts, so storing cache-write values under OutputTokensDetails is confusing and makes the schema harder to reason about (same concern as the earlier review on this block). If feasible, consider either:

  • Introducing a dedicated “cache_write_input_tokens” style field in ResponsesResponseInputTokens, or
  • At minimum documenting that OutputTokensDetails.CachedTokens carries cache‑write input tokens for Bedrock/Anthropic rather than cached output tokens.

Right now this is more a schema semantics issue than a functional bug, but it’s easy for downstream consumers to misinterpret.

#!/bin/bash
# Inspect all usages of InputTokensDetails/OutputTokensDetails.CachedTokens
rg -n "InputTokensDetails" core -C3
rg -n "OutputTokensDetails" core -C3

Also applies to: 1623-1641


2381-2385: Centralized Bedrock system-message conversion with CachePoint looks good

Routing system messages through convertBedrockSystemMessageToBifrostMessages and:

  • Emitting system ResponsesMessage entries per sysMsg.Text, and
  • Translating sysMsg.CachePoint into an ephemeral CacheControl on the last content block of the last system message,

keeps system cache hints in sync with the non‑system content path and avoids scattered special‑cases.

Given previous clarification that Bedrock only emits CachePoint after at least one content block, the silent no-op when bifrostMessages is empty is acceptable.

Also applies to: 2450-2484

🧹 Nitpick comments (1)
core/providers/bedrock/utils.go (1)

320-351: CacheControl→CachePoint mapping for text/image blocks is consistent but drops TTL

Using per-block temporaries in convertToolMessage and convertContentBlock to attach a BedrockCachePoint{Type: BedrockCachePointTypeDefault} when block.CacheControl != nil correctly propagates “has cache hint” from Bifrost to Bedrock for both text and image content.

Note that any CacheControl.TTL is currently ignored for Bedrock; if TTL semantics become important for Bedrock later, you’ll need either a provider-specific TTL mapping or a short comment noting that TTL is intentionally only supported on OpenAI today.

Also applies to: 389-421

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 4ff38e2 and e4c62a0.

📒 Files selected for processing (22)
  • core/internal/testutil/account.go (3 hunks)
  • core/internal/testutil/prompt_caching.go (3 hunks)
  • core/internal/testutil/test_retry_framework.go (5 hunks)
  • core/providers/anthropic/anthropic_test.go (2 hunks)
  • core/providers/anthropic/chat.go (6 hunks)
  • core/providers/anthropic/responses.go (13 hunks)
  • core/providers/anthropic/types.go (2 hunks)
  • core/providers/anthropic/utils.go (1 hunks)
  • core/providers/bedrock/bedrock_test.go (2 hunks)
  • core/providers/bedrock/chat.go (1 hunks)
  • core/providers/bedrock/responses.go (8 hunks)
  • core/providers/bedrock/types.go (6 hunks)
  • core/providers/bedrock/utils.go (6 hunks)
  • core/providers/openai/responses_marshal_test.go (0 hunks)
  • core/providers/openai/types.go (1 hunks)
  • core/schemas/chatcompletions.go (2 hunks)
  • core/schemas/responses.go (2 hunks)
  • core/schemas/utils.go (1 hunks)
  • tests/integrations/config.yml (7 hunks)
  • tests/integrations/tests/test_anthropic.py (5 hunks)
  • tests/integrations/tests/test_bedrock.py (3 hunks)
  • tests/integrations/tests/utils/common.py (2 hunks)
💤 Files with no reviewable changes (1)
  • core/providers/openai/responses_marshal_test.go
🚧 Files skipped from review as they are similar to previous changes (5)
  • core/providers/anthropic/types.go
  • core/providers/anthropic/anthropic_test.go
  • core/internal/testutil/test_retry_framework.go
  • core/schemas/responses.go
  • core/schemas/utils.go
🧰 Additional context used
📓 Path-based instructions (1)
**

⚙️ CodeRabbit configuration file

always check the stack if there is one for the current PR. do not give localized reviews for the PR, always see all changes in the light of the whole stack of PRs (if there is a stack, if there is no stack you can continue to make localized suggestions/reviews)

Files:

  • tests/integrations/config.yml
  • core/providers/anthropic/utils.go
  • core/internal/testutil/account.go
  • core/providers/bedrock/responses.go
  • core/providers/bedrock/bedrock_test.go
  • core/internal/testutil/prompt_caching.go
  • core/providers/bedrock/utils.go
  • core/providers/anthropic/chat.go
  • core/schemas/chatcompletions.go
  • core/providers/bedrock/types.go
  • core/providers/anthropic/responses.go
  • tests/integrations/tests/utils/common.py
  • core/providers/bedrock/chat.go
  • core/providers/openai/types.go
  • tests/integrations/tests/test_anthropic.py
  • tests/integrations/tests/test_bedrock.py
🧠 Learnings (6)
📚 Learning: 2025-12-09T17:07:42.007Z
Learnt from: qwerty-dvorak
Repo: maximhq/bifrost PR: 1006
File: core/schemas/account.go:9-18
Timestamp: 2025-12-09T17:07:42.007Z
Learning: In core/schemas/account.go, the HuggingFaceKeyConfig field within the Key struct is currently unused and reserved for future Hugging Face inference endpoint deployments. Do not flag this field as missing from OpenAPI documentation or require its presence in the API spec until the feature is actively implemented and used. When the feature is added, update the OpenAPI docs accordingly; otherwise, treat this field as non-breaking and not part of the current API surface.

Applied to files:

  • core/providers/anthropic/utils.go
  • core/internal/testutil/account.go
  • core/providers/bedrock/responses.go
  • core/providers/bedrock/bedrock_test.go
  • core/internal/testutil/prompt_caching.go
  • core/providers/bedrock/utils.go
  • core/providers/anthropic/chat.go
  • core/schemas/chatcompletions.go
  • core/providers/bedrock/types.go
  • core/providers/anthropic/responses.go
  • core/providers/bedrock/chat.go
  • core/providers/openai/types.go
📚 Learning: 2025-12-11T07:38:31.413Z
Learnt from: Pratham-Mishra04
Repo: maximhq/bifrost PR: 1000
File: core/providers/bedrock/bedrock_test.go:1374-1390
Timestamp: 2025-12-11T07:38:31.413Z
Learning: In core/providers/bedrock tests, follow a layered testing approach: - Unit tests (e.g., TestBifrostToBedrockResponseConversion) should perform structural comparisons and type/field checks to avoid brittleness from dynamic fields. - Separate scenario-based and integration tests should validate the full end-to-end conversion logic, including content block internals. Ensure unit tests avoid brittle string/field matching and that integration tests cover end-to-end behavior with realistic data.

Applied to files:

  • core/providers/bedrock/bedrock_test.go
📚 Learning: 2025-12-15T10:16:21.909Z
Learnt from: qwerty-dvorak
Repo: maximhq/bifrost PR: 1006
File: core/providers/huggingface/huggingface_test.go:12-63
Timestamp: 2025-12-15T10:16:21.909Z
Learning: In provider tests under core/providers/<provider>/*_test.go, do not require or flag the use of defer for Shutdown(); instead call client.Shutdown() at the end of each test function. This pattern appears consistent across all provider tests. Apply this rule only within this path; for other tests or resources, defer may still be appropriate.

Applied to files:

  • core/providers/bedrock/bedrock_test.go
📚 Learning: 2025-12-12T10:28:54.988Z
Learnt from: Pratham-Mishra04
Repo: maximhq/bifrost PR: 978
File: core/providers/anthropic/chat.go:207-216
Timestamp: 2025-12-12T10:28:54.988Z
Learning: In core/providers/anthropic/chat.go, within ToAnthropicChatRequest, ensure the mapping from ReasoningDetails to Anthropic thinking blocks preserves all fields from ReasoningDetails, including when Text is nil or empty. Do not drop or coerce missing Text; propagate nil/empty values to maintain complete bidirectional mapping of Anthropic events in Bifrost. Add tests to verify that ReasoningDetails with Text nil or "" round-trip without loss, and update code to handle nullable Text without dropping data. If there are multiple blocks, preserve order and all fields during mapping.

Applied to files:

  • core/providers/anthropic/chat.go
📚 Learning: 2025-12-11T11:58:25.307Z
Learnt from: Pratham-Mishra04
Repo: maximhq/bifrost PR: 1000
File: core/providers/openai/responses.go:42-84
Timestamp: 2025-12-11T11:58:25.307Z
Learning: In core/providers/openai/responses.go (and related OpenAI response handling), document and enforce the API format constraint: if ResponsesReasoning != nil and the response contains content blocks, all content blocks should be treated as reasoning blocks by default. Implement type guards or parsing logic accordingly, and add unit tests to verify that when ResponsesReasoning is non-nil, content blocks are labeled as reasoning blocks. Include clear comments in the code explaining the rationale and ensure downstream consumers rely on this behavior.

Applied to files:

  • core/providers/openai/types.go
📚 Learning: 2025-12-14T14:43:30.902Z
Learnt from: Radheshg04
Repo: maximhq/bifrost PR: 980
File: core/providers/openai/images.go:10-22
Timestamp: 2025-12-14T14:43:30.902Z
Learning: Enforce the OpenAI image generation SSE event type values across the OpenAI image flow in the repository: use "image_generation.partial_image" for partial chunks, "image_generation.completed" for the final result, and "error" for errors. Apply this consistently in schemas, constants, tests, accumulator routing, and UI code within core/providers/openai (and related Go files) to ensure uniform event typing and avoid mismatches.

Applied to files:

  • core/providers/openai/types.go
🧬 Code graph analysis (9)
core/providers/anthropic/utils.go (2)
core/providers/anthropic/types.go (2)
  • AnthropicContentBlockTypeImage (205-205)
  • AnthropicImageSource (234-239)
core/schemas/chatcompletions.go (1)
  • CacheControl (610-613)
core/providers/bedrock/responses.go (4)
core/schemas/utils.go (1)
  • IsNovaModel (1043-1045)
core/schemas/chatcompletions.go (2)
  • CacheControl (610-613)
  • CacheControlTypeEphemeral (607-607)
core/providers/bedrock/types.go (5)
  • BedrockTool (284-287)
  • BedrockCachePoint (199-201)
  • BedrockCachePointTypeDefault (195-195)
  • BedrockSystemMessage (156-160)
  • BedrockContentBlock (163-190)
core/schemas/responses.go (3)
  • ResponsesMessage (318-331)
  • ResponsesMessageContent (343-348)
  • ResponsesMessageContentBlock (403-418)
core/internal/testutil/prompt_caching.go (2)
core/schemas/chatcompletions.go (4)
  • CacheControl (610-613)
  • CacheControlTypeEphemeral (607-607)
  • ChatContentBlock (592-602)
  • ChatContentBlockTypeText (584-584)
core/schemas/utils.go (1)
  • Ptr (16-18)
core/providers/anthropic/chat.go (3)
core/schemas/chatcompletions.go (4)
  • CacheControl (610-613)
  • ChatContentBlock (592-602)
  • ChatAssistantMessageToolCallFunction (711-714)
  • ChatAssistantMessageToolCall (702-708)
core/providers/anthropic/types.go (3)
  • AnthropicContentBlockTypeText (204-204)
  • AnthropicContentBlockTypeToolUse (206-206)
  • AnthropicContentBlockTypeThinking (212-212)
core/schemas/utils.go (1)
  • Ptr (16-18)
core/providers/anthropic/responses.go (4)
core/schemas/responses.go (4)
  • ResponsesMessage (318-331)
  • ResponsesInputMessageRoleSystem (338-338)
  • ResponsesMessageContent (343-348)
  • ResponsesMessageContentBlock (403-418)
core/schemas/utils.go (1)
  • Ptr (16-18)
core/schemas/chatcompletions.go (1)
  • CacheControl (610-613)
core/providers/anthropic/types.go (1)
  • AnthropicContentBlockTypeText (204-204)
core/providers/bedrock/chat.go (3)
core/schemas/chatcompletions.go (6)
  • ChatContentBlock (592-602)
  • ChatContentBlockTypeText (584-584)
  • CacheControl (610-613)
  • CacheControlTypeEphemeral (607-607)
  • ChatAssistantMessageToolCall (702-708)
  • ChatAssistantMessageToolCallFunction (711-714)
core/schemas/bifrost.go (1)
  • BifrostContextKeyStructuredOutputToolName (135-135)
core/schemas/utils.go (1)
  • Ptr (16-18)
core/providers/openai/types.go (2)
core/schemas/responses.go (3)
  • ResponsesMessage (318-331)
  • ResponsesMessageContentBlock (403-418)
  • ResponsesToolMessage (469-489)
core/schemas/chatcompletions.go (1)
  • CacheControl (610-613)
tests/integrations/tests/test_anthropic.py (2)
tests/integrations/tests/utils/parametrize.py (2)
  • get_cross_provider_params_for_scenario (12-47)
  • format_provider_model (50-65)
tests/integrations/tests/utils/common.py (1)
  • assert_valid_chat_response (801-833)
tests/integrations/tests/test_bedrock.py (2)
tests/integrations/tests/utils/parametrize.py (2)
  • get_cross_provider_params_for_scenario (12-47)
  • format_provider_model (50-65)
tests/integrations/tests/test_anthropic.py (2)
  • validate_cache_write (1669-1681)
  • validate_cache_read (1683-1695)
🔇 Additional comments (28)
core/internal/testutil/account.go (2)

21-56: Extending TestScenarios with PromptCaching looks consistent with the feature matrix.

The new PromptCaching flag slots cleanly alongside other scenario booleans and is wired into provider configs below; no structural or API‑surface issues here.


593-660: Anthropic/Bedrock scenarios enabling PromptCaching are aligned with the stack’s intent.

Setting PromptCaching: true only for Anthropic and Bedrock matches the updated integrations config and new prompt‑caching tests, while leaving other providers unchanged. No additional wiring seems required in this file.

core/providers/anthropic/utils.go (1)

94-103: Image block conversion now correctly carries CacheControl and uses the typed content‑block kind.

Using AnthropicContentBlockTypeImage and forwarding block.CacheControl into the Anthropic image block is consistent with the rest of the cache‑control plumbing; initialization remains safe even when ImageURLStruct is nil.

core/providers/bedrock/bedrock_test.go (1)

65-107: Bedrock comprehensive test config correctly wires prompt‑caching model and scenario.

Using claude-4.5-sonnet for both ReasoningModel and PromptCachingModel, and enabling PromptCaching: true in scenarios, matches the broader stack (Claude prompt caching on Bedrock) and reuses existing S3/role batch/file params without introducing new test‑only knobs.

core/schemas/chatcompletions.go (2)

244-250: Adding CacheControl to ChatTool is a safe, backwards‑compatible extension.

The optional cache_control field lets providers receive tool‑level cache hints while keeping existing JSON payloads unchanged when absent, and it reuses the shared CacheControl schema so chat and responses stay aligned.


591-613: Shared CacheControl on ChatContentBlock cleanly models provider‑specific caching hints.

Exposing CacheControl *CacheControl on chat content blocks, backed by the new CacheControlType/CacheControl definitions, gives Anthropic/Bedrock a place to attach ephemeral/TTL metadata without impacting OpenAI’s schema; the reuse of this type across chat and responses keeps cache semantics consistent.

core/providers/bedrock/types.go (2)

155-201: BedrockCachePoint additions correctly surface Bedrock’s cache metadata.

Adding CachePoint to BedrockSystemMessage and BedrockContentBlock, alongside the BedrockCachePointType/BedrockCachePoint definitions, gives the Bedrock adapter a structured place to emit/consume cache points without altering existing payloads when unset.


641-678: The review comment regarding mandatory ETag fields on Bedrock file request types cannot be verified without access to the codebase to inspect all call sites and coordinate with the broader PR stack.

tests/integrations/tests/utils/common.py (2)

88-281: PROMPT_CACHING_TOOLS test data is well‑structured and consistent with existing tool schemas.

The added tool definitions mirror the existing tool parameter shape (type/required/properties) and give prompt‑caching tests a rich set of tool payloads without affecting runtime code.


421-481: PROMPT_CACHING_LARGE_CONTEXT is appropriate as a high‑token test payload.

Using a long, repeated legal‑framework prompt ensures prompt‑caching tests hit realistic context sizes without introducing any behavioral complexity into the utilities module.

tests/integrations/config.yml (3)

109-128: Updating bedrock.chat to the Claude 4.5 Bedrock ARN is consistent with the new caching flows.

Pointing the core Bedrock chat model at us.anthropic.claude-sonnet-4-5-20250929-v1:0 matches the rest of the stack’s focus on Claude 4.5 for reasoning/prompt‑caching while leaving other capabilities on the stable 3.5 sonnet ARN.


148-320: Prompt‑caching scenario flags and capability mapping are coherent across providers.

Enabling prompt_caching only for Anthropic and Bedrock, while explicitly disabling it for OpenAI, Gemini, and Cohere, matches the current implementation surface; wiring prompt_caching: "chat" in scenario_capabilities lets the harness reuse existing chat model selection without extra special‑cases.


340-363: Adding prompt_caching to scenario_capabilities correctly reuses the chat capability.

Treating prompt‑caching scenarios as a chat capability keeps the configuration model simple and avoids duplicating per‑provider model keys just for caching tests.

core/providers/openai/types.go (1)

172-212: Deep‑copying responses messages before stripping CacheControl fixes the prior mutation hazard.

MarshalJSON now builds a messagesCopy slice and deep‑copies ContentBlocks and tool‑output blocks before nulling CacheControl, so OpenAI wiring stays cache‑control‑free while the original ResponsesMessage graph remains untouched for other consumers.

tests/integrations/tests/test_bedrock.py (3)

37-40: Prompt-caching section in docstring looks consistent with new tests.

Entries 21–23 line up with test_21_prompt_caching_system, test_22_prompt_caching_messages, and test_23_prompt_caching_tools.


51-68: Imports for prompt-caching test data are correct and localized.

Bringing in PROMPT_CACHING_TOOLS and PROMPT_CACHING_LARGE_CONTEXT from .utils.common matches how they’re used below.


1799-1821: Cache usage helpers for Bedrock prompt‑caching look appropriate.

validate_cache_write / validate_cache_read correctly operate on the Bedrock-style usage dict (inputTokens, cacheWriteInputTokens, cacheReadInputTokens) and only assert that the relevant counter is present and > 0, which should be robust across warm-cache runs.

core/providers/bedrock/chat.go (1)

62-191: Bedrock response → Bifrost conversion correctly preserves cache, tools, and reasoning metadata.

The new per‑block loop handles text, CachePointCacheControl(ephemeral), tool uses (including structured‑output via the context key), and reasoning content without dropping data, and the usage mapping exposes CacheReadInputTokens / CacheWriteInputTokens via PromptTokensDetails and CompletionTokensDetails consistently with other providers.

core/internal/testutil/prompt_caching.go (1)

245-270: Ephemeral CacheControl on check_compliance_status tool is consistent with prompt‑caching design.

Marking this tool as CacheControl{Type: Ephemeral} matches how tools are used in the prompt‑caching scenarios and how provider adapters propagate tool‑level cache control.

core/providers/anthropic/chat.go (1)

294-367: Anthropic ↔ Bifrost response conversions now propagate cache, tools, and reasoning correctly.

ToBifrostChatResponse builds ChatContentBlocks with CacheControl, serializes tool‑use inputs into ChatAssistantMessageToolCall.Function.Arguments (defaulting to "{}"), and maps thinking blocks into ChatReasoningDetails and aggregated reasoningText. ToAnthropicChatResponse mirrors this, re‑emitting text content blocks with CacheControl and reconstructing tool_use blocks from Bifrost tool calls, while exposing cache read/creation tokens via Usage.CacheReadInputTokens / CacheCreationInputTokens. This looks consistent with the prompt‑caching tests and other providers.

Also applies to: 421-541

tests/integrations/tests/test_anthropic.py (3)

39-41: Docstring updates for prompt‑caching tests match the new cases.

Entries 28–30 correspond directly to test_28_prompt_caching_system, test_29_prompt_caching_messages, and test_30_prompt_caching_tools.


52-94: New prompt‑caching imports integrate cleanly with existing utilities.

Importing PROMPT_CACHING_TOOLS and PROMPT_CACHING_LARGE_CONTEXT from .utils.common aligns with their usage in tests 28–30.


1421-1614: Anthropic prompt‑caching tests and cache helpers are well-structured and consistent with provider behavior.

  • Tests 28–30 correctly:

    • Parameterize over prompt_caching providers/models.
    • Use get_provider_anthropic_client(provider) plus format_provider_model.
    • Attach {"type": "ephemeral"} cache_control at system, message, and tool levels in the expected Anthropic wire format.
    • Assert non‑trivial cache creation on the first call and non‑trivial cache reads on the second; for system/messages, they also check read ≈ write (abs(diff) < 100), which matches how cache metrics are exposed by ToAnthropicChatResponse.
  • validate_cache_write / validate_cache_read:

    • Log input_tokens, cache_creation_input_tokens, and cache_read_input_tokens.
    • Require the relevant counter to exist and be > 0 without over‑constraining warm‑cache behavior, addressing earlier strictness concerns.

Overall these tests should give good signal on Anthropic‑style prompt caching across providers.

Also applies to: 1669-1695

core/providers/anthropic/responses.go (5)

1340-1347: Shared Anthropic↔Bifrost conversion entrypoints with system content are aligned

Updating ToBifrostResponsesRequest and ToBifrostResponsesResponse to call:

ConvertAnthropicMessagesToBifrostMessages(request.Messages, request.System, false, provider == schemas.Bedrock)
...
ConvertAnthropicMessagesToBifrostMessages([]AnthropicMessage{tempMsg}, nil, true, false)

with the new (anthropicMessages, systemContent, isOutputMessage, keepToolsGrouped) signature centralizes system handling and tool grouping for both request and response paths. This matches how ConvertBifrostMessagesToAnthropicMessages already returns regular+system content separately and keeps the two directions symmetric.

No correctness issues stand out here.

Also applies to: 1627-1649


2260-2267: Tool-result and text content now round‑trip CacheControl between Anthropic and Bifrost

Across:

  • AnthropicContentBlockTypeToolResult handling (for both normal and MCP/tool outputs), and
  • convertAnthropicContentBlocksToResponsesMessages (text case for both input and output messages),

you now attach CacheControl from AnthropicContentBlock.CacheControl to the corresponding ResponsesMessageContentBlock.CacheControl.

Combined with convertBifrostResponsesMessageContentBlocksToBedrockContentBlocks and convertContentBlockToAnthropic wiring the reverse direction, this gives a clean end‑to‑end mapping for cache hints on tool outputs and message text.

Behavior looks correct and consistent with the cache‑point design in Bedrock.

Also applies to: 2493-2499, 2558-2564, 2370-2399


3325-3347: Anthropic↔Bifrost tool/content CacheControl propagation is symmetric

The new plumbing:

  • In convertContentBlockToAnthropic, copying block.CacheControl for text and image blocks (the latter via ChatContentBlock.CacheControl into ConvertToAnthropicImageBlock), and
  • In toBifrostResponsesImageBlock, copying block.CacheControl back into the image content block, and
  • In convertAnthropicToolToBifrost / convertBifrostToolToAnthropic, shuttling tool.CacheControl between AnthropicTool and ResponsesTool,

gives a symmetric mapping for both content‑level and tool‑level cache hints.

This aligns Anthropic’s new CacheControl fields with the unified schemas.CacheControl type and matches the Bedrock/OpenAI handling patterns in the stack.

Also applies to: 3376-3383, 2982-3085, 3183-3267


2065-2097: System-content conversion now preserves block-level CacheControl

convertAnthropicSystemToBifrostMessages:

  • Emits a system message for ContentStr when non-empty, and
  • For ContentBlocks, builds ResponsesMessageContentBlock entries with Text and CacheControl copied from each AnthropicContentBlock, only appending a system message when at least one text block exists.

This avoids empty system messages and ensures Anthropic system-level cache hints survive into Bifrost. The block Type is set to an output-text variant even for input/system messages—verify that downstream code handles both input and output text types equivalently.


2150-2184: Wire CacheControl for symmetry in output-message accumulation

In convertAnthropicContentBlocksToResponsesMessagesGrouped, the output-message branch accumulates text content without propagating block.CacheControl, while the input-message branch correctly includes it. Adding CacheControl: block.CacheControl to the accumulated text content blocks ensures symmetric handling and prevents issues if this function is reused for output messages in the future.

 accumulatedTextContent = append(accumulatedTextContent, schemas.ResponsesMessageContentBlock{
     Type: schemas.ResponsesOutputMessageContentTypeText,
     Text: block.Text,
+    CacheControl: block.CacheControl,
 })

@TejasGhatte TejasGhatte force-pushed the 11-25-feat_support_for_prompt_caching_parameter_for_claude_models branch from e4c62a0 to 5cfbabb Compare December 17, 2025 06:46
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (1)
core/providers/bedrock/responses.go (1)

1706-1712: Semantically incorrect token mapping: CacheWriteInputTokens mapped from OutputTokensDetails.

CacheWriteInputTokens represents tokens written to the input cache and should be sourced from InputTokensDetails, not OutputTokensDetails. Both CacheReadInputTokens and CacheWriteInputTokens are input-side metrics.

The reverse mapping at lines 1636-1640 has the same semantic issue—it places CacheWriteInputTokens into OutputTokensDetails.

Consider updating both mappings to use InputTokensDetails consistently:

 if bifrostResp.Usage.InputTokensDetails != nil {
     bedrockResp.Usage.CacheReadInputTokens = bifrostResp.Usage.InputTokensDetails.CachedTokens
 }
-if bifrostResp.Usage.OutputTokensDetails != nil {
-    bedrockResp.Usage.CacheWriteInputTokens = bifrostResp.Usage.OutputTokensDetails.CachedTokens
-}
+// Note: CacheWriteInputTokens is also an input metric; consider storing it in InputTokensDetails
+// or adding a dedicated field if the schema supports it

And the reverse mapping (lines 1636-1640):

 if response.Usage.CacheReadInputTokens > 0 {
     bifrostResp.Usage.InputTokensDetails = &schemas.ResponsesResponseInputTokens{
         CachedTokens: response.Usage.CacheReadInputTokens,
     }
 }
-if response.Usage.CacheWriteInputTokens > 0 {
-    bifrostResp.Usage.OutputTokensDetails = &schemas.ResponsesResponseOutputTokens{
-        CachedTokens: response.Usage.CacheWriteInputTokens,
-    }
-}
+// CacheWriteInputTokens should also map to InputTokensDetails or a separate field
🧹 Nitpick comments (3)
tests/integrations/tests/test_anthropic.py (1)

1561-1608: Prompt caching tools test validates tool caching.

The test correctly adds cache_control to the last tool and validates cache creation/read behavior. Note that this test doesn't assert abs(cache_write_tokens - cache_read_tokens) < 100 like the other tests.

Consider adding the same relative validation as tests 28 and 29 for consistency:

         cache_read_tokens = validate_cache_read(response2.usage, "Second request")
         
+        # Validate that cache read tokens are approximately equal to cache creation tokens
+        assert abs(cache_write_tokens - cache_read_tokens) < 100, \
+            f"Cache read tokens ({cache_read_tokens}) should be close to cache creation tokens ({cache_write_tokens})"
+        
         print(f"✓ Tools caching validated - Cache created: {cache_write_tokens} tokens, "
               f"Cache read: {cache_read_tokens} tokens")
core/providers/anthropic/chat.go (1)

274-281: Consider edge case: Single text content with CacheControl loses the annotation.

When there's exactly one text content block (line 275-277), the code uses ContentStr which doesn't carry CacheControl. If the original block had CacheControl set, it would be silently dropped.

Consider checking if CacheControl is set before simplifying to ContentStr:

 // Set content
-if len(content) == 1 && content[0].Type == AnthropicContentBlockTypeText {
+if len(content) == 1 && content[0].Type == AnthropicContentBlockTypeText && content[0].CacheControl == nil {
     // Single text content can be string
     anthropicMsg.Content = AnthropicContent{ContentStr: content[0].Text}
 } else if len(content) > 0 {
tests/integrations/tests/test_bedrock.py (1)

1799-1821: Consider consolidating helper functions with Anthropic tests.

The validate_cache_write and validate_cache_read functions here use dictionary key access (usage.get('cacheWriteInputTokens')), while the similar functions in test_anthropic.py use attribute access (getattr(usage, 'cache_creation_input_tokens')). This suggests the response structures differ between providers.

If both providers return the same Bifrost-normalized structure, consider extracting these helpers to a shared utility module to reduce duplication and ensure consistent validation logic.

📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e4c62a0 and 5cfbabb.

📒 Files selected for processing (22)
  • core/internal/testutil/account.go (3 hunks)
  • core/internal/testutil/prompt_caching.go (3 hunks)
  • core/internal/testutil/test_retry_framework.go (5 hunks)
  • core/providers/anthropic/anthropic_test.go (2 hunks)
  • core/providers/anthropic/chat.go (5 hunks)
  • core/providers/anthropic/responses.go (13 hunks)
  • core/providers/anthropic/types.go (2 hunks)
  • core/providers/anthropic/utils.go (1 hunks)
  • core/providers/bedrock/bedrock_test.go (2 hunks)
  • core/providers/bedrock/chat.go (1 hunks)
  • core/providers/bedrock/responses.go (8 hunks)
  • core/providers/bedrock/types.go (6 hunks)
  • core/providers/bedrock/utils.go (8 hunks)
  • core/providers/openai/responses_marshal_test.go (0 hunks)
  • core/providers/openai/types.go (1 hunks)
  • core/schemas/chatcompletions.go (2 hunks)
  • core/schemas/responses.go (2 hunks)
  • core/schemas/utils.go (1 hunks)
  • tests/integrations/config.yml (7 hunks)
  • tests/integrations/tests/test_anthropic.py (5 hunks)
  • tests/integrations/tests/test_bedrock.py (3 hunks)
  • tests/integrations/tests/utils/common.py (2 hunks)
💤 Files with no reviewable changes (1)
  • core/providers/openai/responses_marshal_test.go
🚧 Files skipped from review as they are similar to previous changes (7)
  • core/providers/anthropic/types.go
  • core/providers/anthropic/anthropic_test.go
  • tests/integrations/config.yml
  • tests/integrations/tests/utils/common.py
  • core/internal/testutil/test_retry_framework.go
  • core/providers/anthropic/utils.go
  • core/schemas/responses.go
🧰 Additional context used
📓 Path-based instructions (1)
**

⚙️ CodeRabbit configuration file

always check the stack if there is one for the current PR. do not give localized reviews for the PR, always see all changes in the light of the whole stack of PRs (if there is a stack, if there is no stack you can continue to make localized suggestions/reviews)

Files:

  • core/providers/openai/types.go
  • core/schemas/utils.go
  • core/internal/testutil/prompt_caching.go
  • core/internal/testutil/account.go
  • core/providers/bedrock/bedrock_test.go
  • core/providers/bedrock/chat.go
  • core/providers/anthropic/chat.go
  • core/schemas/chatcompletions.go
  • core/providers/bedrock/utils.go
  • core/providers/bedrock/types.go
  • tests/integrations/tests/test_anthropic.py
  • core/providers/bedrock/responses.go
  • tests/integrations/tests/test_bedrock.py
  • core/providers/anthropic/responses.go
🧠 Learnings (6)
📚 Learning: 2025-12-09T17:07:42.007Z
Learnt from: qwerty-dvorak
Repo: maximhq/bifrost PR: 1006
File: core/schemas/account.go:9-18
Timestamp: 2025-12-09T17:07:42.007Z
Learning: In core/schemas/account.go, the HuggingFaceKeyConfig field within the Key struct is currently unused and reserved for future Hugging Face inference endpoint deployments. Do not flag this field as missing from OpenAPI documentation or require its presence in the API spec until the feature is actively implemented and used. When the feature is added, update the OpenAPI docs accordingly; otherwise, treat this field as non-breaking and not part of the current API surface.

Applied to files:

  • core/providers/openai/types.go
  • core/schemas/utils.go
  • core/internal/testutil/prompt_caching.go
  • core/internal/testutil/account.go
  • core/providers/bedrock/bedrock_test.go
  • core/providers/bedrock/chat.go
  • core/providers/anthropic/chat.go
  • core/schemas/chatcompletions.go
  • core/providers/bedrock/utils.go
  • core/providers/bedrock/types.go
  • core/providers/bedrock/responses.go
  • core/providers/anthropic/responses.go
📚 Learning: 2025-12-11T11:58:25.307Z
Learnt from: Pratham-Mishra04
Repo: maximhq/bifrost PR: 1000
File: core/providers/openai/responses.go:42-84
Timestamp: 2025-12-11T11:58:25.307Z
Learning: In core/providers/openai/responses.go (and related OpenAI response handling), document and enforce the API format constraint: if ResponsesReasoning != nil and the response contains content blocks, all content blocks should be treated as reasoning blocks by default. Implement type guards or parsing logic accordingly, and add unit tests to verify that when ResponsesReasoning is non-nil, content blocks are labeled as reasoning blocks. Include clear comments in the code explaining the rationale and ensure downstream consumers rely on this behavior.

Applied to files:

  • core/providers/openai/types.go
📚 Learning: 2025-12-14T14:43:30.902Z
Learnt from: Radheshg04
Repo: maximhq/bifrost PR: 980
File: core/providers/openai/images.go:10-22
Timestamp: 2025-12-14T14:43:30.902Z
Learning: Enforce the OpenAI image generation SSE event type values across the OpenAI image flow in the repository: use "image_generation.partial_image" for partial chunks, "image_generation.completed" for the final result, and "error" for errors. Apply this consistently in schemas, constants, tests, accumulator routing, and UI code within core/providers/openai (and related Go files) to ensure uniform event typing and avoid mismatches.

Applied to files:

  • core/providers/openai/types.go
📚 Learning: 2025-12-11T07:38:31.413Z
Learnt from: Pratham-Mishra04
Repo: maximhq/bifrost PR: 1000
File: core/providers/bedrock/bedrock_test.go:1374-1390
Timestamp: 2025-12-11T07:38:31.413Z
Learning: In core/providers/bedrock tests, follow a layered testing approach: - Unit tests (e.g., TestBifrostToBedrockResponseConversion) should perform structural comparisons and type/field checks to avoid brittleness from dynamic fields. - Separate scenario-based and integration tests should validate the full end-to-end conversion logic, including content block internals. Ensure unit tests avoid brittle string/field matching and that integration tests cover end-to-end behavior with realistic data.

Applied to files:

  • core/providers/bedrock/bedrock_test.go
📚 Learning: 2025-12-15T10:16:21.909Z
Learnt from: qwerty-dvorak
Repo: maximhq/bifrost PR: 1006
File: core/providers/huggingface/huggingface_test.go:12-63
Timestamp: 2025-12-15T10:16:21.909Z
Learning: In provider tests under core/providers/<provider>/*_test.go, do not require or flag the use of defer for Shutdown(); instead call client.Shutdown() at the end of each test function. This pattern appears consistent across all provider tests. Apply this rule only within this path; for other tests or resources, defer may still be appropriate.

Applied to files:

  • core/providers/bedrock/bedrock_test.go
📚 Learning: 2025-12-12T10:28:54.988Z
Learnt from: Pratham-Mishra04
Repo: maximhq/bifrost PR: 978
File: core/providers/anthropic/chat.go:207-216
Timestamp: 2025-12-12T10:28:54.988Z
Learning: In core/providers/anthropic/chat.go, within ToAnthropicChatRequest, ensure the mapping from ReasoningDetails to Anthropic thinking blocks preserves all fields from ReasoningDetails, including when Text is nil or empty. Do not drop or coerce missing Text; propagate nil/empty values to maintain complete bidirectional mapping of Anthropic events in Bifrost. Add tests to verify that ReasoningDetails with Text nil or "" round-trip without loss, and update code to handle nullable Text without dropping data. If there are multiple blocks, preserve order and all fields during mapping.

Applied to files:

  • core/providers/anthropic/chat.go
🧬 Code graph analysis (5)
core/providers/openai/types.go (1)
core/schemas/responses.go (3)
  • ResponsesMessage (318-331)
  • ResponsesMessageContentBlock (403-418)
  • ResponsesToolMessage (469-489)
core/internal/testutil/prompt_caching.go (2)
core/schemas/chatcompletions.go (4)
  • CacheControl (610-613)
  • CacheControlTypeEphemeral (607-607)
  • ChatContentBlock (592-602)
  • ChatContentBlockTypeText (584-584)
core/schemas/utils.go (1)
  • Ptr (16-18)
core/providers/bedrock/chat.go (3)
core/schemas/chatcompletions.go (6)
  • ChatContentBlock (592-602)
  • ChatContentBlockTypeText (584-584)
  • ChatAssistantMessageToolCall (702-708)
  • ChatAssistantMessageToolCallFunction (711-714)
  • ChatReasoningDetails (739-746)
  • BifrostReasoningDetailsTypeText (735-735)
core/schemas/bifrost.go (1)
  • BifrostContextKeyStructuredOutputToolName (135-135)
core/schemas/utils.go (1)
  • Ptr (16-18)
core/providers/bedrock/utils.go (3)
core/schemas/chatcompletions.go (4)
  • ChatMessage (471-480)
  • ChatContentBlockTypeText (584-584)
  • CacheControl (610-613)
  • ChatContentBlock (592-602)
core/providers/bedrock/types.go (5)
  • BedrockSystemMessage (156-160)
  • BedrockCachePoint (199-201)
  • BedrockContentBlock (163-190)
  • BedrockToolConfig (278-281)
  • BedrockTool (284-287)
core/schemas/utils.go (1)
  • IsNovaModel (1043-1045)
core/providers/anthropic/responses.go (4)
core/schemas/responses.go (6)
  • ResponsesMessage (318-331)
  • ResponsesInputMessageRoleSystem (338-338)
  • ResponsesMessageContent (343-348)
  • ResponsesMessageContentBlock (403-418)
  • ResponsesOutputMessageContentTypeText (396-396)
  • ResponsesInputMessageContentBlockTypeText (392-392)
core/schemas/utils.go (1)
  • Ptr (16-18)
core/schemas/chatcompletions.go (1)
  • CacheControl (610-613)
core/providers/anthropic/types.go (1)
  • AnthropicContentBlockTypeText (204-204)
🔇 Additional comments (44)
core/providers/openai/types.go (1)

178-211: Deep copy implementation correctly addresses the mutation concern.

The implementation properly creates copies at each level before stripping CacheControl:

  • messagesCopy for the outer array
  • contentCopy and blockCopy for content blocks
  • toolMsgCopy, outputCopy, and block copies for tool message outputs

This prevents mutation of the original input data during marshaling.

One consideration: as noted in the past review comment by Pratham-Mishra04, this allocation pattern could be optimized using a custom JSON encoder with field skip logic for high-throughput scenarios. However, for correctness, this approach is sound.

core/schemas/utils.go (1)

1042-1045: LGTM!

The IsNovaModel function follows the established pattern of other model detection utilities (IsAnthropicModel, IsMistralModel, IsGeminiModel) in this file. The implementation is consistent and straightforward.

core/schemas/chatcompletions.go (3)

246-249: LGTM - CacheControl field added to ChatTool.

The addition of CacheControl to ChatTool enables cache control metadata on tool definitions, which aligns with the PR's goal of supporting prompt caching across providers.


599-602: LGTM - CacheControl field added to ChatContentBlock.

The field addition with the explanatory comment about provider-specific usage (Anthropic, Bedrock) provides good context for maintainers.


604-613: No duplicate type definitions found. CacheControl and CacheControlType are defined only once in chatcompletions.go (lines 604 and 610 respectively).

Likely an incorrect or invalid review comment.

core/internal/testutil/account.go (4)

43-43: LGTM - PromptCaching field added to TestScenarios.

The new PromptCaching boolean flag enables gating prompt caching tests per provider, which is a clean approach for provider-specific feature testing.


609-609: LGTM - PromptCaching enabled for Anthropic.

Correctly enables prompt caching tests for the Anthropic provider.


643-643: LGTM - PromptCaching enabled for Bedrock.

Correctly enables prompt caching tests for the Bedrock provider.


553-576: Verify: OpenAI has PromptCachingModel but PromptCaching is not enabled in Scenarios.

The OpenAI configuration defines PromptCachingModel: "gpt-4.1" (line 553) but doesn't set PromptCaching: true in the Scenarios struct. This differs from Anthropic and Bedrock which both have PromptCaching: true set.

Is this intentional because OpenAI uses a different caching mechanism, or should PromptCaching: true be added to the OpenAI scenarios as well?

core/providers/bedrock/bedrock_test.go (2)

73-77: LGTM - Test configuration expanded with new model and parameter fields.

The additions properly configure the Bedrock test suite with:

  • EmbeddingModel for embedding tests
  • ReasoningModel and PromptCachingModel for respective feature tests
  • BatchExtraParams and FileExtraParams for S3-based operations

The configuration follows established patterns in the test framework.


95-95: LGTM - PromptCaching enabled for Bedrock tests.

This enables the prompt caching integration tests for Bedrock, which aligns with the new CacheControl support added in this PR.

core/internal/testutil/prompt_caching.go (3)

265-268: LGTM: Cache control added to last tool for testing prompt caching.

Adding CacheControl with Ephemeral type to the last tool aligns with the Anthropic prompt caching pattern where cache breakpoints are placed on specific content items.


281-284: Improved gating via scenario-based configuration.

Switching from provider-specific OpenAI check to testConfig.Scenarios.PromptCaching makes the test more flexible and properly controlled via configuration. This allows any provider supporting prompt caching to run these tests.


295-303: System message now uses ContentBlocks with ephemeral cache control.

The change from ContentStr to ContentBlocks with a text block containing CacheControl is consistent with the broader cache control implementation across providers. This structure allows the cache point to be signaled correctly for Anthropic and Bedrock.

core/providers/bedrock/chat.go (4)

63-71: Clean per-block text content handling.

The iteration over response.Output.Message.Content with per-block processing for text content is well-structured. Each text block is properly converted to a ChatContentBlock with the correct type.


73-87: Structured output tool handling correctly integrated into the block loop.

The structured output check now properly combines the context key check with the tool name match in a single conditional. This addresses the previous review concern where the structured-output path could drop normal content when the tool wasn't used.


89-113: Regular tool call processing is thorough.

The tool call extraction properly handles:

  • JSON marshaling with fallback to string representation
  • Empty object default when input is nil
  • Correct index assignment using current toolCalls length
  • Proper ID and function name extraction

115-127: Reasoning content handling with signature support.

The reasoning content extraction properly:

  • Guards against nil ReasoningText
  • Extracts both text and signature
  • Accumulates reasoning text with newline separators
  • Uses correct BifrostReasoningDetailsTypeText type
core/providers/bedrock/types.go (3)

159-159: CachePoint field added to BedrockSystemMessage.

This enables cache point signaling for system messages in Bedrock requests.


187-201: New BedrockCachePoint type and ContentBlock field.

The BedrockCachePoint struct with BedrockCachePointTypeDefault = "default" correctly matches the AWS Bedrock API specification. Per the AWS docs, only the "default" type is accepted for cache point blocks.


285-286: CachePoint field added to BedrockTool.

This enables cache point signaling on tool definitions, which is useful for caching tool configurations across multiple requests.

core/providers/bedrock/utils.go (5)

30-32: convertToolConfig now accepts model parameter.

This enables model-specific logic (e.g., excluding Nova models from cache point support).


237-264: convertSystemMessages correctly emits CachePoint as separate BedrockSystemMessage.

Bedrock's API requires cache points to be in separate message blocks rather than inline. The function correctly:

  • Appends a text system message for each text block
  • Appends an additional BedrockSystemMessage with only CachePoint when CacheControl is present

This aligns with AWS Bedrock's cache point API structure.


330-337: Tool message content correctly adds CachePoint as separate blocks.

Both text and image blocks within tool results properly emit CachePoint as a separate BedrockContentBlock when the source block has CacheControl set.

Also applies to: 348-355


398-446: convertContentBlock returns slice and handles CachePoint correctly.

The function signature change to return []BedrockContentBlock allows it to emit both the content block and an optional CachePoint block. This pattern is applied consistently for text and image content types.


661-668: CachePoint added to tools with Nova model exclusion.

The logic correctly:

  • Checks if CacheControl is set on the tool
  • Excludes Nova models using schemas.IsNovaModel(model)
  • Appends a separate BedrockTool entry containing only CachePoint

This matches Bedrock's requirement for cache points to be separate entries in the tools array.

tests/integrations/tests/test_anthropic.py (3)

1421-1484: Prompt caching system message test is well-structured.

The test correctly:

  • Uses cross-provider parametrization for flexibility
  • Creates a system message with cache_control: {"type": "ephemeral"} on the large context
  • Makes two requests and validates cache creation then cache read
  • Uses relative comparison (abs(cache_write_tokens - cache_read_tokens) < 100) rather than strict equality

This addresses the previous review concern about overly strict cache counter assertions.


1485-1559: Prompt caching messages test properly validates cache behavior.

The test correctly places cache_control on the large context block within user message content and validates that the second request reads from cache.


1663-1689: Cache validation helpers are appropriately relaxed.

The validate_cache_write and validate_cache_read functions:

  • Assert the required attributes exist
  • Assert token counts are > 0
  • Log the actual values for debugging
  • Don't enforce strict counter==0 invariants that could cause flakiness

This addresses the previous review feedback about overly strict assertions.

core/providers/anthropic/chat.go (3)

65-67: LGTM: CacheControl propagation for tools.

Correctly propagates CacheControl from Bifrost tool definitions to Anthropic tool representations.


150-153: LGTM: CacheControl attached to text content blocks.

Consistently applies CacheControl to text blocks across system messages, tool results, and regular message content blocks.

Also applies to: 185-188, 242-245


320-365: LGTM: Refactored response content block handling.

The switch-based content block processing is cleaner and correctly handles text, tool use, and thinking blocks. The tool use JSON marshaling with fallback to empty object "{}" is a good defensive approach.

tests/integrations/tests/test_bedrock.py (3)

1623-1676: LGTM: Well-structured system message caching test.

Test correctly validates the prompt caching workflow:

  1. First request creates cache with system message checkpoint
  2. Second request reads from cache
  3. Validates cache write/read tokens are approximately equal

1677-1731: LGTM: Messages caching test properly validates cache behavior.

Test correctly places cachePoint after the large context content block within the user message, then verifies cache read on the subsequent request.


1732-1797: LGTM: Tools caching test with 12 tools.

Test correctly appends cachePoint after all tool definitions and validates cache behavior across two requests with different prompts but same tool config.

core/providers/bedrock/responses.go (6)

1283-1289: LGTM: CachePoint handling for tools with Nova exclusion.

Correctly applies ephemeral cache control to the last tool in the array when a CachePoint is encountered, with proper exclusion for Nova models that don't support cache points in tools.


1571-1578: LGTM: Bifrost CacheControl to Bedrock CachePoint conversion.

Correctly converts Bifrost's CacheControl on tools to Bedrock's CachePoint format, with Nova model exclusion.


2450-2484: LGTM: System message to Bifrost conversion with CachePoint handling.

The updated convertBedrockSystemMessageToBifrostMessages correctly:

  1. Iterates over multiple system messages
  2. Applies ephemeral CacheControl to the last content block when CachePoint is encountered
  3. Creates proper Bifrost message structure for text content

2413-2419: LGTM: CacheControl to CachePoint conversion for system messages.

Correctly emits a BedrockCachePoint after a text block when the Bifrost content block has CacheControl set.


2656-2665: LGTM: CachePoint to CacheControl conversion in message content blocks.

Correctly applies ephemeral CacheControl to the last content block of the last message when a CachePoint is encountered during Bedrock-to-Bifrost conversion.


2774-2781: LGTM: Bifrost CacheControl to Bedrock CachePoint for content blocks.

Correctly appends a BedrockCachePoint block after content blocks that have CacheControl set.

core/providers/anthropic/responses.go (3)

2065-2097: LGTM! Past review concern addressed.

The implementation correctly handles system content conversion to Bifrost messages:

  1. The check at line 2086 (if len(contentBlocks) > 0) prevents empty ContentBlocks arrays from being appended, which directly addresses the concern raised in the past review comment.

  2. The filtering to text-only blocks (line 2078) is correct per Anthropic's API specification, where system messages can only contain text content.

  3. CacheControl is properly preserved from source blocks (line 2082), maintaining cache directives throughout the conversion pipeline.


2082-2082: LGTM! Consistent CacheControl propagation.

The CacheControl field is correctly propagated across all content block and tool conversion paths:

  • System content: Line 2082 preserves cache directives for system text blocks
  • Regular messages: Lines 2178, 2379, 2394 handle input and output text blocks
  • Tool results: Lines 2265, 2497, 2562 maintain cache control in function/MCP tool outputs
  • Conversion helpers: Lines 3332, 3343, 3382 ensure consistency in lower-level conversions
  • Tool definitions: Lines 3083, 3263 propagate cache control for tool schemas

The pattern is uniform and correctly uses pointer assignment for pass-through conversion semantics.

Also applies to: 2178-2178, 2265-2265, 2379-2379, 2394-2394, 2497-2497, 2562-2562, 3083-3083, 3263-3263, 3332-3332, 3343-3343, 3382-3382


1344-1344: LGTM! System content parameter correctly passed.

The addition of request.System to the ConvertAnthropicMessagesToBifrostMessages call correctly centralizes system content handling. The receiver function (line 1632) appropriately checks for nil before invoking the conversion helper, ensuring safe handling of optional system content.

@TejasGhatte TejasGhatte force-pushed the 11-25-feat_support_for_prompt_caching_parameter_for_claude_models branch from 5cfbabb to 8005eb3 Compare December 17, 2025 07:52
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
core/providers/anthropic/chat.go (1)

479-492: Don’t drop reasoning details with nil Text but non‑empty Signature in ToAnthropicChatResponse

The current guard:

if reasoningDetail.Type == schemas.BifrostReasoningDetailsTypeText && reasoningDetail.Text != nil &&
    ((reasoningDetail.Text != nil && *reasoningDetail.Text != "") ||
        (reasoningDetail.Signature != nil && *reasoningDetail.Signature != "")) {
    content = append(content, AnthropicContentBlock{
        Type:      AnthropicContentBlockTypeThinking,
        Thinking:  reasoningDetail.Text,
        Signature: reasoningDetail.Signature,
    })
}

means any ReasoningDetails entry with Text == nil but a non‑empty Signature is silently dropped. That breaks the “preserve all fields, including when Text is nil/empty” requirement for reasoning round‑trips and loses Anthropic thinking‑signature information when mapping Bifrost → Anthropic. Based on learnings, this should be avoided.

You can fix this by only requiring that at least one of Text or Signature is present, and by not special‑casing empty text:

-        if choice.ChatNonStreamResponseChoice != nil && choice.Message != nil && choice.Message.ChatAssistantMessage != nil && choice.Message.ChatAssistantMessage.ReasoningDetails != nil {
-            for _, reasoningDetail := range choice.Message.ChatAssistantMessage.ReasoningDetails {
-                if reasoningDetail.Type == schemas.BifrostReasoningDetailsTypeText && reasoningDetail.Text != nil &&
-                    ((reasoningDetail.Text != nil && *reasoningDetail.Text != "") ||
-                        (reasoningDetail.Signature != nil && *reasoningDetail.Signature != "")) {
-                    content = append(content, AnthropicContentBlock{
-                        Type:      AnthropicContentBlockTypeThinking,
-                        Thinking:  reasoningDetail.Text,
-                        Signature: reasoningDetail.Signature,
-                    })
-                }
-            }
-        }
+        if choice.ChatNonStreamResponseChoice != nil &&
+            choice.Message != nil &&
+            choice.Message.ChatAssistantMessage != nil &&
+            choice.Message.ChatAssistantMessage.ReasoningDetails != nil {
+            for _, reasoningDetail := range choice.Message.ChatAssistantMessage.ReasoningDetails {
+                if reasoningDetail.Type != schemas.BifrostReasoningDetailsTypeText {
+                    continue
+                }
+                if reasoningDetail.Text == nil && reasoningDetail.Signature == nil {
+                    continue
+                }
+                content = append(content, AnthropicContentBlock{
+                    Type:      AnthropicContentBlockTypeThinking,
+                    Thinking:  reasoningDetail.Text,
+                    Signature: reasoningDetail.Signature,
+                })
+            }
+        }

This keeps all reasoning entries (including signature‑only and empty‑text ones) flowing through, preserving bidirectional mapping.

♻️ Duplicate comments (2)
core/providers/bedrock/responses.go (2)

2381-2385: System‑message CachePoint handling is symmetric; document ordering assumption

Delegating system conversion through convertBedrockSystemMessageToBifrostMessages and emitting a synthetic BedrockSystemMessage{CachePoint: ...} from convertBifrostMessageToBedrockSystemMessages gives a clean round‑trip: {Text},{CachePoint} in Bedrock becomes a single system message whose last content block has CacheControl, and vice‑versa.

convertBedrockSystemMessageToBifrostMessages silently ignores a CachePoint when bifrostMessages is still empty or the last message has no content blocks. That matches prior design discussions that Bedrock (or upstream construction) never emits a CachePoint before at least one system text entry, but it’s an implicit contract.

Consider adding a short comment near the if sysMsg.CachePoint != nil branch noting that Bedrock/system construction guarantees CachePoint only appears after system text, so a missing target means the hint is intentionally ignored.

Also applies to: 2413-2419, 2450-2484


2565-2665: Per‑content‑block CacheControl/CachePoint mapping is reversible and consistent

The new block.CachePoint != nil branch in convertSingleBedrockMessageToBifrostMessages and the if block.CacheControl != nil logic in convertBifrostResponsesMessageContentBlocksToBedrockContentBlocks together ensure:

  • A Bedrock content block followed by a CachePoint becomes a Bifrost content block whose last entry has CacheControl: {Type: ephemeral}.
  • A Bifrost content block with CacheControl is re‑encoded as the original content block plus a trailing CachePoint block.

This gives a lossless Bedrock↔Bifrost translation for per‑block cache hints. As with system messages, the code assumes Bedrock doesn’t emit a CachePoint before any content block; if that API guarantee ever changes, we’d need a fallback, but under the current contract this is fine.

Also applies to: 2737-2781

🧹 Nitpick comments (7)
core/internal/testutil/test_retry_framework.go (1)

832-834: Consider including max attempts for consistency with other retry logs.

While the current logging is functional, it differs from the pattern used throughout the rest of the file where retry logs display both current and max attempts (e.g., lines 608, 648, 2155, 2157). Including config.MaxAttempts would improve consistency and provide better context during test debugging.

Apply this diff to align with the file's logging pattern:

 		OnRetry: func(attempt int, reason string, t *testing.T) {
 			// reason already contains ❌ prefix from retry logic
 			// attempt represents the current failed attempt number
 			// Log with attempt+1 to show the next attempt that will run
-			t.Logf("🔄 Retrying streaming test (attempt %d): %s", attempt+1, reason)
+			t.Logf("🔄 Retrying streaming test (attempt %d/%d): %s", attempt+1, config.MaxAttempts, reason)
 		},
core/internal/testutil/account.go (1)

593-621: Ensure Anthropic/Bedrock PromptCachingModel is configured where these scenarios are used.

You’ve enabled Scenarios.PromptCaching for Anthropic and Bedrock here, but the corresponding ComprehensiveTestConfig entries in AllProviderConfigs don’t set PromptCachingModel. The provider‑specific tests construct their own configs with PromptCachingModel, but if any shared runner uses AllProviderConfigs + Scenarios.PromptCaching, it will end up issuing prompt‑caching runs with an empty model string.

Consider either:

  • Setting PromptCachingModel on these two AllProviderConfigs entries (to the same models used in the dedicated tests), or
  • Making the prompt‑caching runner explicitly no‑op when PromptCachingModel == "", regardless of the scenario flag.

Also applies to: 627-660

core/internal/testutil/prompt_caching.go (1)

276-284: Scenario gate + ContentBlocks-based system message are appropriate; consider updating the doc comment.

The additional Scenarios.PromptCaching guard and the move to a ContentBlocks system message with an ephemeral CacheControl block both match the new caching model and avoid running this test where unsupported. The top comment still says “OpenAI's prompt caching”, though this function is now clearly provider‑agnostic; renaming that to “prompt caching” generally would avoid confusion.

Also applies to: 291-304

core/providers/bedrock/utils.go (1)

30-33: CacheControl → Bedrock CachePoint mapping is coherent across system/messages/tools

The new wiring achieves a consistent story:

  • convertSystemMessages explodes a system ChatMessage into multiple BedrockSystemMessage entries, and when a text block has CacheControl it appends a separate {cachePoint: {type: "default"}} system message, matching Bedrock’s cachePoint convention.
  • convertContent + convertContentBlock now return slices so you can emit a normal text/image block followed by a CachePoint block when block.CacheControl != nil; multi-block append in convertMessage correctly flattens these.
  • convertToolMessage mirrors that behavior for tool results, inserting cachePoint content blocks after cached text/image tool outputs.
  • convertToolConfig(model, params) keeping ToolSpec and CachePoint as separate entries in Tools and gating cache points with !schemas.IsNovaModel(model) matches the Bedrock JSON shape and avoids sending unsupported cache hints to Nova.

The only trade‑off is that TTL from CacheControl is intentionally dropped because Bedrock’s cachePoint doesn’t accept it, which is reasonable for now.

Overall this is a clean, provider‑correct implementation of prompt‑caching hooks.

Also applies to: 199-235, 237-264, 295-359, 374-395, 397-447, 616-685

tests/integrations/tests/test_anthropic.py (1)

1421-1608: Prompt‑caching tests and Anthropic usage helpers look consistent with cross‑provider design

The three new tests:

  • Correctly exercise system/message/tool‑level cache control using cache_control: {"type": "ephemeral"} and separate user questions to force multiple calls.
  • Validate behavior via usage.cache_creation_input_tokens and usage.cache_read_input_tokens in a way that matches Anthropic’s schema and mirrors the Bedrock tests (first call writes >0 tokens, second call reads >0 tokens, and read/write counts are roughly similar).

The validate_cache_write/validate_cache_read helpers are straightforward and avoid the earlier over‑strict “no reads on first call / no writes on second call” checks, which should reduce flakiness while still catching regressions in caching.

Only optional nit: these helpers are nearly identical to the Bedrock versions—if you want to keep behavior in lock‑step, you could factor them into a shared test utility, but it’s not required.

Also applies to: 1663-1689

tests/integrations/tests/test_bedrock.py (1)

1799-1821: Cache write/read helpers match Bedrock usage schema; consider sharing with Anthropic tests

validate_cache_write/validate_cache_read operate on the usage dict’s cacheWriteInputTokens and cacheReadInputTokens fields and only assert that the respective values are > 0, which is exactly what the Bedrock prompt‑caching tests need.

They also mirror the Anthropic helpers almost 1:1; if you want to avoid divergence in future, you could move these into a shared test utility module and parametrize on field names, but it’s optional.

core/providers/anthropic/responses.go (1)

2159-2184: CacheControl propagation across Anthropic ↔ Bifrost is thorough; consider one small consistency tweak

The new wiring generally looks solid:

  • Text blocks in both grouped and ungrouped converters (convertAnthropicContentBlocksToResponsesMessagesGrouped / convertAnthropicContentBlocksToResponsesMessages) now copy block.CacheControl into ResponsesMessageContentBlock.
  • Tool‑result content (AnthropicContentBlockTypeToolResult and AnthropicContentBlockTypeMCPToolResult) propagates CacheControl on nested text blocks into the corresponding ResponsesMessageContentBlocks.
  • convertContentBlockToAnthropic mirrors this by copying block.CacheControl back into AnthropicContentBlock for text and image blocks, and toBifrostResponsesImageBlock does the same for image blocks.
  • Tool structs now round‑trip tool‑level cache control: convertAnthropicToolToBifrost and convertBifrostToolToAnthropic both map CacheControl between AnthropicTool and ResponsesTool.

One minor gap: in convertAnthropicContentBlocksToResponsesMessagesGrouped, when isOutputMessage is true, the accumulatedTextContent slice is built without copying block.CacheControl, so any cache hints on those output text blocks would be dropped in the grouped path. It’s currently only used for input‑side conversions in this file, but making it consistent is trivial:

-                if isOutputMessage {
-                    // For output messages, accumulate text blocks (don't emit immediately)
-                    accumulatedTextContent = append(accumulatedTextContent, schemas.ResponsesMessageContentBlock{
-                        Type: schemas.ResponsesOutputMessageContentTypeText,
-                        Text: block.Text,
-                    })
+                if isOutputMessage {
+                    // For output messages, accumulate text blocks (don't emit immediately)
+                    accumulatedTextContent = append(accumulatedTextContent, schemas.ResponsesMessageContentBlock{
+                        Type:         schemas.ResponsesOutputMessageContentTypeText,
+                        Text:         block.Text,
+                        CacheControl: block.CacheControl,
+                    })

That would align the grouped/output path with the ungrouped converter and avoid any future surprises if keepToolsGrouped ever gets used for outputs.

Also applies to: 2239-2276, 2365-2400, 2463-2466, 2495-2498, 2560-2563, 3330-3333, 3343-3344, 3377-3383, 3082-3084, 3262-3264

📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 5cfbabb and 8005eb3.

📒 Files selected for processing (22)
  • core/internal/testutil/account.go (3 hunks)
  • core/internal/testutil/prompt_caching.go (3 hunks)
  • core/internal/testutil/test_retry_framework.go (5 hunks)
  • core/providers/anthropic/anthropic_test.go (2 hunks)
  • core/providers/anthropic/chat.go (5 hunks)
  • core/providers/anthropic/responses.go (13 hunks)
  • core/providers/anthropic/types.go (2 hunks)
  • core/providers/anthropic/utils.go (1 hunks)
  • core/providers/bedrock/bedrock_test.go (2 hunks)
  • core/providers/bedrock/chat.go (1 hunks)
  • core/providers/bedrock/responses.go (8 hunks)
  • core/providers/bedrock/types.go (6 hunks)
  • core/providers/bedrock/utils.go (8 hunks)
  • core/providers/openai/responses_marshal_test.go (0 hunks)
  • core/providers/openai/types.go (1 hunks)
  • core/schemas/chatcompletions.go (2 hunks)
  • core/schemas/responses.go (2 hunks)
  • core/schemas/utils.go (1 hunks)
  • tests/integrations/config.yml (7 hunks)
  • tests/integrations/tests/test_anthropic.py (5 hunks)
  • tests/integrations/tests/test_bedrock.py (3 hunks)
  • tests/integrations/tests/utils/common.py (2 hunks)
💤 Files with no reviewable changes (1)
  • core/providers/openai/responses_marshal_test.go
🚧 Files skipped from review as they are similar to previous changes (4)
  • core/providers/anthropic/utils.go
  • core/schemas/utils.go
  • core/schemas/responses.go
  • core/providers/anthropic/types.go
🧰 Additional context used
📓 Path-based instructions (1)
**

⚙️ CodeRabbit configuration file

always check the stack if there is one for the current PR. do not give localized reviews for the PR, always see all changes in the light of the whole stack of PRs (if there is a stack, if there is no stack you can continue to make localized suggestions/reviews)

Files:

  • core/internal/testutil/account.go
  • core/providers/bedrock/bedrock_test.go
  • core/internal/testutil/prompt_caching.go
  • core/internal/testutil/test_retry_framework.go
  • core/providers/bedrock/chat.go
  • core/schemas/chatcompletions.go
  • tests/integrations/config.yml
  • core/providers/anthropic/anthropic_test.go
  • core/providers/bedrock/types.go
  • core/providers/anthropic/responses.go
  • core/providers/openai/types.go
  • core/providers/bedrock/utils.go
  • core/providers/bedrock/responses.go
  • tests/integrations/tests/utils/common.py
  • tests/integrations/tests/test_bedrock.py
  • tests/integrations/tests/test_anthropic.py
  • core/providers/anthropic/chat.go
🧠 Learnings (6)
📚 Learning: 2025-12-09T17:07:42.007Z
Learnt from: qwerty-dvorak
Repo: maximhq/bifrost PR: 1006
File: core/schemas/account.go:9-18
Timestamp: 2025-12-09T17:07:42.007Z
Learning: In core/schemas/account.go, the HuggingFaceKeyConfig field within the Key struct is currently unused and reserved for future Hugging Face inference endpoint deployments. Do not flag this field as missing from OpenAPI documentation or require its presence in the API spec until the feature is actively implemented and used. When the feature is added, update the OpenAPI docs accordingly; otherwise, treat this field as non-breaking and not part of the current API surface.

Applied to files:

  • core/internal/testutil/account.go
  • core/providers/bedrock/bedrock_test.go
  • core/internal/testutil/prompt_caching.go
  • core/internal/testutil/test_retry_framework.go
  • core/providers/bedrock/chat.go
  • core/schemas/chatcompletions.go
  • core/providers/anthropic/anthropic_test.go
  • core/providers/bedrock/types.go
  • core/providers/anthropic/responses.go
  • core/providers/openai/types.go
  • core/providers/bedrock/utils.go
  • core/providers/bedrock/responses.go
  • core/providers/anthropic/chat.go
📚 Learning: 2025-12-11T07:38:31.413Z
Learnt from: Pratham-Mishra04
Repo: maximhq/bifrost PR: 1000
File: core/providers/bedrock/bedrock_test.go:1374-1390
Timestamp: 2025-12-11T07:38:31.413Z
Learning: In core/providers/bedrock tests, follow a layered testing approach: - Unit tests (e.g., TestBifrostToBedrockResponseConversion) should perform structural comparisons and type/field checks to avoid brittleness from dynamic fields. - Separate scenario-based and integration tests should validate the full end-to-end conversion logic, including content block internals. Ensure unit tests avoid brittle string/field matching and that integration tests cover end-to-end behavior with realistic data.

Applied to files:

  • core/providers/bedrock/bedrock_test.go
📚 Learning: 2025-12-15T10:16:21.909Z
Learnt from: qwerty-dvorak
Repo: maximhq/bifrost PR: 1006
File: core/providers/huggingface/huggingface_test.go:12-63
Timestamp: 2025-12-15T10:16:21.909Z
Learning: In provider tests under core/providers/<provider>/*_test.go, do not require or flag the use of defer for Shutdown(); instead call client.Shutdown() at the end of each test function. This pattern appears consistent across all provider tests. Apply this rule only within this path; for other tests or resources, defer may still be appropriate.

Applied to files:

  • core/providers/bedrock/bedrock_test.go
  • core/providers/anthropic/anthropic_test.go
📚 Learning: 2025-12-11T11:58:25.307Z
Learnt from: Pratham-Mishra04
Repo: maximhq/bifrost PR: 1000
File: core/providers/openai/responses.go:42-84
Timestamp: 2025-12-11T11:58:25.307Z
Learning: In core/providers/openai/responses.go (and related OpenAI response handling), document and enforce the API format constraint: if ResponsesReasoning != nil and the response contains content blocks, all content blocks should be treated as reasoning blocks by default. Implement type guards or parsing logic accordingly, and add unit tests to verify that when ResponsesReasoning is non-nil, content blocks are labeled as reasoning blocks. Include clear comments in the code explaining the rationale and ensure downstream consumers rely on this behavior.

Applied to files:

  • core/providers/openai/types.go
📚 Learning: 2025-12-14T14:43:30.902Z
Learnt from: Radheshg04
Repo: maximhq/bifrost PR: 980
File: core/providers/openai/images.go:10-22
Timestamp: 2025-12-14T14:43:30.902Z
Learning: Enforce the OpenAI image generation SSE event type values across the OpenAI image flow in the repository: use "image_generation.partial_image" for partial chunks, "image_generation.completed" for the final result, and "error" for errors. Apply this consistently in schemas, constants, tests, accumulator routing, and UI code within core/providers/openai (and related Go files) to ensure uniform event typing and avoid mismatches.

Applied to files:

  • core/providers/openai/types.go
📚 Learning: 2025-12-12T10:28:54.988Z
Learnt from: Pratham-Mishra04
Repo: maximhq/bifrost PR: 978
File: core/providers/anthropic/chat.go:207-216
Timestamp: 2025-12-12T10:28:54.988Z
Learning: In core/providers/anthropic/chat.go, within ToAnthropicChatRequest, ensure the mapping from ReasoningDetails to Anthropic thinking blocks preserves all fields from ReasoningDetails, including when Text is nil or empty. Do not drop or coerce missing Text; propagate nil/empty values to maintain complete bidirectional mapping of Anthropic events in Bifrost. Add tests to verify that ReasoningDetails with Text nil or "" round-trip without loss, and update code to handle nullable Text without dropping data. If there are multiple blocks, preserve order and all fields during mapping.

Applied to files:

  • core/providers/anthropic/chat.go
🧬 Code graph analysis (7)
core/internal/testutil/prompt_caching.go (2)
core/schemas/chatcompletions.go (4)
  • CacheControl (610-613)
  • CacheControlTypeEphemeral (607-607)
  • ChatContentBlock (592-602)
  • ChatContentBlockTypeText (584-584)
core/schemas/utils.go (1)
  • Ptr (16-18)
core/providers/bedrock/chat.go (3)
core/schemas/chatcompletions.go (4)
  • ChatContentBlock (592-602)
  • ChatContentBlockTypeText (584-584)
  • ChatAssistantMessageToolCall (702-708)
  • ChatAssistantMessageToolCallFunction (711-714)
core/schemas/bifrost.go (1)
  • BifrostContextKeyStructuredOutputToolName (135-135)
core/schemas/utils.go (1)
  • Ptr (16-18)
core/providers/openai/types.go (2)
core/schemas/responses.go (3)
  • ResponsesMessage (318-331)
  • ResponsesMessageContentBlock (403-418)
  • ResponsesToolMessage (469-489)
core/schemas/chatcompletions.go (1)
  • CacheControl (610-613)
core/providers/bedrock/utils.go (3)
core/schemas/chatcompletions.go (3)
  • ChatContentBlockTypeText (584-584)
  • CacheControl (610-613)
  • ChatContentBlock (592-602)
core/providers/bedrock/types.go (6)
  • BedrockSystemMessage (156-160)
  • BedrockCachePoint (199-201)
  • BedrockCachePointTypeDefault (195-195)
  • BedrockContentBlock (163-190)
  • BedrockToolConfig (278-281)
  • BedrockTool (284-287)
core/schemas/utils.go (1)
  • IsNovaModel (1043-1045)
core/providers/bedrock/responses.go (4)
core/schemas/utils.go (1)
  • IsNovaModel (1043-1045)
core/schemas/chatcompletions.go (2)
  • CacheControl (610-613)
  • CacheControlTypeEphemeral (607-607)
core/providers/bedrock/types.go (5)
  • BedrockTool (284-287)
  • BedrockCachePoint (199-201)
  • BedrockCachePointTypeDefault (195-195)
  • BedrockSystemMessage (156-160)
  • BedrockContentBlock (163-190)
core/schemas/responses.go (6)
  • ResponsesMessage (318-331)
  • ResponsesInputMessageRoleSystem (338-338)
  • ResponsesMessageTypeMessage (294-294)
  • ResponsesMessageContent (343-348)
  • ResponsesMessageContentBlock (403-418)
  • ResponsesInputMessageContentBlockTypeText (392-392)
tests/integrations/tests/test_bedrock.py (2)
tests/integrations/tests/utils/parametrize.py (2)
  • get_cross_provider_params_for_scenario (12-47)
  • format_provider_model (50-65)
tests/integrations/tests/test_anthropic.py (2)
  • validate_cache_write (1663-1675)
  • validate_cache_read (1677-1689)
tests/integrations/tests/test_anthropic.py (3)
tests/integrations/tests/utils/parametrize.py (2)
  • get_cross_provider_params_for_scenario (12-47)
  • format_provider_model (50-65)
tests/integrations/tests/utils/common.py (1)
  • assert_valid_chat_response (801-833)
tests/integrations/tests/test_bedrock.py (2)
  • validate_cache_write (1799-1809)
  • validate_cache_read (1811-1821)
🔇 Additional comments (19)
core/internal/testutil/test_retry_framework.go (3)

2153-2158: Good addition of explicit attempt start logging.

The new logging clearly distinguishes between the first test attempt and subsequent retries, providing valuable context for debugging streaming validation tests. The inclusion of config.MaxAttempts is consistent with the file's logging patterns.


2205-2254: Excellent improvements to retry flow observability and documentation.

The enhancements provide clear visibility into the retry decision process:

  • Waiting time logs (lines 2218, 2252) help track retry delays
  • Comments clarifying attempt semantics (lines 2211, 2246) maintain consistency across retry functions
  • The critical comment on line 2254 documents important control flow behavior (continue vs return)
  • Comprehensive logging for the nil channel case improves debugging

These changes align well with the PR's goal of improving the retry framework with better logging.


2317-2318: Good documentation of attempt parameter semantics.

The clarifying comment helps maintain consistency across retry functions, ensuring all OnRetry callbacks receive the current failed attempt number. This addresses the attempt semantics consistency that was previously flagged in reviews.

core/internal/testutil/account.go (1)

21-56: PromptCaching scenario flag wiring looks consistent.

Adding PromptCaching to TestScenarios cleanly aligns with the new prompt‑caching gate in RunPromptCachingTest and keeps scenarios explicit and self‑documenting.

core/providers/anthropic/anthropic_test.go (1)

25-35: Anthropic test config extensions for vision/reasoning/prompt caching look correct.

Wiring VisionModel, ReasoningModel, and PromptCachingModel to the newer Claude SKUs and enabling Scenarios.PromptCaching is consistent with the rest of the stack and keeps the comprehensive test harness aligned with the new capabilities.

Also applies to: 35-52

core/providers/bedrock/bedrock_test.go (1)

65-78: Bedrock comprehensive test wiring for embeddings/reasoning/prompt caching is coherent.

The added EmbeddingModel, ReasoningModel, PromptCachingModel, and the reuse of batchExtraParams/fileExtraParams plug neatly into the existing test harness, and enabling PromptCaching in Scenarios matches the new prompt‑caching coverage.

Also applies to: 78-107

core/internal/testutil/prompt_caching.go (1)

245-268: Ephemeral CacheControl on check_compliance_status tool is aligned with the new schema.

Attaching CacheControl{Type: CacheControlTypeEphemeral} to this tool gives you a clean way to exercise tool‑level cache control in prompt‑caching tests without impacting other tools.

core/schemas/chatcompletions.go (1)

244-250: CacheControl schema extensions are consistent and minimally invasive.

Adding CacheControl to ChatTool and ChatContentBlock plus the shared CacheControlType/CacheControl definitions cleanly models provider‑specific cache hints while preserving existing JSON behavior (all new fields are optional and opt‑in).

Also applies to: 591-613

tests/integrations/config.yml (1)

109-115: Integration config now correctly reflects prompt_caching support across providers.

The new prompt_caching entries under provider_scenarios and scenario_capabilities, plus the updated Bedrock chat model, align the Python integration layer with the Go‑side prompt‑caching scenarios. This keeps Anthropics/Bedrock enabled and others explicitly off, which matches the current implementation.

Also applies to: 167-167, 188-205, 226-244, 264-281, 302-320, 342-363

tests/integrations/tests/utils/common.py (1)

88-281: Shared prompt-caching test data is well-structured and reusable.

PROMPT_CACHING_TOOLS mirrors the existing tool schema and gives good variety for cache‑heavy tool usage, and PROMPT_CACHING_LARGE_CONTEXT provides an appropriately long, coherent context block for exercising caching behavior across providers.

Also applies to: 421-481

core/providers/openai/types.go (1)

172-254: Copy-on-write CacheControl stripping in MarshalJSON avoids mutation while minimizing overhead.

The updated OpenAIResponsesRequestInput.MarshalJSON correctly:

  • Detects whether any message carries CacheControl,
  • Marshals the original slice unchanged when none do, and
  • For messages that do, constructs per-message/content/tool-output copies with CacheControl nulled, leaving the original data structures intact.

The hasCacheControl helper keeps the logic readable, and this implementation resolves the prior in-place mutation risk without imposing extra allocations on the common no‑CacheControl path.

Also applies to: 258-277

core/providers/bedrock/chat.go (1)

62-150: Bedrock response → Bifrost mapping for text/tools/reasoning and cache usage looks solid

The refactor to iterate all content blocks and separately handle text, structured-output tool use, regular tools, and reasoning content is consistent and fixes the earlier control‑flow issue where structured‑output could swallow normal content. Usage mapping of CacheReadInputTokensPromptTokensDetails.CachedTokens and CacheWriteInputTokensCompletionTokensDetails.CachedTokens also matches the new Bedrock prompt‑caching tests that assert against those provider fields.

I don’t see functional issues here; no changes needed.

Also applies to: 167-185

core/providers/bedrock/types.go (2)

155-160: CachePoint types and wiring across system/content/tools are consistent

Adding CachePoint *BedrockCachePoint to BedrockSystemMessage, BedrockContentBlock, and BedrockTool plus the BedrockCachePointType/BedrockCachePointTypeDefault and BedrockCachePoint struct gives you exactly the JSON shape Bedrock expects ({"cachePoint": {"type": "default"}}). This matches how the utils now emit cache points for system/messages/tools and keeps the type surface clean.

No issues from a correctness or API‑shape perspective.

Also applies to: 162-190, 192-201, 283-287


641-646: Making ETag explicit on Bedrock file request types matches test usage

Requiring ETag string on BedrockFileRetrieveRequest, BedrockFileDeleteRequest, and BedrockFileContentRequest aligns with the S3 access patterns in tests/test_bedrock.py (all three operations pass IfMatch with the ETag). This tightens concurrency semantics without changing response types.

Looks correct and consistent with the file integration tests.

Also applies to: 659-664, 673-678

core/providers/anthropic/chat.go (1)

42-72: CacheControl propagation into Anthropic tools and text blocks is correct

You’re now:

  • Copying tool.CacheControl from ChatTool into AnthropicTool.CacheControl.
  • Propagating block.CacheControl from system/message text blocks into AnthropicContentBlock.CacheControl when building systemContent and regular message content.
  • Preserving CacheControl on tool‑result content (text/image) as it’s converted into Anthropic blocks.

This lines up with the schemas.CacheControl definition and Anthropic’s cache_control request semantics while leaving response conversion unchanged (per the comment that responses never carry cache_control). No functional issues here.

Also applies to: 143-160, 176-198, 231-251

tests/integrations/tests/test_bedrock.py (1)

37-40: Bedrock prompt‑caching tests correctly exercise system/messages/tools cachePoint paths

The three new tests:

  • Use Bedrock‑native shapes (system entries and content blocks with { "cachePoint": {"type": "default"} }) that match the BedrockSystemMessage, BedrockContentBlock, and BedrockTool structures you introduced.
  • Assert against response["usage"]["cacheWriteInputTokens"] and response["usage"]["cacheReadInputTokens"], which lines up with BedrockTokenUsage and the mapping in ToBifrostChatResponse.
  • Properly use the injected bedrock_client fixture (fixing the earlier _bedrock_client misuse) and the standard format_provider_model(provider, model) routing.

Semantically these tests are aligned with how the Bedrock utils now emit CachePoint blocks; they should give good coverage of system/message/tool cache behavior without touching unrelated flows.

Also applies to: 51-60, 1623-1797

core/providers/bedrock/responses.go (2)

1283-1289: Tool‑level CacheControl ↔ CachePoint round‑trip looks correct

The mapping from BedrockTool.CachePoint to ResponsesTool.CacheControl in ToBifrostResponsesRequest, and the reverse mapping that appends a BedrockTool with CachePoint when ResponsesTool.CacheControl is set in ToBedrockResponsesRequest, are symmetric and gated with !schemas.IsNovaModel(...) as expected given Nova’s lack of tool‑cache support. No functional issues here.

Also applies to: 1572-1578


1707-1712: Cached‑token usage metrics now survive Bedrock ↔ Bifrost round‑trip

Populating CacheReadInputTokens from Usage.InputTokensDetails.CachedTokens and CacheWriteInputTokens from Usage.OutputTokensDetails.CachedTokens in ToBedrockConverseResponse mirrors the earlier Bedrock→Bifrost mapping, so cache read/write token counts won’t be dropped when converting responses back to Bedrock. Semantics remain consistent across providers.

core/providers/anthropic/responses.go (1)

1344-1345: System‑content routing through ConvertAnthropicMessagesToBifrostMessages is sound

ToBifrostResponsesRequest now passes request.System into ConvertAnthropicMessagesToBifrostMessages, and convertAnthropicSystemToBifrostMessages builds system ResponsesMessage entries from either ContentStr or ContentBlocks, preserving CacheControl on text blocks. ConvertAnthropicMessagesToBifrostMessages then prepends these system messages before regular messages.

The reverse path (convertBifrostMessageToAnthropicSystemContent + convertBifrostContentBlocksToAnthropic) understands both input/output text block types and also carries CacheControl, so Anthropic ↔ Bifrost system content (including cache hints) round‑trips cleanly.

Also applies to: 1627-1648, 2065-2097

@TejasGhatte TejasGhatte force-pushed the 11-25-feat_support_for_prompt_caching_parameter_for_claude_models branch from 8005eb3 to 572dc2e Compare December 17, 2025 11:11
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
core/schemas/utils.go (1)

672-728: Deep-copy helpers drop the new CacheControl field

deepCopyChatContentBlock and deepCopyResponsesMessageContentBlock don't copy CacheControl, so any message cloned via DeepCopyChatMessage / DeepCopyResponsesMessage will silently lose cache-control metadata. That can break prompt caching semantics on copied messages.

Consider deep-copying CacheControl similarly to other pointer fields, e.g.:

 func deepCopyChatContentBlock(original ChatContentBlock) ChatContentBlock {
 	copy := ChatContentBlock{
 		Type: original.Type,
 	}
+	if original.CacheControl != nil {
+		cc := *original.CacheControl
+		if original.CacheControl.TTL != nil {
+			ttl := *original.CacheControl.TTL
+			cc.TTL = &ttl
+		}
+		copy.CacheControl = &cc
+	}
 	// existing field copies...
 }

and analogously inside deepCopyResponsesMessageContentBlock.

Also applies to: 883-1040

core/providers/bedrock/utils.go (1)

199-235: Guard against nil Content in convertSystemMessages

convertSystemMessages directly dereferences msg.Content (msg.Content.ContentStr / .ContentBlocks) without checking for nil. A malformed or provider-originated system message with content: null (or missing) would panic in this path.

Recommend a simple guard:

func convertSystemMessages(msg schemas.ChatMessage) ([]BedrockSystemMessage, error) {
	systemMsgs := []BedrockSystemMessage{}

	if msg.Content == nil {
		return systemMsgs, nil
	}

	// existing logic...
}

The cache-point emission per text block also looks correct and matches the “cache point must be a separate block” requirement.

Also applies to: 237-264

🧹 Nitpick comments (6)
core/schemas/utils.go (1)

1042-1045: IsNovaModel helper is fine but uses a very loose match

The helper is straightforward and works for IDs like amazon.nova-*. If you ever introduce models whose IDs incidentally contain "nova" (e.g., supernova-x from another provider), you may want a stricter predicate (e.g., prefix or provider-aware check), but that's not required now.

Would you like a follow-up helper that matches known Bedrock Nova model patterns more strictly (e.g., amazon.nova-* only)?

core/providers/bedrock/utils.go (2)

323-359: Tool-result cache points are injected correctly after text/image blocks

For tool messages, adding a separate BedrockContentBlock with CachePointDefault immediately after each text/image block when block.CacheControl != nil is consistent with Bedrock’s requirement that cache points be distinct blocks. The JSON shape stays backwards-compatible when CacheControl is nil.

If you find similar cache-point injection logic elsewhere (e.g., system/content blocks), you might consider a small helper to avoid duplication, but it's not necessary.


375-395: convertContentBlock refactor looks good; errors are preferable for unsupported types

The refactor to have convertContentBlock return []BedrockContentBlock allows you to append cache-point blocks cleanly. Behavior per type is reasonable:

  • Text: returns a text block plus optional cache-point block.
  • Image: validates ImageURLStruct, converts via convertImageToBedrockSource, then appends optional cache-point block.
  • InputAudio/unknown types: now explicitly error out, which is better than silently ignoring or sending malformed content for a Converse API that doesn’t support them.

Only minor nit: for text blocks with block.Text == nil, you’ll still emit a block with Text: nil. If that’s not expected to occur upstream, this is fine; otherwise you might want to validate and return an error similarly to the image branch.

Also applies to: 398-446

core/internal/testutil/account.go (1)

593-625: Enabling PromptCaching for Anthropic and Bedrock scenarios

Marking PromptCaching: true for Anthropic and Bedrock configurations correctly scopes prompt-caching tests to providers that support the feature, and lines up with the existing PromptCachingModel field. Other providers remain unaffected.

In future, you might want to add a short comment near each PromptCachingModel indicating which model family/feature set it targets, to ease updates when providers change cache support.

Also applies to: 627-664

core/providers/anthropic/utils.go (1)

94-149: Forward image-block conversion now preserves type and cache control

Updating ConvertToAnthropicImageBlock to:

  • Use AnthropicContentBlockTypeImage instead of a raw string for Type, and
  • Copy block.CacheControl into the Anthropic block

is a clear improvement and keeps the image path consistent with your other cache-control propagation.

One follow-up to consider: if ToBifrostContentImageBlock participates in Anthropic→Bifrost conversions where cache control matters, you may want to mirror block.CacheControl back onto the resulting schemas.ChatContentBlock so cache metadata round-trips for images as well.

core/providers/anthropic/responses.go (1)

2151-2181: CacheControl propagation is consistent; consider using input text enums for input/system blocks

The new wiring cleanly round‑trips CacheControl for:

  • text blocks in both grouped and non‑grouped converters,
  • tool‑result content blocks,
  • image blocks, and
  • tools (convertAnthropicToolToBifrostconvertBifrostToolToAnthropic),

which is exactly what the prompt‑caching work needs.

One consistency improvement: in convertAnthropicSystemToBifrostMessages and in the input branch of convertAnthropicContentBlocksToResponsesMessagesGrouped, text blocks are emitted with ResponsesOutputMessageContentTypeText even though they represent input/system content, while the non‑grouped path uses ResponsesInputMessageContentBlockTypeText. Aligning these to the input enum would make the type semantics clearer and avoid surprises if downstream logic ever differentiates input vs output block types.

You could adjust along these lines:

@@ func convertAnthropicSystemToBifrostMessages(systemContent *AnthropicContent) []schemas.ResponsesMessage {
-            if block.Text != nil { // System messages will only have text content
-                contentBlocks = append(contentBlocks, schemas.ResponsesMessageContentBlock{
-                    Type:         schemas.ResponsesOutputMessageContentTypeText,
-                    Text:         block.Text,
-                    CacheControl: block.CacheControl,
-                })
+            if block.Text != nil { // System messages will only have text content
+                contentBlocks = append(contentBlocks, schemas.ResponsesMessageContentBlock{
+                    Type:         schemas.ResponsesInputMessageContentBlockTypeText,
+                    Text:         block.Text,
+                    CacheControl: block.CacheControl,
+                })
             }
@@ func convertAnthropicContentBlocksToResponsesMessagesGrouped(...) []schemas.ResponsesMessage {
-                    // For input messages, emit text immediately as separate message
+                    // For input messages, emit text immediately as separate message
                     bifrostMsg := schemas.ResponsesMessage{
                         Type: schemas.Ptr(schemas.ResponsesMessageTypeMessage),
                         Role: role,
                         Content: &schemas.ResponsesMessageContent{
                             ContentBlocks: []schemas.ResponsesMessageContentBlock{
                                 {
-                                    Type:         schemas.ResponsesOutputMessageContentTypeText,
+                                    Type:         schemas.ResponsesInputMessageContentBlockTypeText,
                                     Text:         block.Text,
                                     CacheControl: block.CacheControl,
                                 },
                             },
                         },
                     }

Also applies to: 2263-2266, 2369-2397, 2495-2498, 2560-2563, 2998-3084, 3183-3264, 3325-3333, 3339-3344, 3377-3383

📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 8005eb3 and 572dc2e.

📒 Files selected for processing (22)
  • core/internal/testutil/account.go (3 hunks)
  • core/internal/testutil/prompt_caching.go (3 hunks)
  • core/internal/testutil/test_retry_framework.go (5 hunks)
  • core/providers/anthropic/anthropic_test.go (2 hunks)
  • core/providers/anthropic/chat.go (5 hunks)
  • core/providers/anthropic/responses.go (13 hunks)
  • core/providers/anthropic/types.go (2 hunks)
  • core/providers/anthropic/utils.go (1 hunks)
  • core/providers/bedrock/bedrock_test.go (2 hunks)
  • core/providers/bedrock/chat.go (1 hunks)
  • core/providers/bedrock/responses.go (8 hunks)
  • core/providers/bedrock/types.go (6 hunks)
  • core/providers/bedrock/utils.go (8 hunks)
  • core/providers/openai/responses_marshal_test.go (0 hunks)
  • core/providers/openai/types.go (4 hunks)
  • core/schemas/chatcompletions.go (2 hunks)
  • core/schemas/responses.go (2 hunks)
  • core/schemas/utils.go (1 hunks)
  • tests/integrations/config.yml (7 hunks)
  • tests/integrations/tests/test_anthropic.py (5 hunks)
  • tests/integrations/tests/test_bedrock.py (3 hunks)
  • tests/integrations/tests/utils/common.py (2 hunks)
💤 Files with no reviewable changes (1)
  • core/providers/openai/responses_marshal_test.go
🚧 Files skipped from review as they are similar to previous changes (4)
  • tests/integrations/config.yml
  • tests/integrations/tests/utils/common.py
  • core/internal/testutil/prompt_caching.go
  • core/providers/bedrock/types.go
🧰 Additional context used
📓 Path-based instructions (1)
**

⚙️ CodeRabbit configuration file

always check the stack if there is one for the current PR. do not give localized reviews for the PR, always see all changes in the light of the whole stack of PRs (if there is a stack, if there is no stack you can continue to make localized suggestions/reviews)

Files:

  • core/providers/bedrock/bedrock_test.go
  • core/schemas/responses.go
  • core/providers/anthropic/utils.go
  • core/providers/anthropic/anthropic_test.go
  • core/internal/testutil/test_retry_framework.go
  • core/providers/bedrock/chat.go
  • core/schemas/utils.go
  • core/internal/testutil/account.go
  • core/providers/anthropic/chat.go
  • core/providers/anthropic/types.go
  • core/schemas/chatcompletions.go
  • core/providers/anthropic/responses.go
  • core/providers/bedrock/utils.go
  • core/providers/bedrock/responses.go
  • core/providers/openai/types.go
  • tests/integrations/tests/test_bedrock.py
  • tests/integrations/tests/test_anthropic.py
🧠 Learnings (7)
📚 Learning: 2025-12-09T17:07:42.007Z
Learnt from: qwerty-dvorak
Repo: maximhq/bifrost PR: 1006
File: core/schemas/account.go:9-18
Timestamp: 2025-12-09T17:07:42.007Z
Learning: In core/schemas/account.go, the HuggingFaceKeyConfig field within the Key struct is currently unused and reserved for future Hugging Face inference endpoint deployments. Do not flag this field as missing from OpenAPI documentation or require its presence in the API spec until the feature is actively implemented and used. When the feature is added, update the OpenAPI docs accordingly; otherwise, treat this field as non-breaking and not part of the current API surface.

Applied to files:

  • core/providers/bedrock/bedrock_test.go
  • core/schemas/responses.go
  • core/providers/anthropic/utils.go
  • core/providers/anthropic/anthropic_test.go
  • core/internal/testutil/test_retry_framework.go
  • core/providers/bedrock/chat.go
  • core/schemas/utils.go
  • core/internal/testutil/account.go
  • core/providers/anthropic/chat.go
  • core/providers/anthropic/types.go
  • core/schemas/chatcompletions.go
  • core/providers/anthropic/responses.go
  • core/providers/bedrock/utils.go
  • core/providers/bedrock/responses.go
  • core/providers/openai/types.go
📚 Learning: 2025-12-11T07:38:31.413Z
Learnt from: Pratham-Mishra04
Repo: maximhq/bifrost PR: 1000
File: core/providers/bedrock/bedrock_test.go:1374-1390
Timestamp: 2025-12-11T07:38:31.413Z
Learning: In core/providers/bedrock tests, follow a layered testing approach: - Unit tests (e.g., TestBifrostToBedrockResponseConversion) should perform structural comparisons and type/field checks to avoid brittleness from dynamic fields. - Separate scenario-based and integration tests should validate the full end-to-end conversion logic, including content block internals. Ensure unit tests avoid brittle string/field matching and that integration tests cover end-to-end behavior with realistic data.

Applied to files:

  • core/providers/bedrock/bedrock_test.go
📚 Learning: 2025-12-15T10:16:21.909Z
Learnt from: qwerty-dvorak
Repo: maximhq/bifrost PR: 1006
File: core/providers/huggingface/huggingface_test.go:12-63
Timestamp: 2025-12-15T10:16:21.909Z
Learning: In provider tests under core/providers/<provider>/*_test.go, do not require or flag the use of defer for Shutdown(); instead call client.Shutdown() at the end of each test function. This pattern appears consistent across all provider tests. Apply this rule only within this path; for other tests or resources, defer may still be appropriate.

Applied to files:

  • core/providers/bedrock/bedrock_test.go
  • core/providers/anthropic/anthropic_test.go
📚 Learning: 2025-12-12T10:28:54.988Z
Learnt from: Pratham-Mishra04
Repo: maximhq/bifrost PR: 978
File: core/providers/anthropic/chat.go:207-216
Timestamp: 2025-12-12T10:28:54.988Z
Learning: In core/providers/anthropic/chat.go, within ToAnthropicChatRequest, ensure the mapping from ReasoningDetails to Anthropic thinking blocks preserves all fields from ReasoningDetails, including when Text is nil or empty. Do not drop or coerce missing Text; propagate nil/empty values to maintain complete bidirectional mapping of Anthropic events in Bifrost. Add tests to verify that ReasoningDetails with Text nil or "" round-trip without loss, and update code to handle nullable Text without dropping data. If there are multiple blocks, preserve order and all fields during mapping.

Applied to files:

  • core/providers/anthropic/chat.go
📚 Learning: 2025-12-17T08:44:08.788Z
Learnt from: Pratham-Mishra04
Repo: maximhq/bifrost PR: 1114
File: core/schemas/chatcompletions.go:224-228
Timestamp: 2025-12-17T08:44:08.788Z
Learning: In core/schemas/chatcompletions.go, ensure the schema structures mirror OpenAI's API specifications exactly. Use the valid values for fields like ChatAudioParameters.Format and ChatAudioParameters.Voice as defined by OpenAI's documentation, and avoid adding additional inline documentation or constants to maintain direct compatibility with OpenAI's API.

Applied to files:

  • core/schemas/chatcompletions.go
📚 Learning: 2025-12-11T11:58:25.307Z
Learnt from: Pratham-Mishra04
Repo: maximhq/bifrost PR: 1000
File: core/providers/openai/responses.go:42-84
Timestamp: 2025-12-11T11:58:25.307Z
Learning: In core/providers/openai/responses.go (and related OpenAI response handling), document and enforce the API format constraint: if ResponsesReasoning != nil and the response contains content blocks, all content blocks should be treated as reasoning blocks by default. Implement type guards or parsing logic accordingly, and add unit tests to verify that when ResponsesReasoning is non-nil, content blocks are labeled as reasoning blocks. Include clear comments in the code explaining the rationale and ensure downstream consumers rely on this behavior.

Applied to files:

  • core/providers/openai/types.go
📚 Learning: 2025-12-14T14:43:30.902Z
Learnt from: Radheshg04
Repo: maximhq/bifrost PR: 980
File: core/providers/openai/images.go:10-22
Timestamp: 2025-12-14T14:43:30.902Z
Learning: Enforce the OpenAI image generation SSE event type values across the OpenAI image flow in the repository: use "image_generation.partial_image" for partial chunks, "image_generation.completed" for the final result, and "error" for errors. Apply this consistently in schemas, constants, tests, accumulator routing, and UI code within core/providers/openai (and related Go files) to ensure uniform event typing and avoid mismatches.

Applied to files:

  • core/providers/openai/types.go
🧬 Code graph analysis (7)
core/schemas/responses.go (1)
core/schemas/chatcompletions.go (1)
  • CacheControl (610-613)
core/providers/anthropic/utils.go (2)
core/providers/anthropic/types.go (2)
  • AnthropicContentBlockTypeImage (205-205)
  • AnthropicImageSource (234-239)
core/schemas/chatcompletions.go (1)
  • CacheControl (610-613)
core/providers/bedrock/chat.go (3)
core/schemas/chatcompletions.go (6)
  • ChatContentBlock (592-602)
  • ChatContentBlockTypeText (584-584)
  • ChatAssistantMessageToolCall (702-708)
  • ChatAssistantMessageToolCallFunction (711-714)
  • ChatReasoningDetails (739-746)
  • BifrostReasoningDetailsTypeText (735-735)
core/schemas/bifrost.go (1)
  • BifrostContextKeyStructuredOutputToolName (135-135)
core/schemas/utils.go (1)
  • Ptr (16-18)
core/providers/anthropic/chat.go (4)
core/schemas/chatcompletions.go (4)
  • CacheControl (610-613)
  • ChatContentBlock (592-602)
  • ChatAssistantMessageToolCallFunction (711-714)
  • ChatAssistantMessageToolCall (702-708)
core/providers/anthropic/types.go (3)
  • AnthropicContentBlockTypeText (204-204)
  • AnthropicContentBlockTypeToolUse (206-206)
  • AnthropicContentBlockTypeThinking (212-212)
core/schemas/utils.go (1)
  • Ptr (16-18)
core/utils.go (1)
  • Ptr (56-58)
core/providers/anthropic/types.go (1)
core/schemas/chatcompletions.go (2)
  • CacheControl (610-613)
  • ToolFunctionParameters (261-268)
tests/integrations/tests/test_bedrock.py (2)
tests/integrations/tests/utils/parametrize.py (2)
  • get_cross_provider_params_for_scenario (12-47)
  • format_provider_model (50-65)
tests/integrations/tests/test_anthropic.py (2)
  • validate_cache_write (1663-1675)
  • validate_cache_read (1677-1689)
tests/integrations/tests/test_anthropic.py (3)
tests/integrations/tests/utils/parametrize.py (2)
  • get_cross_provider_params_for_scenario (12-47)
  • format_provider_model (50-65)
tests/integrations/tests/utils/common.py (1)
  • assert_valid_chat_response (801-833)
tests/integrations/tests/test_bedrock.py (2)
  • validate_cache_write (1799-1809)
  • validate_cache_read (1811-1821)
🔇 Additional comments (20)
core/schemas/chatcompletions.go (1)

244-250: Cache-control extensions to tools and content blocks look consistent

Adding CacheControl to ChatTool and ChatContentBlock, plus the CacheControlType/CacheControl types, cleanly models provider-specific cache control while keeping fields optional and clearly flagged as non-OpenAI. No issues from a schema or JSON perspective.

Also applies to: 591-613

core/providers/bedrock/utils.go (1)

16-33: Model-aware convertToolConfig with Nova gating is reasonable

  • Updating convertChatParameters to call convertToolConfig(bifrostReq.Model, bifrostReq.Params) and having convertToolConfig append an extra BedrockTool with CachePointDefault when tool.CacheControl != nil && !schemas.IsNovaModel(model) looks consistent with model-specific caching behavior.
  • Existing tool-spec construction (name, description, schema) is preserved; the extra cache-point entry is additive and doesn’t interfere with tool choice logic.

Given Nova’s different cache semantics, skipping cache-point tool injection for Nova via IsNovaModel is a sensible first pass.

If you’d like, I can help draft a small test that asserts we emit cache-point tools for non-Nova models but not for Nova IDs (using a couple of representative model strings).

Also applies to: 617-685

core/internal/testutil/account.go (1)

22-56: Extending TestScenarios with PromptCaching is consistent

Adding PromptCaching bool to TestScenarios follows the existing pattern for feature flags (e.g., Reasoning, Embedding) and defaults safely to false for providers that don’t set it.

core/providers/anthropic/anthropic_test.go (1)

25-35: Anthropic test config updates for vision, reasoning, and prompt caching look good

Providing explicit VisionModel, ReasoningModel, and PromptCachingModel and enabling PromptCaching in TestScenarios keeps the Anthropic tests aligned with the new capabilities. The rest of the test shape (setup, shutdown, scenarios) remains consistent with other provider tests.

Also applies to: 35-52

core/providers/anthropic/types.go (1)

216-231: Anthropic content/tool structs now expose cache control cleanly

Adding CacheControl *schemas.CacheControl to both AnthropicContentBlock and AnthropicTool aligns Anthropic types with the shared CacheControl schema. Fields are optional and won’t affect existing payloads unless explicitly set.

Also applies to: 290-300

core/schemas/responses.go (1)

401-418: Responses schemas extended with cache control in a backward-compatible way

The new CacheControl *CacheControl fields on ResponsesMessageContentBlock and ResponsesTool provide a clean hook for provider-specific cache hints while remaining optional and clearly documented as non-OpenAI extensions. They integrate naturally with the existing Responses API model.

Also applies to: 1036-1055

core/providers/bedrock/responses.go (6)

1571-1578: LGTM: CachePoint propagation for tools.

The logic correctly appends a separate BedrockTool with CachePoint after tools that have CacheControl, and appropriately excludes Nova models that don't support tool-level caching.


2450-2484: LGTM: CachePoint propagation relies on API ordering guarantee.

The conversion correctly attaches CacheControl to the last content block when a CachePoint is encountered (lines 2454-2463). As confirmed in previous reviews, the Bedrock API guarantees that CachePoint entries only appear after text messages, preventing silent drops.


2413-2419: LGTM: Proper CacheControl→CachePoint conversion for system messages.

The code correctly emits a separate BedrockSystemMessage with CachePoint after text messages that carry CacheControl, consistent with Bedrock's API format.


2656-2665: LGTM: CachePoint handling validated upstream.

As confirmed in prior reviews, the provider API validates and returns an error when a CachePoint appears before content blocks, so the defensive check at line 2658 is sufficient. The upstream validation prevents the silent drop scenario.


2774-2781: LGTM: CacheControl→CachePoint propagation for content blocks.

The conversion correctly emits a separate BedrockContentBlock with CachePoint after blocks that have CacheControl, maintaining consistency with the tool and system message conversion patterns.


1283-1289: The code at lines 1283-1289 already includes a defensive check on line 1285 (if len(bifrostReq.Params.Tools) > 0) that prevents accessing an empty array. This guards against the scenario where a CachePoint appears before any ToolSpec in the iteration. The same defensive pattern is consistently applied across similar locations in the file (lines 2454-2461 for system messages and lines 2656-2665 for content blocks). No additional changes are needed.

Likely an incorrect or invalid review comment.

core/providers/bedrock/bedrock_test.go (1)

65-108: Bedrock comprehensive test config updates look consistent with prompt‑caching rollout

Embedding/Reasoning/PromptCaching model choices and the batch/file extra‑params wiring integrate cleanly with the existing Bedrock comprehensive test setup and new PromptCaching scenario flag; no issues from this diff alone.

core/internal/testutil/test_retry_framework.go (1)

818-841: Retry framework streaming logs and nil‑channel behavior are now consistent and safer

The StreamingRetryConfig.OnRetry semantics (current failed attempt in, logging next attempt with attempt+1) are now aligned with other helpers, and the Responses stream‑validation retry path adds clear per‑attempt logs, explicit wait logging, and, importantly, continues retrying on a nil responseChannel instead of returning early. Control flow and backoff usage look correct.

Also applies to: 2150-2325

core/providers/bedrock/chat.go (1)

62-186: Bedrock → Bifrost chat conversion now correctly handles structured tools, tool calls, reasoning, and cache usage

The new block‑wise processing cleanly separates:

  • structured‑output tool_use (marshalled into ContentStr and excluded from ToolCalls),
  • regular tool_use (proper ChatAssistantMessageToolCall with JSON arguments),
  • reasoning blocks (aggregated into ReasoningDetails/Reasoning),
  • and usage/cache fields (read/write tokens → prompt/completion token details),

while avoiding mutation of the source response. Only notable behavioral change is that plain text is now always surfaced via ContentBlocks (except for structured output) rather than collapsing single‑text blocks into ContentStr; double‑check any consumers that might still look exclusively at ContentStr for Bedrock responses.

core/providers/openai/types.go (1)

82-190: CacheControl‑stripping for OpenAI chat/responses is now non‑mutating and allocation‑aware

The new MarshalJSON implementations correctly:

  • detect presence of CacheControl first,
  • only then allocate per‑message/tool copies, cloning content/tool‑output blocks and nil‑ing CacheControl on the copies,
  • and shadow messages, tools, and input in aux structs so that wire JSON excludes CacheControl while the original structs stay intact.

This cleanly resolves the prior mutation concern while keeping the no‑CacheControl hot path cheap.

Also applies to: 253-371, 394-452

core/providers/anthropic/chat.go (1)

42-289: Anthropic request/response mappings now propagate cache_control on input and normalize content types on output

On the request path, CacheControl is threaded from:

  • tools → AnthropicTool.CacheControl, and
  • all relevant text content blocks (system, tool results, user/assistant) → AnthropicContentBlock.CacheControl,

without disturbing the existing reasoning/thinking mapping. On the response path, the new type‑switch over content blocks correctly distinguishes text, tool_use, and thinking, constructing ChatContentBlocks, tool calls (with JSON‑marshalled inputs), and reasoning details/reasoning text accordingly. This matches the intended prompt‑caching semantics and prior guidance that responses themselves don’t carry cache_control.

Also applies to: 320-365

tests/integrations/tests/test_bedrock.py (1)

37-41: Bedrock prompt‑caching tests and helpers look solid

The new Prompt Caching section, tests 21–23, and the validate_cache_write/read helpers are wired correctly: fixtures and parametrization are used properly, cache counters are asserted in a tolerant way (no brittle “must be zero” invariants), and the approximate equality check for system/messages caching should be robust enough for an integration suite.

Also applies to: 1623-1821

tests/integrations/tests/test_anthropic.py (1)

39-41: Anthropic prompt‑caching tests and usage helpers are well‑structured

The new prompt‑caching tests (28–30) and validate_cache_write/read helpers correctly target Anthropic’s usage fields, avoid brittle assumptions about warm caches, and use a reasonable token‑difference tolerance where needed. Signatures and imports are consistent and previous unused‑fixture concerns are resolved.

Also applies to: 52-86, 1421-1608, 1663-1689

core/providers/anthropic/responses.go (1)

1344-1345: Centralized system‑content conversion looks correct

Passing request.System into ConvertAnthropicMessagesToBifrostMessages and handling it via convertAnthropicSystemToBifrostMessages gives a single, symmetric path for Anthropic system content (and its CacheControl) in both directions. The helper also sensibly skips emitting a system message when there’s no text and avoids empty ContentBlocks arrays.

Also applies to: 1627-1649, 2065-2097

Copy link
Collaborator Author

Pratham-Mishra04 commented Dec 17, 2025

Merge activity

  • Dec 17, 12:08 PM UTC: A user started a stack merge that includes this pull request via Graphite.
  • Dec 17, 12:08 PM UTC: @Pratham-Mishra04 merged this pull request with Graphite.

@Pratham-Mishra04 Pratham-Mishra04 merged commit d0747ad into main Dec 17, 2025
9 checks passed
@Pratham-Mishra04 Pratham-Mishra04 deleted the 11-25-feat_support_for_prompt_caching_parameter_for_claude_models branch December 17, 2025 12:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants