-
Notifications
You must be signed in to change notification settings - Fork 168
feat: add cache control support for claude models #979
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: add cache control support for claude models #979
Conversation
📝 WalkthroughSummary by CodeRabbit
✏️ Tip: You can customize this high-level summary in your review settings. WalkthroughAdds provider-agnostic CacheControl/CachePoint propagation across Anthropic, Bedrock, and OpenAI conversions and streaming; centralizes Anthropic system-message conversion; adjusts Bedrock helpers to emit multiple system/content blocks; refines retry-framework logging and nil-channel retry behavior; and adds cross-provider prompt‑caching tests, constants, and scenario gating. Changes
Sequence Diagram(s)sequenceDiagram
participant Client
participant Bifrost
participant Converter
participant Provider
Note over Client,Bifrost: Request may include ContentBlocks with CacheControl
Client->>Bifrost: Submit chat/responses request (ContentBlocks + CacheControl)
Bifrost->>Converter: Convert to provider format (map CacheControl → CachePoint if needed)
Converter->>Provider: Send provider request (CachePoint/CacheControl set)
Provider-->>Converter: Stream/deltas (may include CachePoint/CacheControl)
Converter->>Bifrost: Map provider cache metadata back (CachePoint → CacheControl)
Bifrost-->>Client: Return responses with CacheControl annotations and usage metrics
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes
Poem
Pre-merge checks and finishing touches✅ Passed checks (3 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Comment |
This stack of pull requests is managed by Graphite. Learn more about stacking. |
a839bc0 to
058c2c3
Compare
e1c5d21 to
b5f6f04
Compare
058c2c3 to
bccd957
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
🧹 Nitpick comments (3)
core/providers/anthropic/responses.go (1)
2514-2519: CacheControl missing for simple ContentStr path.When converting
ResponsesMessageTypeMessagewithContentStr(notContentBlocks), the resultingAnthropicContentBlockdoesn't includeCacheControl. This is inconsistent with theContentBlockspath at lines 2520-2528 where blocks go throughconvertContentBlockToAnthropicwhich does propagateCacheControl.Consider whether
ContentStrmessages should carry cache control metadata. If so, apply it here:if msg.Content.ContentStr != nil { contentBlocks = append(contentBlocks, AnthropicContentBlock{ - Type: AnthropicContentBlockTypeText, - Text: msg.Content.ContentStr, + Type: AnthropicContentBlockTypeText, + Text: msg.Content.ContentStr, + CacheControl: msg.Content.CacheControl, // if available at content level }) }core/internal/testutil/test_retry_framework.go (2)
2152-2157: Consider consistent attempt-start logging across all stream validation functions.The attempt-start logging added here improves debugging visibility, but it's only present in
WithResponsesStreamValidationRetry. Similar functions likeWithSpeechStreamValidationRetry(line 1987+) andWithChatStreamValidationRetry(line 2358+) lack this logging.For consistency, consider adding similar logging to all stream validation retry functions:
+ // Log attempt start (especially for retries) + if attempt > 1 { + t.Logf("🔄 Starting [speech|chat] stream retry attempt %d/%d for %s", attempt, config.MaxAttempts, context.ScenarioName) + } else { + t.Logf("🔄 Starting [speech|chat] stream test attempt %d/%d for %s", attempt, config.MaxAttempts, context.ScenarioName) + }
2217-2217: Consider adding waiting-time logs consistently across all retry functions.The "⏳ Waiting" logs improve visibility into retry timing, but they're only present in
WithResponsesStreamValidationRetry. Other retry functions (e.g.,WithChatTestRetry,WithStreamRetry,WithSpeechStreamValidationRetry) calculate and sleep on delays without logging the wait time.For consistent debugging experience, consider adding similar logs before
time.Sleepcalls in all retry functions:delay := calculateRetryDelay(attempt-1, config.BaseDelay, config.MaxDelay) + t.Logf("⏳ Waiting %v before retry...", delay) time.Sleep(delay)Also applies to: 2251-2251
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (11)
core/internal/testutil/test_retry_framework.go(5 hunks)core/providers/anthropic/chat.go(2 hunks)core/providers/anthropic/responses.go(11 hunks)core/providers/anthropic/types.go(1 hunks)core/providers/anthropic/utils.go(1 hunks)core/providers/bedrock/chat.go(1 hunks)core/providers/bedrock/types.go(1 hunks)core/providers/bedrock/utils.go(4 hunks)core/providers/openai/types.go(1 hunks)core/schemas/chatcompletions.go(1 hunks)core/schemas/responses.go(1 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**
⚙️ CodeRabbit configuration file
always check the stack if there is one for the current PR. do not give localized reviews for the PR, always see all changes in the light of the whole stack of PRs (if there is a stack, if there is no stack you can continue to make localized suggestions/reviews)
Files:
core/providers/bedrock/types.gocore/schemas/chatcompletions.gocore/providers/openai/types.gocore/providers/anthropic/types.gocore/providers/bedrock/utils.gocore/providers/anthropic/utils.gocore/providers/bedrock/chat.gocore/providers/anthropic/chat.gocore/internal/testutil/test_retry_framework.gocore/schemas/responses.gocore/providers/anthropic/responses.go
🧠 Learnings (2)
📚 Learning: 2025-12-09T17:07:42.007Z
Learnt from: qwerty-dvorak
Repo: maximhq/bifrost PR: 1006
File: core/schemas/account.go:9-18
Timestamp: 2025-12-09T17:07:42.007Z
Learning: In core/schemas/account.go, the HuggingFaceKeyConfig field within the Key struct is currently unused and reserved for future Hugging Face inference endpoint deployments. Do not flag this field as missing from OpenAPI documentation or require its presence in the API spec until the feature is actively implemented and used. When the feature is added, update the OpenAPI docs accordingly; otherwise, treat this field as non-breaking and not part of the current API surface.
Applied to files:
core/providers/bedrock/types.gocore/schemas/chatcompletions.gocore/providers/openai/types.gocore/providers/anthropic/types.gocore/providers/bedrock/utils.gocore/providers/anthropic/utils.gocore/providers/bedrock/chat.gocore/providers/anthropic/chat.gocore/internal/testutil/test_retry_framework.gocore/schemas/responses.gocore/providers/anthropic/responses.go
📚 Learning: 2025-12-11T11:58:25.307Z
Learnt from: Pratham-Mishra04
Repo: maximhq/bifrost PR: 1000
File: core/providers/openai/responses.go:42-84
Timestamp: 2025-12-11T11:58:25.307Z
Learning: In core/providers/openai/responses.go (and related OpenAI response handling), document and enforce the API format constraint: if ResponsesReasoning != nil and the response contains content blocks, all content blocks should be treated as reasoning blocks by default. Implement type guards or parsing logic accordingly, and add unit tests to verify that when ResponsesReasoning is non-nil, content blocks are labeled as reasoning blocks. Include clear comments in the code explaining the rationale and ensure downstream consumers rely on this behavior.
Applied to files:
core/providers/openai/types.go
🧬 Code graph analysis (8)
core/providers/bedrock/types.go (1)
core/schemas/chatcompletions.go (1)
CacheControl(608-611)
core/providers/openai/types.go (2)
core/schemas/chatcompletions.go (1)
CacheControl(608-611)core/schemas/responses.go (1)
ResponsesToolMessage(460-480)
core/providers/anthropic/types.go (1)
core/schemas/chatcompletions.go (1)
CacheControl(608-611)
core/providers/bedrock/utils.go (1)
core/schemas/chatcompletions.go (1)
CacheControl(608-611)
core/providers/anthropic/utils.go (3)
core/providers/gemini/types.go (1)
Type(778-778)core/providers/anthropic/types.go (2)
AnthropicContentBlockTypeImage(136-136)AnthropicImageSource(163-168)core/schemas/chatcompletions.go (1)
CacheControl(608-611)
core/providers/bedrock/chat.go (1)
core/schemas/chatcompletions.go (2)
ChatContentBlockTypeText(582-582)CacheControl(608-611)
core/providers/anthropic/chat.go (2)
core/providers/anthropic/types.go (1)
AnthropicContentBlockTypeText(135-135)core/schemas/chatcompletions.go (1)
CacheControl(608-611)
core/schemas/responses.go (1)
core/schemas/chatcompletions.go (1)
CacheControl(608-611)
⏰ Context from checks skipped due to timeout of 900000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: Graphite / mergeability_check
- GitHub Check: Graphite / mergeability_check
🔇 Additional comments (20)
core/providers/bedrock/chat.go (1)
68-72: LGTM! Cache control propagation is correct.The addition of CacheControl field properly propagates cache metadata from Bedrock content blocks to Bifrost's ChatContentBlock representation. The use of the typed constant
schemas.ChatContentBlockTypeTextis consistent with the broader refactoring across providers.core/schemas/chatcompletions.go (1)
597-611: LGTM! Cache control schema definitions are well-structured.The introduction of
CacheControltype withTypeand optionalTTLfields provides a clean, provider-agnostic way to represent cache control metadata. The optionalomitemptytag ensures backward compatibility.Note: The TTL comment indicates
"1m" | "1h"format, but there's no validation. Consider adding validation logic or more detailed documentation if TTL values need to follow a strict format.core/schemas/responses.go (1)
406-408: LGTM! Consistent cache control support in Responses API.The addition of
CacheControlfield toResponsesMessageContentBlockmaintains consistency with the chat completions schema and properly documents that it's supported by specific providers (Anthropic, Bedrock).core/providers/anthropic/chat.go (2)
142-146: LGTM! Proper use of typed constants and cache control propagation.The system message text block construction correctly:
- Uses the typed constant
AnthropicContentBlockTypeTextinstead of string literal- Propagates
CacheControlfrom the source blockThis aligns with the provider-agnostic cache control support introduced in this PR.
177-181: LGTM! Consistent cache control propagation in tool results.The tool result text block construction follows the same pattern as system messages, properly using typed constants and propagating cache control metadata.
core/providers/bedrock/types.go (1)
108-109: LGTM! Consistent cache control field addition.The
CacheControlfield addition toBedrockContentBlockis consistent with the same change in Anthropic provider and aligns with the provider-agnostic cache control support.core/providers/anthropic/types.go (1)
148-159: LGTM! Cache control field properly integrated.The
CacheControlfield addition toAnthropicContentBlockis properly typed and follows the same pattern as other providers in this PR.core/providers/anthropic/utils.go (1)
45-48: LGTM! Image block conversion properly updated.The conversion function correctly:
- Uses typed constant
AnthropicContentBlockTypeImageinstead of string literal- Propagates
CacheControlfrom the source block to the Anthropic representationThis is consistent with similar changes for text blocks in other files.
core/providers/bedrock/utils.go (5)
274-277: LGTM: CacheControl propagation for text blocks in tool results.The
CacheControlfield is correctly propagated from the inputChatContentBlockto theBedrockContentBlockfor text content within tool results.
285-288: LGTM: CacheControl propagation for image blocks in tool results.The
CacheControlfield is correctly propagated for image content within tool results, consistent with the text block handling.
334-337: LGTM: CacheControl propagation in general content block conversion.Both text and image blocks in
convertContentBlockcorrectly propagateCacheControlfrom the source block to the Bedrock-specific block.Also applies to: 348-351
260-268: CacheControl not propagated for JSON tool result content.When tool result content is parsed as JSON (lines 263-268), the resulting
BedrockContentBlockwithJSONfield doesn't includeCacheControl. This is inconsistent with the text/image paths. Consider whether cache control semantics should apply to JSON content blocks as well.
311-315: CacheControl missing for simple string content path.When
content.ContentStris non-nil (line 311-315), the resultingBedrockContentBlockdoesn't carryCacheControl. This is inconsistent with the multi-modal content block paths that now propagate it. If the sourceChatMessageContenthas aCacheControlfield at the content level, consider propagating it here as well.#!/bin/bash # Check if ChatMessageContent has a CacheControl field that should be propagated ast-grep --pattern 'type ChatMessageContent struct { $$$ }'core/providers/anthropic/responses.go (7)
219-246: Improved system message handling with CacheControl support.The refactored system message conversion now properly preserves individual content blocks with their
CacheControlmetadata instead of concatenating text. This is a good improvement that enables cache control for system prompts.
1461-1468: LGTM: CacheControl propagation for text content blocks.Text blocks in
convertAnthropicMessageToBifrostResponsesMessagescorrectly propagateCacheControlfrom the Anthropic content block to the Bifrost responses content block.
1531-1535: LGTM: CacheControl propagation in tool result content blocks.Both regular tool results and MCP tool results correctly propagate
CacheControlfor text content blocks, maintaining consistency across tool handling paths.Also applies to: 1587-1591
2324-2332: LGTM: CacheControl in Anthropic content blocks to Responses conversion.The text content block conversion correctly includes
CacheControlwhen converting from Anthropic format to Responses format.
2421-2427: LGTM: CacheControl in nested tool result content blocks.Both regular tool results and MCP tool results correctly propagate
CacheControlfor text content blocks within nested content structures.Also applies to: 2482-2488
2700-2727: LGTM: CacheControl propagation in convertContentBlockToAnthropic.All content block types (text, image, reasoning) correctly propagate
CacheControlwhen converting from Bifrost to Anthropic format.
2749-2757: LGTM: CacheControl in toBifrostResponsesImageBlock helper.The image block conversion helper correctly includes
CacheControlfrom the source Anthropic block.
bccd957 to
5a89f32
Compare
5a89f32 to
5da39b0
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
core/providers/anthropic/responses.go (1)
2150-2185: Use consistent content types and preserve CacheControl in grouped text-block handling.In
convertAnthropicContentBlocksToResponsesMessagesGrouped:
- Line 2176: Input messages (
isOutputMessage == false) incorrectly useResponsesOutputMessageContentTypeText. Should useResponsesInputMessageContentBlockTypeTextto match the non-grouped helper and schema conventions (seen at line 2392 inconvertAnthropicContentBlocksToResponsesMessages).- Lines 2165–2168: When accumulating output message text blocks,
CacheControlis not copied fromblock.CacheControl, dropping any cache-control semantics. The non-grouped helper preserves it (line 2379).These inconsistencies affect Bedrock compatibility (which uses the grouped path) and message fidelity. Align the grouped path with the non-grouped helper:
- Use
ResponsesInputMessageContentBlockTypeTextfor input messages.- Include
CacheControl: block.CacheControlwhen accumulating text.
♻️ Duplicate comments (1)
core/providers/openai/types.go (1)
172-195: Avoid mutating input messages when stripping CacheControl before marshaling.The loop currently clears
CacheControlin-place onOpenAIResponsesRequestInputArray’s nested content blocks and tool output blocks. Becausemsg.Contentandmsg.ResponsesToolMessage.Outputpoint into the original slice, this mutates the caller’s data, so any later use of the same messages loses cache-control metadata.Safer approach: build a sanitized deep copy (messages, their
ContentBlocks, andResponsesFunctionToolCallOutputBlocks) and marshal that instead.Example pattern:
func (r *OpenAIResponsesRequestInput) MarshalJSON() ([]byte, error) { if r.OpenAIResponsesRequestInputStr != nil { return sonic.Marshal(*r.OpenAIResponsesRequestInputStr) } if r.OpenAIResponsesRequestInputArray != nil { - // Set CacheControl to nil for all content blocks to exclude it from marshaling - for _, msg := range r.OpenAIResponsesRequestInputArray { - // Remove CacheControl from message content blocks - if msg.Content != nil && msg.Content.ContentBlocks != nil { - for i := range msg.Content.ContentBlocks { - msg.Content.ContentBlocks[i].CacheControl = nil - } - } - // Remove CacheControl from tool message output blocks - if msg.ResponsesToolMessage != nil && msg.ResponsesToolMessage.Output != nil { - if msg.ResponsesToolMessage.Output.ResponsesFunctionToolCallOutputBlocks != nil { - for i := range msg.ResponsesToolMessage.Output.ResponsesFunctionToolCallOutputBlocks { - msg.ResponsesToolMessage.Output.ResponsesFunctionToolCallOutputBlocks[i].CacheControl = nil - } - } - } - } - return sonic.Marshal(r.OpenAIResponsesRequestInputArray) + // Work on a sanitized deep copy to avoid mutating the caller's slice. + msgsCopy := make([]schemas.ResponsesMessage, len(r.OpenAIResponsesRequestInputArray)) + copy(msgsCopy, r.OpenAIResponsesRequestInputArray) + + for i := range msgsCopy { + msg := &msgsCopy[i] + + // Strip CacheControl from message content blocks + if msg.Content != nil && msg.Content.ContentBlocks != nil { + blocksCopy := make([]schemas.ResponsesMessageContentBlock, len(msg.Content.ContentBlocks)) + copy(blocksCopy, msg.Content.ContentBlocks) + for j := range blocksCopy { + blocksCopy[j].CacheControl = nil + } + msg.Content.ContentBlocks = blocksCopy + } + + // Strip CacheControl from tool output blocks + if msg.ResponsesToolMessage != nil && msg.ResponsesToolMessage.Output != nil && + msg.ResponsesToolMessage.Output.ResponsesFunctionToolCallOutputBlocks != nil { + outBlocks := msg.ResponsesToolMessage.Output.ResponsesFunctionToolCallOutputBlocks + outCopy := make([]schemas.ResponsesMessageContentBlock, len(outBlocks)) + copy(outCopy, outBlocks) + for j := range outCopy { + outCopy[j].CacheControl = nil + } + msg.ResponsesToolMessage.Output.ResponsesFunctionToolCallOutputBlocks = outCopy + } + } + return sonic.Marshal(msgsCopy) } return sonic.Marshal(nil) }This keeps CacheControl out of the OpenAI wire format without surprising callers by mutating their request payloads.
🧹 Nitpick comments (5)
core/providers/bedrock/types.go (1)
155-201: Bedrock cache point typing and wiring look consistentThe added
CachePointfields onBedrockSystemMessageandBedrockContentBlockplusBedrockCachePointType/BedrockCachePointgive a clean, explicit shape for Bedrock cache metadata and align with the rest of the Bedrock type modeling. No functional issues from this change.You might optionally add a brief comment on
BedrockCachePointTypeDefaultto document that"default"currently maps to ephemeral cache control in Bifrost, for future readers.core/providers/bedrock/responses.go (2)
1245-1290: Tool cache control mapping is correct but relies on positional conventionThe bidirectional mapping between Bedrock tool
cachePointentries and BifrostCacheControllooks internally consistent:
- Bedrock → Bifrost: a
BedrockToolwith onlyCachePointsetsCacheControlon the lastbifrostReq.Params.Toolsentry.- Bifrost → Bedrock: a Bifrost tool with
CacheControlcauses you to append an extraBedrockToolwith justCachePoint.Two notes:
- This is intentionally order‑dependent: it assumes a
cachePoint-only tool always follows the tool definition it applies to. That’s fine given how you construct these arrays, but it’s worth a short comment near both sites to document the convention.- You already guard
len(bifrostReq.Params.Tools) > 0, so malformed input with a leadingcachePointis safely ignored.Functionally this should round‑trip cleanly and supports the prompt‑caching tests you’ve added.
Also applies to: 1534-1587
2376-2484: System message CachePoint ↔ CacheControl conversion is symmetric and safe
convertBifrostMessageToBedrockSystemMessagesnow emits aBedrockSystemMessage{Text: ...}per text block and, whenCacheControlis set, a followingBedrockSystemMessage{CachePoint: ...}.convertBedrockSystemMessageToBifrostMessageswalks[]BedrockSystemMessage, and when it encounters aCachePointentry, it attachesCacheControl{Type: Ephemeral}to the last message’s last content block, which matches how you emitted it in the other direction.This gives you a clean round‑trip for system‑level cache checkpoints, and the guards (
len(bifrostMessages) > 0, non‑nil content/blocks) avoid panics on malformed input.Consider adding a brief comment that
CachePointentries are expected to be emitted immediately after the system text they apply to, as this ordering is now a contract.tests/integrations/tests/test_bedrock.py (1)
37-39: Prompt caching test is well-structured; consider small cleanups and lint fixesThe new
test_21_prompt_cachingdoes a good job of validating the three cache checkpoint locations (system, messages, tools) end‑to‑end viausage.cacheWriteInputTokens/cacheReadInputTokens. The overall flow and use ofPROMPT_CACHING_LARGE_CONTEXTlook solid.A few minor points you may want to address:
Unused fixture argument:
test_configisn’t used intest_21_prompt_caching. Given Ruff’s ARG002 hint, either:
- Drop it from the signature here, or
- Rename to
_test_configor explicitly reference it (e.g.,assert isinstance(test_config, Config)) if you want to keep the pattern consistent with other tests.Pointless f-strings: Several
- Line 1630:
print(f"\n--- Test 1: System Message Caching ---")- Line 1696:
print(f"\n--- Test 2: Messages Caching ---")- Line 1754:
print(f"\n--- Test 3: Tools Caching ---")You can remove the
fprefix on those strings to silence the warning.Minor prompt text nit: In the system message setup you currently have:
"You are an AI assistant with analyzing legal documents."That reads oddly (double space and missing “tasked”). If you care about prompt quality, consider aligning it with the wording used in
PROMPT_CACHING_LARGE_CONTEXT(e.g., “You are an AI assistant tasked with analyzing legal documents.”).If you’re comfortable with the stricter expectations that second requests never write to cache, the assertions around
cacheWriteInputTokens == 0are fine as a contract test for your abstraction.Also applies to: 49-65, 1620-1824
tests/integrations/tests/test_anthropic.py (1)
1418-1512: Tidy uptest_28_prompt_cachingfor linters and minor styleThe overall structure of the new prompt-caching test looks solid and matches the cross-provider patterns used elsewhere in this suite. A couple of small cleanups will keep lint happy:
anthropic_clientandtest_configare unused but required as fixtures; Ruff flags these asARG002. Easiest fix is to mark the function definition as intentionally ignoring that rule.- The final
F541; you can safely drop thefprefix.Suggested diff:
- @pytest.mark.parametrize("provider,model", get_cross_provider_params_for_scenario("prompt_caching")) - def test_28_prompt_caching(self, anthropic_client, test_config, provider, model): + @pytest.mark.parametrize("provider,model", get_cross_provider_params_for_scenario("prompt_caching")) + def test_28_prompt_caching(self, anthropic_client, test_config, provider, model): # noqa: ARG002 @@ - print(f"\n✓ Prompt caching validated for provider {provider}") + print(f"\n✓ Prompt caching validated for provider {provider}") print(f" Cache created: {cache_created} tokens") print(f" Cache read: {cache_read} tokens") - print(f" Efficiency: Cache hit successfully reduced processing on second request") + print(" Efficiency: Cache hit successfully reduced processing on second request")(Keeping the f-strings on the earlier lines is useful since they interpolate
provider,cache_created, andcache_read; only the last one needs to be plain.)Also, if you ever see this test become flaky across providers, consider relaxing the
abs(cache_created - cache_read) < 100tolerance or additionally asserting that the second request’s total input tokens are lower than the first, but that’s optional and can be adjusted based on real-world behavior.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (15)
core/internal/testutil/test_retry_framework.go(5 hunks)core/providers/anthropic/chat.go(6 hunks)core/providers/anthropic/responses.go(13 hunks)core/providers/anthropic/types.go(2 hunks)core/providers/anthropic/utils.go(1 hunks)core/providers/bedrock/chat.go(1 hunks)core/providers/bedrock/responses.go(8 hunks)core/providers/bedrock/types.go(6 hunks)core/providers/bedrock/utils.go(3 hunks)core/providers/openai/types.go(1 hunks)core/schemas/chatcompletions.go(2 hunks)core/schemas/responses.go(2 hunks)tests/integrations/tests/test_anthropic.py(3 hunks)tests/integrations/tests/test_bedrock.py(3 hunks)tests/integrations/tests/utils/common.py(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (4)
- core/providers/bedrock/utils.go
- core/providers/anthropic/types.go
- core/schemas/responses.go
- core/providers/bedrock/chat.go
🧰 Additional context used
📓 Path-based instructions (1)
**
⚙️ CodeRabbit configuration file
always check the stack if there is one for the current PR. do not give localized reviews for the PR, always see all changes in the light of the whole stack of PRs (if there is a stack, if there is no stack you can continue to make localized suggestions/reviews)
Files:
tests/integrations/tests/test_bedrock.pycore/providers/anthropic/utils.gocore/internal/testutil/test_retry_framework.gocore/schemas/chatcompletions.gocore/providers/anthropic/responses.gotests/integrations/tests/utils/common.pycore/providers/anthropic/chat.gocore/providers/openai/types.gocore/providers/bedrock/responses.gocore/providers/bedrock/types.gotests/integrations/tests/test_anthropic.py
🧠 Learnings (4)
📚 Learning: 2025-12-09T17:07:42.007Z
Learnt from: qwerty-dvorak
Repo: maximhq/bifrost PR: 1006
File: core/schemas/account.go:9-18
Timestamp: 2025-12-09T17:07:42.007Z
Learning: In core/schemas/account.go, the HuggingFaceKeyConfig field within the Key struct is currently unused and reserved for future Hugging Face inference endpoint deployments. Do not flag this field as missing from OpenAPI documentation or require its presence in the API spec until the feature is actively implemented and used. When the feature is added, update the OpenAPI docs accordingly; otherwise, treat this field as non-breaking and not part of the current API surface.
Applied to files:
core/providers/anthropic/utils.gocore/internal/testutil/test_retry_framework.gocore/schemas/chatcompletions.gocore/providers/anthropic/responses.gocore/providers/anthropic/chat.gocore/providers/openai/types.gocore/providers/bedrock/responses.gocore/providers/bedrock/types.go
📚 Learning: 2025-12-12T10:28:54.988Z
Learnt from: Pratham-Mishra04
Repo: maximhq/bifrost PR: 978
File: core/providers/anthropic/chat.go:207-216
Timestamp: 2025-12-12T10:28:54.988Z
Learning: In core/providers/anthropic/chat.go, within ToAnthropicChatRequest, ensure the mapping from ReasoningDetails to Anthropic thinking blocks preserves all fields from ReasoningDetails, including when Text is nil or empty. Do not drop or coerce missing Text; propagate nil/empty values to maintain complete bidirectional mapping of Anthropic events in Bifrost. Add tests to verify that ReasoningDetails with Text nil or "" round-trip without loss, and update code to handle nullable Text without dropping data. If there are multiple blocks, preserve order and all fields during mapping.
Applied to files:
core/providers/anthropic/chat.go
📚 Learning: 2025-12-11T11:58:25.307Z
Learnt from: Pratham-Mishra04
Repo: maximhq/bifrost PR: 1000
File: core/providers/openai/responses.go:42-84
Timestamp: 2025-12-11T11:58:25.307Z
Learning: In core/providers/openai/responses.go (and related OpenAI response handling), document and enforce the API format constraint: if ResponsesReasoning != nil and the response contains content blocks, all content blocks should be treated as reasoning blocks by default. Implement type guards or parsing logic accordingly, and add unit tests to verify that when ResponsesReasoning is non-nil, content blocks are labeled as reasoning blocks. Include clear comments in the code explaining the rationale and ensure downstream consumers rely on this behavior.
Applied to files:
core/providers/openai/types.go
📚 Learning: 2025-12-14T14:43:30.902Z
Learnt from: Radheshg04
Repo: maximhq/bifrost PR: 980
File: core/providers/openai/images.go:10-22
Timestamp: 2025-12-14T14:43:30.902Z
Learning: Enforce the OpenAI image generation SSE event type values across the OpenAI image flow in the repository: use "image_generation.partial_image" for partial chunks, "image_generation.completed" for the final result, and "error" for errors. Apply this consistently in schemas, constants, tests, accumulator routing, and UI code within core/providers/openai (and related Go files) to ensure uniform event typing and avoid mismatches.
Applied to files:
core/providers/openai/types.go
🧬 Code graph analysis (6)
tests/integrations/tests/test_bedrock.py (1)
tests/integrations/tests/utils/parametrize.py (2)
get_cross_provider_params_for_scenario(12-47)format_provider_model(50-65)
core/providers/anthropic/utils.go (2)
core/providers/anthropic/types.go (2)
AnthropicContentBlockTypeImage(205-205)AnthropicImageSource(234-239)core/schemas/chatcompletions.go (1)
CacheControl(610-613)
core/providers/anthropic/responses.go (4)
core/schemas/responses.go (5)
ResponsesMessage(318-331)ResponsesInputMessageRoleSystem(338-338)ResponsesMessageContent(343-348)ResponsesMessageContentBlock(403-418)ResponsesInputMessageContentBlockTypeText(392-392)core/schemas/utils.go (1)
Ptr(16-18)core/schemas/chatcompletions.go (1)
CacheControl(610-613)core/providers/anthropic/types.go (1)
AnthropicContentBlockTypeText(204-204)
core/providers/anthropic/chat.go (2)
core/schemas/chatcompletions.go (2)
CacheControl(610-613)ChatContentBlockTypeText(584-584)core/providers/anthropic/types.go (1)
AnthropicContentBlockTypeText(204-204)
core/providers/openai/types.go (2)
core/schemas/chatcompletions.go (1)
CacheControl(610-613)core/schemas/responses.go (1)
ResponsesToolMessage(469-489)
tests/integrations/tests/test_anthropic.py (2)
tests/integrations/tests/utils/parametrize.py (2)
get_cross_provider_params_for_scenario(12-47)format_provider_model(50-65)tests/integrations/tests/utils/common.py (1)
assert_valid_chat_response(606-638)
🪛 Ruff (0.14.8)
tests/integrations/tests/test_bedrock.py
1620-1620: Unused method argument: test_config
(ARG002)
1630-1630: f-string without any placeholders
Remove extraneous f prefix
(F541)
1696-1696: f-string without any placeholders
Remove extraneous f prefix
(F541)
1754-1754: f-string without any placeholders
Remove extraneous f prefix
(F541)
tests/integrations/tests/test_anthropic.py
1419-1419: Unused method argument: anthropic_client
(ARG002)
1419-1419: Unused method argument: test_config
(ARG002)
1511-1511: f-string without any placeholders
Remove extraneous f prefix
(F541)
🔇 Additional comments (12)
core/internal/testutil/test_retry_framework.go (2)
818-835: Streaming retry attempt semantics are now explicit and consistentThe updated comments and
t.Logf("…attempt %d/10…", attempt+1, …)clearly express thatattemptis the current failed attempt and the log shows the upcoming attempt, matching how other helpers passattemptintoOnRetry. No issues here.
2150-2325: Responses stream retry logging/control-flow improvements look correctThe added per-attempt “Starting … attempt” logs, the extra failure logging before retries, and the explicit “⏳ Waiting … before retry…” messages significantly improve observability. Passing the current failed
attemptintoconfig.OnRetryin all error / nil-channel / validation-failure paths now aligns with the rest of the retry framework, and switching the nil-channel branch tocontinue(with retries) instead of returning fixes the earlier short-circuit bug while still failing on the final attempt. The overall retry behavior remains correct and bounded.core/providers/anthropic/utils.go (1)
97-105: CacheControl propagation for image blocks looks correct.Using
AnthropicContentBlockTypeImageand threadingblock.CacheControlintoAnthropicContentBlock.CacheControlkeeps image handling consistent with the text-path changes and preserves cache semantics end‑to‑end. No issues spotted here.core/schemas/chatcompletions.go (1)
244-250: Chat-level CacheControl schema additions look consistent and non-breaking.Adding
CacheControl *CacheControltoChatToolandChatContentBlock, plus theCacheControlType/CacheControldefinitions, cleanly models ephemeral/TTL semantics at the chat schema level. Using pointer fields withomitemptypreserves backward compatibility, and the shape matches the Responses-side CacheControl struct, so conversions can stay symmetric.Also applies to: 592-613
core/providers/anthropic/chat.go (1)
42-72: CacheControl propagation across Anthropic chat/tool paths is wired correctly.Tool definitions, system text blocks, tool-result text, and user/assistant text content all now copy
CacheControlbetweenschemas.Chat*andAnthropicContentBlock/AnthropicTool, and the reverse mappings inToBifrostChatResponse/ToAnthropicChatResponsepreserve it. The only path that drops CacheControl is where a single text block is collapsed to a plain string, which is expected since there’s no per-block metadata in that representation. Looks good.Also applies to: 139-159, 167-201, 218-250, 318-333, 504-518
core/providers/anthropic/responses.go (2)
1340-1347: System message conversion now correctly handles CacheControl and empty-block cases.
ToBifrostResponsesRequestdelegating system handling toConvertAnthropicMessagesToBifrostMessages, plus the newconvertAnthropicSystemToBifrostMessagesimplementation, fixes the prior empty‑ContentBlocks issue and preservesCacheControlon system text blocks. System content is emitted only when there’s actual text (eitherContentStror non‑emptyContentBlocks), which should be safer for downstream consumers.Also applies to: 2065-2097
2260-2267: CacheControl is propagated comprehensively across Responses text/image/tool conversions.The new assignments of
CacheControl:
- On text blocks in tool outputs (
toolMsgContentBlocksfor function/MCP/computer outputs),- On Anthropic→Bifrost text blocks for both input and output messages,
- In
convertContentBlockToAnthropicfor text and image blocks (viaConvertToAnthropicImageBlock),- In
toBifrostResponsesImageBlock, and- Between
AnthropicTool.CacheControlandschemas.ResponsesTool.CacheControlin bothconvertAnthropicToolToBifrostandconvertBifrostToolToAnthropicgive you a symmetric, end‑to‑end CacheControl story across Responses and Anthropic. No correctness issues stood out in these paths.
Also applies to: 2365-2383, 2488-2507, 2559-2567, 3325-3347, 3376-3383, 2982-3084, 3257-3265
core/providers/bedrock/responses.go (2)
1703-1712: Cached token usage propagation from Bifrost to Bedrock looks correctPopulating
CacheReadInputTokensandCacheWriteInputTokensfromUsage.InputTokensDetails.CachedTokensandUsage.OutputTokensDetails.CachedTokensinToBedrockConverseResponsematches the schema and lets the Bedrock integration tests assert cache reads/writes viausage.cacheReadInputTokens/cacheWriteInputTokens. No issues here.
2524-2686: Content block CachePoint handling is consistent with Bifrost CacheControlThe new logic for content‑block caching appears coherent:
convertBifrostResponsesMessageContentBlocksToBedrockContentBlocks:
- Emits a normal
BedrockContentBlockfor each supported Bifrost block type.- When
block.CacheControl != nil, appends a secondBedrockContentBlockwith onlyCachePointset. Unknown block types are skipped entirely, including their cache control, which is acceptable.
convertSingleBedrockMessageToBifrostMessages:
- On encountering a
BedrockContentBlockwithCachePointonly, it back‑patchesCacheControl{Type: Ephemeral}onto the last output message’s last content block, matching the emission pattern above.- It safely no‑ops if there are no prior messages or content blocks.
This should make cache control on text (and reasoning) content blocks round‑trip correctly between Bedrock and the Responses schema, and underpins the system/messages prompt‑caching tests.
No functional issues detected.
Also applies to: 2737-2785
tests/integrations/tests/utils/common.py (1)
226-286: Shared large-context prompt constant is appropriate for caching testsDefining
PROMPT_CACHING_LARGE_CONTEXThere and repeating it to guarantee a sizable token footprint is a good approach for exercising prompt caching across providers. No issues from a test or maintenance perspective.tests/integrations/tests/test_anthropic.py (2)
39-39: Doc list entry for Test 28 is consistent and clearThe new “Prompt caching – cache creation and cache hit validation” entry correctly documents
test_28_prompt_cachingand keeps the numbered scenario list in sync. No changes needed.
80-83: Shared large-context import is used appropriatelyImporting
PROMPT_CACHING_LARGE_CONTEXThere and reusing it intest_28_prompt_cachingkeeps the test focused on caching behavior and avoids duplicating a big literal in this file. This is a good use of the shared test constant.
5da39b0 to
7f9b310
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
♻️ Duplicate comments (1)
core/providers/openai/types.go (1)
178-194: Avoid mutatingOpenAIResponsesRequestInputArraywhen strippingCacheControl.The loop currently clears
CacheControlin-place onr.OpenAIResponsesRequestInputArray(and its nested content/output blocks). Becausemsg.ContentandResponsesFunctionToolCallOutputBlocksshare backing data with the caller, this silently mutates the original request structure and can break any downstream reuse (logging, retries, multiple sends).Build a sanitized copy of the messages (and their relevant nested slices) before marshaling so the caller’s data remains unchanged.
func (r *OpenAIResponsesRequestInput) MarshalJSON() ([]byte, error) { if r.OpenAIResponsesRequestInputStr != nil { return sonic.Marshal(*r.OpenAIResponsesRequestInputStr) } if r.OpenAIResponsesRequestInputArray != nil { - // Set CacheControl to nil for all content blocks to exclude it from marshaling - for _, msg := range r.OpenAIResponsesRequestInputArray { - // Remove CacheControl from message content blocks - if msg.Content != nil && msg.Content.ContentBlocks != nil { - for i := range msg.Content.ContentBlocks { - msg.Content.ContentBlocks[i].CacheControl = nil - } - } - // Remove CacheControl from tool message output blocks - if msg.ResponsesToolMessage != nil && msg.ResponsesToolMessage.Output != nil { - if msg.ResponsesToolMessage.Output.ResponsesFunctionToolCallOutputBlocks != nil { - for i := range msg.ResponsesToolMessage.Output.ResponsesFunctionToolCallOutputBlocks { - msg.ResponsesToolMessage.Output.ResponsesFunctionToolCallOutputBlocks[i].CacheControl = nil - } - } - } - } - return sonic.Marshal(r.OpenAIResponsesRequestInputArray) + // Work on a copy so we don't mutate the caller's data. + msgsCopy := make([]schemas.ResponsesMessage, len(r.OpenAIResponsesRequestInputArray)) + copy(msgsCopy, r.OpenAIResponsesRequestInputArray) + + for i := range msgsCopy { + msg := &msgsCopy[i] + + // Strip CacheControl from message content blocks + if msg.Content != nil && msg.Content.ContentBlocks != nil { + blocksCopy := make([]schemas.ResponsesMessageContentBlock, len(msg.Content.ContentBlocks)) + copy(blocksCopy, msg.Content.ContentBlocks) + for j := range blocksCopy { + blocksCopy[j].CacheControl = nil + } + msg.Content.ContentBlocks = blocksCopy + } + + // Strip CacheControl from tool message output blocks + if msg.ResponsesToolMessage != nil && + msg.ResponsesToolMessage.Output != nil && + msg.ResponsesToolMessage.Output.ResponsesFunctionToolCallOutputBlocks != nil { + + outputBlocksCopy := make( + []schemas.ResponsesMessageContentBlock, + len(msg.ResponsesToolMessage.Output.ResponsesFunctionToolCallOutputBlocks), + ) + copy(outputBlocksCopy, msg.ResponsesToolMessage.Output.ResponsesFunctionToolCallOutputBlocks) + for j := range outputBlocksCopy { + outputBlocksCopy[j].CacheControl = nil + } + msg.ResponsesToolMessage.Output.ResponsesFunctionToolCallOutputBlocks = outputBlocksCopy + } + } + return sonic.Marshal(msgsCopy) } return sonic.Marshal(nil) }
🧹 Nitpick comments (5)
tests/integrations/tests/utils/common.py (1)
88-281: Prompt‑caching test fixtures look solid; duplication is acceptable here.The
PROMPT_CACHING_TOOLSdefinitions andPROMPT_CACHING_LARGE_CONTEXTblob provide realistic, rich fixtures for cache‑behavior tests and are structurally consistent with existing tool/message formats. If you start evolving tool schemas heavily, consider reusingWEATHER_TOOL/CALCULATOR_TOOLto avoid drift, but it's not required for this PR.Also applies to: 421-481
tests/integrations/tests/test_anthropic.py (1)
1421-1485: Consider relaxing cache token comparison assertion.Line 1481 asserts that cache read tokens should be within 100 tokens of cache write tokens:
assert abs(cache_write_tokens - cache_read_tokens) < 100This assertion may be overly strict. Cache read and write token counts can legitimately differ due to:
- Provider-specific tokenization variations
- Cache granularity differences
- Metadata overhead
Consider either:
- Removing this assertion entirely (the individual validations in the helper functions are sufficient)
- Increasing the tolerance threshold
- Making it a warning rather than a hard failure
- assert abs(cache_write_tokens - cache_read_tokens) < 100, \ - f"Cache read tokens ({cache_read_tokens}) should be close to cache creation tokens ({cache_write_tokens})" + # Log token difference for debugging but don't enforce strict equality + print(f"Cache token difference: {abs(cache_write_tokens - cache_read_tokens)} tokens")Note: The unused parameter warnings from static analysis are false positives - these are standard pytest fixture patterns.
core/providers/anthropic/responses.go (2)
2065-2097: System content → Bifrost messages helper is consistent, with a minor type nitThe new
convertAnthropicSystemToBifrostMessagescorrectly:
- Emits a single system message for
ContentStr.- Emits one system message with text
ContentBlocks, propagatingCacheControlfrom Anthropic blocks.One minor consistency nit: these system messages are inputs, but system
ContentBlocksuseResponsesOutputMessageContentTypeText. Elsewhere (e.g.,convertAnthropicContentBlocksToResponsesMessages) non‑output messages useResponsesInputMessageContentBlockTypeText. If no downstream code depends on the “output_text” type here, consider switching to the input variant for clarity and consistency.- Type: schemas.ResponsesOutputMessageContentTypeText, + Type: schemas.ResponsesInputMessageContentBlockTypeText,
2174-2180: CacheControl propagation across Anthropic↔Bifrost looks correct; a couple of small consistency gapsThe new CacheControl wiring is generally solid:
- Input messages (
convertAnthropicContentBlocksToResponsesMessagesGroupednon‑output branch, andconvertAnthropicContentBlocksToResponsesMessagesinput branch) now emitContentBlockswithCacheControlon text blocks.- Tool result conversions (
AnthropicContentBlockTypeToolResultpaths in both grouped/ungrouped converters) attachCacheControlon inner text blocks.- Tool definitions (
convertAnthropicToolToBifrost/convertBifrostToolToAnthropic) mirrorCacheControlboth ways.convertContentBlockToAnthropicmirrorsCacheControlfor text and image blocks, andtoBifrostResponsesImageBlockpreserves it back into Bifrost.Two minor improvements for completeness:
Grouped output text path drops
CacheControl
InconvertAnthropicContentBlocksToResponsesMessagesGrouped, theisOutputMessagecase forAnthropicContentBlockTypeTextstill ignoresblock.CacheControl:accumulatedTextContent = append(accumulatedTextContent, schemas.ResponsesMessageContentBlock{ Type: schemas.ResponsesOutputMessageContentTypeText, Text: block.Text, })Even if that path isn’t currently used for outputs, it’s cheap and safer to keep it consistent:
- accumulatedTextContent = append(accumulatedTextContent, schemas.ResponsesMessageContentBlock{ - Type: schemas.ResponsesOutputMessageContentTypeText, - Text: block.Text, - }) + accumulatedTextContent = append(accumulatedTextContent, schemas.ResponsesMessageContentBlock{ + Type: schemas.ResponsesOutputMessageContentTypeText, + Text: block.Text, + CacheControl: block.CacheControl, + })Stale comment on input text path
InconvertAnthropicContentBlocksToResponsesMessages, the comment still says “For input messages, use ContentStr” but the implementation now (correctly) usesContentBlockswithResponsesInputMessageContentBlockTypeText. Consider updating the comment to avoid confusion.Overall behavior looks correct; these are just small consistency cleanups.
Also applies to: 2262-2266, 2377-2380, 2390-2395, 2495-2498, 2560-2563, 3082-3084, 3262-3264, 3330-3333, 3343-3344, 3382-3383
core/providers/bedrock/responses.go (1)
2378-2385: System/content CachePoint handling is consistent both directions; corner-case worth notingAcross the helpers:
convertBifrostMessageToBedrockSystemMessagesemits[Text, CachePoint]pairs when a system text block hasCacheControl.convertBedrockSystemMessageToBifrostMessagesinterpretsBedrockSystemMessage.CachePointby marking the last content block of the last emitted system message asCacheControl{Type: Ephemeral}.convertBifrostResponsesMessageContentBlocksToBedrockContentBlocksemits[block, CachePoint]for any content block withCacheControl.convertSingleBedrockMessageToBifrostMessagesinterpretsBedrockContentBlock.CachePointby tagging the last content block of the last emitted Bifrost message with EphemeralCacheControl.This gives a symmetric representation for both system and regular content blocks and correctly localizes the cache marker to the preceding block.
One behavior to be aware of: if Bedrock ever sends a
CachePointas the first element (before any text/content), we’ll silently ignore it because there’s no prior Bifrost message/block to tag. If that’s not allowed by the upstream API, no change needed; if it is allowed, we might eventually want to accumulate such “orphan” CachePoints and apply them when the first content block is created.Also applies to: 2399-2420, 2450-2484, 2657-2666, 2747-2781
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (15)
core/internal/testutil/test_retry_framework.go(5 hunks)core/providers/anthropic/chat.go(6 hunks)core/providers/anthropic/responses.go(13 hunks)core/providers/anthropic/types.go(2 hunks)core/providers/anthropic/utils.go(1 hunks)core/providers/bedrock/chat.go(1 hunks)core/providers/bedrock/responses.go(8 hunks)core/providers/bedrock/types.go(6 hunks)core/providers/bedrock/utils.go(3 hunks)core/providers/openai/types.go(1 hunks)core/schemas/chatcompletions.go(2 hunks)core/schemas/responses.go(2 hunks)tests/integrations/tests/test_anthropic.py(5 hunks)tests/integrations/tests/test_bedrock.py(3 hunks)tests/integrations/tests/utils/common.py(2 hunks)
🚧 Files skipped from review as they are similar to previous changes (3)
- core/providers/bedrock/utils.go
- core/providers/bedrock/chat.go
- core/providers/anthropic/utils.go
🧰 Additional context used
📓 Path-based instructions (1)
**
⚙️ CodeRabbit configuration file
always check the stack if there is one for the current PR. do not give localized reviews for the PR, always see all changes in the light of the whole stack of PRs (if there is a stack, if there is no stack you can continue to make localized suggestions/reviews)
Files:
core/providers/anthropic/chat.gocore/schemas/responses.gocore/providers/anthropic/types.gocore/schemas/chatcompletions.gocore/providers/openai/types.gocore/providers/bedrock/types.gotests/integrations/tests/test_bedrock.pycore/providers/anthropic/responses.gotests/integrations/tests/test_anthropic.pycore/internal/testutil/test_retry_framework.gotests/integrations/tests/utils/common.pycore/providers/bedrock/responses.go
🧠 Learnings (4)
📚 Learning: 2025-12-09T17:07:42.007Z
Learnt from: qwerty-dvorak
Repo: maximhq/bifrost PR: 1006
File: core/schemas/account.go:9-18
Timestamp: 2025-12-09T17:07:42.007Z
Learning: In core/schemas/account.go, the HuggingFaceKeyConfig field within the Key struct is currently unused and reserved for future Hugging Face inference endpoint deployments. Do not flag this field as missing from OpenAPI documentation or require its presence in the API spec until the feature is actively implemented and used. When the feature is added, update the OpenAPI docs accordingly; otherwise, treat this field as non-breaking and not part of the current API surface.
Applied to files:
core/providers/anthropic/chat.gocore/schemas/responses.gocore/providers/anthropic/types.gocore/schemas/chatcompletions.gocore/providers/openai/types.gocore/providers/bedrock/types.gocore/providers/anthropic/responses.gocore/internal/testutil/test_retry_framework.gocore/providers/bedrock/responses.go
📚 Learning: 2025-12-12T10:28:54.988Z
Learnt from: Pratham-Mishra04
Repo: maximhq/bifrost PR: 978
File: core/providers/anthropic/chat.go:207-216
Timestamp: 2025-12-12T10:28:54.988Z
Learning: In core/providers/anthropic/chat.go, within ToAnthropicChatRequest, ensure the mapping from ReasoningDetails to Anthropic thinking blocks preserves all fields from ReasoningDetails, including when Text is nil or empty. Do not drop or coerce missing Text; propagate nil/empty values to maintain complete bidirectional mapping of Anthropic events in Bifrost. Add tests to verify that ReasoningDetails with Text nil or "" round-trip without loss, and update code to handle nullable Text without dropping data. If there are multiple blocks, preserve order and all fields during mapping.
Applied to files:
core/providers/anthropic/chat.go
📚 Learning: 2025-12-11T11:58:25.307Z
Learnt from: Pratham-Mishra04
Repo: maximhq/bifrost PR: 1000
File: core/providers/openai/responses.go:42-84
Timestamp: 2025-12-11T11:58:25.307Z
Learning: In core/providers/openai/responses.go (and related OpenAI response handling), document and enforce the API format constraint: if ResponsesReasoning != nil and the response contains content blocks, all content blocks should be treated as reasoning blocks by default. Implement type guards or parsing logic accordingly, and add unit tests to verify that when ResponsesReasoning is non-nil, content blocks are labeled as reasoning blocks. Include clear comments in the code explaining the rationale and ensure downstream consumers rely on this behavior.
Applied to files:
core/providers/openai/types.go
📚 Learning: 2025-12-14T14:43:30.902Z
Learnt from: Radheshg04
Repo: maximhq/bifrost PR: 980
File: core/providers/openai/images.go:10-22
Timestamp: 2025-12-14T14:43:30.902Z
Learning: Enforce the OpenAI image generation SSE event type values across the OpenAI image flow in the repository: use "image_generation.partial_image" for partial chunks, "image_generation.completed" for the final result, and "error" for errors. Apply this consistently in schemas, constants, tests, accumulator routing, and UI code within core/providers/openai (and related Go files) to ensure uniform event typing and avoid mismatches.
Applied to files:
core/providers/openai/types.go
🧬 Code graph analysis (6)
core/providers/anthropic/chat.go (3)
core/schemas/chatcompletions.go (2)
CacheControl(610-613)ChatContentBlockTypeText(584-584)core/providers/gemini/types.go (1)
Type(782-782)core/providers/anthropic/types.go (1)
AnthropicContentBlockTypeText(204-204)
core/schemas/responses.go (1)
core/schemas/chatcompletions.go (1)
CacheControl(610-613)
core/providers/anthropic/types.go (1)
core/schemas/chatcompletions.go (1)
CacheControl(610-613)
core/providers/openai/types.go (2)
core/schemas/chatcompletions.go (1)
CacheControl(610-613)core/schemas/responses.go (1)
ResponsesToolMessage(469-489)
core/providers/anthropic/responses.go (3)
core/schemas/responses.go (6)
ResponsesMessage(318-331)ResponsesInputMessageRoleSystem(338-338)ResponsesMessageContent(343-348)ResponsesMessageContentBlock(403-418)ResponsesOutputMessageContentTypeText(396-396)ResponsesInputMessageContentBlockTypeText(392-392)core/schemas/chatcompletions.go (1)
CacheControl(610-613)core/providers/anthropic/types.go (1)
AnthropicContentBlockTypeText(204-204)
tests/integrations/tests/test_anthropic.py (1)
tests/integrations/tests/utils/parametrize.py (2)
get_cross_provider_params_for_scenario(12-47)format_provider_model(50-65)
🪛 Ruff (0.14.8)
tests/integrations/tests/test_bedrock.py
1624-1624: Unused method argument: test_config
(ARG002)
1674-1674: Unused method argument: test_config
(ARG002)
1725-1725: Unused method argument: test_config
(ARG002)
tests/integrations/tests/test_anthropic.py
1422-1422: Unused method argument: anthropic_client
(ARG002)
1422-1422: Unused method argument: test_config
(ARG002)
1488-1488: Unused method argument: anthropic_client
(ARG002)
1488-1488: Unused method argument: test_config
(ARG002)
1562-1562: Unused method argument: anthropic_client
(ARG002)
1562-1562: Unused method argument: test_config
(ARG002)
🔇 Additional comments (23)
core/internal/testutil/test_retry_framework.go (3)
832-834: LGTM! Attempt parameter semantics now consistent.The comment correctly documents that
attemptrepresents the current failed attempt number, and the log displaysattempt+1to show the next attempt. This resolves the inconsistency flagged in the previous review where different retry functions had different semantics.
2153-2158: Excellent logging improvements for observability.The added logs provide clear visibility into:
- Attempt start/progression (lines 2153-2158)
- Retry errors with context (lines 2205-2209)
- Waiting time before retries (line 2218)
These enhancements will make debugging test failures much easier.
Also applies to: 2205-2209, 2218-2218
2211-2212: LGTM! Attempt number semantics now consistent across all retry functions.The updated comments (lines 2211, 2246, 2317) confirm that
config.OnRetrynow receives the current failed attempt number, makingWithResponsesStreamValidationRetryconsistent with other retry helpers likeWithStreamRetryandWithChatTestRetry.The enhanced nil channel handling (lines 2244-2254) properly logs and retries, and the CRITICAL comment at line 2254 prevents potential regression where the function might return early instead of retrying.
Also applies to: 2244-2254, 2317-2318
core/schemas/responses.go (1)
416-418: CacheControl propagation on responses content blocks and tools looks consistent.Adding
CacheControl *CacheControltoResponsesMessageContentBlockandResponsesToolwithomitemptyis a clean, backward‑compatible way to carry provider cache metadata through the unified responses schema. No marshal/unmarshal conflicts in this file.Also applies to: 1042-1044
core/schemas/chatcompletions.go (1)
246-249: Chat CacheControl schema extension is well‑scoped and backwards‑compatible.Introducing
CacheControlType/CacheControland wiringCacheControl *CacheControlintoChatToolandChatContentBlockcleanly models ephemeral/TTL behavior while keeping existing payloads valid (omitempty, pointer fields, no custom marshalling to update).Also applies to: 592-613
core/providers/anthropic/types.go (1)
218-231: Anthropic CacheControl wiring matches shared schema and is safe.Adding
CacheControl *schemas.CacheControltoAnthropicContentBlockandAnthropicTool(withomitempty) correctly exposes provider cache metadata without affecting existing flows or custom (un)marshalling behavior in this file.Also applies to: 292-297
core/providers/anthropic/chat.go (6)
65-67: LGTM: Tool cache control propagation is correct.The implementation properly propagates
CacheControlfrom Bifrost tools to Anthropic tools with appropriate nil checking.
149-152: LGTM: System message cache control is properly implemented.The system content blocks correctly include
Type,Text, andCacheControlfields, maintaining consistency with the schema.
184-187: LGTM: Tool result cache control propagation is consistent.The implementation correctly propagates
CacheControlfor text blocks within tool results, following the same pattern as other content blocks.
241-244: LGTM: Message content cache control is properly handled.The propagation of
CacheControlfor user and assistant message content blocks is correctly implemented and consistent with the overall pattern.
327-330: LGTM: Bidirectional cache control mapping is correctly implemented.The conversion from Anthropic to Bifrost properly preserves
CacheControlmetadata, ensuring cache information is available to downstream consumers.
513-516: LGTM: Reverse conversion preserves cache control.The Bifrost-to-Anthropic conversion correctly includes
CacheControl, completing the bidirectional mapping required for proper cache metadata flow.tests/integrations/tests/test_bedrock.py (3)
1673-1722: LGTM: Messages caching test is correctly structured.The test properly validates prompt caching at the message level, following a consistent pattern of cache write validation followed by cache read validation.
1724-1785: LGTM: Tools caching test is properly implemented.The test correctly validates prompt caching with tool definitions, converting tools to Bedrock format and adding a cache checkpoint after all tool definitions.
1787-1813: LGTM: Cache validation helpers are well-designed.The
validate_cache_writeandvalidate_cache_readfunctions provide clear, reusable validation logic with helpful debug output for troubleshooting cache behavior.tests/integrations/tests/test_anthropic.py (3)
1487-1559: LGTM: Messages caching test is correctly implemented.The test properly validates prompt caching at the message level using Anthropic's
cache_controlformat with ephemeral type.
1561-1610: LGTM: Tools caching test uses correct Anthropic format.The test properly implements tools caching by adding
cache_controldirectly to the last tool definition, which is the correct approach for the Anthropic API (vs. Bedrock's separate cachePoint).
1665-1695: LGTM: Cache validation helpers correctly use Anthropic SDK format.The helpers properly use snake_case attribute names (
cache_creation_input_tokens,cache_read_input_tokens) consistent with the Anthropic SDK, and safely access attributes usinggetattrwith defaults.core/providers/anthropic/responses.go (1)
1340-1346: System + grouped-tool routing intoConvertAnthropicMessagesToBifrostMessageslooks correctPassing
request.SystemintoConvertAnthropicMessagesToBifrostMessagesand togglingkeepToolsGroupedbased on the provider centralizes system handling and Bedrock-specific grouping without changing external signatures; this wiring looks sound and matches the rest of the file’s conversion patterns.core/providers/bedrock/responses.go (2)
1244-1290: Tool-level CachePoint ↔ CacheControl mapping is symmetric and scoped correctlyThe new logic:
- In
ToBifrostResponsesRequest, when a Bedrock tool entry hasCachePointand noToolSpec, you tag the previous Bifrost tool withCacheControl{Type: Ephemeral}.- In
ToBedrockResponsesRequest, when a Bifrost tool has non‑nilCacheControl, you append a syntheticBedrockToolwithCachePoint{Type: default}immediately after itsToolSpec.This cleanly round‑trips tool cache metadata while preserving existing ToolSpec behavior. The “attach to last tool” semantics match how the Bedrock tools array is constructed here, so there’s no obvious ordering bug.
Also applies to: 1535-1583
1701-1713: Usage cache token fields are now round-tripped correctlyMapping:
bifrostResp.Usage.InputTokensDetails.CachedTokens→BedrockTokenUsage.CacheReadInputTokensbifrostResp.Usage.OutputTokensDetails.CachedTokens→BedrockTokenUsage.CacheWriteInputTokensmatches the earlier Bedrock→Bifrost mapping in
BedrockConverseResponse.ToBifrostResponsesResponse, so cached token counts now flow both directions without loss. No issues here.core/providers/bedrock/types.go (2)
155-160: Bedrock CachePoint types/fields align with response conversion logicAdding:
CachePoint *BedrockCachePointtoBedrockSystemMessage,BedrockContentBlock, andBedrockTool.BedrockCachePointType+BedrockCachePointTypeDefaultandBedrockCachePointstruct.is consistent with how responses.go now converts CachePoint↔CacheControl at the system, content, and tool levels. These are optional JSON fields, so they’re wire‑compatible, and their usage is localized to Bedrock conversions.
No issues from a type/struct perspective.
Also applies to: 163-190, 192-201, 284-287
640-646: ETag field definition is correct for these new types — no action required
BedrockFileRetrieveRequest,BedrockFileDeleteRequest, andBedrockFileContentRequestare new struct types in this commit withETag string json:"etag". These types are properly handled in the codebase:
- ETag is explicitly populated at all call sites (bedrock.go lines 903, 910, 917)
- Tests verify ETag values are present
- These request types are never marshaled back to external clients (they're internal request ↔ conversion layer)
The JSON struct tag without
omitemptydoes not impose a requirement on clients to send the field during unmarshaling — it only affects whether the field appears in marshaled output. Since the code consistently populates ETag internally before any serialization, the current definition is appropriate.Likely an incorrect or invalid review comment.
7f9b310 to
6fe2adf
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
♻️ Duplicate comments (3)
core/providers/openai/types.go (1)
178-194: Data mutation issue persists from previous review.The code still modifies the original input data structure by setting
CacheControl = nilon content blocks and tool output blocks. This issue was previously flagged with a comprehensive fix suggestion. The newly added tool output block handling (lines 186-193) follows the same problematic mutation pattern.Please refer to the earlier review comment for the detailed solution approach, which includes creating deep copies before modification to avoid mutating shared data structures.
tests/integrations/tests/test_anthropic.py (2)
1487-1559: Message‑level caching test mirrors system test; same unused‑arg issue
test_29_prompt_caching_messagescorrectly:
- places the cached span in the middle of the user content with
cache_control,- asserts cache creation on the first call and read on the second,
- reuses the same large context to probe actual cache hits.
Same as test 28,
anthropic_clientandtest_configare unused and will keep tripping ARG002; please apply the same fix (drop, underscore, or# noqa: ARG002) consistently across these prompt‑caching tests.
1561-1610: Tool‑level caching test matches provider semantics; address unused fixtures
test_30_prompt_caching_toolscleanly:
- converts
PROMPT_CACHING_TOOLSto Anthropic’s tool format,- adds
cache_controlonly on the last tool, and- verifies that a second call with the same tool set reads from cache.
Again,
anthropic_clientandtest_configare unused here; consider the same ARG002 mitigation as for tests 28–29 so Ruff stays quiet.
🧹 Nitpick comments (6)
tests/integrations/tests/utils/common.py (1)
88-281: Prompt caching test data looks solid; consider minor deduplicationThe new
PROMPT_CACHING_TOOLSset andPROMPT_CACHING_LARGE_CONTEXTgive you a good variety and sufficient token volume for cache behavior tests. There is some duplication with existing tool constants (e.g., weather/calculate) and the huge inline legal text blob, but it’s all test‑only and harmless. If this needs maintenance later, you could:
- Reuse
WEATHER_TOOL/CALCULATOR_TOOLwhen buildingPROMPT_CACHING_TOOLS, and- Optionally move the large context into a shorter base paragraph plus a small loop to repeat/compose it, or into a separate fixture module.
Not required for correctness, just a readability/DRY tweak.
Also applies to: 421-481
tests/integrations/tests/test_bedrock.py (1)
1623-1785: Prompt caching tests cover system/messages/tools checkpoints appropriatelyThe three new tests exercise Bedrock prompt caching semantics across system, message content, and tool definitions, and the
validate_cache_write/validate_cache_readhelpers assert against the correct Bedrockusagefields (cacheWriteInputTokens/cacheReadInputTokens). The parametrization overprompt_cachingproviders ties cleanly back to the config.One small cleanup opportunity: in
test_23_prompt_caching_tools, you manually rebuild the Bedrock tool specs fromPROMPT_CACHING_TOOLS. You could reuseconvert_to_bedrock_toolsto avoid duplicating the"toolSpec"wiring, e.g.:tools_config = convert_to_bedrock_tools(PROMPT_CACHING_TOOLS) bedrock_tools = tools_config["tools"] bedrock_tools.append({"cachePoint": {"type": "default"}}) tool_config = {"tools": bedrock_tools}Purely optional; current implementation is correct as written.
core/providers/bedrock/utils.go (1)
598-667: Tool‑level CachePoint injection: semantics and potential dedupAppending an extra
BedrockToolthat only carriesCachePointwhentool.CacheControl != nil && !schemas.IsNovaModel(model)is a reasonable way to surface per‑tool caching to Bedrock without touchingToolSpec, and the Nova guard matches the intent to avoid unsupported cache hints.Two follow‑ups to consider:
- If multiple tools have
CacheControl, you’ll send several identical{ "cachePoint": { "type": "default" } }entries. That’s likely harmless but could be deduped for cleaner requests.- Cache metadata from
tool.CacheControl(e.g. TTL) isn’t represented here; if Bedrock ever surfaces more granular cache config, this may need extending.Functionally this looks fine; please just double‑check the Bedrock cachePoint contract for Nova/non‑Nova models and whether a single cachePoint entry would suffice for multiple cached tools.
tests/integrations/tests/test_anthropic.py (1)
1421-1485: System prompt‑caching test is sound; drop unused fixtures to satisfy RuffThe flow for
test_28_prompt_caching_systemcorrectly:
- uses a large cached system block with
cache_control: {"type": "ephemeral"},- verifies first call writes cache and second call reads from it via
usage.cache_creation_input_tokens/usage.cache_read_input_tokens,- checks read/write token counts are close.
Static analysis is right that
anthropic_clientandtest_configaren’t used here; you can either:
- remove them from the signature, or
- rename to
_anthropic_client/_test_config, or- add a
# noqa: ARG002comment if you want to keep the fixture signature uniform.core/providers/anthropic/responses.go (1)
2163-2184: Grouped converter: CacheControl preserved, but block type for input text is slightly inconsistentFor non‑output messages in
convertAnthropicContentBlocksToResponsesMessagesGrouped, you now emit a singleResponsesMessagewhoseContentBlocks[0]carries:
Type: ResponsesOutputMessageContentTypeTextText: block.TextCacheControl: block.CacheControlFunctionally this works (downstream converters treat input/output text types similarly), and it correctly propagates CacheControl. For consistency with the non‑grouped path (which uses
ResponsesInputMessageContentBlockTypeTextfor inputs), consider switching to the input text type here as well.core/providers/bedrock/types.go (1)
640-678: ETag requirement on file requests: clarify caller expectationsAdding mandatory
etagfields to:
BedrockFileRetrieveRequest,BedrockFileDeleteRequest,BedrockFileContentRequestis a good way to support conditional operations and stronger consistency against S3. Since the JSON tag omits
omitempty, these will always be serialized; ensure all call sites either:
- always populate
ETag(preferred for concurrency), or- treat an empty string as “no constraint” if that’s acceptable.
Might be worth documenting this requirement alongside the higher‑level file API helpers in this stack.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (17)
core/internal/testutil/test_retry_framework.go(5 hunks)core/providers/anthropic/chat.go(6 hunks)core/providers/anthropic/responses.go(13 hunks)core/providers/anthropic/types.go(2 hunks)core/providers/anthropic/utils.go(1 hunks)core/providers/bedrock/chat.go(1 hunks)core/providers/bedrock/responses.go(8 hunks)core/providers/bedrock/types.go(6 hunks)core/providers/bedrock/utils.go(6 hunks)core/providers/openai/types.go(1 hunks)core/schemas/chatcompletions.go(2 hunks)core/schemas/responses.go(2 hunks)core/schemas/utils.go(1 hunks)tests/integrations/config.yml(7 hunks)tests/integrations/tests/test_anthropic.py(5 hunks)tests/integrations/tests/test_bedrock.py(3 hunks)tests/integrations/tests/utils/common.py(2 hunks)
🚧 Files skipped from review as they are similar to previous changes (5)
- core/providers/anthropic/utils.go
- core/schemas/responses.go
- core/internal/testutil/test_retry_framework.go
- core/providers/anthropic/types.go
- core/providers/bedrock/responses.go
🧰 Additional context used
📓 Path-based instructions (1)
**
⚙️ CodeRabbit configuration file
always check the stack if there is one for the current PR. do not give localized reviews for the PR, always see all changes in the light of the whole stack of PRs (if there is a stack, if there is no stack you can continue to make localized suggestions/reviews)
Files:
core/providers/openai/types.gotests/integrations/config.ymltests/integrations/tests/utils/common.pycore/providers/bedrock/chat.gotests/integrations/tests/test_bedrock.pycore/providers/bedrock/utils.gocore/schemas/chatcompletions.gotests/integrations/tests/test_anthropic.pycore/schemas/utils.gocore/providers/anthropic/chat.gocore/providers/bedrock/types.gocore/providers/anthropic/responses.go
🧠 Learnings (4)
📚 Learning: 2025-12-09T17:07:42.007Z
Learnt from: qwerty-dvorak
Repo: maximhq/bifrost PR: 1006
File: core/schemas/account.go:9-18
Timestamp: 2025-12-09T17:07:42.007Z
Learning: In core/schemas/account.go, the HuggingFaceKeyConfig field within the Key struct is currently unused and reserved for future Hugging Face inference endpoint deployments. Do not flag this field as missing from OpenAPI documentation or require its presence in the API spec until the feature is actively implemented and used. When the feature is added, update the OpenAPI docs accordingly; otherwise, treat this field as non-breaking and not part of the current API surface.
Applied to files:
core/providers/openai/types.gocore/providers/bedrock/chat.gocore/providers/bedrock/utils.gocore/schemas/chatcompletions.gocore/schemas/utils.gocore/providers/anthropic/chat.gocore/providers/bedrock/types.gocore/providers/anthropic/responses.go
📚 Learning: 2025-12-11T11:58:25.307Z
Learnt from: Pratham-Mishra04
Repo: maximhq/bifrost PR: 1000
File: core/providers/openai/responses.go:42-84
Timestamp: 2025-12-11T11:58:25.307Z
Learning: In core/providers/openai/responses.go (and related OpenAI response handling), document and enforce the API format constraint: if ResponsesReasoning != nil and the response contains content blocks, all content blocks should be treated as reasoning blocks by default. Implement type guards or parsing logic accordingly, and add unit tests to verify that when ResponsesReasoning is non-nil, content blocks are labeled as reasoning blocks. Include clear comments in the code explaining the rationale and ensure downstream consumers rely on this behavior.
Applied to files:
core/providers/openai/types.go
📚 Learning: 2025-12-14T14:43:30.902Z
Learnt from: Radheshg04
Repo: maximhq/bifrost PR: 980
File: core/providers/openai/images.go:10-22
Timestamp: 2025-12-14T14:43:30.902Z
Learning: Enforce the OpenAI image generation SSE event type values across the OpenAI image flow in the repository: use "image_generation.partial_image" for partial chunks, "image_generation.completed" for the final result, and "error" for errors. Apply this consistently in schemas, constants, tests, accumulator routing, and UI code within core/providers/openai (and related Go files) to ensure uniform event typing and avoid mismatches.
Applied to files:
core/providers/openai/types.go
📚 Learning: 2025-12-12T10:28:54.988Z
Learnt from: Pratham-Mishra04
Repo: maximhq/bifrost PR: 978
File: core/providers/anthropic/chat.go:207-216
Timestamp: 2025-12-12T10:28:54.988Z
Learning: In core/providers/anthropic/chat.go, within ToAnthropicChatRequest, ensure the mapping from ReasoningDetails to Anthropic thinking blocks preserves all fields from ReasoningDetails, including when Text is nil or empty. Do not drop or coerce missing Text; propagate nil/empty values to maintain complete bidirectional mapping of Anthropic events in Bifrost. Add tests to verify that ReasoningDetails with Text nil or "" round-trip without loss, and update code to handle nullable Text without dropping data. If there are multiple blocks, preserve order and all fields during mapping.
Applied to files:
core/providers/anthropic/chat.go
🧬 Code graph analysis (8)
core/providers/openai/types.go (2)
core/schemas/chatcompletions.go (1)
CacheControl(610-613)core/schemas/responses.go (1)
ResponsesToolMessage(469-489)
core/providers/bedrock/chat.go (1)
core/schemas/chatcompletions.go (4)
ChatContentBlock(592-602)ChatContentBlockTypeText(584-584)CacheControl(610-613)CacheControlTypeEphemeral(607-607)
tests/integrations/tests/test_bedrock.py (2)
tests/integrations/tests/utils/parametrize.py (2)
get_cross_provider_params_for_scenario(12-47)format_provider_model(50-65)tests/integrations/tests/test_anthropic.py (2)
validate_cache_write(1665-1679)validate_cache_read(1681-1695)
core/providers/bedrock/utils.go (2)
core/providers/bedrock/types.go (5)
BedrockContentBlock(163-190)BedrockCachePoint(199-201)BedrockCachePointTypeDefault(195-195)BedrockMessage(150-153)BedrockTool(284-287)core/schemas/utils.go (1)
IsNovaModel(1043-1045)
core/schemas/chatcompletions.go (2)
core/providers/gemini/types.go (1)
Type(782-782)ui/lib/types/logs.ts (1)
Function(144-149)
tests/integrations/tests/test_anthropic.py (2)
tests/integrations/tests/utils/parametrize.py (2)
get_cross_provider_params_for_scenario(12-47)format_provider_model(50-65)tests/integrations/tests/utils/common.py (1)
assert_valid_chat_response(801-833)
core/providers/anthropic/chat.go (3)
core/schemas/chatcompletions.go (2)
CacheControl(610-613)ChatContentBlockTypeText(584-584)core/providers/gemini/types.go (1)
Type(782-782)core/providers/anthropic/types.go (1)
AnthropicContentBlockTypeText(204-204)
core/providers/anthropic/responses.go (5)
core/schemas/responses.go (6)
ResponsesMessage(318-331)ResponsesInputMessageRoleSystem(338-338)ResponsesMessageContent(343-348)ResponsesMessageContentBlock(403-418)ResponsesOutputMessageContentTypeText(396-396)ResponsesInputMessageContentBlockTypeText(392-392)ui/lib/types/logs.ts (3)
ResponsesMessage(423-438)ResponsesMessageContent(400-400)ResponsesMessageContentBlock(362-397)core/schemas/utils.go (1)
Ptr(16-18)core/schemas/chatcompletions.go (1)
CacheControl(610-613)core/providers/anthropic/types.go (1)
AnthropicContentBlockTypeText(204-204)
🪛 Ruff (0.14.8)
tests/integrations/tests/test_bedrock.py
1624-1624: Unused method argument: test_config
(ARG002)
1674-1674: Unused method argument: test_config
(ARG002)
1725-1725: Unused method argument: test_config
(ARG002)
tests/integrations/tests/test_anthropic.py
1422-1422: Unused method argument: anthropic_client
(ARG002)
1422-1422: Unused method argument: test_config
(ARG002)
1488-1488: Unused method argument: anthropic_client
(ARG002)
1488-1488: Unused method argument: test_config
(ARG002)
1562-1562: Unused method argument: anthropic_client
(ARG002)
1562-1562: Unused method argument: test_config
(ARG002)
🔇 Additional comments (21)
core/schemas/utils.go (1)
1042-1045: LGTM!The
IsNovaModelfunction follows the established pattern of other model detection utilities in this file. The implementation is consistent withIsAnthropicModel,IsMistralModel, andIsGeminiModel.tests/integrations/config.yml (1)
109-127: Prompt caching configuration and capability mapping look consistentThe updated Bedrock chat model and the new
prompt_cachingflags per provider line up withscenario_capabilities.prompt_caching: "chat", so cross‑provider parametrization for the new tests will target only providers that actually support prompt caching (Anthropic + Bedrock) while leaving others disabled.Also applies to: 167-167, 205-205, 243-243, 281-281, 319-319, 362-362
core/schemas/chatcompletions.go (1)
246-250: CacheControl schema extensions are well-scoped and backward compatibleAdding
CacheControltoChatToolandChatContentBlock, plus theCacheControlType/CacheControltypes, follows existing schema patterns (pointer fields withomitempty) and should not disturb existing clients while enabling per‑tool and per‑block caching metadata.Also applies to: 591-613
core/providers/anthropic/chat.go (1)
65-67: Anthropic CacheControl propagation is symmetric and consistentForward and reverse conversions now carry
CacheControlon both tools and text content blocks (system, tool results, user/assistant messages), matching the new centralschemas.CacheControltype. This keeps Anthropic ↔ Bifrost mappings lossless for caching metadata without altering existing behavior for messages that don’t set it.Also applies to: 150-153, 184-188, 242-245, 328-331, 514-517
core/providers/bedrock/utils.go (3)
26-32: Pass model intoconvertToolConfigfor Nova gatingWiring
bifrostReq.ModelintoconvertToolConfigis appropriate and keeps model‑specific behavior (e.g. Nova cache restrictions) localized there. No issues seen with this change.
319-348: Per‑block CachePoint on tool results looks correctAttaching
CachePointon tool‑result text and image blocks only whenblock.CacheControl != nilpreserves existing behavior while enabling cache signaling, and the construction via a localtoolResultContentBlockavoids partial initialization issues. This segment looks good.
392-421: Content block CachePoint propagation is consistent
convertContentBlocknow mirrors the tool‑result logic by attaching aBedrockCachePoint{Type: BedrockCachePointTypeDefault}wheneverblock.CacheControlis present for text and image blocks. This is a straightforward, nil‑safe extension and aligns with the newCachePointfield onBedrockContentBlock.tests/integrations/tests/test_anthropic.py (2)
52-86: New prompt‑caching test fixtures and data look well‑integratedImporting
PROMPT_CACHING_TOOLSandPROMPT_CACHING_LARGE_CONTEXTfrom the shared utils keeps these Anthropic tests aligned with the cross‑provider scenarios. The additions fit the existing fixture/config patterns in this file.
1665-1695: Cache usage validators align with Anthropic usage schema
validate_cache_write/validate_cache_read:
- log
input_tokens,cache_creation_input_tokens, andcache_read_input_tokens,- assert presence of the relevant cache field and that it’s
> 0,- ensure write vs read operations don’t both happen in the same request.
Assuming the Anthropic SDK exposes these fields with the expected snake_case names (which matches the Go
AnthropicUsageJSON tags), this is the right assertion surface for prompt‑caching behavior.core/providers/anthropic/responses.go (10)
1340-1347: Unifying message+system conversion throughConvertAnthropicMessagesToBifrostMessagesPassing
request.SystemintoConvertAnthropicMessagesToBifrostMessagescentralizes system‑content handling with the regular message pipeline and lets the new system helper reuse CacheControl‑aware logic. This avoids duplicated system‑conversion code and looks like a solid refactor.
2065-2097: System content conversion now preserves blocks and avoids empty system messages
convertAnthropicSystemToBifrostMessages:
- Handles both simple
ContentStrand structuredContentBlocks.- Emits system messages only when there is non‑empty text (guarded by
len(contentBlocks) > 0).- Preserves per‑block
CacheControlwhen system content comes as blocks.This fixes the earlier empty‑ContentBlocks system‑message edge case and ensures cache directives on system text survive into Bifrost.
2253-2275: Tool‑result text blocks correctly carry CacheControl into tool outputsWithin tool‑result handling for Anthropic blocks, the mapping:
- chooses
blockTypebased onisOutputMessage, and- copies
contentBlock.CacheControlonto eachResponsesMessageContentBlock.This ensures cache directives on tool outputs are preserved when converting Anthropic tool results into Bifrost
FunctionCallOutputmessages. The same pattern is mirrored later in the non‑grouped converter.
2377-2398: Text content CacheControl propagation for input/output messages is symmetricIn
convertAnthropicContentBlocksToResponsesMessages, both branches now:
- wrap text in a
ContentBlocksslice (rather thanContentStr), and- attach
CacheControlto the text block for both output (output_text) and input (input_text) cases.That gives a single representation for cacheable text across the Responses pipeline and works cleanly with
convertBifrostContentBlocksToAnthropic.
2495-2507: Tool‑result CacheControl preserved in non‑grouped converter as wellThe non‑grouped variant mirrors the grouped one by copying
contentBlock.CacheControlonto each text block inside tool‑result outputs. This keeps CacheControl round‑tripped for tool outputs regardless of which converter path is used.
2560-2568: MCP tool‑result text blocks now also carry CacheControlFor MCP tool results, adding
CacheControl: contentBlock.CacheControlto each text block ensures MCP outputs participate in the same caching semantics as standard tools, which is important for consistent cross‑tool behavior.
3082-3084: Anthropic→Bifrost tool conversion: CacheControl now forwardedThe new block:
if tool.CacheControl != nil { bifrostTool.CacheControl = tool.CacheControl }means Anthropic
cache_controlon tools will be preserved when building BifrostResponsesTools, which is necessary for the new prompt‑caching tests that set tool‑level cache control.
3262-3264: Bifrost→Anthropic tool conversion: CacheControl round‑trip completedMirroring the previous change, propagating
tool.CacheControlback toAnthropicTool.CacheControlcloses the loop so cache metadata round‑trips across Anthropic↔Bifrost conversions.
3325-3347: Content block conversion to Anthropic now includes CacheControl (text & image)
convertContentBlockToAnthropicnow:
- copies
block.CacheControlinto text blocks, and- passes it through when wrapping image blocks via
ConvertToAnthropicImageBlock.Together with
convertBifrostContentBlocksToAnthropic, this ensures CacheControl on Bifrost content blocks is preserved in Anthropic request/response payloads.
3376-3383: Image block back‑conversion preserves CacheControl
toBifrostResponsesImageBlocknow setsCacheControl: block.CacheControlon the returnedResponsesMessageContentBlock, so image‑level cache hints from Anthropic are preserved when mapping back into Responses messages.core/providers/bedrock/types.go (2)
155-201: Bedrock CachePoint types and wiring look coherentAdding:
CachePoint *BedrockCachePointtoBedrockSystemMessageandBedrockContentBlock,BedrockCachePointType+BedrockCachePointTypeDefault, andBedrockCachePoint{Type BedrockCachePointType},matches how
convertContentBlock,convertToolMessage, andconvertToolConfignow populate cache hints. The JSON tagcachePointis consistent across all three structs and aligns with Bedrock’s Converse schema.
283-287: Tool‑level CachePoint field aligns with new tool config behaviorExtending
BedrockToolto:ToolSpec *BedrockToolSpec `json:"toolSpec,omitempty"` CachePoint *BedrockCachePoint `json:"cachePoint,omitempty"`matches the new
convertToolConfigbehavior that appends bare cachePoint tools. Struct layout and tags look correct; no issues here.
6fe2adf to
b4bc2f2
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
core/providers/anthropic/chat.go (1)
274-281: CacheControl is lost when collapsing single text block toContentStr.When there's exactly one text content block, this optimization collapses it into
ContentStr:if len(content) == 1 && content[0].Type == AnthropicContentBlockTypeText { anthropicMsg.Content = AnthropicContent{ContentStr: content[0].Text} }This discards any
CacheControlattached to that block. Consider preserving the block-based representation whenCacheControlis present:- if len(content) == 1 && content[0].Type == AnthropicContentBlockTypeText { + if len(content) == 1 && content[0].Type == AnthropicContentBlockTypeText && content[0].CacheControl == nil { anthropicMsg.Content = AnthropicContent{ContentStr: content[0].Text} } else if len(content) > 0 {
🧹 Nitpick comments (3)
tests/integrations/tests/test_anthropic.py (3)
1481-1482: Consider making the token proximity threshold configurable.The hardcoded 100-token threshold for validating cache token proximity might be fragile across different providers and models. Consider making this threshold configurable per provider or using a percentage-based approach.
Example:
- assert abs(cache_write_tokens - cache_read_tokens) < 100, \ + # Allow 5% variance or 100 tokens, whichever is larger + tolerance = max(100, int(cache_write_tokens * 0.05)) + assert abs(cache_write_tokens - cache_read_tokens) < tolerance, \ - f"Cache read tokens ({cache_read_tokens}) should be close to cache creation tokens ({cache_write_tokens})" + f"Cache read tokens ({cache_read_tokens}) should be close to cache creation tokens ({cache_write_tokens}) within {tolerance} tokens"
1487-1560: Consider adding proximity validation for consistency.Test 28 includes a proximity check to validate that cache read tokens are close to cache write tokens (line 1481-1482), but test 29 omits this validation. Consider adding it for consistency:
assert_valid_chat_response(response2) cache_read_tokens = validate_cache_read(response2.usage, "Second request") + # Validate that cache read tokens are approximately equal to cache creation tokens + tolerance = max(100, int(cache_write_tokens * 0.05)) + assert abs(cache_write_tokens - cache_read_tokens) < tolerance, \ + f"Cache read tokens ({cache_read_tokens}) should be close to cache creation tokens ({cache_write_tokens})" + print(f"✓ Messages caching validated - Cache created: {cache_write_tokens} tokens, " f"Cache read: {cache_read_tokens} tokens")
1561-1611: Consider adding proximity validation for consistency.Similar to test 29, test 30 omits the proximity check that test 28 includes. For consistency across all three caching tests, consider adding the validation here as well.
Additionally, the test doesn't validate that the responses are appropriate (e.g., that response1 attempts to use the weather tool). While not strictly required for a caching test, it might strengthen the test:
# After line 1591, optionally add: # Verify first response uses appropriate tool (weather) if hasattr(response1, "content") and len(response1.content) > 0: # Check if model attempted to use tools has_tool_use = any( hasattr(block, "type") and block.type == "tool_use" for block in response1.content ) # Tool use is expected but not required for caching test if has_tool_use: print("✓ Model used tools as expected")
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (18)
core/internal/testutil/test_retry_framework.go(5 hunks)core/providers/anthropic/chat.go(6 hunks)core/providers/anthropic/responses.go(13 hunks)core/providers/anthropic/types.go(2 hunks)core/providers/anthropic/utils.go(1 hunks)core/providers/bedrock/chat.go(1 hunks)core/providers/bedrock/responses.go(8 hunks)core/providers/bedrock/types.go(6 hunks)core/providers/bedrock/utils.go(6 hunks)core/providers/openai/responses_marshal_test.go(0 hunks)core/providers/openai/types.go(1 hunks)core/schemas/chatcompletions.go(2 hunks)core/schemas/responses.go(2 hunks)core/schemas/utils.go(1 hunks)tests/integrations/config.yml(7 hunks)tests/integrations/tests/test_anthropic.py(5 hunks)tests/integrations/tests/test_bedrock.py(3 hunks)tests/integrations/tests/utils/common.py(2 hunks)
💤 Files with no reviewable changes (1)
- core/providers/openai/responses_marshal_test.go
🚧 Files skipped from review as they are similar to previous changes (6)
- core/schemas/responses.go
- tests/integrations/config.yml
- core/schemas/chatcompletions.go
- core/providers/anthropic/types.go
- core/providers/bedrock/types.go
- tests/integrations/tests/utils/common.py
🧰 Additional context used
📓 Path-based instructions (1)
**
⚙️ CodeRabbit configuration file
always check the stack if there is one for the current PR. do not give localized reviews for the PR, always see all changes in the light of the whole stack of PRs (if there is a stack, if there is no stack you can continue to make localized suggestions/reviews)
Files:
core/providers/anthropic/chat.gocore/providers/openai/types.gocore/providers/anthropic/responses.gocore/providers/anthropic/utils.gocore/providers/bedrock/utils.gocore/providers/bedrock/chat.gocore/internal/testutil/test_retry_framework.gotests/integrations/tests/test_anthropic.pytests/integrations/tests/test_bedrock.pycore/schemas/utils.gocore/providers/bedrock/responses.go
🧠 Learnings (4)
📚 Learning: 2025-12-09T17:07:42.007Z
Learnt from: qwerty-dvorak
Repo: maximhq/bifrost PR: 1006
File: core/schemas/account.go:9-18
Timestamp: 2025-12-09T17:07:42.007Z
Learning: In core/schemas/account.go, the HuggingFaceKeyConfig field within the Key struct is currently unused and reserved for future Hugging Face inference endpoint deployments. Do not flag this field as missing from OpenAPI documentation or require its presence in the API spec until the feature is actively implemented and used. When the feature is added, update the OpenAPI docs accordingly; otherwise, treat this field as non-breaking and not part of the current API surface.
Applied to files:
core/providers/anthropic/chat.gocore/providers/openai/types.gocore/providers/anthropic/responses.gocore/providers/anthropic/utils.gocore/providers/bedrock/utils.gocore/providers/bedrock/chat.gocore/internal/testutil/test_retry_framework.gocore/schemas/utils.gocore/providers/bedrock/responses.go
📚 Learning: 2025-12-12T10:28:54.988Z
Learnt from: Pratham-Mishra04
Repo: maximhq/bifrost PR: 978
File: core/providers/anthropic/chat.go:207-216
Timestamp: 2025-12-12T10:28:54.988Z
Learning: In core/providers/anthropic/chat.go, within ToAnthropicChatRequest, ensure the mapping from ReasoningDetails to Anthropic thinking blocks preserves all fields from ReasoningDetails, including when Text is nil or empty. Do not drop or coerce missing Text; propagate nil/empty values to maintain complete bidirectional mapping of Anthropic events in Bifrost. Add tests to verify that ReasoningDetails with Text nil or "" round-trip without loss, and update code to handle nullable Text without dropping data. If there are multiple blocks, preserve order and all fields during mapping.
Applied to files:
core/providers/anthropic/chat.go
📚 Learning: 2025-12-11T11:58:25.307Z
Learnt from: Pratham-Mishra04
Repo: maximhq/bifrost PR: 1000
File: core/providers/openai/responses.go:42-84
Timestamp: 2025-12-11T11:58:25.307Z
Learning: In core/providers/openai/responses.go (and related OpenAI response handling), document and enforce the API format constraint: if ResponsesReasoning != nil and the response contains content blocks, all content blocks should be treated as reasoning blocks by default. Implement type guards or parsing logic accordingly, and add unit tests to verify that when ResponsesReasoning is non-nil, content blocks are labeled as reasoning blocks. Include clear comments in the code explaining the rationale and ensure downstream consumers rely on this behavior.
Applied to files:
core/providers/openai/types.go
📚 Learning: 2025-12-14T14:43:30.902Z
Learnt from: Radheshg04
Repo: maximhq/bifrost PR: 980
File: core/providers/openai/images.go:10-22
Timestamp: 2025-12-14T14:43:30.902Z
Learning: Enforce the OpenAI image generation SSE event type values across the OpenAI image flow in the repository: use "image_generation.partial_image" for partial chunks, "image_generation.completed" for the final result, and "error" for errors. Apply this consistently in schemas, constants, tests, accumulator routing, and UI code within core/providers/openai (and related Go files) to ensure uniform event typing and avoid mismatches.
Applied to files:
core/providers/openai/types.go
🧬 Code graph analysis (5)
core/providers/anthropic/chat.go (2)
core/schemas/chatcompletions.go (7)
CacheControl(610-613)ChatContentBlock(592-602)ChatContentBlockTypeText(584-584)ChatAssistantMessageToolCallFunction(711-714)ChatAssistantMessageToolCall(702-708)ChatReasoningDetails(739-746)BifrostReasoningDetailsTypeText(735-735)core/providers/anthropic/types.go (3)
AnthropicContentBlockTypeText(204-204)AnthropicContentBlockTypeToolUse(206-206)AnthropicContentBlockTypeThinking(212-212)
core/providers/openai/types.go (2)
core/schemas/responses.go (3)
ResponsesMessage(318-331)ResponsesMessageContentBlock(403-418)ResponsesToolMessage(469-489)core/schemas/chatcompletions.go (1)
CacheControl(610-613)
core/providers/bedrock/chat.go (3)
core/schemas/bifrost.go (1)
BifrostContextKeyStructuredOutputToolName(135-135)core/schemas/chatcompletions.go (8)
ChatContentBlock(592-602)ChatContentBlockTypeText(584-584)CacheControl(610-613)CacheControlTypeEphemeral(607-607)ChatAssistantMessageToolCall(702-708)ChatAssistantMessageToolCallFunction(711-714)ChatReasoningDetails(739-746)BifrostReasoningDetailsTypeText(735-735)core/schemas/utils.go (1)
Ptr(16-18)
tests/integrations/tests/test_bedrock.py (2)
tests/integrations/tests/utils/parametrize.py (2)
get_cross_provider_params_for_scenario(12-47)format_provider_model(50-65)tests/integrations/tests/test_anthropic.py (2)
validate_cache_write(1665-1679)validate_cache_read(1681-1695)
core/providers/bedrock/responses.go (4)
core/schemas/utils.go (1)
IsNovaModel(1043-1045)core/schemas/chatcompletions.go (2)
CacheControl(610-613)CacheControlTypeEphemeral(607-607)core/providers/bedrock/types.go (5)
BedrockTool(284-287)BedrockCachePoint(199-201)BedrockCachePointTypeDefault(195-195)BedrockSystemMessage(156-160)BedrockContentBlock(163-190)core/schemas/responses.go (3)
ResponsesMessage(318-331)ResponsesMessageContent(343-348)ResponsesMessageContentBlock(403-418)
🪛 Ruff (0.14.8)
tests/integrations/tests/test_anthropic.py
1422-1422: Unused method argument: anthropic_client
(ARG002)
1422-1422: Unused method argument: test_config
(ARG002)
1488-1488: Unused method argument: anthropic_client
(ARG002)
1488-1488: Unused method argument: test_config
(ARG002)
1562-1562: Unused method argument: anthropic_client
(ARG002)
1562-1562: Unused method argument: test_config
(ARG002)
tests/integrations/tests/test_bedrock.py
1624-1624: Unused method argument: test_config
(ARG002)
1674-1674: Unused method argument: test_config
(ARG002)
1725-1725: Unused method argument: test_config
(ARG002)
🔇 Additional comments (40)
tests/integrations/tests/test_anthropic.py (5)
66-66: LGTM: New imports for prompt caching tests.The imported constants
PROMPT_CACHING_TOOLSandPROMPT_CACHING_LARGE_CONTEXTare appropriately used in the new caching test cases.Also applies to: 85-85
39-41: LGTM: Documentation updated for new test cases.The docstring additions clearly describe the three new prompt caching test scenarios.
1421-1486: LGTM: System message caching test is well-structured.The test correctly validates cache creation and reuse for system messages across providers. Using
get_provider_anthropic_client(provider)is appropriate for cross-provider testing with the x-model-provider header.Note: The static analysis warnings about unused
anthropic_clientandtest_configare false positives—these are pytest fixtures that must be declared for the test framework.
1665-1679: LGTM: Cache write validation helper is well-implemented.The function appropriately validates cache creation operations with clear assertions and safe attribute access using
getattr. The attribute-based access pattern (e.g.,usage.input_tokens) is correct for Anthropic SDK response objects.
1681-1695: LGTM: Cache read validation helper is well-implemented.The function correctly validates cache read operations with appropriate assertions. It's consistent with
validate_cache_writeand provides clear diagnostic output.core/internal/testutil/test_retry_framework.go (6)
832-834: LGTM - Consistent attempt semantics!The updated comment and logging correctly clarify that
attemptrepresents the current failed attempt number, and the log displaysattempt+1to show the next attempt that will run. This addresses the inconsistency flagged in the previous review and aligns with other retry functions throughout the codebase.
2153-2158: Helpful logging enhancement for retry tracking.The attempt start logging distinguishes between the initial attempt and retries, making it easier to follow test execution flow in logs. This is particularly useful when debugging cache control tests that rely on the Responses API streaming.
2205-2212: Correct attempt parameter passing.The code correctly passes the current failed attempt number to the OnRetry callback (line 2211), which is consistent with the fix from the previous review. The conditional error logging (lines 2206-2208) avoids duplicate messages by only logging for retry attempts (attempt > 1).
2218-2218: Good UX improvement with explicit wait indicators.Logging the wait time before retries (with the ⏳ indicator) helps distinguish between active retry attempts and backoff delays, making test logs easier to interpret.
2244-2254: Robust nil channel handling with clear intent.The enhanced logging for the nil channel case provides better visibility into retry decisions. The critical comment at line 2254 correctly emphasizes that the code must
continueto retry rather than returning immediately, ensuring retry resilience even when the channel is nil on intermediate attempts.
2317-2317: Consistent attempt parameter semantics.The comment correctly documents that the current failed attempt number is passed to the OnRetry callback, maintaining consistency with the fixes at lines 2211 and 2246.
core/schemas/utils.go (1)
1042-1045: LGTM!The
IsNovaModelfunction follows the established pattern used byIsAnthropicModel,IsMistralModel, andIsGeminiModel. The case-sensitive check for "nova" is appropriate given typical model naming conventions.core/providers/anthropic/utils.go (1)
96-101: LGTM!Good improvements:
- Using the typed constant
AnthropicContentBlockTypeImageinstead of a string literal improves type safety.- Propagating
CacheControlfrom the input block ensures cache metadata is preserved during image block conversion.core/providers/anthropic/chat.go (6)
65-68: LGTM!Properly propagates
CacheControlfrom Bifrost tools to Anthropic tools, enabling per-tool cache control during request conversion.
149-154: LGTM!System message content blocks now correctly propagate
CacheControlmetadata, enabling cache control at the system prompt level.
184-191: LGTM!Tool result content blocks correctly propagate
CacheControl, and the image block conversion viaConvertToAnthropicImageBlock(which also handlesCacheControl) is appropriate.
241-249: LGTM!User and assistant message content blocks correctly propagate
CacheControlmetadata while maintaining image block handling via the dedicated converter.
320-366: LGTM!The response conversion correctly:
- Propagates
CacheControlfrom Anthropic text blocks to Bifrost content blocks- Handles tool use blocks with proper JSON marshaling and fallback
- Accumulates thinking/reasoning details with correct type assignment
501-511: LGTM!The bidirectional conversion correctly propagates
CacheControlfrom Bifrost content blocks back to Anthropic format, ensuring cache metadata is preserved in both directions.core/providers/bedrock/utils.go (4)
30-32: LGTM!The function call now correctly passes the model to
convertToolConfig, enabling model-specific cache control behavior.
319-351: LGTM!Tool result content blocks (both text and image) correctly propagate cache signaling by adding
CachePointwhen the source block hasCacheControlset.
390-421: LGTM!The
convertContentBlockfunction correctly addsCachePointto both text and image blocks whenCacheControlis present on the source, maintaining consistency with the tool result handling.
599-651: The CachePoint format and Nova model exclusion are correct.The Bedrock Converse API tool structure is a Tagged Union where only one of toolSpec, systemTool, or cachePoint can be set, confirming that adding CachePoint as a separate BedrockTool entry is the correct API format.
Nova models do not support tool caching, and attempting to use tool caching strategies with Nova throws an exception, so the
!schemas.IsNovaModel(model)check correctly prevents CachePoint injection for Nova models.core/providers/bedrock/chat.go (2)
63-78: LGTM!The structured output handling correctly:
- Checks for the structured output tool by name from context
- Marshals tool input to JSON string as the content
- Provides fallback to string conversion on marshal failure
80-140: LGTM with minor edge case consideration.The content block handling correctly:
- Converts text blocks to Bifrost format
- Attaches
CacheControlto the last content block when encountering aCachePointblock- Handles tool use with proper JSON marshaling and variable capture to avoid loop variable issues (lines 112-113)
- Accumulates reasoning content with correct signature handling
One edge case to consider: if a
CachePointblock appears before any content blocks,len(contentBlocks) > 0check (line 90) safely prevents a panic, but the cache signal would be silently dropped. This is likely acceptable given Bedrock's expected ordering.core/providers/anthropic/responses.go (4)
2065-2097: LGTM! System message conversion correctly preserves CacheControl.The implementation properly handles both
ContentStrandContentBlockscases while preservingCacheControlon each text block. The guard at line 2086 correctly prevents appending system messages with emptyContentBlocksarrays, addressing the concern from the past review.
2174-2180: Excellent cache control propagation across content blocks.The implementation consistently preserves
CacheControlacross all content block types (text, tool results) in both input and output message conversions. This ensures cache metadata is maintained throughout the conversion pipeline.Also applies to: 2263-2266, 2377-2379, 2390-2396, 2495-2497, 2559-2562
3082-3084: Tool conversion correctly preserves CacheControl in both directions.The bidirectional tool conversions properly preserve
CacheControl, enabling cache metadata to flow through tool definitions. The nil checks ensure safe assignment.Also applies to: 3262-3264
3330-3332: Image and helper conversions complete the CacheControl propagation.The remaining conversions (image blocks and helper functions) correctly preserve
CacheControl, completing comprehensive cache metadata support across all content block types.Also applies to: 3343-3343, 3382-3382
tests/integrations/tests/test_bedrock.py (3)
1623-1671: System message caching test correctly structured.The test properly validates prompt caching for system messages by:
- Placing a
cachePointafter system content blocks- Making two requests with identical system content but different user messages
- Validating cache write on the first request and cache read on the second
The Bedrock
cachePointformat with{"type": "default"}is correctly used.
1673-1722: Messages caching test correctly implements mid-message cache points.The test validates prompt caching within messages by placing a
cachePointbetween the large context and the specific question. This allows the large context to be cached while enabling different questions, which is the intended use case for efficient prompt caching.
1724-1785: Tools caching test and helper functions correctly implemented.The tools caching test properly validates caching of tool definitions by:
- Using 12 tools from
PROMPT_CACHING_TOOLSto create a substantial cache- Placing the
cachePointafter all tool definitions- Making two requests with the same tools but different user messages
The helper functions
validate_cache_writeandvalidate_cache_readcorrectly use Bedrock's camelCase field names (cacheWriteInputTokens,cacheReadInputTokens) rather than Anthropic's snake_case format, ensuring proper validation of cache metrics.Also applies to: 1787-1813
core/providers/bedrock/responses.go (6)
1572-1578: Cache point conversion logic mirrors the inverse operation.The conversion from Bifrost CacheControl to Bedrock CachePoint correctly appends a separate tool entry, which mirrors the inverse operation in lines 1283-1290.
1707-1712: Cache token mapping is correct.The mapping of cached token counts from Bifrost's InputTokensDetails/OutputTokensDetails to Bedrock's CacheReadInputTokens/CacheWriteInputTokens is accurate.
2382-2384: System message conversion refactored to handle slices.The updated call correctly passes the entire systemMessages slice and only appends results if non-empty. Clean refactor.
2413-2419: Cache point appending is consistent with tool conversion pattern.The logic correctly appends a separate BedrockSystemMessage with CachePoint when a content block has CacheControl, matching the pattern used in tool conversion.
2775-2781: Cache point appending after content blocks is correct.The conversion from Bifrost CacheControl to Bedrock CachePoint correctly appends a separate content block marker after processing each block with cache control, maintaining consistency with Bedrock's cache representation.
1283-1290: Add code comment explaining Nova tool cache limitation.The Nova model exclusion at line 1283 is correct—Amazon Nova models do not support tool caching. Consider adding an inline comment documenting this limitation for future maintainers.
core/providers/openai/types.go (2)
178-211: Data mutation issue successfully resolved.The implementation correctly addresses the past review concern by creating deep copies of the message structures before stripping CacheControl. The approach ensures the original input data remains unchanged:
- Creates a new messagesCopy slice
- Deep copies Content and ContentBlocks when present
- Deep copies ResponsesToolMessage and Output when present
- Marshals the copied data instead of the original
The nil checks are properly placed, preventing any potential panics when accessing nested fields.
196-209: Code correctly handles CacheControl stripping in ResponsesMessage.The implementation properly strips CacheControl from all message content fields:
- Content blocks (line 189): All ResponsesMessageContentBlock instances in Content.ContentBlocks
- Tool output blocks (line 203): All ResponsesMessageContentBlock instances in ResponsesToolMessage.Output.ResponsesFunctionToolCallOutputBlocks
ResponsesReasoning and other embedded tool structs (FileSearch, Computer, CodeInterpreter, MCP, Custom, ImageGeneration) do not contain CacheControl fields and require no modification.
b4bc2f2 to
217e2c0
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 4
🧹 Nitpick comments (3)
core/internal/testutil/prompt_caching.go (1)
245-268: Prompt‑caching harness correctly usesCacheControl, with a couple of small robustness nits
- Marking the last tool in
GetPromptCachingToolsasCacheControl: Ephemeraland wrapping the long system prefix in aChatContentBlockwithCacheControl: Ephemeralmatches the new schema and Anthropic/Bedrock mappings, so the Go harness will actually exercise provider‑level caching.- Gating
RunPromptCachingTeston bothSimpleChatandPromptCachingavoids hitting providers that don’t expose caching metrics, which should reduce flaky skips.Two small follow‑ups you may want to consider:
- The “last tool is cached” convention is index‑based; if the tools slice is reordered later the cached tool changes silently. Keying the cache‑controlled tool by name (e.g., always making
check_compliance_statusephemeral) would be more robust.- In
retryContext.TestMetadata, you still logmodel: testConfig.ChatModelwhile the request usesPromptCachingModel; switching the metadata to the actual model being called would make retries/debugging clearer.Also applies to: 276-304, 369-379
tests/integrations/tests/test_bedrock.py (1)
37-41: Prompt caching integration tests for Bedrock are well structuredThe new tests and helpers correctly exercise Bedrock’s cache metrics:
test_21_prompt_caching_system,test_22_prompt_caching_messages, andtest_23_prompt_caching_toolseach validate a distinct checkpoint type (system, messages, tools) with a clear two-call pattern (write then read).validate_cache_write/validate_cache_readuse the Bedrock field names (inputTokens,cacheWriteInputTokens,cacheReadInputTokens) and assert the right invariants (write>0/read==0 and vice versa).- Tool checkpoint test builds a 12-tool
toolConfigmatching how Bedrock expectscachePointafter tool specs.If you find yourself adding similar helpers for other providers again, you could consider moving the cache read/write validators into a shared test util, but that’s optional.
Also applies to: 51-61, 1623-1813
core/providers/anthropic/responses.go (1)
1344-1388: Centralized system-message handling and Bedrock grouping look correctThe refactor to:
- Pass
request.SystemintoConvertAnthropicMessagesToBifrostMessages, and- Introduce
convertAnthropicSystemToBifrostMessages,nicely centralizes system-message handling for both request and response paths while preserving any
CacheControlon system blocks.Using the
keepToolsGroupedflag (provider == schemas.Bedrock) to switch between grouped and ungrouped conversions is a sensible way to keep Bedrock’s mixed text+tool_use turns intact without affecting other providers.One minor stylistic nit: in
convertAnthropicSystemToBifrostMessagesyou tag system text content asResponsesOutputMessageContentTypeText; using theinput_textvariant might better reflect that these are request-side messages, but functionally it should be equivalent given how other code treats text blocks.Also applies to: 1628-1649, 2065-2097
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (22)
core/internal/testutil/account.go(3 hunks)core/internal/testutil/prompt_caching.go(3 hunks)core/internal/testutil/test_retry_framework.go(5 hunks)core/providers/anthropic/anthropic_test.go(2 hunks)core/providers/anthropic/chat.go(6 hunks)core/providers/anthropic/responses.go(13 hunks)core/providers/anthropic/types.go(2 hunks)core/providers/anthropic/utils.go(1 hunks)core/providers/bedrock/bedrock_test.go(2 hunks)core/providers/bedrock/chat.go(1 hunks)core/providers/bedrock/responses.go(8 hunks)core/providers/bedrock/types.go(6 hunks)core/providers/bedrock/utils.go(6 hunks)core/providers/openai/responses_marshal_test.go(0 hunks)core/providers/openai/types.go(1 hunks)core/schemas/chatcompletions.go(2 hunks)core/schemas/responses.go(2 hunks)core/schemas/utils.go(1 hunks)tests/integrations/config.yml(7 hunks)tests/integrations/tests/test_anthropic.py(5 hunks)tests/integrations/tests/test_bedrock.py(3 hunks)tests/integrations/tests/utils/common.py(2 hunks)
💤 Files with no reviewable changes (1)
- core/providers/openai/responses_marshal_test.go
🚧 Files skipped from review as they are similar to previous changes (6)
- core/schemas/responses.go
- core/providers/openai/types.go
- core/providers/anthropic/utils.go
- tests/integrations/tests/utils/common.py
- core/providers/anthropic/types.go
- core/schemas/utils.go
🧰 Additional context used
📓 Path-based instructions (1)
**
⚙️ CodeRabbit configuration file
always check the stack if there is one for the current PR. do not give localized reviews for the PR, always see all changes in the light of the whole stack of PRs (if there is a stack, if there is no stack you can continue to make localized suggestions/reviews)
Files:
core/internal/testutil/account.gocore/schemas/chatcompletions.gotests/integrations/config.ymlcore/providers/bedrock/bedrock_test.gocore/providers/anthropic/anthropic_test.gocore/providers/bedrock/utils.gocore/providers/bedrock/responses.gocore/providers/anthropic/responses.gocore/providers/bedrock/chat.gotests/integrations/tests/test_bedrock.pycore/providers/bedrock/types.gocore/internal/testutil/test_retry_framework.gocore/providers/anthropic/chat.gotests/integrations/tests/test_anthropic.pycore/internal/testutil/prompt_caching.go
🧠 Learnings (4)
📚 Learning: 2025-12-09T17:07:42.007Z
Learnt from: qwerty-dvorak
Repo: maximhq/bifrost PR: 1006
File: core/schemas/account.go:9-18
Timestamp: 2025-12-09T17:07:42.007Z
Learning: In core/schemas/account.go, the HuggingFaceKeyConfig field within the Key struct is currently unused and reserved for future Hugging Face inference endpoint deployments. Do not flag this field as missing from OpenAPI documentation or require its presence in the API spec until the feature is actively implemented and used. When the feature is added, update the OpenAPI docs accordingly; otherwise, treat this field as non-breaking and not part of the current API surface.
Applied to files:
core/internal/testutil/account.gocore/schemas/chatcompletions.gocore/providers/bedrock/bedrock_test.gocore/providers/anthropic/anthropic_test.gocore/providers/bedrock/utils.gocore/providers/bedrock/responses.gocore/providers/anthropic/responses.gocore/providers/bedrock/chat.gocore/providers/bedrock/types.gocore/internal/testutil/test_retry_framework.gocore/providers/anthropic/chat.gocore/internal/testutil/prompt_caching.go
📚 Learning: 2025-12-11T07:38:31.413Z
Learnt from: Pratham-Mishra04
Repo: maximhq/bifrost PR: 1000
File: core/providers/bedrock/bedrock_test.go:1374-1390
Timestamp: 2025-12-11T07:38:31.413Z
Learning: In core/providers/bedrock tests, follow a layered testing approach: - Unit tests (e.g., TestBifrostToBedrockResponseConversion) should perform structural comparisons and type/field checks to avoid brittleness from dynamic fields. - Separate scenario-based and integration tests should validate the full end-to-end conversion logic, including content block internals. Ensure unit tests avoid brittle string/field matching and that integration tests cover end-to-end behavior with realistic data.
Applied to files:
core/providers/bedrock/bedrock_test.go
📚 Learning: 2025-12-15T10:16:21.909Z
Learnt from: qwerty-dvorak
Repo: maximhq/bifrost PR: 1006
File: core/providers/huggingface/huggingface_test.go:12-63
Timestamp: 2025-12-15T10:16:21.909Z
Learning: In provider tests under core/providers/<provider>/*_test.go, do not require or flag the use of defer for Shutdown(); instead call client.Shutdown() at the end of each test function. This pattern appears consistent across all provider tests. Apply this rule only within this path; for other tests or resources, defer may still be appropriate.
Applied to files:
core/providers/bedrock/bedrock_test.gocore/providers/anthropic/anthropic_test.go
📚 Learning: 2025-12-12T10:28:54.988Z
Learnt from: Pratham-Mishra04
Repo: maximhq/bifrost PR: 978
File: core/providers/anthropic/chat.go:207-216
Timestamp: 2025-12-12T10:28:54.988Z
Learning: In core/providers/anthropic/chat.go, within ToAnthropicChatRequest, ensure the mapping from ReasoningDetails to Anthropic thinking blocks preserves all fields from ReasoningDetails, including when Text is nil or empty. Do not drop or coerce missing Text; propagate nil/empty values to maintain complete bidirectional mapping of Anthropic events in Bifrost. Add tests to verify that ReasoningDetails with Text nil or "" round-trip without loss, and update code to handle nullable Text without dropping data. If there are multiple blocks, preserve order and all fields during mapping.
Applied to files:
core/providers/anthropic/chat.go
🧬 Code graph analysis (8)
core/providers/bedrock/utils.go (3)
core/schemas/models.go (1)
Model(109-129)core/providers/bedrock/types.go (6)
BedrockContentBlock(163-190)BedrockCachePoint(199-201)BedrockCachePointTypeDefault(195-195)BedrockMessage(150-153)BedrockToolConfig(278-281)BedrockTool(284-287)core/schemas/utils.go (1)
IsNovaModel(1043-1045)
core/providers/bedrock/responses.go (4)
core/schemas/utils.go (1)
IsNovaModel(1043-1045)core/schemas/chatcompletions.go (2)
CacheControl(610-613)CacheControlTypeEphemeral(607-607)core/providers/bedrock/types.go (4)
BedrockTool(284-287)BedrockCachePoint(199-201)BedrockCachePointTypeDefault(195-195)BedrockContentBlock(163-190)core/schemas/responses.go (4)
ResponsesMessage(318-331)ResponsesInputMessageRoleSystem(338-338)ResponsesMessageContent(343-348)ResponsesMessageContentBlock(403-418)
core/providers/anthropic/responses.go (7)
core/schemas/bifrost.go (1)
Bedrock(38-38)core/schemas/responses.go (6)
ResponsesMessage(318-331)ResponsesInputMessageRoleSystem(338-338)ResponsesMessageContent(343-348)ResponsesMessageContentBlock(403-418)ResponsesOutputMessageContentTypeText(396-396)ResponsesInputMessageContentBlockTypeText(392-392)ui/lib/types/logs.ts (3)
ResponsesMessage(423-438)ResponsesMessageContent(400-400)ResponsesMessageContentBlock(362-397)core/providers/gemini/types.go (3)
Role(17-17)Content(977-985)Type(782-782)core/schemas/utils.go (1)
Ptr(16-18)core/schemas/chatcompletions.go (1)
CacheControl(610-613)core/providers/anthropic/types.go (1)
AnthropicContentBlockTypeText(204-204)
core/providers/bedrock/chat.go (3)
core/schemas/bifrost.go (1)
BifrostContextKeyStructuredOutputToolName(135-135)core/schemas/chatcompletions.go (6)
ChatContentBlock(592-602)ChatContentBlockTypeText(584-584)CacheControl(610-613)CacheControlTypeEphemeral(607-607)ChatAssistantMessageToolCall(702-708)ChatAssistantMessageToolCallFunction(711-714)core/schemas/utils.go (1)
Ptr(16-18)
tests/integrations/tests/test_bedrock.py (2)
tests/integrations/tests/utils/parametrize.py (2)
get_cross_provider_params_for_scenario(12-47)format_provider_model(50-65)tests/integrations/tests/test_anthropic.py (2)
validate_cache_write(1665-1679)validate_cache_read(1681-1695)
core/providers/anthropic/chat.go (2)
core/schemas/chatcompletions.go (5)
CacheControl(610-613)ChatContentBlock(592-602)ChatContentBlockTypeText(584-584)ChatAssistantMessageToolCallFunction(711-714)ChatAssistantMessageToolCall(702-708)core/providers/anthropic/types.go (3)
AnthropicContentBlockTypeText(204-204)AnthropicContentBlockTypeToolUse(206-206)AnthropicContentBlockTypeThinking(212-212)
tests/integrations/tests/test_anthropic.py (2)
tests/integrations/tests/utils/parametrize.py (2)
get_cross_provider_params_for_scenario(12-47)format_provider_model(50-65)tests/integrations/tests/utils/common.py (1)
assert_valid_chat_response(801-833)
core/internal/testutil/prompt_caching.go (2)
core/schemas/chatcompletions.go (4)
CacheControl(610-613)CacheControlTypeEphemeral(607-607)ChatContentBlock(592-602)ChatContentBlockTypeText(584-584)core/schemas/utils.go (1)
Ptr(16-18)
🪛 Ruff (0.14.8)
tests/integrations/tests/test_bedrock.py
1624-1624: Unused method argument: test_config
(ARG002)
1674-1674: Unused method argument: test_config
(ARG002)
1725-1725: Unused method argument: test_config
(ARG002)
tests/integrations/tests/test_anthropic.py
1422-1422: Unused method argument: anthropic_client
(ARG002)
1422-1422: Unused method argument: test_config
(ARG002)
1488-1488: Unused method argument: anthropic_client
(ARG002)
1488-1488: Unused method argument: test_config
(ARG002)
1562-1562: Unused method argument: anthropic_client
(ARG002)
1562-1562: Unused method argument: test_config
(ARG002)
🔇 Additional comments (24)
core/internal/testutil/test_retry_framework.go (3)
2153-2158: LGTM! Improved attempt logging.The addition of per-attempt start logs enhances test observability by clearly distinguishing between initial attempts and retries.
2205-2220: LGTM! Consistent attempt semantics and enhanced logging.The changes correctly implement consistent attempt semantics (passing the current failed attempt number to
OnRetry) and add helpful waiting logs before retries, addressing the previous review feedback.
2241-2254: LGTM! Critical nil-channel retry fix.The code correctly continues the retry loop when the response channel is nil instead of returning early. The enhanced logging and consistent attempt semantics further improve debuggability.
core/schemas/chatcompletions.go (2)
245-250: AddCacheControltoChatToollooks correct and backward‑compatiblePointer field with
omitemptycleanly extends the schema and will be ignored by existing callers and providers that don’t use cache control.
591-613:ChatContentBlock+CacheControltype shape is soundCarrying
CacheControlon content blocks via a typedCacheControlTypeandTTLstring is a minimal, provider‑agnostic extension and integrates cleanly with existing (de)serialization.tests/integrations/config.yml (2)
109-114: Bedrock chat model override aligns with Claude 4.5 Bedrock deploymentPointing
providers.bedrock.chatatus.anthropic.claude-sonnet-4-5-20250929-v1:0matches the Bedrock deployment mapping used in Go test accounts and makes this model the default for Bedrock chat and prompt‑caching scenarios.
167-167: Prompt‑caching scenario wiring is consistent across providers and capabilitiesMarking
prompt_cachingper‑provider inprovider_scenariosand mapping it to"chat"inscenario_capabilitiescleanly scopes the new cross‑provider tests so only Anthropic and Bedrock participate while others are explicitly disabled.Also applies to: 205-205, 243-243, 281-281, 319-319, 342-363
core/providers/bedrock/bedrock_test.go (1)
45-56: Bedrock comprehensive test config correctly enables prompt caching and embeddingsPopulating
EmbeddingModel(cohere.embed-v4:0),ReasoningModel(claude-4.5-sonnet), andPromptCachingModelalongsidePromptCaching: trueinScenariosmatches the Bedrock key configuration and ensures the shared prompt‑caching harness runs only where the models and metrics are available. Based on learnings, this keeps Bedrock tests at the structural/config layer without adding brittle content checks.Also applies to: 70-74
core/internal/testutil/account.go (2)
22-55:PromptCachingflag is a natural extension toTestScenariosAdding a dedicated
PromptCachingboolean keeps scenario toggles cohesive and gives the Go harness an explicit switch instead of overloading existing chat/tool flags.
590-619: Anthropic and Bedrock correctly opted into prompt‑caching scenariosEnabling
PromptCaching: trueonly for Anthropic and Bedrock inAllProviderConfigsmatches the integration YAML and the underlying provider support, so the shared prompt‑caching test helper will automatically skip unsupported providers.Also applies to: 624-661
core/providers/anthropic/anthropic_test.go (1)
25-35: Anthropic config now fully wired for vision, reasoning, and prompt cachingSetting
VisionModel,ReasoningModel, andPromptCachingModeltogether withPromptCaching: trueinScenariosgives the comprehensive test runner enough information to exercise the new cache‑control flows without impacting unrelated scenarios.Also applies to: 35-52
core/providers/anthropic/chat.go (1)
42-72: Cache‑control propagation for Anthropic text/tools is symmetrical and non‑breakingForwarding
CacheControlfromChatTooland textChatContentBlocks intoAnthropicTool/AnthropicContentBlock, and then mapping it back in bothToBifrostChatResponseandToAnthropicChatResponse, gives you a clean round‑trip for cache metadata without changing existing semantics for callers that don’t set it. Reasoning/thinking and tool‑use handling remain structurally identical, so prior behavior is preserved.Also applies to: 139-162, 167-210, 218-250, 318-367, 495-511
tests/integrations/tests/test_anthropic.py (1)
52-60: New prompt‑caching constants are correctly imported for Anthropic testsImporting
PROMPT_CACHING_TOOLSandPROMPT_CACHING_LARGE_CONTEXTfrom the shared utils keeps the Anthropic prompt‑caching tests consistent with the Bedrock and cross‑provider helpers.Also applies to: 82-86
core/providers/bedrock/utils.go (1)
30-32: CachePoint propagation across tools and content blocks looks soundThe changes cleanly propagate
CacheControlinto Bedrock-specificCachePoints:
convertChatParametersnow passesmodelintoconvertToolConfig, enabling the Nova guard without introducing nil risks (params already checked).convertToolMessageandconvertContentBlockattach aBedrockCachePoint{Type: BedrockCachePointTypeDefault}only when the sourceChatContentBlock.CacheControlis non-nil, for both text and image blocks.convertToolConfig’s newmodelparameter and the!schemas.IsNovaModel(model)check correctly avoid emitting cache points for Nova models, while still allowing per-tool cachepoints elsewhere.No functional issues spotted in this wiring.
Also applies to: 320-348, 390-420, 598-667
core/providers/bedrock/types.go (1)
155-160: Bedrock CachePoint type additions are coherent and backward compatibleThe introduction of
BedrockCachePointType,BedrockCachePoint, and the newCachePointfields onBedrockSystemMessage,BedrockContentBlock, andBedrockToolis consistent with how they’re consumed in the Bedrock utils/chat paths:
- All new fields are pointers with
omitempty, so they don’t change existing JSON unless explicitly set.- The single
BedrockCachePointTypeDefaultconstant matches the current usage; additional variants can be added later without API breakage.- Struct layout for
BedrockToolkeepsToolSpecandCachePointorthogonal, which fits the “sentinel tool” pattern used for tool-cache checkpoints.Looks good as a foundation for prompt caching.
Also applies to: 162-201, 283-287
core/providers/anthropic/responses.go (2)
2151-2185: CacheControl round‑trip for text/image blocks and tool outputs is consistentAcross the various content-block converters:
- Anthropic → Bifrost (
convertAnthropicContentBlocksToResponsesMessages*) now copiesblock.CacheControlinto the correspondingResponsesMessageContentBlockfor text (both input and output) and for tool-result text blocks.- Bifrost → Anthropic (
convertContentBlockToAnthropic,convertBifrostContentBlocksToAnthropic,toBifrostResponsesImageBlock) propagatesCacheControlon both text and image blocks back ontoAnthropicContentBlock.CacheControl.This ensures CacheControl survives:
- Regular messages,
- Tool result payloads, and
- Image blocks
in both directions without introducing new branches or regressions. The wiring looks internally consistent with the shared
CacheControlschema.Also applies to: 2263-2276, 2359-2401, 2467-2508, 3326-3347, 3377-3383
2982-3087: Tool‑level CacheControl mapping between Anthropic and Responses tools is symmetricThe changes to
convertAnthropicToolToBifrostandconvertBifrostToolToAnthropicto copytool.CacheControl:
- Ensure that prompt-caching hints attached at the tool-definition level are preserved when moving between Anthropic’s native tool schema and the generic
ResponsesToolabstraction.- Keep the mapping symmetric, so round-tripping a tool won’t silently strip or duplicate cache metadata.
No functional issues here; this is a straightforward and correct extension of the existing tool-conversion logic.
Also applies to: 3183-3267
core/providers/bedrock/responses.go (7)
1283-1289: LGTM! Cache point handling follows Bedrock API contract.The logic correctly applies
CacheControlto the last tool when aCachePointmarker is encountered, with appropriate Nova model exclusion.
1571-1578: LGTM! Correct bidirectional cache point propagation.The code properly converts Bifrost
CacheControlto a separate BedrockCachePointentry, maintaining consistency with Bedrock's API structure.
2382-2384: LGTM! Clean refactoring for batch system message processing.The signature change from single-message to batch processing enables proper handling of CachePoint relationships across multiple system messages.
2413-2419: LGTM! Correct system message cache control propagation.Properly converts Bifrost
CacheControlto separate BedrockCachePointentries, consistent with the tool-level handling.
2450-2484: LGTM! System message conversion correctly handles cache points.The function properly converts Bedrock
CachePointentries to BifrostCacheControlon the last content block of the preceding message. As confirmed in previous reviews, Bedrock's API contract guaranteesCachePointonly appears after content blocks, making the current implementation safe.
2656-2665: LGTM! Message content cache point handling is correct.Consistent with system message handling (lines 2450-2484), this correctly converts
CachePointtoCacheControlon the last content block. As previously confirmed, upstream validation ensuresCachePointnever appears before content blocks.
2774-2781: LGTM! Completes bidirectional cache control conversion.Correctly propagates
CacheControlto separate BedrockCachePointentries in content blocks, maintaining consistency with tool and system message handling.
217e2c0 to
4ff38e2
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (3)
core/internal/testutil/test_retry_framework.go (1)
832-834: Consider showing max attempts in retry log.The comment clarification about attempt semantics is good and addresses the past review concern. However, the log format shows only the next attempt number without the maximum:
t.Logf("🔄 Retrying streaming test (attempt %d): %s", attempt+1, reason)Showing the max attempts (e.g.,
attempt %d/%d) would improve visibility into retry progress, especially useful during test debugging.Apply this diff to show max attempts:
- t.Logf("🔄 Retrying streaming test (attempt %d): %s", attempt+1, reason) + t.Logf("🔄 Retrying streaming test (attempt %d/%d): %s", attempt+1, config.MaxAttempts, reason)core/providers/bedrock/chat.go (1)
63-78: CachePoint handling has an edge case when no content blocks exist.When a
CachePointblock is encountered butcontentBlocksis empty (line 73), the code silently skips attaching the CacheControl. This could happen if:
- The first content block is a CachePoint (unlikely but possible)
- All previous blocks were filtered out
Consider logging or handling this case explicitly, though it may be intentional to ignore orphan CachePoints.
} else if contentBlock.CachePoint != nil { // add it to last content block if len(contentBlocks) > 0 { contentBlocks[len(contentBlocks)-1].CacheControl = &schemas.CacheControl{ Type: schemas.CacheControlTypeEphemeral, } - } + } // CachePoint without preceding content block is ignored }tests/integrations/tests/test_anthropic.py (1)
1669-1695: Consider making Anthropicvalidate_cache_*a bit more defensiveThese helpers assume
usage.input_tokensalways exists; if Anthropic (or our wrapper) ever omits it, thegetattr(usage, "input_tokens", 0)would make them safer without changing semantics.-def validate_cache_write(usage: Any, operation: str) -> int: - """Validate cache write operation and return tokens written""" - print(f"{operation} usage - input_tokens: {usage.input_tokens}, " - f"cache_creation_input_tokens: {getattr(usage, 'cache_creation_input_tokens', 0)}, " - f"cache_read_input_tokens: {getattr(usage, 'cache_read_input_tokens', 0)}") +def validate_cache_write(usage: Any, operation: str) -> int: + """Validate cache write operation and return tokens written""" + print( + f"{operation} usage - input_tokens: {getattr(usage, 'input_tokens', 0)}, " + f"cache_creation_input_tokens: {getattr(usage, 'cache_creation_input_tokens', 0)}, " + f"cache_read_input_tokens: {getattr(usage, 'cache_read_input_tokens', 0)}" + ) @@ -def validate_cache_read(usage: Any, operation: str) -> int: - """Validate cache read operation and return tokens read""" - print(f"{operation} usage - input_tokens: {usage.input_tokens}, " - f"cache_creation_input_tokens: {getattr(usage, 'cache_creation_input_tokens', 0)}, " - f"cache_read_input_tokens: {getattr(usage, 'cache_read_input_tokens', 0)}") +def validate_cache_read(usage: Any, operation: str) -> int: + """Validate cache read operation and return tokens read""" + print( + f"{operation} usage - input_tokens: {getattr(usage, 'input_tokens', 0)}, " + f"cache_creation_input_tokens: {getattr(usage, 'cache_creation_input_tokens', 0)}, " + f"cache_read_input_tokens: {getattr(usage, 'cache_read_input_tokens', 0)}" + )
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (22)
core/internal/testutil/account.go(3 hunks)core/internal/testutil/prompt_caching.go(3 hunks)core/internal/testutil/test_retry_framework.go(5 hunks)core/providers/anthropic/anthropic_test.go(2 hunks)core/providers/anthropic/chat.go(6 hunks)core/providers/anthropic/responses.go(13 hunks)core/providers/anthropic/types.go(2 hunks)core/providers/anthropic/utils.go(1 hunks)core/providers/bedrock/bedrock_test.go(2 hunks)core/providers/bedrock/chat.go(1 hunks)core/providers/bedrock/responses.go(8 hunks)core/providers/bedrock/types.go(6 hunks)core/providers/bedrock/utils.go(6 hunks)core/providers/openai/responses_marshal_test.go(0 hunks)core/providers/openai/types.go(1 hunks)core/schemas/chatcompletions.go(2 hunks)core/schemas/responses.go(2 hunks)core/schemas/utils.go(1 hunks)tests/integrations/config.yml(7 hunks)tests/integrations/tests/test_anthropic.py(5 hunks)tests/integrations/tests/test_bedrock.py(3 hunks)tests/integrations/tests/utils/common.py(2 hunks)
💤 Files with no reviewable changes (1)
- core/providers/openai/responses_marshal_test.go
🚧 Files skipped from review as they are similar to previous changes (9)
- core/providers/anthropic/utils.go
- core/internal/testutil/account.go
- core/providers/anthropic/types.go
- core/schemas/chatcompletions.go
- tests/integrations/tests/utils/common.py
- core/providers/anthropic/anthropic_test.go
- core/schemas/responses.go
- tests/integrations/config.yml
- core/internal/testutil/prompt_caching.go
🧰 Additional context used
📓 Path-based instructions (1)
**
⚙️ CodeRabbit configuration file
always check the stack if there is one for the current PR. do not give localized reviews for the PR, always see all changes in the light of the whole stack of PRs (if there is a stack, if there is no stack you can continue to make localized suggestions/reviews)
Files:
core/providers/openai/types.gocore/schemas/utils.gocore/providers/bedrock/chat.gocore/internal/testutil/test_retry_framework.gocore/providers/anthropic/responses.gocore/providers/bedrock/bedrock_test.gocore/providers/anthropic/chat.gotests/integrations/tests/test_bedrock.pytests/integrations/tests/test_anthropic.pycore/providers/bedrock/types.gocore/providers/bedrock/responses.gocore/providers/bedrock/utils.go
🧠 Learnings (6)
📚 Learning: 2025-12-09T17:07:42.007Z
Learnt from: qwerty-dvorak
Repo: maximhq/bifrost PR: 1006
File: core/schemas/account.go:9-18
Timestamp: 2025-12-09T17:07:42.007Z
Learning: In core/schemas/account.go, the HuggingFaceKeyConfig field within the Key struct is currently unused and reserved for future Hugging Face inference endpoint deployments. Do not flag this field as missing from OpenAPI documentation or require its presence in the API spec until the feature is actively implemented and used. When the feature is added, update the OpenAPI docs accordingly; otherwise, treat this field as non-breaking and not part of the current API surface.
Applied to files:
core/providers/openai/types.gocore/schemas/utils.gocore/providers/bedrock/chat.gocore/internal/testutil/test_retry_framework.gocore/providers/anthropic/responses.gocore/providers/bedrock/bedrock_test.gocore/providers/anthropic/chat.gocore/providers/bedrock/types.gocore/providers/bedrock/responses.gocore/providers/bedrock/utils.go
📚 Learning: 2025-12-11T11:58:25.307Z
Learnt from: Pratham-Mishra04
Repo: maximhq/bifrost PR: 1000
File: core/providers/openai/responses.go:42-84
Timestamp: 2025-12-11T11:58:25.307Z
Learning: In core/providers/openai/responses.go (and related OpenAI response handling), document and enforce the API format constraint: if ResponsesReasoning != nil and the response contains content blocks, all content blocks should be treated as reasoning blocks by default. Implement type guards or parsing logic accordingly, and add unit tests to verify that when ResponsesReasoning is non-nil, content blocks are labeled as reasoning blocks. Include clear comments in the code explaining the rationale and ensure downstream consumers rely on this behavior.
Applied to files:
core/providers/openai/types.go
📚 Learning: 2025-12-14T14:43:30.902Z
Learnt from: Radheshg04
Repo: maximhq/bifrost PR: 980
File: core/providers/openai/images.go:10-22
Timestamp: 2025-12-14T14:43:30.902Z
Learning: Enforce the OpenAI image generation SSE event type values across the OpenAI image flow in the repository: use "image_generation.partial_image" for partial chunks, "image_generation.completed" for the final result, and "error" for errors. Apply this consistently in schemas, constants, tests, accumulator routing, and UI code within core/providers/openai (and related Go files) to ensure uniform event typing and avoid mismatches.
Applied to files:
core/providers/openai/types.go
📚 Learning: 2025-12-11T07:38:31.413Z
Learnt from: Pratham-Mishra04
Repo: maximhq/bifrost PR: 1000
File: core/providers/bedrock/bedrock_test.go:1374-1390
Timestamp: 2025-12-11T07:38:31.413Z
Learning: In core/providers/bedrock tests, follow a layered testing approach: - Unit tests (e.g., TestBifrostToBedrockResponseConversion) should perform structural comparisons and type/field checks to avoid brittleness from dynamic fields. - Separate scenario-based and integration tests should validate the full end-to-end conversion logic, including content block internals. Ensure unit tests avoid brittle string/field matching and that integration tests cover end-to-end behavior with realistic data.
Applied to files:
core/providers/bedrock/bedrock_test.go
📚 Learning: 2025-12-15T10:16:21.909Z
Learnt from: qwerty-dvorak
Repo: maximhq/bifrost PR: 1006
File: core/providers/huggingface/huggingface_test.go:12-63
Timestamp: 2025-12-15T10:16:21.909Z
Learning: In provider tests under core/providers/<provider>/*_test.go, do not require or flag the use of defer for Shutdown(); instead call client.Shutdown() at the end of each test function. This pattern appears consistent across all provider tests. Apply this rule only within this path; for other tests or resources, defer may still be appropriate.
Applied to files:
core/providers/bedrock/bedrock_test.go
📚 Learning: 2025-12-12T10:28:54.988Z
Learnt from: Pratham-Mishra04
Repo: maximhq/bifrost PR: 978
File: core/providers/anthropic/chat.go:207-216
Timestamp: 2025-12-12T10:28:54.988Z
Learning: In core/providers/anthropic/chat.go, within ToAnthropicChatRequest, ensure the mapping from ReasoningDetails to Anthropic thinking blocks preserves all fields from ReasoningDetails, including when Text is nil or empty. Do not drop or coerce missing Text; propagate nil/empty values to maintain complete bidirectional mapping of Anthropic events in Bifrost. Add tests to verify that ReasoningDetails with Text nil or "" round-trip without loss, and update code to handle nullable Text without dropping data. If there are multiple blocks, preserve order and all fields during mapping.
Applied to files:
core/providers/anthropic/chat.go
🧬 Code graph analysis (7)
core/providers/openai/types.go (2)
core/schemas/responses.go (3)
ResponsesMessage(318-331)ResponsesMessageContentBlock(403-418)ResponsesToolMessage(469-489)core/schemas/chatcompletions.go (1)
CacheControl(610-613)
core/providers/bedrock/chat.go (3)
core/schemas/chatcompletions.go (8)
ChatContentBlock(592-602)ChatContentBlockTypeText(584-584)CacheControl(610-613)CacheControlTypeEphemeral(607-607)ChatAssistantMessageToolCall(702-708)ChatAssistantMessageToolCallFunction(711-714)ChatReasoningDetails(739-746)BifrostReasoningDetailsTypeText(735-735)core/schemas/bifrost.go (1)
BifrostContextKeyStructuredOutputToolName(135-135)core/schemas/utils.go (1)
Ptr(16-18)
core/providers/anthropic/responses.go (4)
core/schemas/responses.go (4)
ResponsesMessage(318-331)ResponsesInputMessageRoleSystem(338-338)ResponsesMessageContent(343-348)ResponsesMessageContentBlock(403-418)core/schemas/utils.go (1)
Ptr(16-18)core/schemas/chatcompletions.go (1)
CacheControl(610-613)core/providers/anthropic/types.go (1)
AnthropicContentBlockTypeText(204-204)
core/providers/anthropic/chat.go (2)
core/schemas/chatcompletions.go (6)
CacheControl(610-613)ChatContentBlock(592-602)ChatAssistantMessageToolCallFunction(711-714)ChatAssistantMessageToolCall(702-708)ChatToolTypeFunction(240-240)ChatReasoningDetails(739-746)core/providers/anthropic/types.go (3)
AnthropicContentBlockTypeText(204-204)AnthropicContentBlockTypeToolUse(206-206)AnthropicContentBlockTypeThinking(212-212)
tests/integrations/tests/test_anthropic.py (3)
tests/integrations/tests/utils/parametrize.py (2)
get_cross_provider_params_for_scenario(12-47)format_provider_model(50-65)tests/integrations/tests/utils/common.py (1)
assert_valid_chat_response(801-833)tests/integrations/tests/test_bedrock.py (2)
validate_cache_write(1799-1809)validate_cache_read(1811-1821)
core/providers/bedrock/responses.go (4)
core/schemas/utils.go (1)
IsNovaModel(1043-1045)core/schemas/chatcompletions.go (2)
CacheControl(610-613)CacheControlTypeEphemeral(607-607)core/providers/bedrock/types.go (5)
BedrockTool(284-287)BedrockCachePoint(199-201)BedrockCachePointTypeDefault(195-195)BedrockSystemMessage(156-160)BedrockContentBlock(163-190)core/schemas/responses.go (5)
ResponsesMessage(318-331)ResponsesInputMessageRoleSystem(338-338)ResponsesMessageTypeMessage(294-294)ResponsesMessageContent(343-348)ResponsesMessageContentBlock(403-418)
core/providers/bedrock/utils.go (2)
core/providers/bedrock/types.go (6)
BedrockContentBlock(163-190)BedrockCachePoint(199-201)BedrockCachePointTypeDefault(195-195)BedrockMessage(150-153)BedrockToolConfig(278-281)BedrockTool(284-287)core/schemas/utils.go (1)
IsNovaModel(1043-1045)
🔇 Additional comments (38)
core/providers/openai/types.go (1)
178-211: LGTM! Data mutation issue properly resolved.The deep-copy implementation correctly addresses the previous concern about in-place modification. The approach is sound:
- Creates a new message slice and copies each message
- Deep copies
ContentandContentBlockswhen present, stripsCacheControl, and replaces the pointer- Deep copies
ResponsesToolMessage,Output, and output blocks when present, stripsCacheControl, and replaces the pointer- Preserves all other fields in the copied structs
- Original data remains unmodified
The nil checks ensure safe dereferencing, and the logic correctly handles cases where content or tool blocks may be absent.
core/providers/bedrock/bedrock_test.go (2)
73-77: LGTM! Prompt caching configuration added correctly.The test configuration properly includes the new
PromptCachingModelfield set toclaude-4.5-sonnet, which aligns with the reasoning model. The field structure and assignments follow the existing pattern.
95-95: LGTM! Prompt caching scenario enabled.The
PromptCaching: trueflag correctly enables prompt caching tests for the Bedrock provider, which is consistent with the PR objectives.core/internal/testutil/test_retry_framework.go (3)
2153-2158: LGTM! Improved logging for attempt tracking.The added per-attempt start logs enhance test observability by clearly distinguishing the first attempt from retries and showing progress (attempt X/Y format).
2205-2218: LGTM! Enhanced retry error logging and timing visibility.The added error logging, clarified attempt parameter comments, and waiting logs improve test observability and address the past review concerns about attempt semantics consistency.
2241-2254: LGTM! Critical fix: Continue retrying on nil channel instead of early return.The change from returning early to continuing the retry loop (line 2254) is important for retry resilience. When the response channel is nil, the code now properly retries instead of failing immediately. The added logging and comment "CRITICAL: Must continue to retry, not return" correctly emphasize this behavior.
core/schemas/utils.go (1)
1042-1045: The search confirms there is only oneIsNovaModelfunction declaration in core/schemas/utils.go (lines 1043-1045). No duplicate declarations exist, so there is no compilation error to address.Likely an incorrect or invalid review comment.
core/providers/anthropic/responses.go (12)
1344-1345: LGTM on centralized system content handling.The signature change to accept
systemContent *AnthropicContentcleanly centralizes system message handling and aligns with the PR's goal of propagating cache control through the conversion pipeline.
2066-2097: Verify handling of nilsystemContent.ContentStr.When
systemContent.ContentStris not nil but points to an empty string, the condition on line 2068 evaluates to false, falling through to theContentBlocksbranch. IfContentBlocksis also nil, no system message is emitted. This is likely intended behavior, but worth confirming.Also, the function correctly guards against empty
contentBlocksat line 2086 before appending, which addresses the concern from the past review comment about empty content blocks.
2174-2181: CacheControl correctly propagated for grouped text blocks.The CacheControl field is properly attached when converting Anthropic text blocks to Bifrost format in the grouped message conversion path.
2263-2266: CacheControl propagated in tool result content blocks.The CacheControl is correctly preserved when converting text content blocks within tool results.
2377-2380: CacheControl propagated for output message text blocks.Text blocks in output messages correctly include CacheControl.
2390-2397: CacheControl propagated for input message text blocks.Input message text blocks now use ContentBlocks instead of ContentStr to preserve CacheControl. This is a good change that ensures cache metadata is not lost.
2495-2498: Consistent CacheControl propagation in tool result conversions.Both tool result and MCP tool result content block conversions correctly preserve CacheControl.
Also applies to: 2560-2563
3082-3086: CacheControl propagated from Anthropic tools to Bifrost tools.The conversion correctly transfers CacheControl from AnthropicTool to the Bifrost ResponsesTool structure.
3262-3266: CacheControl propagated from Bifrost tools back to Anthropic tools.The reverse conversion also correctly preserves CacheControl, ensuring bidirectional fidelity.
3330-3333: CacheControl propagated in content block to Anthropic conversion.Text blocks correctly include CacheControl when converting to Anthropic format.
3343-3344: CacheControl preserved for image block conversions.Image blocks correctly pass CacheControl through to the Anthropic image block conversion.
3382-3383: CacheControl preserved in toBifrostResponsesImageBlock.Image blocks correctly include CacheControl when converting to Bifrost format.
core/providers/anthropic/chat.go (6)
65-68: CacheControl correctly propagated for tools.Tools now properly include CacheControl when converting from Bifrost to Anthropic format.
150-153: CacheControl propagated in system content blocks.System message content blocks correctly include CacheControl.
185-188: CacheControl propagated in tool result content blocks.Tool result text blocks correctly preserve CacheControl.
242-245: CacheControl propagated in message content blocks.Regular message text blocks correctly include CacheControl.
320-366: Refactored content processing looks good.The switch-based content block processing is cleaner and properly handles:
- Text blocks with CacheControl propagation
- Tool use blocks with proper argument marshaling
- Thinking blocks with signature preservation
The refactor maintains all existing functionality while adding CacheControl support.
505-508: CacheControl propagated in response content blocks.ToAnthropicChatResponse correctly includes CacheControl when converting content blocks back to Anthropic format.
core/providers/bedrock/types.go (4)
159-160: CachePoint field added to BedrockSystemMessage.The optional CachePoint field enables cache signaling for system messages in Bedrock requests.
285-287: CachePoint field added to BedrockTool.Tools can now include cache points for prompt caching optimization.
645-646: Minor formatting changes to ETag fields.These appear to be whitespace/alignment changes only with no functional impact.
Also applies to: 663-664, 677-678
187-201: New BedrockCachePoint types are well-structured.The new types follow Go conventions and Bedrock's API patterns:
BedrockCachePointTypeas a typed string constantBedrockCachePointTypeDefaultas the standard valueBedrockCachePointstruct with the requiredtypefieldThe implementation is consistent across the codebase with proper JSON struct tags and appropriate usage in content block handling.
core/providers/bedrock/chat.go (3)
80-120: Tool call processing logic is correct.The refactored code properly:
- Checks for structured output tool and sets
contentStrwithout adding totoolCalls- Handles regular tool calls with proper argument marshaling
- Uses
continueto skip adding structured output to toolCallsThis addresses the past review comment about structured output dropping normal content.
122-135: Reasoning content extraction is correct.Reasoning blocks are properly extracted with text and signature preservation. The nil check on
ReasoningTextprevents panics.
138-142: MessageContent construction preserves both contentStr and contentBlocks.Unlike the previous implementation that collapsed single text blocks to
ContentStr, this version keeps both fields which preserves CacheControl metadata. This addresses the past review comment about dropping CacheControl.tests/integrations/tests/test_anthropic.py (1)
1421-1614: Prompt-caching tests for Anthropic look correct and aligned with Bedrock coverageThe three tests exercise system, message, and tools checkpoints, assert non‑trivial cache write/read usage, and use approximate equality rather than brittle counter invariants. The cross‑provider parametrization and use of
PROMPT_CACHING_LARGE_CONTEXT/PROMPT_CACHING_TOOLSare consistent with the Bedrock tests.core/providers/bedrock/utils.go (1)
30-32: Cache-control propagation into Bedrock content blocks and tools looks consistentPassing the model into
convertToolConfigand emittingBedrockCachePointfor text/image/tool‑result blocks and tools whenCacheControlis set is in line with the newBedrockContentBlock/BedrockToolschema and with how the tests construct prompt‑caching ToolConfig (tools plus a trailing cachePoint sentinel). The Nova guard inconvertToolConfigis also a sensible safety check.Also applies to: 323-348, 392-420, 599-667
tests/integrations/tests/test_bedrock.py (1)
1799-1821: Bedrock cache-usage validators are simple and aligned with provider semanticsThe
validate_cache_write/validate_cache_readhelpers correctly assert positivecacheWriteInputTokens/cacheReadInputTokensand log the raw usage dict. This matches how Bedrock reports prompt-caching metrics and keeps the tests focused on “did we write/read from cache” rather than exact counter behavior.core/providers/bedrock/responses.go (3)
1282-1290: Tool-level CachePoint ↔ CacheControl round-trip is coherentMapping Bedrock
{"cachePoint": {"type": "default"}}entries intoCacheControlTypeEphemeralon the last Bifrost tool (and back again by appending a CachePoint-onlyBedrockToolwhentool.CacheControl != niland the model is not Nova) matches the way the tests construct ToolConfig and keeps the conversion symmetric.Also applies to: 1572-1578
2378-2385: CachePoint handling for system and content blocks is consistent in both directions
- Bedrock → Bifrost:
convertBedrockSystemMessageToBifrostMessagesand theblock.CachePointbranch inconvertSingleBedrockMessageToBifrostMessagescorrectly attach an ephemeralCacheControlto the last content block of the last relevant message.- Bifrost → Bedrock:
convertBifrostMessageToBedrockSystemMessagesandconvertBifrostResponsesMessageContentBlocksToBedrockContentBlocksemit separate CachePoint-only system messages/content blocks whenever aCacheControlis present.This matches the Bedrock schema (
BedrockSystemMessage.CachePoint,BedrockContentBlock.CachePoint) and the new prompt-caching tests.Also applies to: 2413-2419, 2450-2484, 2657-2665, 2737-2781
1623-1640: The cache token mapping is semantically correct and intentional—no changes needed.The concern about both Bedrock fields being named
...InputTokensis understandable, but the split is justified.CacheWriteInputTokensrepresents cache creation cost (tokens written to cache), which correctly maps toOutputTokensDetails.CachedTokens. This follows the schema design where output token details explicitly document cache creation tokens as "input tokens used to create the cache entry." The mapping treats cache creation as part of completion cost, consistent with OpenAI's model and the broader codebase pattern (see Anthropic's separateCacheCreationInputTokensfield).
4ff38e2 to
e4c62a0
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
core/internal/testutil/prompt_caching.go (1)
277-443: Fix loop variable capture in parallel subtests inRunPromptCachingTest.The subtests call
t.Parallel()and close overiandqueryfrom thefor i, query := range testQueriesloop. Without rebinding, all parallel subtests can see the final values ofi/query, breaking the “second query” expectations (e.g., cached‑tokens checks).Rebind per iteration before
t.Run:- for i, query := range testQueries { - t.Run(query.name, func(t *testing.T) { + for i, query := range testQueries { + i, query := i, query // capture loop variables for this subtest + t.Run(query.name, func(t *testing.T) { userMessage := schemas.ChatMessage{ Role: schemas.ChatMessageRoleUser, @@ - if i == 1 { + if i == 1 { expectations.ProviderSpecific = map[string]interface{}{ "min_cached_tokens_percentage": 0.90, // 90% minimum "query_index": i, } } @@ - if i == 0 { + if i == 0 { ... - } else if i == 1 { + } else if i == 1 { ... } }) }This keeps each subtest’s logs and expectations aligned with the intended query index even when they run in parallel.
core/providers/anthropic/chat.go (1)
42-72: Avoid dropping CacheControl when collapsing single text content toContentStrinToAnthropicChatRequest.You correctly propagate
CacheControlfromChatContentBlockintoAnthropicContentBlockfor system, tool‑result, and normal messages, but the single‑block optimization:if len(content) == 1 && content[0].Type == AnthropicContentBlockTypeText { anthropicMsg.Content = AnthropicContent{ContentStr: content[0].Text} } else if len(content) > 0 { anthropicMsg.Content = AnthropicContent{ContentBlocks: content} }drops
content[0].CacheControlentirely. That means a Bifrost request with a single text block marked for caching won’t sendcache_controlto Anthropic.Align this with the Bedrock fix by only collapsing when there is no cache control:
- // Set content - if len(content) == 1 && content[0].Type == AnthropicContentBlockTypeText { - // Single text content can be string - anthropicMsg.Content = AnthropicContent{ContentStr: content[0].Text} - } else if len(content) > 0 { - // Multiple content blocks - anthropicMsg.Content = AnthropicContent{ContentBlocks: content} - } + // Set content + if len(content) == 1 && + content[0].Type == AnthropicContentBlockTypeText && + content[0].CacheControl == nil { + // Single plain text content can be string + anthropicMsg.Content = AnthropicContent{ContentStr: content[0].Text} + } else if len(content) > 0 { + // Preserve blocks when there are multiple items or cache_control is set + anthropicMsg.Content = AnthropicContent{ContentBlocks: content} + }This keeps the fast path while preserving cache semantics for cached content.
Also applies to: 141-201, 218-285
♻️ Duplicate comments (4)
tests/integrations/tests/test_bedrock.py (1)
1623-1797: Fix Bedrock client fixture usage in prompt‑caching tests (still calling the fixture function).
test_21_prompt_caching_system,test_22_prompt_caching_messages, andtest_23_prompt_caching_toolscallbedrock_client.converse(...)but don’t acceptbedrock_clientas a fixture argument, sobedrock_clienthere resolves to the fixture function, not an injected client. This will fail at runtime ('function' object has no attribute 'converse').Update the signatures to use the injected fixture and (optionally) make
test_configavailable instead of_test_config:- @pytest.mark.parametrize("provider,model", get_cross_provider_params_for_scenario("prompt_caching")) - def test_21_prompt_caching_system(self, _test_config, provider, model): + @pytest.mark.parametrize("provider,model", get_cross_provider_params_for_scenario("prompt_caching")) + def test_21_prompt_caching_system(self, bedrock_client, test_config, provider, model): @@ - @pytest.mark.parametrize("provider,model", get_cross_provider_params_for_scenario("prompt_caching")) - def test_22_prompt_caching_messages(self, _test_config, provider, model): + @pytest.mark.parametrize("provider,model", get_cross_provider_params_for_scenario("prompt_caching")) + def test_22_prompt_caching_messages(self, bedrock_client, test_config, provider, model): @@ - @pytest.mark.parametrize("provider,model", get_cross_provider_params_for_scenario("prompt_caching")) - def test_23_prompt_caching_tools(self, _test_config, provider, model): + @pytest.mark.parametrize("provider,model", get_cross_provider_params_for_scenario("prompt_caching")) + def test_23_prompt_caching_tools(self, bedrock_client, test_config, provider, model):After this change, the
.converse()calls will operate on the actual client instance, not the fixture function.core/providers/bedrock/responses.go (3)
1245-1292: Bidirectional CacheControl↔CachePoint wiring for tools/content looks coherentThe new logic that:
- Reads
tool.CachePointentries from Bedrock’sToolConfig.Toolsand applies an ephemeralCacheControlto the lastbifrostReq.Params.Toolsentry, and- Emits separate
BedrockTool{CachePoint: ...}entries whenever a Bifrost tool hasCacheControl(for non‑Nova models),- Emits / consumes
CachePointblocks around content viaconvertBifrostResponsesMessageContentBlocksToBedrockContentBlocksandblock.CachePointhandling inconvertSingleBedrockMessageToBifrostMessages,is internally consistent and symmetric with the chat-path changes.
This relies on the Bedrock contract that cache-point tools and cache-point content blocks only appear after the corresponding function/content they annotate; given earlier discussion that the provider enforces this, the current behavior is reasonable.
#!/bin/bash # Sanity-check that CacheControl is only ever translated to BedrockCachePointTypeDefault # and that Nova models are gated consistently. rg -n "CachePointTypeDefault" core/providers/bedrock rg -n "IsNovaModel" core/providersAlso applies to: 1535-1587, 2737-2781
1702-1713: CacheRead/CacheWrite token mapping still overloads OutputTokensDetails for input metrics
ToBifrostResponsesResponsemaps:
response.Usage.CacheReadInputTokens→Usage.InputTokensDetails.CachedTokensresponse.Usage.CacheWriteInputTokens→Usage.OutputTokensDetails.CachedTokensand
ToBedrockConverseResponseinverts this by readingInputTokensDetailsintoCacheReadInputTokensandOutputTokensDetailsintoCacheWriteInputTokens.Semantically, both Bedrock metrics are input token counts, so storing cache-write values under
OutputTokensDetailsis confusing and makes the schema harder to reason about (same concern as the earlier review on this block). If feasible, consider either:
- Introducing a dedicated “cache_write_input_tokens” style field in
ResponsesResponseInputTokens, or- At minimum documenting that
OutputTokensDetails.CachedTokenscarries cache‑write input tokens for Bedrock/Anthropic rather than cached output tokens.Right now this is more a schema semantics issue than a functional bug, but it’s easy for downstream consumers to misinterpret.
#!/bin/bash # Inspect all usages of InputTokensDetails/OutputTokensDetails.CachedTokens rg -n "InputTokensDetails" core -C3 rg -n "OutputTokensDetails" core -C3Also applies to: 1623-1641
2381-2385: Centralized Bedrock system-message conversion with CachePoint looks goodRouting system messages through
convertBedrockSystemMessageToBifrostMessagesand:
- Emitting system
ResponsesMessageentries persysMsg.Text, and- Translating
sysMsg.CachePointinto an ephemeralCacheControlon the last content block of the last system message,keeps system cache hints in sync with the non‑system content path and avoids scattered special‑cases.
Given previous clarification that Bedrock only emits
CachePointafter at least one content block, the silent no-op whenbifrostMessagesis empty is acceptable.Also applies to: 2450-2484
🧹 Nitpick comments (1)
core/providers/bedrock/utils.go (1)
320-351: CacheControl→CachePoint mapping for text/image blocks is consistent but drops TTLUsing per-block temporaries in
convertToolMessageandconvertContentBlockto attach aBedrockCachePoint{Type: BedrockCachePointTypeDefault}whenblock.CacheControl != nilcorrectly propagates “has cache hint” from Bifrost to Bedrock for both text and image content.Note that any
CacheControl.TTLis currently ignored for Bedrock; if TTL semantics become important for Bedrock later, you’ll need either a provider-specific TTL mapping or a short comment noting that TTL is intentionally only supported on OpenAI today.Also applies to: 389-421
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (22)
core/internal/testutil/account.go(3 hunks)core/internal/testutil/prompt_caching.go(3 hunks)core/internal/testutil/test_retry_framework.go(5 hunks)core/providers/anthropic/anthropic_test.go(2 hunks)core/providers/anthropic/chat.go(6 hunks)core/providers/anthropic/responses.go(13 hunks)core/providers/anthropic/types.go(2 hunks)core/providers/anthropic/utils.go(1 hunks)core/providers/bedrock/bedrock_test.go(2 hunks)core/providers/bedrock/chat.go(1 hunks)core/providers/bedrock/responses.go(8 hunks)core/providers/bedrock/types.go(6 hunks)core/providers/bedrock/utils.go(6 hunks)core/providers/openai/responses_marshal_test.go(0 hunks)core/providers/openai/types.go(1 hunks)core/schemas/chatcompletions.go(2 hunks)core/schemas/responses.go(2 hunks)core/schemas/utils.go(1 hunks)tests/integrations/config.yml(7 hunks)tests/integrations/tests/test_anthropic.py(5 hunks)tests/integrations/tests/test_bedrock.py(3 hunks)tests/integrations/tests/utils/common.py(2 hunks)
💤 Files with no reviewable changes (1)
- core/providers/openai/responses_marshal_test.go
🚧 Files skipped from review as they are similar to previous changes (5)
- core/providers/anthropic/types.go
- core/providers/anthropic/anthropic_test.go
- core/internal/testutil/test_retry_framework.go
- core/schemas/responses.go
- core/schemas/utils.go
🧰 Additional context used
📓 Path-based instructions (1)
**
⚙️ CodeRabbit configuration file
always check the stack if there is one for the current PR. do not give localized reviews for the PR, always see all changes in the light of the whole stack of PRs (if there is a stack, if there is no stack you can continue to make localized suggestions/reviews)
Files:
tests/integrations/config.ymlcore/providers/anthropic/utils.gocore/internal/testutil/account.gocore/providers/bedrock/responses.gocore/providers/bedrock/bedrock_test.gocore/internal/testutil/prompt_caching.gocore/providers/bedrock/utils.gocore/providers/anthropic/chat.gocore/schemas/chatcompletions.gocore/providers/bedrock/types.gocore/providers/anthropic/responses.gotests/integrations/tests/utils/common.pycore/providers/bedrock/chat.gocore/providers/openai/types.gotests/integrations/tests/test_anthropic.pytests/integrations/tests/test_bedrock.py
🧠 Learnings (6)
📚 Learning: 2025-12-09T17:07:42.007Z
Learnt from: qwerty-dvorak
Repo: maximhq/bifrost PR: 1006
File: core/schemas/account.go:9-18
Timestamp: 2025-12-09T17:07:42.007Z
Learning: In core/schemas/account.go, the HuggingFaceKeyConfig field within the Key struct is currently unused and reserved for future Hugging Face inference endpoint deployments. Do not flag this field as missing from OpenAPI documentation or require its presence in the API spec until the feature is actively implemented and used. When the feature is added, update the OpenAPI docs accordingly; otherwise, treat this field as non-breaking and not part of the current API surface.
Applied to files:
core/providers/anthropic/utils.gocore/internal/testutil/account.gocore/providers/bedrock/responses.gocore/providers/bedrock/bedrock_test.gocore/internal/testutil/prompt_caching.gocore/providers/bedrock/utils.gocore/providers/anthropic/chat.gocore/schemas/chatcompletions.gocore/providers/bedrock/types.gocore/providers/anthropic/responses.gocore/providers/bedrock/chat.gocore/providers/openai/types.go
📚 Learning: 2025-12-11T07:38:31.413Z
Learnt from: Pratham-Mishra04
Repo: maximhq/bifrost PR: 1000
File: core/providers/bedrock/bedrock_test.go:1374-1390
Timestamp: 2025-12-11T07:38:31.413Z
Learning: In core/providers/bedrock tests, follow a layered testing approach: - Unit tests (e.g., TestBifrostToBedrockResponseConversion) should perform structural comparisons and type/field checks to avoid brittleness from dynamic fields. - Separate scenario-based and integration tests should validate the full end-to-end conversion logic, including content block internals. Ensure unit tests avoid brittle string/field matching and that integration tests cover end-to-end behavior with realistic data.
Applied to files:
core/providers/bedrock/bedrock_test.go
📚 Learning: 2025-12-15T10:16:21.909Z
Learnt from: qwerty-dvorak
Repo: maximhq/bifrost PR: 1006
File: core/providers/huggingface/huggingface_test.go:12-63
Timestamp: 2025-12-15T10:16:21.909Z
Learning: In provider tests under core/providers/<provider>/*_test.go, do not require or flag the use of defer for Shutdown(); instead call client.Shutdown() at the end of each test function. This pattern appears consistent across all provider tests. Apply this rule only within this path; for other tests or resources, defer may still be appropriate.
Applied to files:
core/providers/bedrock/bedrock_test.go
📚 Learning: 2025-12-12T10:28:54.988Z
Learnt from: Pratham-Mishra04
Repo: maximhq/bifrost PR: 978
File: core/providers/anthropic/chat.go:207-216
Timestamp: 2025-12-12T10:28:54.988Z
Learning: In core/providers/anthropic/chat.go, within ToAnthropicChatRequest, ensure the mapping from ReasoningDetails to Anthropic thinking blocks preserves all fields from ReasoningDetails, including when Text is nil or empty. Do not drop or coerce missing Text; propagate nil/empty values to maintain complete bidirectional mapping of Anthropic events in Bifrost. Add tests to verify that ReasoningDetails with Text nil or "" round-trip without loss, and update code to handle nullable Text without dropping data. If there are multiple blocks, preserve order and all fields during mapping.
Applied to files:
core/providers/anthropic/chat.go
📚 Learning: 2025-12-11T11:58:25.307Z
Learnt from: Pratham-Mishra04
Repo: maximhq/bifrost PR: 1000
File: core/providers/openai/responses.go:42-84
Timestamp: 2025-12-11T11:58:25.307Z
Learning: In core/providers/openai/responses.go (and related OpenAI response handling), document and enforce the API format constraint: if ResponsesReasoning != nil and the response contains content blocks, all content blocks should be treated as reasoning blocks by default. Implement type guards or parsing logic accordingly, and add unit tests to verify that when ResponsesReasoning is non-nil, content blocks are labeled as reasoning blocks. Include clear comments in the code explaining the rationale and ensure downstream consumers rely on this behavior.
Applied to files:
core/providers/openai/types.go
📚 Learning: 2025-12-14T14:43:30.902Z
Learnt from: Radheshg04
Repo: maximhq/bifrost PR: 980
File: core/providers/openai/images.go:10-22
Timestamp: 2025-12-14T14:43:30.902Z
Learning: Enforce the OpenAI image generation SSE event type values across the OpenAI image flow in the repository: use "image_generation.partial_image" for partial chunks, "image_generation.completed" for the final result, and "error" for errors. Apply this consistently in schemas, constants, tests, accumulator routing, and UI code within core/providers/openai (and related Go files) to ensure uniform event typing and avoid mismatches.
Applied to files:
core/providers/openai/types.go
🧬 Code graph analysis (9)
core/providers/anthropic/utils.go (2)
core/providers/anthropic/types.go (2)
AnthropicContentBlockTypeImage(205-205)AnthropicImageSource(234-239)core/schemas/chatcompletions.go (1)
CacheControl(610-613)
core/providers/bedrock/responses.go (4)
core/schemas/utils.go (1)
IsNovaModel(1043-1045)core/schemas/chatcompletions.go (2)
CacheControl(610-613)CacheControlTypeEphemeral(607-607)core/providers/bedrock/types.go (5)
BedrockTool(284-287)BedrockCachePoint(199-201)BedrockCachePointTypeDefault(195-195)BedrockSystemMessage(156-160)BedrockContentBlock(163-190)core/schemas/responses.go (3)
ResponsesMessage(318-331)ResponsesMessageContent(343-348)ResponsesMessageContentBlock(403-418)
core/internal/testutil/prompt_caching.go (2)
core/schemas/chatcompletions.go (4)
CacheControl(610-613)CacheControlTypeEphemeral(607-607)ChatContentBlock(592-602)ChatContentBlockTypeText(584-584)core/schemas/utils.go (1)
Ptr(16-18)
core/providers/anthropic/chat.go (3)
core/schemas/chatcompletions.go (4)
CacheControl(610-613)ChatContentBlock(592-602)ChatAssistantMessageToolCallFunction(711-714)ChatAssistantMessageToolCall(702-708)core/providers/anthropic/types.go (3)
AnthropicContentBlockTypeText(204-204)AnthropicContentBlockTypeToolUse(206-206)AnthropicContentBlockTypeThinking(212-212)core/schemas/utils.go (1)
Ptr(16-18)
core/providers/anthropic/responses.go (4)
core/schemas/responses.go (4)
ResponsesMessage(318-331)ResponsesInputMessageRoleSystem(338-338)ResponsesMessageContent(343-348)ResponsesMessageContentBlock(403-418)core/schemas/utils.go (1)
Ptr(16-18)core/schemas/chatcompletions.go (1)
CacheControl(610-613)core/providers/anthropic/types.go (1)
AnthropicContentBlockTypeText(204-204)
core/providers/bedrock/chat.go (3)
core/schemas/chatcompletions.go (6)
ChatContentBlock(592-602)ChatContentBlockTypeText(584-584)CacheControl(610-613)CacheControlTypeEphemeral(607-607)ChatAssistantMessageToolCall(702-708)ChatAssistantMessageToolCallFunction(711-714)core/schemas/bifrost.go (1)
BifrostContextKeyStructuredOutputToolName(135-135)core/schemas/utils.go (1)
Ptr(16-18)
core/providers/openai/types.go (2)
core/schemas/responses.go (3)
ResponsesMessage(318-331)ResponsesMessageContentBlock(403-418)ResponsesToolMessage(469-489)core/schemas/chatcompletions.go (1)
CacheControl(610-613)
tests/integrations/tests/test_anthropic.py (2)
tests/integrations/tests/utils/parametrize.py (2)
get_cross_provider_params_for_scenario(12-47)format_provider_model(50-65)tests/integrations/tests/utils/common.py (1)
assert_valid_chat_response(801-833)
tests/integrations/tests/test_bedrock.py (2)
tests/integrations/tests/utils/parametrize.py (2)
get_cross_provider_params_for_scenario(12-47)format_provider_model(50-65)tests/integrations/tests/test_anthropic.py (2)
validate_cache_write(1669-1681)validate_cache_read(1683-1695)
🔇 Additional comments (28)
core/internal/testutil/account.go (2)
21-56: ExtendingTestScenarioswithPromptCachinglooks consistent with the feature matrix.The new
PromptCachingflag slots cleanly alongside other scenario booleans and is wired into provider configs below; no structural or API‑surface issues here.
593-660: Anthropic/Bedrock scenarios enablingPromptCachingare aligned with the stack’s intent.Setting
PromptCaching: trueonly for Anthropic and Bedrock matches the updated integrations config and new prompt‑caching tests, while leaving other providers unchanged. No additional wiring seems required in this file.core/providers/anthropic/utils.go (1)
94-103: Image block conversion now correctly carriesCacheControland uses the typed content‑block kind.Using
AnthropicContentBlockTypeImageand forwardingblock.CacheControlinto the Anthropic image block is consistent with the rest of the cache‑control plumbing; initialization remains safe even whenImageURLStructis nil.core/providers/bedrock/bedrock_test.go (1)
65-107: Bedrock comprehensive test config correctly wires prompt‑caching model and scenario.Using
claude-4.5-sonnetfor bothReasoningModelandPromptCachingModel, and enablingPromptCaching: truein scenarios, matches the broader stack (Claude prompt caching on Bedrock) and reuses existing S3/role batch/file params without introducing new test‑only knobs.core/schemas/chatcompletions.go (2)
244-250: AddingCacheControltoChatToolis a safe, backwards‑compatible extension.The optional
cache_controlfield lets providers receive tool‑level cache hints while keeping existing JSON payloads unchanged when absent, and it reuses the sharedCacheControlschema so chat and responses stay aligned.
591-613: SharedCacheControlonChatContentBlockcleanly models provider‑specific caching hints.Exposing
CacheControl *CacheControlon chat content blocks, backed by the newCacheControlType/CacheControldefinitions, gives Anthropic/Bedrock a place to attach ephemeral/TTL metadata without impacting OpenAI’s schema; the reuse of this type across chat and responses keeps cache semantics consistent.core/providers/bedrock/types.go (2)
155-201:BedrockCachePointadditions correctly surface Bedrock’s cache metadata.Adding
CachePointtoBedrockSystemMessageandBedrockContentBlock, alongside theBedrockCachePointType/BedrockCachePointdefinitions, gives the Bedrock adapter a structured place to emit/consume cache points without altering existing payloads when unset.
641-678: The review comment regarding mandatoryETagfields on Bedrock file request types cannot be verified without access to the codebase to inspect all call sites and coordinate with the broader PR stack.tests/integrations/tests/utils/common.py (2)
88-281: PROMPT_CACHING_TOOLS test data is well‑structured and consistent with existing tool schemas.The added tool definitions mirror the existing tool parameter shape (type/required/properties) and give prompt‑caching tests a rich set of tool payloads without affecting runtime code.
421-481: PROMPT_CACHING_LARGE_CONTEXT is appropriate as a high‑token test payload.Using a long, repeated legal‑framework prompt ensures prompt‑caching tests hit realistic context sizes without introducing any behavioral complexity into the utilities module.
tests/integrations/config.yml (3)
109-128: Updatingbedrock.chatto the Claude 4.5 Bedrock ARN is consistent with the new caching flows.Pointing the core Bedrock chat model at
us.anthropic.claude-sonnet-4-5-20250929-v1:0matches the rest of the stack’s focus on Claude 4.5 for reasoning/prompt‑caching while leaving other capabilities on the stable 3.5 sonnet ARN.
148-320: Prompt‑caching scenario flags and capability mapping are coherent across providers.Enabling
prompt_cachingonly for Anthropic and Bedrock, while explicitly disabling it for OpenAI, Gemini, and Cohere, matches the current implementation surface; wiringprompt_caching: "chat"inscenario_capabilitieslets the harness reuse existing chat model selection without extra special‑cases.
340-363: Addingprompt_cachingtoscenario_capabilitiescorrectly reuses the chat capability.Treating prompt‑caching scenarios as a
chatcapability keeps the configuration model simple and avoids duplicating per‑provider model keys just for caching tests.core/providers/openai/types.go (1)
172-212: Deep‑copying responses messages before strippingCacheControlfixes the prior mutation hazard.
MarshalJSONnow builds amessagesCopyslice and deep‑copiesContentBlocksand tool‑output blocks before nullingCacheControl, so OpenAI wiring stays cache‑control‑free while the originalResponsesMessagegraph remains untouched for other consumers.tests/integrations/tests/test_bedrock.py (3)
37-40: Prompt-caching section in docstring looks consistent with new tests.Entries 21–23 line up with
test_21_prompt_caching_system,test_22_prompt_caching_messages, andtest_23_prompt_caching_tools.
51-68: Imports for prompt-caching test data are correct and localized.Bringing in
PROMPT_CACHING_TOOLSandPROMPT_CACHING_LARGE_CONTEXTfrom.utils.commonmatches how they’re used below.
1799-1821: Cache usage helpers for Bedrock prompt‑caching look appropriate.
validate_cache_write/validate_cache_readcorrectly operate on the Bedrock-style usage dict (inputTokens,cacheWriteInputTokens,cacheReadInputTokens) and only assert that the relevant counter is present and > 0, which should be robust across warm-cache runs.core/providers/bedrock/chat.go (1)
62-191: Bedrock response → Bifrost conversion correctly preserves cache, tools, and reasoning metadata.The new per‑block loop handles text,
CachePoint→CacheControl(ephemeral), tool uses (including structured‑output via the context key), and reasoning content without dropping data, and the usage mapping exposesCacheReadInputTokens/CacheWriteInputTokensviaPromptTokensDetailsandCompletionTokensDetailsconsistently with other providers.core/internal/testutil/prompt_caching.go (1)
245-270: Ephemeral CacheControl oncheck_compliance_statustool is consistent with prompt‑caching design.Marking this tool as
CacheControl{Type: Ephemeral}matches how tools are used in the prompt‑caching scenarios and how provider adapters propagate tool‑level cache control.core/providers/anthropic/chat.go (1)
294-367: Anthropic ↔ Bifrost response conversions now propagate cache, tools, and reasoning correctly.
ToBifrostChatResponsebuildsChatContentBlocks withCacheControl, serializes tool‑use inputs intoChatAssistantMessageToolCall.Function.Arguments(defaulting to"{}"), and maps thinking blocks intoChatReasoningDetailsand aggregatedreasoningText.ToAnthropicChatResponsemirrors this, re‑emitting text content blocks withCacheControland reconstructingtool_useblocks from Bifrost tool calls, while exposing cache read/creation tokens viaUsage.CacheReadInputTokens/CacheCreationInputTokens. This looks consistent with the prompt‑caching tests and other providers.Also applies to: 421-541
tests/integrations/tests/test_anthropic.py (3)
39-41: Docstring updates for prompt‑caching tests match the new cases.Entries 28–30 correspond directly to
test_28_prompt_caching_system,test_29_prompt_caching_messages, andtest_30_prompt_caching_tools.
52-94: New prompt‑caching imports integrate cleanly with existing utilities.Importing
PROMPT_CACHING_TOOLSandPROMPT_CACHING_LARGE_CONTEXTfrom.utils.commonaligns with their usage in tests 28–30.
1421-1614: Anthropic prompt‑caching tests and cache helpers are well-structured and consistent with provider behavior.
Tests 28–30 correctly:
- Parameterize over
prompt_cachingproviders/models.- Use
get_provider_anthropic_client(provider)plusformat_provider_model.- Attach
{"type": "ephemeral"}cache_controlat system, message, and tool levels in the expected Anthropic wire format.- Assert non‑trivial cache creation on the first call and non‑trivial cache reads on the second; for system/messages, they also check read ≈ write (
abs(diff) < 100), which matches how cache metrics are exposed byToAnthropicChatResponse.
validate_cache_write/validate_cache_read:
- Log
input_tokens,cache_creation_input_tokens, andcache_read_input_tokens.- Require the relevant counter to exist and be > 0 without over‑constraining warm‑cache behavior, addressing earlier strictness concerns.
Overall these tests should give good signal on Anthropic‑style prompt caching across providers.
Also applies to: 1669-1695
core/providers/anthropic/responses.go (5)
1340-1347: Shared Anthropic↔Bifrost conversion entrypoints with system content are alignedUpdating
ToBifrostResponsesRequestandToBifrostResponsesResponseto call:ConvertAnthropicMessagesToBifrostMessages(request.Messages, request.System, false, provider == schemas.Bedrock) ... ConvertAnthropicMessagesToBifrostMessages([]AnthropicMessage{tempMsg}, nil, true, false)with the new
(anthropicMessages, systemContent, isOutputMessage, keepToolsGrouped)signature centralizes system handling and tool grouping for both request and response paths. This matches howConvertBifrostMessagesToAnthropicMessagesalready returns regular+system content separately and keeps the two directions symmetric.No correctness issues stand out here.
Also applies to: 1627-1649
2260-2267: Tool-result and text content now round‑trip CacheControl between Anthropic and BifrostAcross:
AnthropicContentBlockTypeToolResulthandling (for both normal and MCP/tool outputs), andconvertAnthropicContentBlocksToResponsesMessages(text case for both input and output messages),you now attach
CacheControlfromAnthropicContentBlock.CacheControlto the correspondingResponsesMessageContentBlock.CacheControl.Combined with
convertBifrostResponsesMessageContentBlocksToBedrockContentBlocksandconvertContentBlockToAnthropicwiring the reverse direction, this gives a clean end‑to‑end mapping for cache hints on tool outputs and message text.Behavior looks correct and consistent with the cache‑point design in Bedrock.
Also applies to: 2493-2499, 2558-2564, 2370-2399
3325-3347: Anthropic↔Bifrost tool/content CacheControl propagation is symmetricThe new plumbing:
- In
convertContentBlockToAnthropic, copyingblock.CacheControlfor text and image blocks (the latter viaChatContentBlock.CacheControlintoConvertToAnthropicImageBlock), and- In
toBifrostResponsesImageBlock, copyingblock.CacheControlback into the image content block, and- In
convertAnthropicToolToBifrost/convertBifrostToolToAnthropic, shuttlingtool.CacheControlbetweenAnthropicToolandResponsesTool,gives a symmetric mapping for both content‑level and tool‑level cache hints.
This aligns Anthropic’s new
CacheControlfields with the unifiedschemas.CacheControltype and matches the Bedrock/OpenAI handling patterns in the stack.Also applies to: 3376-3383, 2982-3085, 3183-3267
2065-2097: System-content conversion now preserves block-level CacheControl
convertAnthropicSystemToBifrostMessages:
- Emits a system message for
ContentStrwhen non-empty, and- For
ContentBlocks, buildsResponsesMessageContentBlockentries withTextandCacheControlcopied from eachAnthropicContentBlock, only appending a system message when at least one text block exists.This avoids empty system messages and ensures Anthropic system-level cache hints survive into Bifrost. The block
Typeis set to an output-text variant even for input/system messages—verify that downstream code handles both input and output text types equivalently.
2150-2184: Wire CacheControl for symmetry in output-message accumulationIn
convertAnthropicContentBlocksToResponsesMessagesGrouped, the output-message branch accumulates text content without propagatingblock.CacheControl, while the input-message branch correctly includes it. AddingCacheControl: block.CacheControlto the accumulated text content blocks ensures symmetric handling and prevents issues if this function is reused for output messages in the future.accumulatedTextContent = append(accumulatedTextContent, schemas.ResponsesMessageContentBlock{ Type: schemas.ResponsesOutputMessageContentTypeText, Text: block.Text, + CacheControl: block.CacheControl, })
e4c62a0 to
5cfbabb
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
♻️ Duplicate comments (1)
core/providers/bedrock/responses.go (1)
1706-1712: Semantically incorrect token mapping: CacheWriteInputTokens mapped from OutputTokensDetails.
CacheWriteInputTokensrepresents tokens written to the input cache and should be sourced fromInputTokensDetails, notOutputTokensDetails. BothCacheReadInputTokensandCacheWriteInputTokensare input-side metrics.The reverse mapping at lines 1636-1640 has the same semantic issue—it places
CacheWriteInputTokensintoOutputTokensDetails.Consider updating both mappings to use
InputTokensDetailsconsistently:if bifrostResp.Usage.InputTokensDetails != nil { bedrockResp.Usage.CacheReadInputTokens = bifrostResp.Usage.InputTokensDetails.CachedTokens } -if bifrostResp.Usage.OutputTokensDetails != nil { - bedrockResp.Usage.CacheWriteInputTokens = bifrostResp.Usage.OutputTokensDetails.CachedTokens -} +// Note: CacheWriteInputTokens is also an input metric; consider storing it in InputTokensDetails +// or adding a dedicated field if the schema supports itAnd the reverse mapping (lines 1636-1640):
if response.Usage.CacheReadInputTokens > 0 { bifrostResp.Usage.InputTokensDetails = &schemas.ResponsesResponseInputTokens{ CachedTokens: response.Usage.CacheReadInputTokens, } } -if response.Usage.CacheWriteInputTokens > 0 { - bifrostResp.Usage.OutputTokensDetails = &schemas.ResponsesResponseOutputTokens{ - CachedTokens: response.Usage.CacheWriteInputTokens, - } -} +// CacheWriteInputTokens should also map to InputTokensDetails or a separate field
🧹 Nitpick comments (3)
tests/integrations/tests/test_anthropic.py (1)
1561-1608: Prompt caching tools test validates tool caching.The test correctly adds
cache_controlto the last tool and validates cache creation/read behavior. Note that this test doesn't assertabs(cache_write_tokens - cache_read_tokens) < 100like the other tests.Consider adding the same relative validation as tests 28 and 29 for consistency:
cache_read_tokens = validate_cache_read(response2.usage, "Second request") + # Validate that cache read tokens are approximately equal to cache creation tokens + assert abs(cache_write_tokens - cache_read_tokens) < 100, \ + f"Cache read tokens ({cache_read_tokens}) should be close to cache creation tokens ({cache_write_tokens})" + print(f"✓ Tools caching validated - Cache created: {cache_write_tokens} tokens, " f"Cache read: {cache_read_tokens} tokens")core/providers/anthropic/chat.go (1)
274-281: Consider edge case: Single text content with CacheControl loses the annotation.When there's exactly one text content block (line 275-277), the code uses
ContentStrwhich doesn't carryCacheControl. If the original block hadCacheControlset, it would be silently dropped.Consider checking if
CacheControlis set before simplifying toContentStr:// Set content -if len(content) == 1 && content[0].Type == AnthropicContentBlockTypeText { +if len(content) == 1 && content[0].Type == AnthropicContentBlockTypeText && content[0].CacheControl == nil { // Single text content can be string anthropicMsg.Content = AnthropicContent{ContentStr: content[0].Text} } else if len(content) > 0 {tests/integrations/tests/test_bedrock.py (1)
1799-1821: Consider consolidating helper functions with Anthropic tests.The
validate_cache_writeandvalidate_cache_readfunctions here use dictionary key access (usage.get('cacheWriteInputTokens')), while the similar functions intest_anthropic.pyuse attribute access (getattr(usage, 'cache_creation_input_tokens')). This suggests the response structures differ between providers.If both providers return the same Bifrost-normalized structure, consider extracting these helpers to a shared utility module to reduce duplication and ensure consistent validation logic.
📜 Review details
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (22)
core/internal/testutil/account.go(3 hunks)core/internal/testutil/prompt_caching.go(3 hunks)core/internal/testutil/test_retry_framework.go(5 hunks)core/providers/anthropic/anthropic_test.go(2 hunks)core/providers/anthropic/chat.go(5 hunks)core/providers/anthropic/responses.go(13 hunks)core/providers/anthropic/types.go(2 hunks)core/providers/anthropic/utils.go(1 hunks)core/providers/bedrock/bedrock_test.go(2 hunks)core/providers/bedrock/chat.go(1 hunks)core/providers/bedrock/responses.go(8 hunks)core/providers/bedrock/types.go(6 hunks)core/providers/bedrock/utils.go(8 hunks)core/providers/openai/responses_marshal_test.go(0 hunks)core/providers/openai/types.go(1 hunks)core/schemas/chatcompletions.go(2 hunks)core/schemas/responses.go(2 hunks)core/schemas/utils.go(1 hunks)tests/integrations/config.yml(7 hunks)tests/integrations/tests/test_anthropic.py(5 hunks)tests/integrations/tests/test_bedrock.py(3 hunks)tests/integrations/tests/utils/common.py(2 hunks)
💤 Files with no reviewable changes (1)
- core/providers/openai/responses_marshal_test.go
🚧 Files skipped from review as they are similar to previous changes (7)
- core/providers/anthropic/types.go
- core/providers/anthropic/anthropic_test.go
- tests/integrations/config.yml
- tests/integrations/tests/utils/common.py
- core/internal/testutil/test_retry_framework.go
- core/providers/anthropic/utils.go
- core/schemas/responses.go
🧰 Additional context used
📓 Path-based instructions (1)
**
⚙️ CodeRabbit configuration file
always check the stack if there is one for the current PR. do not give localized reviews for the PR, always see all changes in the light of the whole stack of PRs (if there is a stack, if there is no stack you can continue to make localized suggestions/reviews)
Files:
core/providers/openai/types.gocore/schemas/utils.gocore/internal/testutil/prompt_caching.gocore/internal/testutil/account.gocore/providers/bedrock/bedrock_test.gocore/providers/bedrock/chat.gocore/providers/anthropic/chat.gocore/schemas/chatcompletions.gocore/providers/bedrock/utils.gocore/providers/bedrock/types.gotests/integrations/tests/test_anthropic.pycore/providers/bedrock/responses.gotests/integrations/tests/test_bedrock.pycore/providers/anthropic/responses.go
🧠 Learnings (6)
📚 Learning: 2025-12-09T17:07:42.007Z
Learnt from: qwerty-dvorak
Repo: maximhq/bifrost PR: 1006
File: core/schemas/account.go:9-18
Timestamp: 2025-12-09T17:07:42.007Z
Learning: In core/schemas/account.go, the HuggingFaceKeyConfig field within the Key struct is currently unused and reserved for future Hugging Face inference endpoint deployments. Do not flag this field as missing from OpenAPI documentation or require its presence in the API spec until the feature is actively implemented and used. When the feature is added, update the OpenAPI docs accordingly; otherwise, treat this field as non-breaking and not part of the current API surface.
Applied to files:
core/providers/openai/types.gocore/schemas/utils.gocore/internal/testutil/prompt_caching.gocore/internal/testutil/account.gocore/providers/bedrock/bedrock_test.gocore/providers/bedrock/chat.gocore/providers/anthropic/chat.gocore/schemas/chatcompletions.gocore/providers/bedrock/utils.gocore/providers/bedrock/types.gocore/providers/bedrock/responses.gocore/providers/anthropic/responses.go
📚 Learning: 2025-12-11T11:58:25.307Z
Learnt from: Pratham-Mishra04
Repo: maximhq/bifrost PR: 1000
File: core/providers/openai/responses.go:42-84
Timestamp: 2025-12-11T11:58:25.307Z
Learning: In core/providers/openai/responses.go (and related OpenAI response handling), document and enforce the API format constraint: if ResponsesReasoning != nil and the response contains content blocks, all content blocks should be treated as reasoning blocks by default. Implement type guards or parsing logic accordingly, and add unit tests to verify that when ResponsesReasoning is non-nil, content blocks are labeled as reasoning blocks. Include clear comments in the code explaining the rationale and ensure downstream consumers rely on this behavior.
Applied to files:
core/providers/openai/types.go
📚 Learning: 2025-12-14T14:43:30.902Z
Learnt from: Radheshg04
Repo: maximhq/bifrost PR: 980
File: core/providers/openai/images.go:10-22
Timestamp: 2025-12-14T14:43:30.902Z
Learning: Enforce the OpenAI image generation SSE event type values across the OpenAI image flow in the repository: use "image_generation.partial_image" for partial chunks, "image_generation.completed" for the final result, and "error" for errors. Apply this consistently in schemas, constants, tests, accumulator routing, and UI code within core/providers/openai (and related Go files) to ensure uniform event typing and avoid mismatches.
Applied to files:
core/providers/openai/types.go
📚 Learning: 2025-12-11T07:38:31.413Z
Learnt from: Pratham-Mishra04
Repo: maximhq/bifrost PR: 1000
File: core/providers/bedrock/bedrock_test.go:1374-1390
Timestamp: 2025-12-11T07:38:31.413Z
Learning: In core/providers/bedrock tests, follow a layered testing approach: - Unit tests (e.g., TestBifrostToBedrockResponseConversion) should perform structural comparisons and type/field checks to avoid brittleness from dynamic fields. - Separate scenario-based and integration tests should validate the full end-to-end conversion logic, including content block internals. Ensure unit tests avoid brittle string/field matching and that integration tests cover end-to-end behavior with realistic data.
Applied to files:
core/providers/bedrock/bedrock_test.go
📚 Learning: 2025-12-15T10:16:21.909Z
Learnt from: qwerty-dvorak
Repo: maximhq/bifrost PR: 1006
File: core/providers/huggingface/huggingface_test.go:12-63
Timestamp: 2025-12-15T10:16:21.909Z
Learning: In provider tests under core/providers/<provider>/*_test.go, do not require or flag the use of defer for Shutdown(); instead call client.Shutdown() at the end of each test function. This pattern appears consistent across all provider tests. Apply this rule only within this path; for other tests or resources, defer may still be appropriate.
Applied to files:
core/providers/bedrock/bedrock_test.go
📚 Learning: 2025-12-12T10:28:54.988Z
Learnt from: Pratham-Mishra04
Repo: maximhq/bifrost PR: 978
File: core/providers/anthropic/chat.go:207-216
Timestamp: 2025-12-12T10:28:54.988Z
Learning: In core/providers/anthropic/chat.go, within ToAnthropicChatRequest, ensure the mapping from ReasoningDetails to Anthropic thinking blocks preserves all fields from ReasoningDetails, including when Text is nil or empty. Do not drop or coerce missing Text; propagate nil/empty values to maintain complete bidirectional mapping of Anthropic events in Bifrost. Add tests to verify that ReasoningDetails with Text nil or "" round-trip without loss, and update code to handle nullable Text without dropping data. If there are multiple blocks, preserve order and all fields during mapping.
Applied to files:
core/providers/anthropic/chat.go
🧬 Code graph analysis (5)
core/providers/openai/types.go (1)
core/schemas/responses.go (3)
ResponsesMessage(318-331)ResponsesMessageContentBlock(403-418)ResponsesToolMessage(469-489)
core/internal/testutil/prompt_caching.go (2)
core/schemas/chatcompletions.go (4)
CacheControl(610-613)CacheControlTypeEphemeral(607-607)ChatContentBlock(592-602)ChatContentBlockTypeText(584-584)core/schemas/utils.go (1)
Ptr(16-18)
core/providers/bedrock/chat.go (3)
core/schemas/chatcompletions.go (6)
ChatContentBlock(592-602)ChatContentBlockTypeText(584-584)ChatAssistantMessageToolCall(702-708)ChatAssistantMessageToolCallFunction(711-714)ChatReasoningDetails(739-746)BifrostReasoningDetailsTypeText(735-735)core/schemas/bifrost.go (1)
BifrostContextKeyStructuredOutputToolName(135-135)core/schemas/utils.go (1)
Ptr(16-18)
core/providers/bedrock/utils.go (3)
core/schemas/chatcompletions.go (4)
ChatMessage(471-480)ChatContentBlockTypeText(584-584)CacheControl(610-613)ChatContentBlock(592-602)core/providers/bedrock/types.go (5)
BedrockSystemMessage(156-160)BedrockCachePoint(199-201)BedrockContentBlock(163-190)BedrockToolConfig(278-281)BedrockTool(284-287)core/schemas/utils.go (1)
IsNovaModel(1043-1045)
core/providers/anthropic/responses.go (4)
core/schemas/responses.go (6)
ResponsesMessage(318-331)ResponsesInputMessageRoleSystem(338-338)ResponsesMessageContent(343-348)ResponsesMessageContentBlock(403-418)ResponsesOutputMessageContentTypeText(396-396)ResponsesInputMessageContentBlockTypeText(392-392)core/schemas/utils.go (1)
Ptr(16-18)core/schemas/chatcompletions.go (1)
CacheControl(610-613)core/providers/anthropic/types.go (1)
AnthropicContentBlockTypeText(204-204)
🔇 Additional comments (44)
core/providers/openai/types.go (1)
178-211: Deep copy implementation correctly addresses the mutation concern.The implementation properly creates copies at each level before stripping
CacheControl:
messagesCopyfor the outer arraycontentCopyandblockCopyfor content blockstoolMsgCopy,outputCopy, and block copies for tool message outputsThis prevents mutation of the original input data during marshaling.
One consideration: as noted in the past review comment by Pratham-Mishra04, this allocation pattern could be optimized using a custom JSON encoder with field skip logic for high-throughput scenarios. However, for correctness, this approach is sound.
core/schemas/utils.go (1)
1042-1045: LGTM!The
IsNovaModelfunction follows the established pattern of other model detection utilities (IsAnthropicModel,IsMistralModel,IsGeminiModel) in this file. The implementation is consistent and straightforward.core/schemas/chatcompletions.go (3)
246-249: LGTM - CacheControl field added to ChatTool.The addition of
CacheControltoChatToolenables cache control metadata on tool definitions, which aligns with the PR's goal of supporting prompt caching across providers.
599-602: LGTM - CacheControl field added to ChatContentBlock.The field addition with the explanatory comment about provider-specific usage (Anthropic, Bedrock) provides good context for maintainers.
604-613: No duplicate type definitions found.CacheControlandCacheControlTypeare defined only once inchatcompletions.go(lines 604 and 610 respectively).Likely an incorrect or invalid review comment.
core/internal/testutil/account.go (4)
43-43: LGTM - PromptCaching field added to TestScenarios.The new
PromptCachingboolean flag enables gating prompt caching tests per provider, which is a clean approach for provider-specific feature testing.
609-609: LGTM - PromptCaching enabled for Anthropic.Correctly enables prompt caching tests for the Anthropic provider.
643-643: LGTM - PromptCaching enabled for Bedrock.Correctly enables prompt caching tests for the Bedrock provider.
553-576: Verify: OpenAI has PromptCachingModel but PromptCaching is not enabled in Scenarios.The OpenAI configuration defines
PromptCachingModel: "gpt-4.1"(line 553) but doesn't setPromptCaching: truein theScenariosstruct. This differs from Anthropic and Bedrock which both havePromptCaching: trueset.Is this intentional because OpenAI uses a different caching mechanism, or should
PromptCaching: truebe added to the OpenAI scenarios as well?core/providers/bedrock/bedrock_test.go (2)
73-77: LGTM - Test configuration expanded with new model and parameter fields.The additions properly configure the Bedrock test suite with:
EmbeddingModelfor embedding testsReasoningModelandPromptCachingModelfor respective feature testsBatchExtraParamsandFileExtraParamsfor S3-based operationsThe configuration follows established patterns in the test framework.
95-95: LGTM - PromptCaching enabled for Bedrock tests.This enables the prompt caching integration tests for Bedrock, which aligns with the new CacheControl support added in this PR.
core/internal/testutil/prompt_caching.go (3)
265-268: LGTM: Cache control added to last tool for testing prompt caching.Adding
CacheControlwithEphemeraltype to the last tool aligns with the Anthropic prompt caching pattern where cache breakpoints are placed on specific content items.
281-284: Improved gating via scenario-based configuration.Switching from provider-specific OpenAI check to
testConfig.Scenarios.PromptCachingmakes the test more flexible and properly controlled via configuration. This allows any provider supporting prompt caching to run these tests.
295-303: System message now uses ContentBlocks with ephemeral cache control.The change from
ContentStrtoContentBlockswith a text block containingCacheControlis consistent with the broader cache control implementation across providers. This structure allows the cache point to be signaled correctly for Anthropic and Bedrock.core/providers/bedrock/chat.go (4)
63-71: Clean per-block text content handling.The iteration over
response.Output.Message.Contentwith per-block processing for text content is well-structured. Each text block is properly converted to aChatContentBlockwith the correct type.
73-87: Structured output tool handling correctly integrated into the block loop.The structured output check now properly combines the context key check with the tool name match in a single conditional. This addresses the previous review concern where the structured-output path could drop normal content when the tool wasn't used.
89-113: Regular tool call processing is thorough.The tool call extraction properly handles:
- JSON marshaling with fallback to string representation
- Empty object default when input is nil
- Correct index assignment using current toolCalls length
- Proper ID and function name extraction
115-127: Reasoning content handling with signature support.The reasoning content extraction properly:
- Guards against nil
ReasoningText- Extracts both text and signature
- Accumulates reasoning text with newline separators
- Uses correct
BifrostReasoningDetailsTypeTexttypecore/providers/bedrock/types.go (3)
159-159: CachePoint field added to BedrockSystemMessage.This enables cache point signaling for system messages in Bedrock requests.
187-201: New BedrockCachePoint type and ContentBlock field.The
BedrockCachePointstruct withBedrockCachePointTypeDefault = "default"correctly matches the AWS Bedrock API specification. Per the AWS docs, only the "default" type is accepted for cache point blocks.
285-286: CachePoint field added to BedrockTool.This enables cache point signaling on tool definitions, which is useful for caching tool configurations across multiple requests.
core/providers/bedrock/utils.go (5)
30-32: convertToolConfig now accepts model parameter.This enables model-specific logic (e.g., excluding Nova models from cache point support).
237-264: convertSystemMessages correctly emits CachePoint as separate BedrockSystemMessage.Bedrock's API requires cache points to be in separate message blocks rather than inline. The function correctly:
- Appends a text system message for each text block
- Appends an additional
BedrockSystemMessagewith onlyCachePointwhenCacheControlis presentThis aligns with AWS Bedrock's cache point API structure.
330-337: Tool message content correctly adds CachePoint as separate blocks.Both text and image blocks within tool results properly emit CachePoint as a separate
BedrockContentBlockwhen the source block hasCacheControlset.Also applies to: 348-355
398-446: convertContentBlock returns slice and handles CachePoint correctly.The function signature change to return
[]BedrockContentBlockallows it to emit both the content block and an optional CachePoint block. This pattern is applied consistently for text and image content types.
661-668: CachePoint added to tools with Nova model exclusion.The logic correctly:
- Checks if
CacheControlis set on the tool- Excludes Nova models using
schemas.IsNovaModel(model)- Appends a separate
BedrockToolentry containing onlyCachePointThis matches Bedrock's requirement for cache points to be separate entries in the tools array.
tests/integrations/tests/test_anthropic.py (3)
1421-1484: Prompt caching system message test is well-structured.The test correctly:
- Uses cross-provider parametrization for flexibility
- Creates a system message with
cache_control: {"type": "ephemeral"}on the large context- Makes two requests and validates cache creation then cache read
- Uses relative comparison (
abs(cache_write_tokens - cache_read_tokens) < 100) rather than strict equalityThis addresses the previous review concern about overly strict cache counter assertions.
1485-1559: Prompt caching messages test properly validates cache behavior.The test correctly places
cache_controlon the large context block within user message content and validates that the second request reads from cache.
1663-1689: Cache validation helpers are appropriately relaxed.The
validate_cache_writeandvalidate_cache_readfunctions:
- Assert the required attributes exist
- Assert token counts are > 0
- Log the actual values for debugging
- Don't enforce strict counter==0 invariants that could cause flakiness
This addresses the previous review feedback about overly strict assertions.
core/providers/anthropic/chat.go (3)
65-67: LGTM: CacheControl propagation for tools.Correctly propagates
CacheControlfrom Bifrost tool definitions to Anthropic tool representations.
150-153: LGTM: CacheControl attached to text content blocks.Consistently applies
CacheControlto text blocks across system messages, tool results, and regular message content blocks.Also applies to: 185-188, 242-245
320-365: LGTM: Refactored response content block handling.The switch-based content block processing is cleaner and correctly handles text, tool use, and thinking blocks. The tool use JSON marshaling with fallback to empty object
"{}"is a good defensive approach.tests/integrations/tests/test_bedrock.py (3)
1623-1676: LGTM: Well-structured system message caching test.Test correctly validates the prompt caching workflow:
- First request creates cache with system message checkpoint
- Second request reads from cache
- Validates cache write/read tokens are approximately equal
1677-1731: LGTM: Messages caching test properly validates cache behavior.Test correctly places
cachePointafter the large context content block within the user message, then verifies cache read on the subsequent request.
1732-1797: LGTM: Tools caching test with 12 tools.Test correctly appends
cachePointafter all tool definitions and validates cache behavior across two requests with different prompts but same tool config.core/providers/bedrock/responses.go (6)
1283-1289: LGTM: CachePoint handling for tools with Nova exclusion.Correctly applies ephemeral cache control to the last tool in the array when a
CachePointis encountered, with proper exclusion for Nova models that don't support cache points in tools.
1571-1578: LGTM: Bifrost CacheControl to Bedrock CachePoint conversion.Correctly converts Bifrost's
CacheControlon tools to Bedrock'sCachePointformat, with Nova model exclusion.
2450-2484: LGTM: System message to Bifrost conversion with CachePoint handling.The updated
convertBedrockSystemMessageToBifrostMessagescorrectly:
- Iterates over multiple system messages
- Applies ephemeral
CacheControlto the last content block whenCachePointis encountered- Creates proper Bifrost message structure for text content
2413-2419: LGTM: CacheControl to CachePoint conversion for system messages.Correctly emits a
BedrockCachePointafter a text block when the Bifrost content block hasCacheControlset.
2656-2665: LGTM: CachePoint to CacheControl conversion in message content blocks.Correctly applies ephemeral
CacheControlto the last content block of the last message when aCachePointis encountered during Bedrock-to-Bifrost conversion.
2774-2781: LGTM: Bifrost CacheControl to Bedrock CachePoint for content blocks.Correctly appends a
BedrockCachePointblock after content blocks that haveCacheControlset.core/providers/anthropic/responses.go (3)
2065-2097: LGTM! Past review concern addressed.The implementation correctly handles system content conversion to Bifrost messages:
The check at line 2086 (
if len(contentBlocks) > 0) prevents emptyContentBlocksarrays from being appended, which directly addresses the concern raised in the past review comment.The filtering to text-only blocks (line 2078) is correct per Anthropic's API specification, where system messages can only contain text content.
CacheControlis properly preserved from source blocks (line 2082), maintaining cache directives throughout the conversion pipeline.
2082-2082: LGTM! Consistent CacheControl propagation.The
CacheControlfield is correctly propagated across all content block and tool conversion paths:
- System content: Line 2082 preserves cache directives for system text blocks
- Regular messages: Lines 2178, 2379, 2394 handle input and output text blocks
- Tool results: Lines 2265, 2497, 2562 maintain cache control in function/MCP tool outputs
- Conversion helpers: Lines 3332, 3343, 3382 ensure consistency in lower-level conversions
- Tool definitions: Lines 3083, 3263 propagate cache control for tool schemas
The pattern is uniform and correctly uses pointer assignment for pass-through conversion semantics.
Also applies to: 2178-2178, 2265-2265, 2379-2379, 2394-2394, 2497-2497, 2562-2562, 3083-3083, 3263-3263, 3332-3332, 3343-3343, 3382-3382
1344-1344: LGTM! System content parameter correctly passed.The addition of
request.Systemto theConvertAnthropicMessagesToBifrostMessagescall correctly centralizes system content handling. The receiver function (line 1632) appropriately checks for nil before invoking the conversion helper, ensuring safe handling of optional system content.
5cfbabb to
8005eb3
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
core/providers/anthropic/chat.go (1)
479-492: Don’t drop reasoning details with nil Text but non‑empty Signature inToAnthropicChatResponseThe current guard:
if reasoningDetail.Type == schemas.BifrostReasoningDetailsTypeText && reasoningDetail.Text != nil && ((reasoningDetail.Text != nil && *reasoningDetail.Text != "") || (reasoningDetail.Signature != nil && *reasoningDetail.Signature != "")) { content = append(content, AnthropicContentBlock{ Type: AnthropicContentBlockTypeThinking, Thinking: reasoningDetail.Text, Signature: reasoningDetail.Signature, }) }means any
ReasoningDetailsentry withText == nilbut a non‑emptySignatureis silently dropped. That breaks the “preserve all fields, including when Text is nil/empty” requirement for reasoning round‑trips and loses Anthropic thinking‑signature information when mapping Bifrost → Anthropic. Based on learnings, this should be avoided.You can fix this by only requiring that at least one of
TextorSignatureis present, and by not special‑casing empty text:- if choice.ChatNonStreamResponseChoice != nil && choice.Message != nil && choice.Message.ChatAssistantMessage != nil && choice.Message.ChatAssistantMessage.ReasoningDetails != nil { - for _, reasoningDetail := range choice.Message.ChatAssistantMessage.ReasoningDetails { - if reasoningDetail.Type == schemas.BifrostReasoningDetailsTypeText && reasoningDetail.Text != nil && - ((reasoningDetail.Text != nil && *reasoningDetail.Text != "") || - (reasoningDetail.Signature != nil && *reasoningDetail.Signature != "")) { - content = append(content, AnthropicContentBlock{ - Type: AnthropicContentBlockTypeThinking, - Thinking: reasoningDetail.Text, - Signature: reasoningDetail.Signature, - }) - } - } - } + if choice.ChatNonStreamResponseChoice != nil && + choice.Message != nil && + choice.Message.ChatAssistantMessage != nil && + choice.Message.ChatAssistantMessage.ReasoningDetails != nil { + for _, reasoningDetail := range choice.Message.ChatAssistantMessage.ReasoningDetails { + if reasoningDetail.Type != schemas.BifrostReasoningDetailsTypeText { + continue + } + if reasoningDetail.Text == nil && reasoningDetail.Signature == nil { + continue + } + content = append(content, AnthropicContentBlock{ + Type: AnthropicContentBlockTypeThinking, + Thinking: reasoningDetail.Text, + Signature: reasoningDetail.Signature, + }) + } + }This keeps all reasoning entries (including signature‑only and empty‑text ones) flowing through, preserving bidirectional mapping.
♻️ Duplicate comments (2)
core/providers/bedrock/responses.go (2)
2381-2385: System‑message CachePoint handling is symmetric; document ordering assumptionDelegating system conversion through
convertBedrockSystemMessageToBifrostMessagesand emitting a syntheticBedrockSystemMessage{CachePoint: ...}fromconvertBifrostMessageToBedrockSystemMessagesgives a clean round‑trip:{Text},{CachePoint}in Bedrock becomes a single system message whose last content block hasCacheControl, and vice‑versa.
convertBedrockSystemMessageToBifrostMessagessilently ignores aCachePointwhenbifrostMessagesis still empty or the last message has no content blocks. That matches prior design discussions that Bedrock (or upstream construction) never emits aCachePointbefore at least one system text entry, but it’s an implicit contract.Consider adding a short comment near the
if sysMsg.CachePoint != nilbranch noting that Bedrock/system construction guaranteesCachePointonly appears after system text, so a missing target means the hint is intentionally ignored.Also applies to: 2413-2419, 2450-2484
2565-2665: Per‑content‑block CacheControl/CachePoint mapping is reversible and consistentThe new
block.CachePoint != nilbranch inconvertSingleBedrockMessageToBifrostMessagesand theif block.CacheControl != nillogic inconvertBifrostResponsesMessageContentBlocksToBedrockContentBlockstogether ensure:
- A Bedrock content block followed by a
CachePointbecomes a Bifrost content block whose last entry hasCacheControl: {Type: ephemeral}.- A Bifrost content block with
CacheControlis re‑encoded as the original content block plus a trailingCachePointblock.This gives a lossless Bedrock↔Bifrost translation for per‑block cache hints. As with system messages, the code assumes Bedrock doesn’t emit a
CachePointbefore any content block; if that API guarantee ever changes, we’d need a fallback, but under the current contract this is fine.Also applies to: 2737-2781
🧹 Nitpick comments (7)
core/internal/testutil/test_retry_framework.go (1)
832-834: Consider including max attempts for consistency with other retry logs.While the current logging is functional, it differs from the pattern used throughout the rest of the file where retry logs display both current and max attempts (e.g., lines 608, 648, 2155, 2157). Including
config.MaxAttemptswould improve consistency and provide better context during test debugging.Apply this diff to align with the file's logging pattern:
OnRetry: func(attempt int, reason string, t *testing.T) { // reason already contains ❌ prefix from retry logic // attempt represents the current failed attempt number // Log with attempt+1 to show the next attempt that will run - t.Logf("🔄 Retrying streaming test (attempt %d): %s", attempt+1, reason) + t.Logf("🔄 Retrying streaming test (attempt %d/%d): %s", attempt+1, config.MaxAttempts, reason) },core/internal/testutil/account.go (1)
593-621: Ensure Anthropic/Bedrock PromptCachingModel is configured where these scenarios are used.You’ve enabled
Scenarios.PromptCachingfor Anthropic and Bedrock here, but the correspondingComprehensiveTestConfigentries inAllProviderConfigsdon’t setPromptCachingModel. The provider‑specific tests construct their own configs withPromptCachingModel, but if any shared runner usesAllProviderConfigs+Scenarios.PromptCaching, it will end up issuing prompt‑caching runs with an empty model string.Consider either:
- Setting
PromptCachingModelon these twoAllProviderConfigsentries (to the same models used in the dedicated tests), or- Making the prompt‑caching runner explicitly no‑op when
PromptCachingModel == "", regardless of the scenario flag.Also applies to: 627-660
core/internal/testutil/prompt_caching.go (1)
276-284: Scenario gate + ContentBlocks-based system message are appropriate; consider updating the doc comment.The additional
Scenarios.PromptCachingguard and the move to aContentBlockssystem message with an ephemeralCacheControlblock both match the new caching model and avoid running this test where unsupported. The top comment still says “OpenAI's prompt caching”, though this function is now clearly provider‑agnostic; renaming that to “prompt caching” generally would avoid confusion.Also applies to: 291-304
core/providers/bedrock/utils.go (1)
30-33: CacheControl → Bedrock CachePoint mapping is coherent across system/messages/toolsThe new wiring achieves a consistent story:
convertSystemMessagesexplodes a systemChatMessageinto multipleBedrockSystemMessageentries, and when a text block hasCacheControlit appends a separate{cachePoint: {type: "default"}}system message, matching Bedrock’s cachePoint convention.convertContent+convertContentBlocknow return slices so you can emit a normal text/image block followed by a CachePoint block whenblock.CacheControl != nil; multi-block append inconvertMessagecorrectly flattens these.convertToolMessagemirrors that behavior for tool results, inserting cachePoint content blocks after cached text/image tool outputs.convertToolConfig(model, params)keepingToolSpecand CachePoint as separate entries inToolsand gating cache points with!schemas.IsNovaModel(model)matches the Bedrock JSON shape and avoids sending unsupported cache hints to Nova.The only trade‑off is that TTL from
CacheControlis intentionally dropped because Bedrock’scachePointdoesn’t accept it, which is reasonable for now.Overall this is a clean, provider‑correct implementation of prompt‑caching hooks.
Also applies to: 199-235, 237-264, 295-359, 374-395, 397-447, 616-685
tests/integrations/tests/test_anthropic.py (1)
1421-1608: Prompt‑caching tests and Anthropic usage helpers look consistent with cross‑provider designThe three new tests:
- Correctly exercise system/message/tool‑level cache control using
cache_control: {"type": "ephemeral"}and separate user questions to force multiple calls.- Validate behavior via
usage.cache_creation_input_tokensandusage.cache_read_input_tokensin a way that matches Anthropic’s schema and mirrors the Bedrock tests (first call writes >0 tokens, second call reads >0 tokens, and read/write counts are roughly similar).The
validate_cache_write/validate_cache_readhelpers are straightforward and avoid the earlier over‑strict “no reads on first call / no writes on second call” checks, which should reduce flakiness while still catching regressions in caching.Only optional nit: these helpers are nearly identical to the Bedrock versions—if you want to keep behavior in lock‑step, you could factor them into a shared test utility, but it’s not required.
Also applies to: 1663-1689
tests/integrations/tests/test_bedrock.py (1)
1799-1821: Cache write/read helpers match Bedrock usage schema; consider sharing with Anthropic tests
validate_cache_write/validate_cache_readoperate on theusagedict’scacheWriteInputTokensandcacheReadInputTokensfields and only assert that the respective values are > 0, which is exactly what the Bedrock prompt‑caching tests need.They also mirror the Anthropic helpers almost 1:1; if you want to avoid divergence in future, you could move these into a shared test utility module and parametrize on field names, but it’s optional.
core/providers/anthropic/responses.go (1)
2159-2184: CacheControl propagation across Anthropic ↔ Bifrost is thorough; consider one small consistency tweakThe new wiring generally looks solid:
- Text blocks in both grouped and ungrouped converters (
convertAnthropicContentBlocksToResponsesMessagesGrouped/convertAnthropicContentBlocksToResponsesMessages) now copyblock.CacheControlintoResponsesMessageContentBlock.- Tool‑result content (
AnthropicContentBlockTypeToolResultandAnthropicContentBlockTypeMCPToolResult) propagatesCacheControlon nested text blocks into the correspondingResponsesMessageContentBlocks.convertContentBlockToAnthropicmirrors this by copyingblock.CacheControlback intoAnthropicContentBlockfor text and image blocks, andtoBifrostResponsesImageBlockdoes the same for image blocks.- Tool structs now round‑trip tool‑level cache control:
convertAnthropicToolToBifrostandconvertBifrostToolToAnthropicboth mapCacheControlbetweenAnthropicToolandResponsesTool.One minor gap: in
convertAnthropicContentBlocksToResponsesMessagesGrouped, whenisOutputMessageis true, theaccumulatedTextContentslice is built without copyingblock.CacheControl, so any cache hints on those output text blocks would be dropped in the grouped path. It’s currently only used for input‑side conversions in this file, but making it consistent is trivial:- if isOutputMessage { - // For output messages, accumulate text blocks (don't emit immediately) - accumulatedTextContent = append(accumulatedTextContent, schemas.ResponsesMessageContentBlock{ - Type: schemas.ResponsesOutputMessageContentTypeText, - Text: block.Text, - }) + if isOutputMessage { + // For output messages, accumulate text blocks (don't emit immediately) + accumulatedTextContent = append(accumulatedTextContent, schemas.ResponsesMessageContentBlock{ + Type: schemas.ResponsesOutputMessageContentTypeText, + Text: block.Text, + CacheControl: block.CacheControl, + })That would align the grouped/output path with the ungrouped converter and avoid any future surprises if
keepToolsGroupedever gets used for outputs.Also applies to: 2239-2276, 2365-2400, 2463-2466, 2495-2498, 2560-2563, 3330-3333, 3343-3344, 3377-3383, 3082-3084, 3262-3264
📜 Review details
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (22)
core/internal/testutil/account.go(3 hunks)core/internal/testutil/prompt_caching.go(3 hunks)core/internal/testutil/test_retry_framework.go(5 hunks)core/providers/anthropic/anthropic_test.go(2 hunks)core/providers/anthropic/chat.go(5 hunks)core/providers/anthropic/responses.go(13 hunks)core/providers/anthropic/types.go(2 hunks)core/providers/anthropic/utils.go(1 hunks)core/providers/bedrock/bedrock_test.go(2 hunks)core/providers/bedrock/chat.go(1 hunks)core/providers/bedrock/responses.go(8 hunks)core/providers/bedrock/types.go(6 hunks)core/providers/bedrock/utils.go(8 hunks)core/providers/openai/responses_marshal_test.go(0 hunks)core/providers/openai/types.go(1 hunks)core/schemas/chatcompletions.go(2 hunks)core/schemas/responses.go(2 hunks)core/schemas/utils.go(1 hunks)tests/integrations/config.yml(7 hunks)tests/integrations/tests/test_anthropic.py(5 hunks)tests/integrations/tests/test_bedrock.py(3 hunks)tests/integrations/tests/utils/common.py(2 hunks)
💤 Files with no reviewable changes (1)
- core/providers/openai/responses_marshal_test.go
🚧 Files skipped from review as they are similar to previous changes (4)
- core/providers/anthropic/utils.go
- core/schemas/utils.go
- core/schemas/responses.go
- core/providers/anthropic/types.go
🧰 Additional context used
📓 Path-based instructions (1)
**
⚙️ CodeRabbit configuration file
always check the stack if there is one for the current PR. do not give localized reviews for the PR, always see all changes in the light of the whole stack of PRs (if there is a stack, if there is no stack you can continue to make localized suggestions/reviews)
Files:
core/internal/testutil/account.gocore/providers/bedrock/bedrock_test.gocore/internal/testutil/prompt_caching.gocore/internal/testutil/test_retry_framework.gocore/providers/bedrock/chat.gocore/schemas/chatcompletions.gotests/integrations/config.ymlcore/providers/anthropic/anthropic_test.gocore/providers/bedrock/types.gocore/providers/anthropic/responses.gocore/providers/openai/types.gocore/providers/bedrock/utils.gocore/providers/bedrock/responses.gotests/integrations/tests/utils/common.pytests/integrations/tests/test_bedrock.pytests/integrations/tests/test_anthropic.pycore/providers/anthropic/chat.go
🧠 Learnings (6)
📚 Learning: 2025-12-09T17:07:42.007Z
Learnt from: qwerty-dvorak
Repo: maximhq/bifrost PR: 1006
File: core/schemas/account.go:9-18
Timestamp: 2025-12-09T17:07:42.007Z
Learning: In core/schemas/account.go, the HuggingFaceKeyConfig field within the Key struct is currently unused and reserved for future Hugging Face inference endpoint deployments. Do not flag this field as missing from OpenAPI documentation or require its presence in the API spec until the feature is actively implemented and used. When the feature is added, update the OpenAPI docs accordingly; otherwise, treat this field as non-breaking and not part of the current API surface.
Applied to files:
core/internal/testutil/account.gocore/providers/bedrock/bedrock_test.gocore/internal/testutil/prompt_caching.gocore/internal/testutil/test_retry_framework.gocore/providers/bedrock/chat.gocore/schemas/chatcompletions.gocore/providers/anthropic/anthropic_test.gocore/providers/bedrock/types.gocore/providers/anthropic/responses.gocore/providers/openai/types.gocore/providers/bedrock/utils.gocore/providers/bedrock/responses.gocore/providers/anthropic/chat.go
📚 Learning: 2025-12-11T07:38:31.413Z
Learnt from: Pratham-Mishra04
Repo: maximhq/bifrost PR: 1000
File: core/providers/bedrock/bedrock_test.go:1374-1390
Timestamp: 2025-12-11T07:38:31.413Z
Learning: In core/providers/bedrock tests, follow a layered testing approach: - Unit tests (e.g., TestBifrostToBedrockResponseConversion) should perform structural comparisons and type/field checks to avoid brittleness from dynamic fields. - Separate scenario-based and integration tests should validate the full end-to-end conversion logic, including content block internals. Ensure unit tests avoid brittle string/field matching and that integration tests cover end-to-end behavior with realistic data.
Applied to files:
core/providers/bedrock/bedrock_test.go
📚 Learning: 2025-12-15T10:16:21.909Z
Learnt from: qwerty-dvorak
Repo: maximhq/bifrost PR: 1006
File: core/providers/huggingface/huggingface_test.go:12-63
Timestamp: 2025-12-15T10:16:21.909Z
Learning: In provider tests under core/providers/<provider>/*_test.go, do not require or flag the use of defer for Shutdown(); instead call client.Shutdown() at the end of each test function. This pattern appears consistent across all provider tests. Apply this rule only within this path; for other tests or resources, defer may still be appropriate.
Applied to files:
core/providers/bedrock/bedrock_test.gocore/providers/anthropic/anthropic_test.go
📚 Learning: 2025-12-11T11:58:25.307Z
Learnt from: Pratham-Mishra04
Repo: maximhq/bifrost PR: 1000
File: core/providers/openai/responses.go:42-84
Timestamp: 2025-12-11T11:58:25.307Z
Learning: In core/providers/openai/responses.go (and related OpenAI response handling), document and enforce the API format constraint: if ResponsesReasoning != nil and the response contains content blocks, all content blocks should be treated as reasoning blocks by default. Implement type guards or parsing logic accordingly, and add unit tests to verify that when ResponsesReasoning is non-nil, content blocks are labeled as reasoning blocks. Include clear comments in the code explaining the rationale and ensure downstream consumers rely on this behavior.
Applied to files:
core/providers/openai/types.go
📚 Learning: 2025-12-14T14:43:30.902Z
Learnt from: Radheshg04
Repo: maximhq/bifrost PR: 980
File: core/providers/openai/images.go:10-22
Timestamp: 2025-12-14T14:43:30.902Z
Learning: Enforce the OpenAI image generation SSE event type values across the OpenAI image flow in the repository: use "image_generation.partial_image" for partial chunks, "image_generation.completed" for the final result, and "error" for errors. Apply this consistently in schemas, constants, tests, accumulator routing, and UI code within core/providers/openai (and related Go files) to ensure uniform event typing and avoid mismatches.
Applied to files:
core/providers/openai/types.go
📚 Learning: 2025-12-12T10:28:54.988Z
Learnt from: Pratham-Mishra04
Repo: maximhq/bifrost PR: 978
File: core/providers/anthropic/chat.go:207-216
Timestamp: 2025-12-12T10:28:54.988Z
Learning: In core/providers/anthropic/chat.go, within ToAnthropicChatRequest, ensure the mapping from ReasoningDetails to Anthropic thinking blocks preserves all fields from ReasoningDetails, including when Text is nil or empty. Do not drop or coerce missing Text; propagate nil/empty values to maintain complete bidirectional mapping of Anthropic events in Bifrost. Add tests to verify that ReasoningDetails with Text nil or "" round-trip without loss, and update code to handle nullable Text without dropping data. If there are multiple blocks, preserve order and all fields during mapping.
Applied to files:
core/providers/anthropic/chat.go
🧬 Code graph analysis (7)
core/internal/testutil/prompt_caching.go (2)
core/schemas/chatcompletions.go (4)
CacheControl(610-613)CacheControlTypeEphemeral(607-607)ChatContentBlock(592-602)ChatContentBlockTypeText(584-584)core/schemas/utils.go (1)
Ptr(16-18)
core/providers/bedrock/chat.go (3)
core/schemas/chatcompletions.go (4)
ChatContentBlock(592-602)ChatContentBlockTypeText(584-584)ChatAssistantMessageToolCall(702-708)ChatAssistantMessageToolCallFunction(711-714)core/schemas/bifrost.go (1)
BifrostContextKeyStructuredOutputToolName(135-135)core/schemas/utils.go (1)
Ptr(16-18)
core/providers/openai/types.go (2)
core/schemas/responses.go (3)
ResponsesMessage(318-331)ResponsesMessageContentBlock(403-418)ResponsesToolMessage(469-489)core/schemas/chatcompletions.go (1)
CacheControl(610-613)
core/providers/bedrock/utils.go (3)
core/schemas/chatcompletions.go (3)
ChatContentBlockTypeText(584-584)CacheControl(610-613)ChatContentBlock(592-602)core/providers/bedrock/types.go (6)
BedrockSystemMessage(156-160)BedrockCachePoint(199-201)BedrockCachePointTypeDefault(195-195)BedrockContentBlock(163-190)BedrockToolConfig(278-281)BedrockTool(284-287)core/schemas/utils.go (1)
IsNovaModel(1043-1045)
core/providers/bedrock/responses.go (4)
core/schemas/utils.go (1)
IsNovaModel(1043-1045)core/schemas/chatcompletions.go (2)
CacheControl(610-613)CacheControlTypeEphemeral(607-607)core/providers/bedrock/types.go (5)
BedrockTool(284-287)BedrockCachePoint(199-201)BedrockCachePointTypeDefault(195-195)BedrockSystemMessage(156-160)BedrockContentBlock(163-190)core/schemas/responses.go (6)
ResponsesMessage(318-331)ResponsesInputMessageRoleSystem(338-338)ResponsesMessageTypeMessage(294-294)ResponsesMessageContent(343-348)ResponsesMessageContentBlock(403-418)ResponsesInputMessageContentBlockTypeText(392-392)
tests/integrations/tests/test_bedrock.py (2)
tests/integrations/tests/utils/parametrize.py (2)
get_cross_provider_params_for_scenario(12-47)format_provider_model(50-65)tests/integrations/tests/test_anthropic.py (2)
validate_cache_write(1663-1675)validate_cache_read(1677-1689)
tests/integrations/tests/test_anthropic.py (3)
tests/integrations/tests/utils/parametrize.py (2)
get_cross_provider_params_for_scenario(12-47)format_provider_model(50-65)tests/integrations/tests/utils/common.py (1)
assert_valid_chat_response(801-833)tests/integrations/tests/test_bedrock.py (2)
validate_cache_write(1799-1809)validate_cache_read(1811-1821)
🔇 Additional comments (19)
core/internal/testutil/test_retry_framework.go (3)
2153-2158: Good addition of explicit attempt start logging.The new logging clearly distinguishes between the first test attempt and subsequent retries, providing valuable context for debugging streaming validation tests. The inclusion of
config.MaxAttemptsis consistent with the file's logging patterns.
2205-2254: Excellent improvements to retry flow observability and documentation.The enhancements provide clear visibility into the retry decision process:
- Waiting time logs (lines 2218, 2252) help track retry delays
- Comments clarifying attempt semantics (lines 2211, 2246) maintain consistency across retry functions
- The critical comment on line 2254 documents important control flow behavior (continue vs return)
- Comprehensive logging for the nil channel case improves debugging
These changes align well with the PR's goal of improving the retry framework with better logging.
2317-2318: Good documentation of attempt parameter semantics.The clarifying comment helps maintain consistency across retry functions, ensuring all OnRetry callbacks receive the current failed attempt number. This addresses the attempt semantics consistency that was previously flagged in reviews.
core/internal/testutil/account.go (1)
21-56: PromptCaching scenario flag wiring looks consistent.Adding
PromptCachingtoTestScenarioscleanly aligns with the new prompt‑caching gate inRunPromptCachingTestand keeps scenarios explicit and self‑documenting.core/providers/anthropic/anthropic_test.go (1)
25-35: Anthropic test config extensions for vision/reasoning/prompt caching look correct.Wiring
VisionModel,ReasoningModel, andPromptCachingModelto the newer Claude SKUs and enablingScenarios.PromptCachingis consistent with the rest of the stack and keeps the comprehensive test harness aligned with the new capabilities.Also applies to: 35-52
core/providers/bedrock/bedrock_test.go (1)
65-78: Bedrock comprehensive test wiring for embeddings/reasoning/prompt caching is coherent.The added
EmbeddingModel,ReasoningModel,PromptCachingModel, and the reuse ofbatchExtraParams/fileExtraParamsplug neatly into the existing test harness, and enablingPromptCachinginScenariosmatches the new prompt‑caching coverage.Also applies to: 78-107
core/internal/testutil/prompt_caching.go (1)
245-268: Ephemeral CacheControl oncheck_compliance_statustool is aligned with the new schema.Attaching
CacheControl{Type: CacheControlTypeEphemeral}to this tool gives you a clean way to exercise tool‑level cache control in prompt‑caching tests without impacting other tools.core/schemas/chatcompletions.go (1)
244-250: CacheControl schema extensions are consistent and minimally invasive.Adding
CacheControltoChatToolandChatContentBlockplus the sharedCacheControlType/CacheControldefinitions cleanly models provider‑specific cache hints while preserving existing JSON behavior (all new fields are optional and opt‑in).Also applies to: 591-613
tests/integrations/config.yml (1)
109-115: Integration config now correctly reflects prompt_caching support across providers.The new
prompt_cachingentries underprovider_scenariosandscenario_capabilities, plus the updated Bedrock chat model, align the Python integration layer with the Go‑side prompt‑caching scenarios. This keeps Anthropics/Bedrock enabled and others explicitly off, which matches the current implementation.Also applies to: 167-167, 188-205, 226-244, 264-281, 302-320, 342-363
tests/integrations/tests/utils/common.py (1)
88-281: Shared prompt-caching test data is well-structured and reusable.
PROMPT_CACHING_TOOLSmirrors the existing tool schema and gives good variety for cache‑heavy tool usage, andPROMPT_CACHING_LARGE_CONTEXTprovides an appropriately long, coherent context block for exercising caching behavior across providers.Also applies to: 421-481
core/providers/openai/types.go (1)
172-254: Copy-on-write CacheControl stripping in MarshalJSON avoids mutation while minimizing overhead.The updated
OpenAIResponsesRequestInput.MarshalJSONcorrectly:
- Detects whether any message carries
CacheControl,- Marshals the original slice unchanged when none do, and
- For messages that do, constructs per-message/content/tool-output copies with
CacheControlnulled, leaving the original data structures intact.The
hasCacheControlhelper keeps the logic readable, and this implementation resolves the prior in-place mutation risk without imposing extra allocations on the common no‑CacheControl path.Also applies to: 258-277
core/providers/bedrock/chat.go (1)
62-150: Bedrock response → Bifrost mapping for text/tools/reasoning and cache usage looks solidThe refactor to iterate all content blocks and separately handle text, structured-output tool use, regular tools, and reasoning content is consistent and fixes the earlier control‑flow issue where structured‑output could swallow normal content. Usage mapping of
CacheReadInputTokens→PromptTokensDetails.CachedTokensandCacheWriteInputTokens→CompletionTokensDetails.CachedTokensalso matches the new Bedrock prompt‑caching tests that assert against those provider fields.I don’t see functional issues here; no changes needed.
Also applies to: 167-185
core/providers/bedrock/types.go (2)
155-160: CachePoint types and wiring across system/content/tools are consistentAdding
CachePoint *BedrockCachePointtoBedrockSystemMessage,BedrockContentBlock, andBedrockToolplus theBedrockCachePointType/BedrockCachePointTypeDefaultandBedrockCachePointstruct gives you exactly the JSON shape Bedrock expects ({"cachePoint": {"type": "default"}}). This matches how the utils now emit cache points for system/messages/tools and keeps the type surface clean.No issues from a correctness or API‑shape perspective.
Also applies to: 162-190, 192-201, 283-287
641-646: Making ETag explicit on Bedrock file request types matches test usageRequiring
ETag stringonBedrockFileRetrieveRequest,BedrockFileDeleteRequest, andBedrockFileContentRequestaligns with the S3 access patterns intests/test_bedrock.py(all three operations passIfMatchwith the ETag). This tightens concurrency semantics without changing response types.Looks correct and consistent with the file integration tests.
Also applies to: 659-664, 673-678
core/providers/anthropic/chat.go (1)
42-72: CacheControl propagation into Anthropic tools and text blocks is correctYou’re now:
- Copying
tool.CacheControlfromChatToolintoAnthropicTool.CacheControl.- Propagating
block.CacheControlfrom system/message text blocks intoAnthropicContentBlock.CacheControlwhen buildingsystemContentand regular message content.- Preserving CacheControl on tool‑result content (text/image) as it’s converted into Anthropic blocks.
This lines up with the
schemas.CacheControldefinition and Anthropic’scache_controlrequest semantics while leaving response conversion unchanged (per the comment that responses never carry cache_control). No functional issues here.Also applies to: 143-160, 176-198, 231-251
tests/integrations/tests/test_bedrock.py (1)
37-40: Bedrock prompt‑caching tests correctly exercise system/messages/tools cachePoint pathsThe three new tests:
- Use Bedrock‑native shapes (
systementries andcontentblocks with{ "cachePoint": {"type": "default"} }) that match theBedrockSystemMessage,BedrockContentBlock, andBedrockToolstructures you introduced.- Assert against
response["usage"]["cacheWriteInputTokens"]andresponse["usage"]["cacheReadInputTokens"], which lines up withBedrockTokenUsageand the mapping inToBifrostChatResponse.- Properly use the injected
bedrock_clientfixture (fixing the earlier_bedrock_clientmisuse) and the standardformat_provider_model(provider, model)routing.Semantically these tests are aligned with how the Bedrock utils now emit CachePoint blocks; they should give good coverage of system/message/tool cache behavior without touching unrelated flows.
Also applies to: 51-60, 1623-1797
core/providers/bedrock/responses.go (2)
1283-1289: Tool‑level CacheControl ↔ CachePoint round‑trip looks correctThe mapping from
BedrockTool.CachePointtoResponsesTool.CacheControlinToBifrostResponsesRequest, and the reverse mapping that appends aBedrockToolwithCachePointwhenResponsesTool.CacheControlis set inToBedrockResponsesRequest, are symmetric and gated with!schemas.IsNovaModel(...)as expected given Nova’s lack of tool‑cache support. No functional issues here.Also applies to: 1572-1578
1707-1712: Cached‑token usage metrics now survive Bedrock ↔ Bifrost round‑tripPopulating
CacheReadInputTokensfromUsage.InputTokensDetails.CachedTokensandCacheWriteInputTokensfromUsage.OutputTokensDetails.CachedTokensinToBedrockConverseResponsemirrors the earlier Bedrock→Bifrost mapping, so cache read/write token counts won’t be dropped when converting responses back to Bedrock. Semantics remain consistent across providers.core/providers/anthropic/responses.go (1)
1344-1345: System‑content routing through ConvertAnthropicMessagesToBifrostMessages is sound
ToBifrostResponsesRequestnow passesrequest.SystemintoConvertAnthropicMessagesToBifrostMessages, andconvertAnthropicSystemToBifrostMessagesbuilds systemResponsesMessageentries from eitherContentStrorContentBlocks, preservingCacheControlon text blocks.ConvertAnthropicMessagesToBifrostMessagesthen prepends these system messages before regular messages.The reverse path (
convertBifrostMessageToAnthropicSystemContent+convertBifrostContentBlocksToAnthropic) understands both input/output text block types and also carriesCacheControl, so Anthropic ↔ Bifrost system content (including cache hints) round‑trips cleanly.Also applies to: 1627-1648, 2065-2097
8005eb3 to
572dc2e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
core/schemas/utils.go (1)
672-728: Deep-copy helpers drop the newCacheControlfield
deepCopyChatContentBlockanddeepCopyResponsesMessageContentBlockdon't copyCacheControl, so any message cloned viaDeepCopyChatMessage/DeepCopyResponsesMessagewill silently lose cache-control metadata. That can break prompt caching semantics on copied messages.Consider deep-copying
CacheControlsimilarly to other pointer fields, e.g.:func deepCopyChatContentBlock(original ChatContentBlock) ChatContentBlock { copy := ChatContentBlock{ Type: original.Type, } + if original.CacheControl != nil { + cc := *original.CacheControl + if original.CacheControl.TTL != nil { + ttl := *original.CacheControl.TTL + cc.TTL = &ttl + } + copy.CacheControl = &cc + } // existing field copies... }and analogously inside
deepCopyResponsesMessageContentBlock.Also applies to: 883-1040
core/providers/bedrock/utils.go (1)
199-235: Guard against nilContentinconvertSystemMessages
convertSystemMessagesdirectly dereferencesmsg.Content(msg.Content.ContentStr/.ContentBlocks) without checking for nil. A malformed or provider-originated system message withcontent: null(or missing) would panic in this path.Recommend a simple guard:
func convertSystemMessages(msg schemas.ChatMessage) ([]BedrockSystemMessage, error) { systemMsgs := []BedrockSystemMessage{} if msg.Content == nil { return systemMsgs, nil } // existing logic... }The cache-point emission per text block also looks correct and matches the “cache point must be a separate block” requirement.
Also applies to: 237-264
🧹 Nitpick comments (6)
core/schemas/utils.go (1)
1042-1045:IsNovaModelhelper is fine but uses a very loose matchThe helper is straightforward and works for IDs like
amazon.nova-*. If you ever introduce models whose IDs incidentally contain"nova"(e.g.,supernova-xfrom another provider), you may want a stricter predicate (e.g., prefix or provider-aware check), but that's not required now.Would you like a follow-up helper that matches known Bedrock Nova model patterns more strictly (e.g.,
amazon.nova-*only)?core/providers/bedrock/utils.go (2)
323-359: Tool-result cache points are injected correctly after text/image blocksFor tool messages, adding a separate
BedrockContentBlockwithCachePointDefaultimmediately after each text/image block whenblock.CacheControl != nilis consistent with Bedrock’s requirement that cache points be distinct blocks. The JSON shape stays backwards-compatible whenCacheControlis nil.If you find similar cache-point injection logic elsewhere (e.g., system/content blocks), you might consider a small helper to avoid duplication, but it's not necessary.
375-395:convertContentBlockrefactor looks good; errors are preferable for unsupported typesThe refactor to have
convertContentBlockreturn[]BedrockContentBlockallows you to append cache-point blocks cleanly. Behavior per type is reasonable:
- Text: returns a text block plus optional cache-point block.
- Image: validates
ImageURLStruct, converts viaconvertImageToBedrockSource, then appends optional cache-point block.- InputAudio/unknown types: now explicitly error out, which is better than silently ignoring or sending malformed content for a Converse API that doesn’t support them.
Only minor nit: for text blocks with
block.Text == nil, you’ll still emit a block withText: nil. If that’s not expected to occur upstream, this is fine; otherwise you might want to validate and return an error similarly to the image branch.Also applies to: 398-446
core/internal/testutil/account.go (1)
593-625: EnablingPromptCachingfor Anthropic and Bedrock scenariosMarking
PromptCaching: truefor Anthropic and Bedrock configurations correctly scopes prompt-caching tests to providers that support the feature, and lines up with the existingPromptCachingModelfield. Other providers remain unaffected.In future, you might want to add a short comment near each
PromptCachingModelindicating which model family/feature set it targets, to ease updates when providers change cache support.Also applies to: 627-664
core/providers/anthropic/utils.go (1)
94-149: Forward image-block conversion now preserves type and cache controlUpdating
ConvertToAnthropicImageBlockto:
- Use
AnthropicContentBlockTypeImageinstead of a raw string forType, and- Copy
block.CacheControlinto the Anthropic blockis a clear improvement and keeps the image path consistent with your other cache-control propagation.
One follow-up to consider: if
ToBifrostContentImageBlockparticipates in Anthropic→Bifrost conversions where cache control matters, you may want to mirrorblock.CacheControlback onto the resultingschemas.ChatContentBlockso cache metadata round-trips for images as well.core/providers/anthropic/responses.go (1)
2151-2181: CacheControl propagation is consistent; consider using input text enums for input/system blocksThe new wiring cleanly round‑trips
CacheControlfor:
- text blocks in both grouped and non‑grouped converters,
- tool‑result content blocks,
- image blocks, and
- tools (
convertAnthropicToolToBifrost⇄convertBifrostToolToAnthropic),which is exactly what the prompt‑caching work needs.
One consistency improvement: in
convertAnthropicSystemToBifrostMessagesand in the input branch ofconvertAnthropicContentBlocksToResponsesMessagesGrouped, text blocks are emitted withResponsesOutputMessageContentTypeTexteven though they represent input/system content, while the non‑grouped path usesResponsesInputMessageContentBlockTypeText. Aligning these to the input enum would make the type semantics clearer and avoid surprises if downstream logic ever differentiates input vs output block types.You could adjust along these lines:
@@ func convertAnthropicSystemToBifrostMessages(systemContent *AnthropicContent) []schemas.ResponsesMessage { - if block.Text != nil { // System messages will only have text content - contentBlocks = append(contentBlocks, schemas.ResponsesMessageContentBlock{ - Type: schemas.ResponsesOutputMessageContentTypeText, - Text: block.Text, - CacheControl: block.CacheControl, - }) + if block.Text != nil { // System messages will only have text content + contentBlocks = append(contentBlocks, schemas.ResponsesMessageContentBlock{ + Type: schemas.ResponsesInputMessageContentBlockTypeText, + Text: block.Text, + CacheControl: block.CacheControl, + }) } @@ func convertAnthropicContentBlocksToResponsesMessagesGrouped(...) []schemas.ResponsesMessage { - // For input messages, emit text immediately as separate message + // For input messages, emit text immediately as separate message bifrostMsg := schemas.ResponsesMessage{ Type: schemas.Ptr(schemas.ResponsesMessageTypeMessage), Role: role, Content: &schemas.ResponsesMessageContent{ ContentBlocks: []schemas.ResponsesMessageContentBlock{ { - Type: schemas.ResponsesOutputMessageContentTypeText, + Type: schemas.ResponsesInputMessageContentBlockTypeText, Text: block.Text, CacheControl: block.CacheControl, }, }, }, }Also applies to: 2263-2266, 2369-2397, 2495-2498, 2560-2563, 2998-3084, 3183-3264, 3325-3333, 3339-3344, 3377-3383
📜 Review details
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (22)
core/internal/testutil/account.go(3 hunks)core/internal/testutil/prompt_caching.go(3 hunks)core/internal/testutil/test_retry_framework.go(5 hunks)core/providers/anthropic/anthropic_test.go(2 hunks)core/providers/anthropic/chat.go(5 hunks)core/providers/anthropic/responses.go(13 hunks)core/providers/anthropic/types.go(2 hunks)core/providers/anthropic/utils.go(1 hunks)core/providers/bedrock/bedrock_test.go(2 hunks)core/providers/bedrock/chat.go(1 hunks)core/providers/bedrock/responses.go(8 hunks)core/providers/bedrock/types.go(6 hunks)core/providers/bedrock/utils.go(8 hunks)core/providers/openai/responses_marshal_test.go(0 hunks)core/providers/openai/types.go(4 hunks)core/schemas/chatcompletions.go(2 hunks)core/schemas/responses.go(2 hunks)core/schemas/utils.go(1 hunks)tests/integrations/config.yml(7 hunks)tests/integrations/tests/test_anthropic.py(5 hunks)tests/integrations/tests/test_bedrock.py(3 hunks)tests/integrations/tests/utils/common.py(2 hunks)
💤 Files with no reviewable changes (1)
- core/providers/openai/responses_marshal_test.go
🚧 Files skipped from review as they are similar to previous changes (4)
- tests/integrations/config.yml
- tests/integrations/tests/utils/common.py
- core/internal/testutil/prompt_caching.go
- core/providers/bedrock/types.go
🧰 Additional context used
📓 Path-based instructions (1)
**
⚙️ CodeRabbit configuration file
always check the stack if there is one for the current PR. do not give localized reviews for the PR, always see all changes in the light of the whole stack of PRs (if there is a stack, if there is no stack you can continue to make localized suggestions/reviews)
Files:
core/providers/bedrock/bedrock_test.gocore/schemas/responses.gocore/providers/anthropic/utils.gocore/providers/anthropic/anthropic_test.gocore/internal/testutil/test_retry_framework.gocore/providers/bedrock/chat.gocore/schemas/utils.gocore/internal/testutil/account.gocore/providers/anthropic/chat.gocore/providers/anthropic/types.gocore/schemas/chatcompletions.gocore/providers/anthropic/responses.gocore/providers/bedrock/utils.gocore/providers/bedrock/responses.gocore/providers/openai/types.gotests/integrations/tests/test_bedrock.pytests/integrations/tests/test_anthropic.py
🧠 Learnings (7)
📚 Learning: 2025-12-09T17:07:42.007Z
Learnt from: qwerty-dvorak
Repo: maximhq/bifrost PR: 1006
File: core/schemas/account.go:9-18
Timestamp: 2025-12-09T17:07:42.007Z
Learning: In core/schemas/account.go, the HuggingFaceKeyConfig field within the Key struct is currently unused and reserved for future Hugging Face inference endpoint deployments. Do not flag this field as missing from OpenAPI documentation or require its presence in the API spec until the feature is actively implemented and used. When the feature is added, update the OpenAPI docs accordingly; otherwise, treat this field as non-breaking and not part of the current API surface.
Applied to files:
core/providers/bedrock/bedrock_test.gocore/schemas/responses.gocore/providers/anthropic/utils.gocore/providers/anthropic/anthropic_test.gocore/internal/testutil/test_retry_framework.gocore/providers/bedrock/chat.gocore/schemas/utils.gocore/internal/testutil/account.gocore/providers/anthropic/chat.gocore/providers/anthropic/types.gocore/schemas/chatcompletions.gocore/providers/anthropic/responses.gocore/providers/bedrock/utils.gocore/providers/bedrock/responses.gocore/providers/openai/types.go
📚 Learning: 2025-12-11T07:38:31.413Z
Learnt from: Pratham-Mishra04
Repo: maximhq/bifrost PR: 1000
File: core/providers/bedrock/bedrock_test.go:1374-1390
Timestamp: 2025-12-11T07:38:31.413Z
Learning: In core/providers/bedrock tests, follow a layered testing approach: - Unit tests (e.g., TestBifrostToBedrockResponseConversion) should perform structural comparisons and type/field checks to avoid brittleness from dynamic fields. - Separate scenario-based and integration tests should validate the full end-to-end conversion logic, including content block internals. Ensure unit tests avoid brittle string/field matching and that integration tests cover end-to-end behavior with realistic data.
Applied to files:
core/providers/bedrock/bedrock_test.go
📚 Learning: 2025-12-15T10:16:21.909Z
Learnt from: qwerty-dvorak
Repo: maximhq/bifrost PR: 1006
File: core/providers/huggingface/huggingface_test.go:12-63
Timestamp: 2025-12-15T10:16:21.909Z
Learning: In provider tests under core/providers/<provider>/*_test.go, do not require or flag the use of defer for Shutdown(); instead call client.Shutdown() at the end of each test function. This pattern appears consistent across all provider tests. Apply this rule only within this path; for other tests or resources, defer may still be appropriate.
Applied to files:
core/providers/bedrock/bedrock_test.gocore/providers/anthropic/anthropic_test.go
📚 Learning: 2025-12-12T10:28:54.988Z
Learnt from: Pratham-Mishra04
Repo: maximhq/bifrost PR: 978
File: core/providers/anthropic/chat.go:207-216
Timestamp: 2025-12-12T10:28:54.988Z
Learning: In core/providers/anthropic/chat.go, within ToAnthropicChatRequest, ensure the mapping from ReasoningDetails to Anthropic thinking blocks preserves all fields from ReasoningDetails, including when Text is nil or empty. Do not drop or coerce missing Text; propagate nil/empty values to maintain complete bidirectional mapping of Anthropic events in Bifrost. Add tests to verify that ReasoningDetails with Text nil or "" round-trip without loss, and update code to handle nullable Text without dropping data. If there are multiple blocks, preserve order and all fields during mapping.
Applied to files:
core/providers/anthropic/chat.go
📚 Learning: 2025-12-17T08:44:08.788Z
Learnt from: Pratham-Mishra04
Repo: maximhq/bifrost PR: 1114
File: core/schemas/chatcompletions.go:224-228
Timestamp: 2025-12-17T08:44:08.788Z
Learning: In core/schemas/chatcompletions.go, ensure the schema structures mirror OpenAI's API specifications exactly. Use the valid values for fields like ChatAudioParameters.Format and ChatAudioParameters.Voice as defined by OpenAI's documentation, and avoid adding additional inline documentation or constants to maintain direct compatibility with OpenAI's API.
Applied to files:
core/schemas/chatcompletions.go
📚 Learning: 2025-12-11T11:58:25.307Z
Learnt from: Pratham-Mishra04
Repo: maximhq/bifrost PR: 1000
File: core/providers/openai/responses.go:42-84
Timestamp: 2025-12-11T11:58:25.307Z
Learning: In core/providers/openai/responses.go (and related OpenAI response handling), document and enforce the API format constraint: if ResponsesReasoning != nil and the response contains content blocks, all content blocks should be treated as reasoning blocks by default. Implement type guards or parsing logic accordingly, and add unit tests to verify that when ResponsesReasoning is non-nil, content blocks are labeled as reasoning blocks. Include clear comments in the code explaining the rationale and ensure downstream consumers rely on this behavior.
Applied to files:
core/providers/openai/types.go
📚 Learning: 2025-12-14T14:43:30.902Z
Learnt from: Radheshg04
Repo: maximhq/bifrost PR: 980
File: core/providers/openai/images.go:10-22
Timestamp: 2025-12-14T14:43:30.902Z
Learning: Enforce the OpenAI image generation SSE event type values across the OpenAI image flow in the repository: use "image_generation.partial_image" for partial chunks, "image_generation.completed" for the final result, and "error" for errors. Apply this consistently in schemas, constants, tests, accumulator routing, and UI code within core/providers/openai (and related Go files) to ensure uniform event typing and avoid mismatches.
Applied to files:
core/providers/openai/types.go
🧬 Code graph analysis (7)
core/schemas/responses.go (1)
core/schemas/chatcompletions.go (1)
CacheControl(610-613)
core/providers/anthropic/utils.go (2)
core/providers/anthropic/types.go (2)
AnthropicContentBlockTypeImage(205-205)AnthropicImageSource(234-239)core/schemas/chatcompletions.go (1)
CacheControl(610-613)
core/providers/bedrock/chat.go (3)
core/schemas/chatcompletions.go (6)
ChatContentBlock(592-602)ChatContentBlockTypeText(584-584)ChatAssistantMessageToolCall(702-708)ChatAssistantMessageToolCallFunction(711-714)ChatReasoningDetails(739-746)BifrostReasoningDetailsTypeText(735-735)core/schemas/bifrost.go (1)
BifrostContextKeyStructuredOutputToolName(135-135)core/schemas/utils.go (1)
Ptr(16-18)
core/providers/anthropic/chat.go (4)
core/schemas/chatcompletions.go (4)
CacheControl(610-613)ChatContentBlock(592-602)ChatAssistantMessageToolCallFunction(711-714)ChatAssistantMessageToolCall(702-708)core/providers/anthropic/types.go (3)
AnthropicContentBlockTypeText(204-204)AnthropicContentBlockTypeToolUse(206-206)AnthropicContentBlockTypeThinking(212-212)core/schemas/utils.go (1)
Ptr(16-18)core/utils.go (1)
Ptr(56-58)
core/providers/anthropic/types.go (1)
core/schemas/chatcompletions.go (2)
CacheControl(610-613)ToolFunctionParameters(261-268)
tests/integrations/tests/test_bedrock.py (2)
tests/integrations/tests/utils/parametrize.py (2)
get_cross_provider_params_for_scenario(12-47)format_provider_model(50-65)tests/integrations/tests/test_anthropic.py (2)
validate_cache_write(1663-1675)validate_cache_read(1677-1689)
tests/integrations/tests/test_anthropic.py (3)
tests/integrations/tests/utils/parametrize.py (2)
get_cross_provider_params_for_scenario(12-47)format_provider_model(50-65)tests/integrations/tests/utils/common.py (1)
assert_valid_chat_response(801-833)tests/integrations/tests/test_bedrock.py (2)
validate_cache_write(1799-1809)validate_cache_read(1811-1821)
🔇 Additional comments (20)
core/schemas/chatcompletions.go (1)
244-250: Cache-control extensions to tools and content blocks look consistentAdding
CacheControltoChatToolandChatContentBlock, plus theCacheControlType/CacheControltypes, cleanly models provider-specific cache control while keeping fields optional and clearly flagged as non-OpenAI. No issues from a schema or JSON perspective.Also applies to: 591-613
core/providers/bedrock/utils.go (1)
16-33: Model-awareconvertToolConfigwith Nova gating is reasonable
- Updating
convertChatParametersto callconvertToolConfig(bifrostReq.Model, bifrostReq.Params)and havingconvertToolConfigappend an extraBedrockToolwithCachePointDefaultwhentool.CacheControl != nil && !schemas.IsNovaModel(model)looks consistent with model-specific caching behavior.- Existing tool-spec construction (name, description, schema) is preserved; the extra cache-point entry is additive and doesn’t interfere with tool choice logic.
Given Nova’s different cache semantics, skipping cache-point tool injection for Nova via
IsNovaModelis a sensible first pass.If you’d like, I can help draft a small test that asserts we emit cache-point tools for non-Nova models but not for Nova IDs (using a couple of representative model strings).
Also applies to: 617-685
core/internal/testutil/account.go (1)
22-56: ExtendingTestScenarioswithPromptCachingis consistentAdding
PromptCaching booltoTestScenariosfollows the existing pattern for feature flags (e.g.,Reasoning,Embedding) and defaults safely to false for providers that don’t set it.core/providers/anthropic/anthropic_test.go (1)
25-35: Anthropic test config updates for vision, reasoning, and prompt caching look goodProviding explicit
VisionModel,ReasoningModel, andPromptCachingModeland enablingPromptCachinginTestScenarioskeeps the Anthropic tests aligned with the new capabilities. The rest of the test shape (setup, shutdown, scenarios) remains consistent with other provider tests.Also applies to: 35-52
core/providers/anthropic/types.go (1)
216-231: Anthropic content/tool structs now expose cache control cleanlyAdding
CacheControl *schemas.CacheControlto bothAnthropicContentBlockandAnthropicToolaligns Anthropic types with the sharedCacheControlschema. Fields are optional and won’t affect existing payloads unless explicitly set.Also applies to: 290-300
core/schemas/responses.go (1)
401-418: Responses schemas extended with cache control in a backward-compatible wayThe new
CacheControl *CacheControlfields onResponsesMessageContentBlockandResponsesToolprovide a clean hook for provider-specific cache hints while remaining optional and clearly documented as non-OpenAI extensions. They integrate naturally with the existing Responses API model.Also applies to: 1036-1055
core/providers/bedrock/responses.go (6)
1571-1578: LGTM: CachePoint propagation for tools.The logic correctly appends a separate
BedrockToolwithCachePointafter tools that haveCacheControl, and appropriately excludes Nova models that don't support tool-level caching.
2450-2484: LGTM: CachePoint propagation relies on API ordering guarantee.The conversion correctly attaches
CacheControlto the last content block when aCachePointis encountered (lines 2454-2463). As confirmed in previous reviews, the Bedrock API guarantees thatCachePointentries only appear after text messages, preventing silent drops.
2413-2419: LGTM: Proper CacheControl→CachePoint conversion for system messages.The code correctly emits a separate
BedrockSystemMessagewithCachePointafter text messages that carryCacheControl, consistent with Bedrock's API format.
2656-2665: LGTM: CachePoint handling validated upstream.As confirmed in prior reviews, the provider API validates and returns an error when a
CachePointappears before content blocks, so the defensive check at line 2658 is sufficient. The upstream validation prevents the silent drop scenario.
2774-2781: LGTM: CacheControl→CachePoint propagation for content blocks.The conversion correctly emits a separate
BedrockContentBlockwithCachePointafter blocks that haveCacheControl, maintaining consistency with the tool and system message conversion patterns.
1283-1289: The code at lines 1283-1289 already includes a defensive check on line 1285 (if len(bifrostReq.Params.Tools) > 0) that prevents accessing an empty array. This guards against the scenario where aCachePointappears before anyToolSpecin the iteration. The same defensive pattern is consistently applied across similar locations in the file (lines 2454-2461 for system messages and lines 2656-2665 for content blocks). No additional changes are needed.Likely an incorrect or invalid review comment.
core/providers/bedrock/bedrock_test.go (1)
65-108: Bedrock comprehensive test config updates look consistent with prompt‑caching rolloutEmbedding/Reasoning/PromptCaching model choices and the batch/file extra‑params wiring integrate cleanly with the existing Bedrock comprehensive test setup and new PromptCaching scenario flag; no issues from this diff alone.
core/internal/testutil/test_retry_framework.go (1)
818-841: Retry framework streaming logs and nil‑channel behavior are now consistent and saferThe StreamingRetryConfig.OnRetry semantics (current failed attempt in, logging next attempt with
attempt+1) are now aligned with other helpers, and the Responses stream‑validation retry path adds clear per‑attempt logs, explicit wait logging, and, importantly, continues retrying on a nilresponseChannelinstead of returning early. Control flow and backoff usage look correct.Also applies to: 2150-2325
core/providers/bedrock/chat.go (1)
62-186: Bedrock → Bifrost chat conversion now correctly handles structured tools, tool calls, reasoning, and cache usageThe new block‑wise processing cleanly separates:
- structured‑output tool_use (marshalled into
ContentStrand excluded fromToolCalls),- regular tool_use (proper
ChatAssistantMessageToolCallwith JSON arguments),- reasoning blocks (aggregated into
ReasoningDetails/Reasoning),- and usage/cache fields (read/write tokens → prompt/completion token details),
while avoiding mutation of the source response. Only notable behavioral change is that plain text is now always surfaced via
ContentBlocks(except for structured output) rather than collapsing single‑text blocks intoContentStr; double‑check any consumers that might still look exclusively atContentStrfor Bedrock responses.core/providers/openai/types.go (1)
82-190: CacheControl‑stripping for OpenAI chat/responses is now non‑mutating and allocation‑awareThe new MarshalJSON implementations correctly:
- detect presence of CacheControl first,
- only then allocate per‑message/tool copies, cloning content/tool‑output blocks and nil‑ing CacheControl on the copies,
- and shadow
messages,tools, andinputin aux structs so that wire JSON excludes CacheControl while the original structs stay intact.This cleanly resolves the prior mutation concern while keeping the no‑CacheControl hot path cheap.
Also applies to: 253-371, 394-452
core/providers/anthropic/chat.go (1)
42-289: Anthropic request/response mappings now propagate cache_control on input and normalize content types on outputOn the request path, CacheControl is threaded from:
- tools →
AnthropicTool.CacheControl, and- all relevant text content blocks (system, tool results, user/assistant) →
AnthropicContentBlock.CacheControl,without disturbing the existing reasoning/thinking mapping. On the response path, the new type‑switch over content blocks correctly distinguishes text, tool_use, and thinking, constructing ChatContentBlocks, tool calls (with JSON‑marshalled inputs), and reasoning details/reasoning text accordingly. This matches the intended prompt‑caching semantics and prior guidance that responses themselves don’t carry cache_control.
Also applies to: 320-365
tests/integrations/tests/test_bedrock.py (1)
37-41: Bedrock prompt‑caching tests and helpers look solidThe new Prompt Caching section, tests 21–23, and the
validate_cache_write/readhelpers are wired correctly: fixtures and parametrization are used properly, cache counters are asserted in a tolerant way (no brittle “must be zero” invariants), and the approximate equality check for system/messages caching should be robust enough for an integration suite.Also applies to: 1623-1821
tests/integrations/tests/test_anthropic.py (1)
39-41: Anthropic prompt‑caching tests and usage helpers are well‑structuredThe new prompt‑caching tests (28–30) and
validate_cache_write/readhelpers correctly target Anthropic’s usage fields, avoid brittle assumptions about warm caches, and use a reasonable token‑difference tolerance where needed. Signatures and imports are consistent and previous unused‑fixture concerns are resolved.Also applies to: 52-86, 1421-1608, 1663-1689
core/providers/anthropic/responses.go (1)
1344-1345: Centralized system‑content conversion looks correctPassing
request.SystemintoConvertAnthropicMessagesToBifrostMessagesand handling it viaconvertAnthropicSystemToBifrostMessagesgives a single, symmetric path for Anthropic system content (and its CacheControl) in both directions. The helper also sensibly skips emitting a system message when there’s no text and avoids empty ContentBlocks arrays.Also applies to: 1627-1649, 2065-2097
Merge activity
|

Summary
Improved the test retry framework with better logging and added support for cache control in content blocks across multiple providers.
Changes
Type of change
Affected areas
How to test
Breaking changes
Security considerations
Cache control implementation allows providers to specify content that should be treated as ephemeral, improving security for sensitive content.
Checklist