-
Notifications
You must be signed in to change notification settings - Fork 186
feat: core tests for web search, schemas #1339
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
📝 WalkthroughSummary by CodeRabbit
✏️ Tip: You can customize this high-level summary in your review settings. WalkthroughAdds comprehensive web-search tool tests and domain utilities, moves web-search delta generation in Anthropic responses from content-start to completion, expands OpenAI response JSON round-trip tests, adjusts schema action unmarshalling, and updates a few test configurations. Changes
Sequence Diagram(s)mermaid mermaid Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
This stack of pull requests is managed by Graphite. Learn more about stacking. |
9f21e84 to
281ec08
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (3)
core/schemas/responses.go (1)
530-583: Fix unreachable code inResponsesToolMessageActionStruct.UnmarshalJSON(compile blocker).
The switch returns in every branch (includingdefault), so thereturn fmt.Errorf("unknown action type...")after the switch is unreachable and will not compile. Also consider resetting the union fields before setting the chosen variant.Proposed fix
func (action *ResponsesToolMessageActionStruct) UnmarshalJSON(data []byte) error { + // Reset union fields to avoid stale state if the struct is reused. + *action = ResponsesToolMessageActionStruct{} + // First, peek at the type field to determine which variant to unmarshal var typeStruct struct { Type string `json:"type"` } if err := Unmarshal(data, &typeStruct); err != nil { return fmt.Errorf("failed to peek at type field: %w", err) } + if typeStruct.Type == "" { + return fmt.Errorf("missing required action.type") + } // Based on the type, unmarshal into the appropriate variant switch typeStruct.Type { // ... default: // use computer tool, as it can have many possible actions var computerToolCallAction ResponsesComputerToolCallAction if err := Unmarshal(data, &computerToolCallAction); err != nil { return fmt.Errorf("failed to unmarshal computer tool call action: %w", err) } action.ResponsesComputerToolCallAction = &computerToolCallAction return nil } - return fmt.Errorf("unknown action type: %s", typeStruct.Type) }core/providers/anthropic/responses.go (1)
1811-1818: Don’t clobber existingparams.Includewhen enabling web_search sources.
params.Include = []string{"web_search_call.action.sources"}will overwrite any user-providedincludefromExtraParams. Prefer append+dedupe, or only set it when empty.Proposed fix
- } else if tool.Type != nil && (*tool.Type == AnthropicToolTypeWebSearch20250305) { - params.Include = []string{"web_search_call.action.sources"} + } else if tool.Type != nil && (*tool.Type == AnthropicToolTypeWebSearch20250305) { + const inc = "web_search_call.action.sources" + found := false + for _, v := range params.Include { + if v == inc { + found = true + break + } + } + if !found { + params.Include = append(params.Include, inc) + } }core/providers/openai/openai_test.go (1)
25-33: ChatModel switch togpt-4o: confirm this cost increase is intentional for CI.
gpt-4ois available and current (Jan 2025), but it costs ~16–17× more thangpt-4o-mini(input: $2.50/$0.15 per 1M tokens; output: $10.00/$0.60 per 1M tokens). Both models are fully available across OpenAI's Chat Completions endpoint with standard rate limits (tier-based RPM/TPM that scale with usage). Confirm this pricing tradeoff is intentional for test suite coverage.
🤖 Fix all issues with AI agents
In `@core/internal/testutil/web_search_tool.go`:
- Around line 351-359: The test creates webSearchTool with
ResponsesToolWebSearch.Filters.AllowedDomains set to
["wikipedia.org","en.wikipedia.org"] but later validates against patterns like
["wikipedia.org/*","*.edu"]; update the test to make the expected patterns match
the configured list by either (A) asserting the same AllowedDomains value is
used for validation, or (B) deriving wildcard patterns from AllowedDomains
consistently (e.g., map "wikipedia.org" → "wikipedia.org/*" or add "*.domain" as
needed) so the validation of ResponsesToolWebSearchFilters uses the same
transformation logic that produced the request—adjust the code around the
webSearchTool variable and the validation/assertion block to use the same
source/derivation for patterns.
In `@core/providers/openai/responses_test.go`:
- Around line 722-742: The test and implementation for
ResponsesToolMessageActionStruct currently default unknown types to
ResponsesComputerToolCallAction which risks silent misclassification; change the
custom unmarshalling logic on ResponsesToolMessageActionStruct to return an
error for unknown "type" values (mirroring the behavior of
ResponsesCodeInterpreterOutput and ResponsesToolFileSearchCompoundFilter),
update the test to expect an error when unmarshalling
`{"type":"unknown_action"}`, and adjust any callers/tests relying on the old
fallback; alternatively, if you prefer the fallback contract, add explicit
documentation in the struct comment and test asserting the documented behavior
instead of silent acceptance.
🧹 Nitpick comments (6)
core/internal/testutil/web_search_tool.go (3)
421-425: Domain-filter validation is too weak/incorrect for the scenario it claims to test.
- Substring matching can accept
evilwikipedia.orgas “wikipedia.org”.ValidateWebSearchSourcesonly logs on mismatch, so the “domain filtering” test may never fail even if filtering regresses.toLowercan panic on non-ASCII strings due to[]runesized bylen(s)(bytes).Proposed fix (host-based matching + enforce at least one match)
import ( "context" + "net/url" "os" + "strings" "testing" "time" @@ func ValidateWebSearchSources(t *testing.T, sources []schemas.ResponsesWebSearchToolCallActionSearchSource, allowedDomains []string) { require.NotEmpty(t, sources, "Sources should not be empty") + var matched int for i, source := range sources { // Validate basic structure require.NotEmpty(t, source.URL, "Source %d should have a URL", i+1) t.Logf(" Source %d: %s", i+1, source.URL) // If domain filters specified, validate sources match patterns if len(allowedDomains) > 0 { matchesFilter := false for _, domain := range allowedDomains { - if matchesDomainPattern(source.URL, domain) { + if matchesDomainPattern(source.URL, domain) { matchesFilter = true break } } - if !matchesFilter { - t.Logf(" ⚠️ Source %d (%s) doesn't match allowed domain filters", i+1, source.URL) + if matchesFilter { + matched++ } } } + if len(allowedDomains) > 0 { + require.Greater(t, matched, 0, "Expected at least 1 source to match allowed domain filters") + } t.Logf("✅ Validated %d search sources", len(sources)) } // matchesDomainPattern checks if a URL matches a domain pattern func matchesDomainPattern(rawURL, pattern string) bool { - // Simple pattern matching implementation - // "*.edu" matches URLs containing ".edu" - // "wikipedia.org/*" matches URLs containing "wikipedia.org" + u, err := url.Parse(rawURL) + if err != nil { + return false + } + host := strings.ToLower(u.Hostname()) + p := strings.ToLower(pattern) - if len(pattern) > 0 && pattern[0] == '*' { - // Pattern like "*.edu" - suffix := pattern[1:] - return containsSubstring(url, suffix) + // "*.edu" + if strings.HasPrefix(p, "*.") { + return strings.HasSuffix(host, p[1:]) // keep leading ".edu" } - if len(pattern) > 0 && pattern[len(pattern)-1] == '*' { - // Pattern like "wikipedia.org/*" - prefix := pattern[:len(pattern)-2] - return containsSubstring(url, prefix) + // "wikipedia.org/*" + if strings.HasSuffix(p, "/*") { + p = strings.TrimSuffix(p, "/*") } - // Exact match - return containsSubstring(url, pattern) -} - -// containsSubstring checks if s contains substr (case-insensitive) -func containsSubstring(s, substr string) bool { - s = toLower(s) - substr = toLower(substr) - return len(s) >= len(substr) && indexOfSubstring(s, substr) >= 0 -} - -// toLower converts string to lowercase -func toLower(s string) string { - result := make([]rune, len(s)) - for i, r := range s { - if r >= 'A' && r <= 'Z' { - result[i] = r + 32 - } else { - result[i] = r - } - } - return string(result) -} - -// indexOfSubstring finds index of substr in s, or -1 if not found -func indexOfSubstring(s, substr string) int { - if len(substr) == 0 { - return 0 - } - if len(substr) > len(s) { - return -1 - } - - for i := 0; i <= len(s)-len(substr); i++ { - if s[i:i+len(substr)] == substr { - return i - } - } - return -1 + // Exact host match (or subdomain match if pattern is a bare registrable/domain) + return host == p || strings.HasSuffix(host, "."+p) }Also applies to: 753-842
451-456: Preferbifrost.Ptr(...)for pointer fields in testutil.
Usebifrost.Ptr(size)/bifrost.Ptr(maxUses)instead of&size/&maxUsesfor consistency with repo conventions (and to avoid future pointer-to-loop-var footguns). Based on learnings, preferbifrost.Ptr().Also applies to: 687-694
148-169: Consider adopting the directory’s retry-config convention for new test scenarios.
This file introduces new scenario entry points but uses bespoke retry config helpers (WebSearchRetryConfig,StreamingRetryConfig). Ifcore/internal/testutilstandardizes onGetTestRetryConfigForScenario()+ typed config composition, aligning here will keep behavior consistent across the comprehensive suite. Based on learnings, ...Also applies to: 212-214
core/internal/testutil/tests.go (1)
24-91: Consider gating “live web” scenarios to reduce flakiness/time across the whole stack.
These new web-search scenarios are now part of the comprehensive runner; if CI runs this suite broadly, it may become more network-dependent/flaky across providers.core/providers/openai/responses_test.go (2)
157-193: Add a nil-guard forresult.Inputto avoid panic in the test itself.
result.Input.OpenAIResponsesRequestInputArraywill panic ifresult.Inputis nil (making failures harder to diagnose).Proposed diff
result := ToOpenAIResponsesRequest(bifrostReq) if result == nil { t.Fatal("ToOpenAIResponsesRequest returned nil") } + if result.Input == nil { + t.Fatal("ToOpenAIResponsesRequest returned nil Input") + } messageCount := len(result.Input.OpenAIResponsesRequestInputArray)
858-900:vector_store_ids:nullexpectation is brittle if tags ever move toomitempty.
If the goal is “file_search tool unmarshals without error and populates the right embedded struct”, asserting on null emission may be stricter than needed.
📜 Review details
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (7)
core/internal/testutil/tests.gocore/internal/testutil/web_search_tool.gocore/providers/anthropic/responses.gocore/providers/openai/openai_test.gocore/providers/openai/responses_test.gocore/schemas/responses.gotests/integrations/python/tests/test_openai.py
🧰 Additional context used
📓 Path-based instructions (1)
**
⚙️ CodeRabbit configuration file
always check the stack if there is one for the current PR. do not give localized reviews for the PR, always see all changes in the light of the whole stack of PRs (if there is a stack, if there is no stack you can continue to make localized suggestions/reviews)
Files:
core/providers/openai/openai_test.gocore/schemas/responses.gocore/providers/openai/responses_test.gotests/integrations/python/tests/test_openai.pycore/internal/testutil/tests.gocore/providers/anthropic/responses.gocore/internal/testutil/web_search_tool.go
🧠 Learnings (19)
📚 Learning: 2025-12-09T17:07:42.007Z
Learnt from: qwerty-dvorak
Repo: maximhq/bifrost PR: 1006
File: core/schemas/account.go:9-18
Timestamp: 2025-12-09T17:07:42.007Z
Learning: In core/schemas/account.go, the HuggingFaceKeyConfig field within the Key struct is currently unused and reserved for future Hugging Face inference endpoint deployments. Do not flag this field as missing from OpenAPI documentation or require its presence in the API spec until the feature is actively implemented and used. When the feature is added, update the OpenAPI docs accordingly; otherwise, treat this field as non-breaking and not part of the current API surface.
Applied to files:
core/providers/openai/openai_test.gocore/schemas/responses.gocore/providers/openai/responses_test.gocore/internal/testutil/tests.gocore/providers/anthropic/responses.gocore/internal/testutil/web_search_tool.go
📚 Learning: 2025-12-29T11:54:55.836Z
Learnt from: akshaydeo
Repo: maximhq/bifrost PR: 1153
File: framework/configstore/rdb.go:2221-2246
Timestamp: 2025-12-29T11:54:55.836Z
Learning: In Go reviews, do not flag range-over-int patterns like for i := range n as compile-time errors, assuming Go 1.22+ semantics. Only flag actual range-capable values (slices, arrays, maps, channels, strings) and other compile-time issues. This applies to all Go files across the repository.
Applied to files:
core/providers/openai/openai_test.gocore/schemas/responses.gocore/providers/openai/responses_test.gocore/internal/testutil/tests.gocore/providers/anthropic/responses.gocore/internal/testutil/web_search_tool.go
📚 Learning: 2026-01-14T04:40:11.480Z
Learnt from: Pratham-Mishra04
Repo: maximhq/bifrost PR: 1312
File: framework/modelcatalog/pricing.go:276-426
Timestamp: 2026-01-14T04:40:11.480Z
Learning: In the Bifrost codebase, ImageUsage and other usage types guarantee that TotalTokens is populated (computed as InputTokens + OutputTokens if providers don’t supply TotalTokens). Reviewers can rely on this invariant and should not assume TotalTokens may be missing when input/output tokens exist. When implementing tiering logic or token-based decisions, you can safely use TotalTokens without extra null/zero guards, provided you’re in a context where InputTokens and OutputTokens are present. If a branch might discard tokens, ensure the invariant is preserved or add explicit checks only where the inputs are confirmed to be valid.
Applied to files:
core/providers/openai/openai_test.gocore/schemas/responses.gocore/providers/openai/responses_test.gocore/internal/testutil/tests.gocore/providers/anthropic/responses.gocore/internal/testutil/web_search_tool.go
📚 Learning: 2026-01-14T13:30:28.760Z
Learnt from: Radheshg04
Repo: maximhq/bifrost PR: 1326
File: plugins/semanticcache/test_utils.go:545-559
Timestamp: 2026-01-14T13:30:28.760Z
Learning: In the maximhq/bifrost repository, prefer using bifrost.Ptr() to create pointers instead of the address operator (&) even when & would be valid syntactically. Apply this consistently across all code paths, including test utilities, to improve consistency and readability. Replace occurrences of &value where a *T is expected with bifrost.Ptr(value) (or an equivalent call) and ensure the function is in scope and used correctly for the target pointer type.
Applied to files:
core/providers/openai/openai_test.gocore/schemas/responses.gocore/providers/openai/responses_test.gocore/internal/testutil/tests.gocore/providers/anthropic/responses.gocore/internal/testutil/web_search_tool.go
📚 Learning: 2025-12-11T11:58:25.307Z
Learnt from: Pratham-Mishra04
Repo: maximhq/bifrost PR: 1000
File: core/providers/openai/responses.go:42-84
Timestamp: 2025-12-11T11:58:25.307Z
Learning: In core/providers/openai/responses.go (and related OpenAI response handling), document and enforce the API format constraint: if ResponsesReasoning != nil and the response contains content blocks, all content blocks should be treated as reasoning blocks by default. Implement type guards or parsing logic accordingly, and add unit tests to verify that when ResponsesReasoning is non-nil, content blocks are labeled as reasoning blocks. Include clear comments in the code explaining the rationale and ensure downstream consumers rely on this behavior.
Applied to files:
core/providers/openai/openai_test.gocore/providers/openai/responses_test.go
📚 Learning: 2025-12-14T14:43:30.902Z
Learnt from: Radheshg04
Repo: maximhq/bifrost PR: 980
File: core/providers/openai/images.go:10-22
Timestamp: 2025-12-14T14:43:30.902Z
Learning: Enforce the OpenAI image generation SSE event type values across the OpenAI image flow in the repository: use "image_generation.partial_image" for partial chunks, "image_generation.completed" for the final result, and "error" for errors. Apply this consistently in schemas, constants, tests, accumulator routing, and UI code within core/providers/openai (and related Go files) to ensure uniform event typing and avoid mismatches.
Applied to files:
core/providers/openai/openai_test.gocore/providers/openai/responses_test.go
📚 Learning: 2025-12-15T10:16:21.909Z
Learnt from: qwerty-dvorak
Repo: maximhq/bifrost PR: 1006
File: core/providers/huggingface/huggingface_test.go:12-63
Timestamp: 2025-12-15T10:16:21.909Z
Learning: In provider tests under core/providers/<provider>/*_test.go, do not require or flag the use of defer for Shutdown(); instead call client.Shutdown() at the end of each test function. This pattern appears consistent across all provider tests. Apply this rule only within this path; for other tests or resources, defer may still be appropriate.
Applied to files:
core/providers/openai/openai_test.gocore/providers/openai/responses_test.go
📚 Learning: 2025-12-19T09:26:54.961Z
Learnt from: qwerty-dvorak
Repo: maximhq/bifrost PR: 1006
File: core/providers/utils/utils.go:1050-1051
Timestamp: 2025-12-19T09:26:54.961Z
Learning: Update streaming end-marker handling so HuggingFace is treated as a non-[DONE] provider for backends that do not emit a DONE marker (e.g., meta llama on novita). In core/providers/utils/utils.go, adjust ProviderSendsDoneMarker() (or related logic) to detect providers that may not emit DONE and avoid relying on DONE as the sole end signal. Add tests to cover both DONE-emitting and non-DONE backends, with clear documentation in code comments explaining the rationale and any fallback behavior.
Applied to files:
core/providers/openai/openai_test.gocore/providers/openai/responses_test.gocore/providers/anthropic/responses.go
📚 Learning: 2026-01-15T07:40:42.227Z
Learnt from: Radheshg04
Repo: maximhq/bifrost PR: 1326
File: core/providers/bedrock/bedrock.go:1347-1355
Timestamp: 2026-01-15T07:40:42.227Z
Learning: When handling unsupported operations across providers, avoid hardcoding provider constants (e.g., schemas.Bedrock). Use the provider.GetProviderKey() (or equivalent API) to obtain the actual provider key from configuration, ensuring errors and messages adapt to custom provider names. Apply this pattern to all core/provider implementations (not just bedrock) to improve configurability and maintainability.
Applied to files:
core/providers/openai/openai_test.gocore/providers/openai/responses_test.gocore/providers/anthropic/responses.go
📚 Learning: 2026-01-10T11:27:47.535Z
Learnt from: Radheshg04
Repo: maximhq/bifrost PR: 1256
File: core/providers/openai/openai.go:2276-2385
Timestamp: 2026-01-10T11:27:47.535Z
Learning: Validate image generation requests for nil and missing prompts before dispatch. Follow the same pattern used here: core/bifrost.go validates nil/empty prompts, providerUtils.CheckContextAndGetRequestBody returns a structured error when the request converter yields nil, and apply this across all providers (including OpenAI) to avoid sending null bodies.
Applied to files:
core/providers/openai/openai_test.gocore/providers/openai/responses_test.gocore/providers/anthropic/responses.go
📚 Learning: 2026-01-15T07:49:14.252Z
Learnt from: Radheshg04
Repo: maximhq/bifrost PR: 1326
File: core/providers/bedrock/bedrock.go:1347-1355
Timestamp: 2026-01-15T07:49:14.252Z
Learning: In provider implementations, when raising unsupported-operation errors, pass provider.GetProviderKey() to NewUnsupportedOperationError so error messages and ExtraFields.Provider reflect the provider's alias. Apply this pattern consistently across all provider files (not just this one) to ensure accurate error context and branding.
Applied to files:
core/providers/openai/openai_test.gocore/providers/openai/responses_test.gocore/providers/anthropic/responses.go
📚 Learning: 2026-01-14T10:53:44.658Z
Learnt from: Pratham-Mishra04
Repo: maximhq/bifrost PR: 1326
File: core/providers/gemini/gemini.go:1679-1754
Timestamp: 2026-01-14T10:53:44.658Z
Learning: Validate image generation inputs in core/bifrost.go before invoking any provider handler. Ensure in all provider implementations (e.g., core/providers/gemini/gemini.go) that the request and request.Input are non-nil before use, to prevent nil dereferences and provide clear error handling. Apply this invariant broadly to all providers and add tests for nil input scenarios.
Applied to files:
core/providers/openai/openai_test.gocore/providers/openai/responses_test.gocore/providers/anthropic/responses.go
📚 Learning: 2026-01-13T17:10:07.064Z
Learnt from: Radheshg04
Repo: maximhq/bifrost PR: 1312
File: tests/integrations/python/tests/test_openai.py:1166-1258
Timestamp: 2026-01-13T17:10:07.064Z
Learning: In tests under tests/integrations/python, prefer using the OpenAI image generation model 'gpt-image-1' via the config key providers.openai.image_generation for image-generation scenarios. This avoids DALLE-3 parameter limitations (e.g., n>1, quality/size combos). Ensure tests reference this provider in mocks/fixtures and document why this choice is used for test determinism.
Applied to files:
tests/integrations/python/tests/test_openai.py
📚 Learning: 2025-12-19T08:29:20.286Z
Learnt from: qwerty-dvorak
Repo: maximhq/bifrost PR: 1095
File: core/internal/testutil/count_tokens.go:30-67
Timestamp: 2025-12-19T08:29:20.286Z
Learning: In core/internal/testutil test files, enforce using GetTestRetryConfigForScenario() to obtain a generic retry config, then construct a typed retry config (e.g., CountTokensRetryConfig, EmbeddingRetryConfig, TranscriptionRetryConfig) with an empty Conditions slice. Copy only MaxAttempts, BaseDelay, MaxDelay, OnRetry, and OnFinalFail from the generic config. This convention should be consistently applied across all test files in this directory.
Applied to files:
core/internal/testutil/tests.gocore/internal/testutil/web_search_tool.go
📚 Learning: 2026-01-11T14:08:10.341Z
Learnt from: Pratham-Mishra04
Repo: maximhq/bifrost PR: 1298
File: core/providers/anthropic/anthropic.go:682-699
Timestamp: 2026-01-11T14:08:10.341Z
Learning: In Anthroplic streaming implementations (and analogous providers), ensure that the final 'summary' chunk, which carries usage information and metadata, is emitted after all delta chunks and uses a chunk index of last_delta_index + 1. This differentiates the summary chunk from content delta chunks. Apply this convention consistently in the anthropic provider code and in similar streaming providers, and consider adding a targeted test to assert the ordering and chunk index logic.
Applied to files:
core/providers/anthropic/responses.go
📚 Learning: 2026-01-13T13:36:35.221Z
Learnt from: TejasGhatte
Repo: maximhq/bifrost PR: 1319
File: core/providers/anthropic/responses.go:937-937
Timestamp: 2026-01-13T13:36:35.221Z
Learning: In core/providers/anthropic/responses.go, when handling Anthropic API streaming responses, ensure that content_block_start events include a signature field set to an empty string (e.g., contentBlock.Signature = ""). The actual signature is delivered later via signature_delta events. This behavior is per Anthropic's specification and should not be treated as an error. This guideline should apply to all Anthropic response handling files under core/providers/anthropic/ and similar go files that process streaming blocks.
Applied to files:
core/providers/anthropic/responses.go
📚 Learning: 2026-01-14T06:57:42.750Z
Learnt from: TejasGhatte
Repo: maximhq/bifrost PR: 1319
File: core/providers/anthropic/types.go:248-336
Timestamp: 2026-01-14T06:57:42.750Z
Learning: For Anthropic citation types (page_location, char_location, content_block_location), ensure there is an optional string field file_id to reference uploaded files. Update the Go structs modeling these citations to include FileID *string (or string with omitempty) and document its optionality in comments, so code consuming these types can handle absence of file_id gracefully.
Applied to files:
core/providers/anthropic/responses.go
📚 Learning: 2026-01-15T11:16:33.414Z
Learnt from: TejasGhatte
Repo: maximhq/bifrost PR: 1241
File: core/providers/anthropic/utils.go:801-874
Timestamp: 2026-01-15T11:16:33.414Z
Learning: In the Anthropic provider (core/providers/anthropic), allow empty arrays for allowed_domains and blocked_domains in the tool arguments JSON. Update or confirm sanitization logic so it does not strip empty arrays when only one of the domain filter fields is present (i.e., if the list is empty but the field is specified, keep it). This applies to code in core/providers/anthropic/utils.go (and related files in the same package if they share the same sanitization behavior). Ensure tests cover cases with empty arrays and with both fields omitted.
Applied to files:
core/providers/anthropic/responses.go
📚 Learning: 2026-01-14T06:57:42.750Z
Learnt from: TejasGhatte
Repo: maximhq/bifrost PR: 1319
File: core/providers/anthropic/types.go:248-336
Timestamp: 2026-01-14T06:57:42.750Z
Learning: In core/providers/anthropic/types.go, ensure the web_search_result_location citation type includes a string field named 'url' alongside the existing fields 'encrypted_index', 'title', and 'cited_text'. If the field is missing, add it with type string and appropriate struct tags (e.g., json and/or db tags) and update any related serialization or usage accordingly.
Applied to files:
core/providers/anthropic/responses.go
🧬 Code graph analysis (5)
core/schemas/responses.go (3)
core/schemas/json_native.go (1)
Unmarshal(18-20)core/schemas/json_wasm.go (1)
Unmarshal(22-24)core/providers/gemini/types.go (1)
Type(825-825)
core/providers/openai/responses_test.go (1)
core/schemas/responses.go (26)
ResponsesToolMessageActionStruct(507-512)ResponsesComputerToolCallAction(656-667)ResponsesWebSearchToolCallAction(698-705)ResponsesLocalShellToolCallAction(913-920)ResponsesMCPApprovalRequestAction(941-947)ResponsesTool(1091-1109)ResponsesToolTypeFunction(1078-1078)ResponsesToolFunction(1402-1405)ResponsesToolTypeFileSearch(1079-1079)ResponsesToolFileSearch(1408-1413)ResponsesToolComputerUsePreview(1545-1551)ResponsesToolTypeWebSearch(1081-1081)ResponsesToolWebSearch(1554-1561)ResponsesToolTypeMCP(1082-1082)ResponsesToolMCP(1579-1588)ResponsesToolTypeCodeInterpreter(1083-1083)ResponsesToolCodeInterpreter(1666-1668)ResponsesToolTypeImageGeneration(1084-1084)ResponsesToolImageGeneration(1671-1682)ResponsesToolTypeLocalShell(1085-1085)ResponsesToolLocalShell(1691-1693)ResponsesToolTypeCustom(1086-1086)ResponsesToolCustom(1696-1698)ResponsesToolTypeWebSearchPreview(1087-1087)ResponsesToolWebSearchPreview(1710-1713)ResponsesToolWebSearchUserLocation(1570-1576)
core/internal/testutil/tests.go (1)
core/internal/testutil/web_search_tool.go (5)
RunWebSearchToolStreamTest(172-333)RunWebSearchToolWithDomainsTest(336-428)RunWebSearchToolContextSizesTest(431-528)RunWebSearchToolMultiTurnTest(531-663)RunWebSearchToolMaxUsesTest(666-751)
core/providers/anthropic/responses.go (2)
core/providers/anthropic/types.go (1)
AnthropicToolTypeWebSearch20250305(368-368)core/schemas/responses.go (2)
ResponsesToolMessage(485-505)ResponsesWebSearchToolCallAction(698-705)
core/internal/testutil/web_search_tool.go (5)
core/schemas/responses.go (9)
ResponsesMessage(319-332)ResponsesTool(1091-1109)ResponsesToolTypeWebSearch(1081-1081)ResponsesToolWebSearch(1554-1561)ResponsesToolWebSearchUserLocation(1570-1576)BifrostResponsesStreamResponse(1789-1828)ResponsesMessageTypeWebSearchCall(299-299)ResponsesToolWebSearchFilters(1564-1567)ResponsesMessageTypeMessage(295-295)core/internal/testutil/utils.go (1)
CreateBasicResponsesMessage(276-284)core/schemas/utils.go (1)
Ptr(14-16)core/internal/testutil/test_retry_framework.go (4)
StreamingRetryConfig(871-893)TestRetryContext(174-179)WithResponsesStreamValidationRetry(2361-2567)ResponsesStreamValidationResult(2351-2357)core/schemas/bifrost.go (1)
BifrostStream(456-463)
⏰ Context from checks skipped due to timeout of 900000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (17)
- GitHub Check: Graphite / mergeability_check
- GitHub Check: Graphite / mergeability_check
- GitHub Check: Graphite / mergeability_check
- GitHub Check: Graphite / mergeability_check
- GitHub Check: Graphite / mergeability_check
- GitHub Check: Graphite / mergeability_check
- GitHub Check: Graphite / mergeability_check
- GitHub Check: Graphite / mergeability_check
- GitHub Check: Graphite / mergeability_check
- GitHub Check: Graphite / mergeability_check
- GitHub Check: Graphite / mergeability_check
- GitHub Check: Graphite / mergeability_check
- GitHub Check: Graphite / mergeability_check
- GitHub Check: Graphite / mergeability_check
- GitHub Check: Graphite / mergeability_check
- GitHub Check: Graphite / mergeability_check
- GitHub Check: Graphite / mergeability_check
🔇 Additional comments (5)
core/schemas/responses.go (1)
557-581: Double-check the “unknown action => computer action” fallback is safe.
Routing any unrecognizedaction.typeintoResponsesComputerToolCallActioncan silently accept malformed or new tool actions and misclassify them, which may hide provider/schema drift. If this is intentional for forward-compat, consider logging/telemetry at the call site whentypeStruct.Typeisn’t in the known set.core/providers/anthropic/responses.go (1)
1554-1562: VerifyresultIndex = index+1cannot collide with other Anthropic content block indices.
If a subsequent “real” content block is already assignedindex+1, the syntheticweb_search_tool_resultblock will conflict. If this is guaranteed not to happen due to how you map OpenAI output indices to Anthropic indices in this pathway, it’d help to codify that with a comment and a targeted test.tests/integrations/python/tests/test_openai.py (1)
579-586: Streaming call simplified; ensure coverage for “reasoning params + streaming” exists elsewhere if still required.
Change seems reasonable ifextra_bodycaused incompatibilities, but it also means this test no longer exercises that path.core/providers/openai/responses_test.go (1)
1386-1435:mapsEqual/valuesEqualapproach is solid for JSON round-trip assertions.
Nice way to avoid field-order flakes in these marshal tests.core/internal/testutil/tests.go (1)
24-42: No issues found. All new test functions (RunWebSearchToolStreamTest,RunWebSearchToolWithDomainsTest,RunWebSearchToolContextSizesTest,RunWebSearchToolMultiTurnTest,RunWebSearchToolMaxUsesTest) have signatures that matchTestScenarioFuncexactly. Function signatures in Go are determined by parameter types, not names, and all functions correctly accept(*testing.T, *bifrost.Bifrost, context.Context, ComprehensiveTestConfig)in order.
✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
core/schemas/responses.go (1)
530-583: Fix unreachable code inResponsesToolMessageActionStruct.UnmarshalJSON(won’t compile as-is).
return fmt.Errorf("unknown action type...")at Line 582 is unreachable because all switch branches return, which should trigger a Go compiler error.Proposed fix
func (action *ResponsesToolMessageActionStruct) UnmarshalJSON(data []byte) error { @@ default: // use computer tool, as it can have many possible actions var computerToolCallAction ResponsesComputerToolCallAction if err := Unmarshal(data, &computerToolCallAction); err != nil { return fmt.Errorf("failed to unmarshal computer tool call action: %w", err) } action.ResponsesComputerToolCallAction = &computerToolCallAction return nil } - return fmt.Errorf("unknown action type: %s", typeStruct.Type) }Follow-up (optional): consider explicitly erroring when
typeStruct.Type == ""and/or resetting other union pointers before setting the chosen variant to avoid silently accepting malformed payloads or accumulating state on reuse.
🤖 Fix all issues with AI agents
In `@core/internal/testutil/web_search_tool.go`:
- Around line 351-423: The test sets
webSearchTool.ResponsesToolWebSearch.Filters.AllowedDomains to ["wikipedia.org",
"en.wikipedia.org"] but ValidateWebSearchSources is called with patterns
["wikipedia.org/*", "*.edu"], which is inconsistent; update the validation call
to match the configured AllowedDomains (e.g., use ["wikipedia.org/*",
"en.wikipedia.org/*"] or equivalent patterns for "wikipedia.org" and
"en.wikipedia.org") so ValidateWebSearchSources(t, sources, ...) reflects the
AllowedDomains on the webSearchTool.
🧹 Nitpick comments (3)
core/providers/openai/responses_test.go (2)
775-1380: ResponsesTool marshal/unmarshal tests look solid; consider factoring out repeated “marshal+compare / unmarshal+assert” boilerplate (optional).The scenarios cover a wide surface (cache_control, locations, different tool types) and should be useful regression protection. If this file keeps growing, a tiny helper like
assertJSONRoundTrip(t, v, expectedJSON)would reduce duplication and make it easier to add new cases.
1386-1435:mapsEqual/valuesEqualcan likely be simplified usingreflect.DeepEqualfor these decoded JSON maps (optional).Since both sides are produced by
json.Unmarshalintomap[string]interface{},reflect.DeepEqual(expected, actual)should already handle nested maps/slices reliably and would remove custom recursion.core/internal/testutil/web_search_tool.go (1)
807-842: Prefer standard library functions over custom implementations.These helper functions (
containsSubstring,toLower,indexOfSubstring) duplicate functionality available in thestringspackage. The customtoLoweralso has a subtle bug with multi-byte UTF-8 characters—usinglen(s)returns byte count whilerangeiteration yields byte indices, which can cause misaligned writes for non-ASCII input.While URLs are typically ASCII (so this works in practice), using the standard library is cleaner and more maintainable.
♻️ Suggested refactor using stdlib
+import "strings" + // matchesDomainPattern checks if a URL matches a domain pattern func matchesDomainPattern(url, pattern string) bool { // Simple pattern matching implementation // "*.edu" matches URLs containing ".edu" // "wikipedia.org/*" matches URLs containing "wikipedia.org" if len(pattern) > 0 && pattern[0] == '*' { // Pattern like "*.edu" suffix := pattern[1:] - return containsSubstring(url, suffix) + return strings.Contains(strings.ToLower(url), strings.ToLower(suffix)) } if len(pattern) > 0 && pattern[len(pattern)-1] == '*' { // Pattern like "wikipedia.org/*" prefix := pattern[:len(pattern)-2] - return containsSubstring(url, prefix) + return strings.Contains(strings.ToLower(url), strings.ToLower(prefix)) } // Exact match - return containsSubstring(url, pattern) + return strings.Contains(strings.ToLower(url), strings.ToLower(pattern)) } - -// containsSubstring checks if s contains substr (case-insensitive) -func containsSubstring(s, substr string) bool { - s = toLower(s) - substr = toLower(substr) - return len(s) >= len(substr) && indexOfSubstring(s, substr) >= 0 -} - -// toLower converts string to lowercase -func toLower(s string) string { - result := make([]rune, len(s)) - for i, r := range s { - if r >= 'A' && r <= 'Z' { - result[i] = r + 32 - } else { - result[i] = r - } - } - return string(result) -} - -// indexOfSubstring finds index of substr in s, or -1 if not found -func indexOfSubstring(s, substr string) int { - if len(substr) == 0 { - return 0 - } - if len(substr) > len(s) { - return -1 - } - - for i := 0; i <= len(s)-len(substr); i++ { - if s[i:i+len(substr)] == substr { - return i - } - } - return -1 -}
📜 Review details
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (7)
core/internal/testutil/tests.gocore/internal/testutil/web_search_tool.gocore/providers/anthropic/responses.gocore/providers/openai/openai_test.gocore/providers/openai/responses_test.gocore/schemas/responses.gotests/integrations/python/tests/test_openai.py
🧰 Additional context used
📓 Path-based instructions (1)
**
⚙️ CodeRabbit configuration file
always check the stack if there is one for the current PR. do not give localized reviews for the PR, always see all changes in the light of the whole stack of PRs (if there is a stack, if there is no stack you can continue to make localized suggestions/reviews)
Files:
core/providers/openai/openai_test.gocore/internal/testutil/tests.gocore/providers/openai/responses_test.gocore/schemas/responses.gocore/providers/anthropic/responses.gotests/integrations/python/tests/test_openai.pycore/internal/testutil/web_search_tool.go
🧠 Learnings (19)
📚 Learning: 2025-12-09T17:07:42.007Z
Learnt from: qwerty-dvorak
Repo: maximhq/bifrost PR: 1006
File: core/schemas/account.go:9-18
Timestamp: 2025-12-09T17:07:42.007Z
Learning: In core/schemas/account.go, the HuggingFaceKeyConfig field within the Key struct is currently unused and reserved for future Hugging Face inference endpoint deployments. Do not flag this field as missing from OpenAPI documentation or require its presence in the API spec until the feature is actively implemented and used. When the feature is added, update the OpenAPI docs accordingly; otherwise, treat this field as non-breaking and not part of the current API surface.
Applied to files:
core/providers/openai/openai_test.gocore/internal/testutil/tests.gocore/providers/openai/responses_test.gocore/schemas/responses.gocore/providers/anthropic/responses.gocore/internal/testutil/web_search_tool.go
📚 Learning: 2025-12-29T11:54:55.836Z
Learnt from: akshaydeo
Repo: maximhq/bifrost PR: 1153
File: framework/configstore/rdb.go:2221-2246
Timestamp: 2025-12-29T11:54:55.836Z
Learning: In Go reviews, do not flag range-over-int patterns like for i := range n as compile-time errors, assuming Go 1.22+ semantics. Only flag actual range-capable values (slices, arrays, maps, channels, strings) and other compile-time issues. This applies to all Go files across the repository.
Applied to files:
core/providers/openai/openai_test.gocore/internal/testutil/tests.gocore/providers/openai/responses_test.gocore/schemas/responses.gocore/providers/anthropic/responses.gocore/internal/testutil/web_search_tool.go
📚 Learning: 2026-01-14T04:40:11.480Z
Learnt from: Pratham-Mishra04
Repo: maximhq/bifrost PR: 1312
File: framework/modelcatalog/pricing.go:276-426
Timestamp: 2026-01-14T04:40:11.480Z
Learning: In the Bifrost codebase, ImageUsage and other usage types guarantee that TotalTokens is populated (computed as InputTokens + OutputTokens if providers don’t supply TotalTokens). Reviewers can rely on this invariant and should not assume TotalTokens may be missing when input/output tokens exist. When implementing tiering logic or token-based decisions, you can safely use TotalTokens without extra null/zero guards, provided you’re in a context where InputTokens and OutputTokens are present. If a branch might discard tokens, ensure the invariant is preserved or add explicit checks only where the inputs are confirmed to be valid.
Applied to files:
core/providers/openai/openai_test.gocore/internal/testutil/tests.gocore/providers/openai/responses_test.gocore/schemas/responses.gocore/providers/anthropic/responses.gocore/internal/testutil/web_search_tool.go
📚 Learning: 2026-01-14T13:30:28.760Z
Learnt from: Radheshg04
Repo: maximhq/bifrost PR: 1326
File: plugins/semanticcache/test_utils.go:545-559
Timestamp: 2026-01-14T13:30:28.760Z
Learning: In the maximhq/bifrost repository, prefer using bifrost.Ptr() to create pointers instead of the address operator (&) even when & would be valid syntactically. Apply this consistently across all code paths, including test utilities, to improve consistency and readability. Replace occurrences of &value where a *T is expected with bifrost.Ptr(value) (or an equivalent call) and ensure the function is in scope and used correctly for the target pointer type.
Applied to files:
core/providers/openai/openai_test.gocore/internal/testutil/tests.gocore/providers/openai/responses_test.gocore/schemas/responses.gocore/providers/anthropic/responses.gocore/internal/testutil/web_search_tool.go
📚 Learning: 2025-12-11T11:58:25.307Z
Learnt from: Pratham-Mishra04
Repo: maximhq/bifrost PR: 1000
File: core/providers/openai/responses.go:42-84
Timestamp: 2025-12-11T11:58:25.307Z
Learning: In core/providers/openai/responses.go (and related OpenAI response handling), document and enforce the API format constraint: if ResponsesReasoning != nil and the response contains content blocks, all content blocks should be treated as reasoning blocks by default. Implement type guards or parsing logic accordingly, and add unit tests to verify that when ResponsesReasoning is non-nil, content blocks are labeled as reasoning blocks. Include clear comments in the code explaining the rationale and ensure downstream consumers rely on this behavior.
Applied to files:
core/providers/openai/openai_test.gocore/providers/openai/responses_test.go
📚 Learning: 2025-12-14T14:43:30.902Z
Learnt from: Radheshg04
Repo: maximhq/bifrost PR: 980
File: core/providers/openai/images.go:10-22
Timestamp: 2025-12-14T14:43:30.902Z
Learning: Enforce the OpenAI image generation SSE event type values across the OpenAI image flow in the repository: use "image_generation.partial_image" for partial chunks, "image_generation.completed" for the final result, and "error" for errors. Apply this consistently in schemas, constants, tests, accumulator routing, and UI code within core/providers/openai (and related Go files) to ensure uniform event typing and avoid mismatches.
Applied to files:
core/providers/openai/openai_test.gocore/providers/openai/responses_test.go
📚 Learning: 2025-12-15T10:16:21.909Z
Learnt from: qwerty-dvorak
Repo: maximhq/bifrost PR: 1006
File: core/providers/huggingface/huggingface_test.go:12-63
Timestamp: 2025-12-15T10:16:21.909Z
Learning: In provider tests under core/providers/<provider>/*_test.go, do not require or flag the use of defer for Shutdown(); instead call client.Shutdown() at the end of each test function. This pattern appears consistent across all provider tests. Apply this rule only within this path; for other tests or resources, defer may still be appropriate.
Applied to files:
core/providers/openai/openai_test.gocore/providers/openai/responses_test.go
📚 Learning: 2025-12-19T09:26:54.961Z
Learnt from: qwerty-dvorak
Repo: maximhq/bifrost PR: 1006
File: core/providers/utils/utils.go:1050-1051
Timestamp: 2025-12-19T09:26:54.961Z
Learning: Update streaming end-marker handling so HuggingFace is treated as a non-[DONE] provider for backends that do not emit a DONE marker (e.g., meta llama on novita). In core/providers/utils/utils.go, adjust ProviderSendsDoneMarker() (or related logic) to detect providers that may not emit DONE and avoid relying on DONE as the sole end signal. Add tests to cover both DONE-emitting and non-DONE backends, with clear documentation in code comments explaining the rationale and any fallback behavior.
Applied to files:
core/providers/openai/openai_test.gocore/providers/openai/responses_test.gocore/providers/anthropic/responses.go
📚 Learning: 2026-01-15T07:40:42.227Z
Learnt from: Radheshg04
Repo: maximhq/bifrost PR: 1326
File: core/providers/bedrock/bedrock.go:1347-1355
Timestamp: 2026-01-15T07:40:42.227Z
Learning: When handling unsupported operations across providers, avoid hardcoding provider constants (e.g., schemas.Bedrock). Use the provider.GetProviderKey() (or equivalent API) to obtain the actual provider key from configuration, ensuring errors and messages adapt to custom provider names. Apply this pattern to all core/provider implementations (not just bedrock) to improve configurability and maintainability.
Applied to files:
core/providers/openai/openai_test.gocore/providers/openai/responses_test.gocore/providers/anthropic/responses.go
📚 Learning: 2026-01-10T11:27:47.535Z
Learnt from: Radheshg04
Repo: maximhq/bifrost PR: 1256
File: core/providers/openai/openai.go:2276-2385
Timestamp: 2026-01-10T11:27:47.535Z
Learning: Validate image generation requests for nil and missing prompts before dispatch. Follow the same pattern used here: core/bifrost.go validates nil/empty prompts, providerUtils.CheckContextAndGetRequestBody returns a structured error when the request converter yields nil, and apply this across all providers (including OpenAI) to avoid sending null bodies.
Applied to files:
core/providers/openai/openai_test.gocore/providers/openai/responses_test.gocore/providers/anthropic/responses.go
📚 Learning: 2026-01-15T07:49:14.252Z
Learnt from: Radheshg04
Repo: maximhq/bifrost PR: 1326
File: core/providers/bedrock/bedrock.go:1347-1355
Timestamp: 2026-01-15T07:49:14.252Z
Learning: In provider implementations, when raising unsupported-operation errors, pass provider.GetProviderKey() to NewUnsupportedOperationError so error messages and ExtraFields.Provider reflect the provider's alias. Apply this pattern consistently across all provider files (not just this one) to ensure accurate error context and branding.
Applied to files:
core/providers/openai/openai_test.gocore/providers/openai/responses_test.gocore/providers/anthropic/responses.go
📚 Learning: 2026-01-14T10:53:44.658Z
Learnt from: Pratham-Mishra04
Repo: maximhq/bifrost PR: 1326
File: core/providers/gemini/gemini.go:1679-1754
Timestamp: 2026-01-14T10:53:44.658Z
Learning: Validate image generation inputs in core/bifrost.go before invoking any provider handler. Ensure in all provider implementations (e.g., core/providers/gemini/gemini.go) that the request and request.Input are non-nil before use, to prevent nil dereferences and provide clear error handling. Apply this invariant broadly to all providers and add tests for nil input scenarios.
Applied to files:
core/providers/openai/openai_test.gocore/providers/openai/responses_test.gocore/providers/anthropic/responses.go
📚 Learning: 2025-12-19T08:29:20.286Z
Learnt from: qwerty-dvorak
Repo: maximhq/bifrost PR: 1095
File: core/internal/testutil/count_tokens.go:30-67
Timestamp: 2025-12-19T08:29:20.286Z
Learning: In core/internal/testutil test files, enforce using GetTestRetryConfigForScenario() to obtain a generic retry config, then construct a typed retry config (e.g., CountTokensRetryConfig, EmbeddingRetryConfig, TranscriptionRetryConfig) with an empty Conditions slice. Copy only MaxAttempts, BaseDelay, MaxDelay, OnRetry, and OnFinalFail from the generic config. This convention should be consistently applied across all test files in this directory.
Applied to files:
core/internal/testutil/tests.gocore/internal/testutil/web_search_tool.go
📚 Learning: 2026-01-11T14:08:10.341Z
Learnt from: Pratham-Mishra04
Repo: maximhq/bifrost PR: 1298
File: core/providers/anthropic/anthropic.go:682-699
Timestamp: 2026-01-11T14:08:10.341Z
Learning: In Anthroplic streaming implementations (and analogous providers), ensure that the final 'summary' chunk, which carries usage information and metadata, is emitted after all delta chunks and uses a chunk index of last_delta_index + 1. This differentiates the summary chunk from content delta chunks. Apply this convention consistently in the anthropic provider code and in similar streaming providers, and consider adding a targeted test to assert the ordering and chunk index logic.
Applied to files:
core/providers/anthropic/responses.go
📚 Learning: 2026-01-13T13:36:35.221Z
Learnt from: TejasGhatte
Repo: maximhq/bifrost PR: 1319
File: core/providers/anthropic/responses.go:937-937
Timestamp: 2026-01-13T13:36:35.221Z
Learning: In core/providers/anthropic/responses.go, when handling Anthropic API streaming responses, ensure that content_block_start events include a signature field set to an empty string (e.g., contentBlock.Signature = ""). The actual signature is delivered later via signature_delta events. This behavior is per Anthropic's specification and should not be treated as an error. This guideline should apply to all Anthropic response handling files under core/providers/anthropic/ and similar go files that process streaming blocks.
Applied to files:
core/providers/anthropic/responses.go
📚 Learning: 2026-01-14T06:57:42.750Z
Learnt from: TejasGhatte
Repo: maximhq/bifrost PR: 1319
File: core/providers/anthropic/types.go:248-336
Timestamp: 2026-01-14T06:57:42.750Z
Learning: For Anthropic citation types (page_location, char_location, content_block_location), ensure there is an optional string field file_id to reference uploaded files. Update the Go structs modeling these citations to include FileID *string (or string with omitempty) and document its optionality in comments, so code consuming these types can handle absence of file_id gracefully.
Applied to files:
core/providers/anthropic/responses.go
📚 Learning: 2026-01-15T11:16:33.414Z
Learnt from: TejasGhatte
Repo: maximhq/bifrost PR: 1241
File: core/providers/anthropic/utils.go:801-874
Timestamp: 2026-01-15T11:16:33.414Z
Learning: In the Anthropic provider (core/providers/anthropic), allow empty arrays for allowed_domains and blocked_domains in the tool arguments JSON. Update or confirm sanitization logic so it does not strip empty arrays when only one of the domain filter fields is present (i.e., if the list is empty but the field is specified, keep it). This applies to code in core/providers/anthropic/utils.go (and related files in the same package if they share the same sanitization behavior). Ensure tests cover cases with empty arrays and with both fields omitted.
Applied to files:
core/providers/anthropic/responses.go
📚 Learning: 2026-01-14T06:57:42.750Z
Learnt from: TejasGhatte
Repo: maximhq/bifrost PR: 1319
File: core/providers/anthropic/types.go:248-336
Timestamp: 2026-01-14T06:57:42.750Z
Learning: In core/providers/anthropic/types.go, ensure the web_search_result_location citation type includes a string field named 'url' alongside the existing fields 'encrypted_index', 'title', and 'cited_text'. If the field is missing, add it with type string and appropriate struct tags (e.g., json and/or db tags) and update any related serialization or usage accordingly.
Applied to files:
core/providers/anthropic/responses.go
📚 Learning: 2026-01-13T17:10:07.064Z
Learnt from: Radheshg04
Repo: maximhq/bifrost PR: 1312
File: tests/integrations/python/tests/test_openai.py:1166-1258
Timestamp: 2026-01-13T17:10:07.064Z
Learning: In tests under tests/integrations/python, prefer using the OpenAI image generation model 'gpt-image-1' via the config key providers.openai.image_generation for image-generation scenarios. This avoids DALLE-3 parameter limitations (e.g., n>1, quality/size combos). Ensure tests reference this provider in mocks/fixtures and document why this choice is used for test determinism.
Applied to files:
tests/integrations/python/tests/test_openai.py
🧬 Code graph analysis (5)
core/internal/testutil/tests.go (1)
core/internal/testutil/web_search_tool.go (5)
RunWebSearchToolStreamTest(172-333)RunWebSearchToolWithDomainsTest(336-428)RunWebSearchToolContextSizesTest(431-528)RunWebSearchToolMultiTurnTest(531-663)RunWebSearchToolMaxUsesTest(666-751)
core/providers/openai/responses_test.go (2)
core/schemas/responses.go (16)
ResponsesToolMessageActionStruct(507-512)ResponsesComputerToolCallAction(656-667)ResponsesWebSearchToolCallAction(698-705)ResponsesLocalShellToolCallAction(913-920)ResponsesMCPApprovalRequestAction(941-947)ResponsesTool(1091-1109)ResponsesToolFunction(1402-1405)ResponsesToolFileSearch(1408-1413)ResponsesToolComputerUsePreview(1545-1551)ResponsesToolWebSearch(1554-1561)ResponsesToolMCP(1579-1588)ResponsesToolCodeInterpreter(1666-1668)ResponsesToolImageGeneration(1671-1682)ResponsesToolLocalShell(1691-1693)ResponsesToolCustom(1696-1698)ResponsesToolWebSearchPreview(1710-1713)core/schemas/utils.go (1)
Ptr(14-16)
core/schemas/responses.go (3)
core/schemas/json_native.go (1)
Unmarshal(18-20)core/schemas/json_wasm.go (1)
Unmarshal(22-24)core/providers/gemini/types.go (1)
Type(825-825)
core/providers/anthropic/responses.go (2)
core/providers/anthropic/types.go (1)
AnthropicToolTypeWebSearch20250305(368-368)core/schemas/responses.go (2)
ResponsesToolMessage(485-505)ResponsesWebSearchToolCallAction(698-705)
core/internal/testutil/web_search_tool.go (5)
core/schemas/responses.go (8)
ResponsesMessage(319-332)ResponsesTool(1091-1109)ResponsesToolWebSearch(1554-1561)ResponsesToolWebSearchUserLocation(1570-1576)ResponsesMessageTypeWebSearchCall(299-299)ResponsesToolWebSearchFilters(1564-1567)BifrostResponsesResponse(43-83)ResponsesMessageTypeMessage(295-295)core/schemas/utils.go (1)
Ptr(14-16)core/schemas/models.go (1)
Model(107-127)core/internal/testutil/test_retry_framework.go (5)
StreamingRetryConfig(871-893)TestRetryContext(174-179)WithResponsesStreamValidationRetry(2361-2567)ResponsesStreamValidationResult(2351-2357)WithResponsesTestRetry(443-593)core/schemas/context.go (1)
NewBifrostContext(48-71)
⏰ Context from checks skipped due to timeout of 900000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (25)
- GitHub Check: Graphite / mergeability_check
- GitHub Check: Graphite / mergeability_check
- GitHub Check: Graphite / mergeability_check
- GitHub Check: Graphite / mergeability_check
- GitHub Check: Graphite / mergeability_check
- GitHub Check: Graphite / mergeability_check
- GitHub Check: Graphite / mergeability_check
- GitHub Check: Graphite / mergeability_check
- GitHub Check: Graphite / mergeability_check
- GitHub Check: Graphite / mergeability_check
- GitHub Check: Graphite / mergeability_check
- GitHub Check: Graphite / mergeability_check
- GitHub Check: Graphite / mergeability_check
- GitHub Check: Graphite / mergeability_check
- GitHub Check: Graphite / mergeability_check
- GitHub Check: Graphite / mergeability_check
- GitHub Check: Graphite / mergeability_check
- GitHub Check: Graphite / mergeability_check
- GitHub Check: Graphite / mergeability_check
- GitHub Check: Graphite / mergeability_check
- GitHub Check: Graphite / mergeability_check
- GitHub Check: Graphite / mergeability_check
- GitHub Check: Graphite / mergeability_check
- GitHub Check: Graphite / mergeability_check
- GitHub Check: Graphite / mergeability_check
🔇 Additional comments (11)
core/providers/openai/openai_test.go (1)
25-33: ChatModel switch togpt-4o: verify availability/cost is acceptable for this test suite.This likely aligns better with the expanded tool + web-search coverage, but please confirm
gpt-4ois enabled for all environments where these tests are expected to run (and that higher cost/latency vsgpt-4o-miniis acceptable). As per provider-test conventions, keepingclient.Shutdown()explicit at the end is consistent. Based on learnings, ...core/internal/testutil/tests.go (1)
24-42: WebSearchTool coverage is now much more complete; ensure each added runner is robustly self-gated.Since
RunAllComprehensiveTestsinvokes every scenario function, the added WebSearch subtests must reliably no-op/skip when unsupported (especiallyRunWebSearchToolMaxUsesTestbeing Anthropic-only) and use consistent retry/backoff so the suite doesn’t get flaky/slow. If you want clearer visibility, consider extendingprintTestSummaryto list the WebSearch sub-scenarios separately (optional).tests/integrations/python/tests/test_openai.py (1)
579-586: Removing provider-specificextra_bodyfrom the baseline streaming test reduces coupling/flakiness.This keeps
test_13_streamingfocused on streaming behavior (and avoids leaking provider/model-specific “reasoning hints” into a cross-provider path). Please just confirm no provider depended on that payload to exercise streaming deterministically.core/providers/openai/responses_test.go (2)
3-9:encoding/jsonimport for schema round-trip tests is appropriate here.
349-773: Good coverage forResponsesToolMessageActionStructunion dispatch; confirm the “unknown action => computer tool” fallback is intended.These tests are valuable and will protect the updated dispatch logic. One thing they implicitly encode (via the edge-case test) is that unknown
typevalues should unmarshal intoResponsesComputerToolCallAction; that’s fine if the goal is forward-compat, but it can also silently drop fields for non-computer unknown payloads—worth double-checking and documenting in-code.core/providers/anthropic/responses.go (2)
1506-1535: LGTM! Synthetic delta generation for web search at completion.The deferred delta generation correctly extracts the query from the web search action and generates synthetic
input_json_deltaevents. The nil checks are comprehensive, and the fallback behavior whenqueryJSONis empty (due to missing query or marshal error) gracefully skips delta generation.
1811-1820: LGTM! Conditional tool parameter handling.The loop correctly sets
Truncationfor computer tools andIncludefor web search tools independently. When both tool types are present, both parameters will be set across different iterations. This aligns with the intended behavior of making web search source inclusion conditional on tool presence.core/internal/testutil/web_search_tool.go (4)
170-333: Well-structured streaming test for web search.The streaming validation properly handles:
- 60-second timeout for the streaming context
- Channel closure detection with proper
gotopattern for breaking out of nested select- Multiple event type checks (
OutputItemAdded,WebSearchCallInProgress,WebSearchCallSearching,WebSearchCallCompleted,OutputTextDelta)- Clear error collection and validation result construction
430-528: Good coverage of context size variations.The test iterates through
low,medium, andhighcontext sizes with proper subtest isolation. The loop variable capture on line 445 is technically unnecessary in Go 1.22+ but doesn't cause any harm.
530-663: Multi-turn conversation test correctly appends history.The test properly builds conversation history by appending the first response's output to subsequent messages, validating that web search works in multi-turn contexts.
665-751: Anthropic-specific max uses test is well-scoped.Good practice to skip non-Anthropic providers early (line 673-676) since
MaxUsesis Anthropic-specific. The test correctly validates that the web search call count doesn't exceed the configured limit.
✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.
281ec08 to
902893e
Compare
902893e to
2c8f77a
Compare
2c8f77a to
baa0c82
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
core/schemas/responses.go (1)
530-583: Fix: unreachablereturnafter theswitchwill fail compilation; also reset the union on unmarshal.
Becausedefaultalready returns, the finalreturn fmt.Errorf(...)is unreachable. Additionally,UnmarshalJSONshould clear other variants to avoid “multiple fields set” when reusing a struct.Proposed diff
func (action *ResponsesToolMessageActionStruct) UnmarshalJSON(data []byte) error { + // Reset union fields in case this struct is reused. + *action = ResponsesToolMessageActionStruct{} + // First, peek at the type field to determine which variant to unmarshal var typeStruct struct { Type string `json:"type"` } if err := Unmarshal(data, &typeStruct); err != nil { return fmt.Errorf("failed to peek at type field: %w", err) } // Based on the type, unmarshal into the appropriate variant switch typeStruct.Type { ... default: // use computer tool, as it can have many possible actions var computerToolCallAction ResponsesComputerToolCallAction if err := Unmarshal(data, &computerToolCallAction); err != nil { return fmt.Errorf("failed to unmarshal computer tool call action: %w", err) } action.ResponsesComputerToolCallAction = &computerToolCallAction return nil } - return fmt.Errorf("unknown action type: %s", typeStruct.Type) }
🤖 Fix all issues with AI agents
In `@core/internal/testutil/web_search_tool.go`:
- Around line 785-805: The suffix-stripping logic in matchesDomainPattern is
off-by-one when handling patterns that end with '*' — it always removes two
chars (pattern[:len(pattern)-2]) assuming "/*"; change it to remove only the
trailing '*' when the pattern ends with '*' but not "/*", and remove two chars
only when the pattern specifically ends with "/*". Update matchesDomainPattern
to distinguish pattern.HasSuffix("/*") versus pattern[len-1]=='*' and compute
prefix accordingly, then continue using containsSubstring(url, prefix).
🧹 Nitpick comments (5)
tests/integrations/python/tests/test_openai.py (2)
575-586: Add the trailing comma afterstream=True(and keep the call style consistent).
This is syntactically fine, but it’s an easy footgun for future diffs when adding kwargs.Proposed diff
- stream=True + stream=True, )
3153-3193: Test name/docstring says “wildcard domains”, but the config is now exact domains; also assert domain filtering actually happened.
With the updatedallowed_domains, consider renaming the test to avoid confusion, and add an assertion that every returnedsource.urlhost matches the allowlist (when sources are present), so regressions don’t pass with only prints.Proposed diff
def test_56_web_search_wildcard_domains(self, provider, model, vk_enabled): - """Test Case 56: Web search with wildcard domain patterns""" + """Test Case 56: Web search with allowed domain filtering""" ... # Collect search sources search_sources = [] for output_item in response.output: if hasattr(output_item, "type") and output_item.type == "web_search_call": if hasattr(output_item, "action") and hasattr(output_item.action, "sources"): if output_item.action.sources: search_sources.extend(output_item.action.sources) if len(search_sources) > 0: + from urllib.parse import urlparse + allowed = {"wikipedia.org", "en.wikipedia.org"} print(f"✓ Found {len(search_sources)} search sources") for i, source in enumerate(search_sources[:3]): if hasattr(source, "url"): print(f" Source {i+1}: {source.url}") + + for source in search_sources: + if hasattr(source, "url") and source.url: + host = (urlparse(source.url).hostname or "").lower() + assert host in allowed or host.endswith(".wikipedia.org"), f"Unexpected source host: {host}"core/providers/openai/responses_test.go (1)
348-773: Nice coverage for ResponsesToolMessageActionStruct variants; tighten the “unknown action type” assertion.
Right now the edge-case test only checks the computer-action pointer is non-nil; also asserting the decodedTypeequals"unknown_action"would catch accidental overwrites/defaulting.Proposed diff
t.Run("unknown action type - unmarshal to computer tool (default)", func(t *testing.T) { jsonData := `{"type":"unknown_action"}` var action schemas.ResponsesToolMessageActionStruct if err := json.Unmarshal([]byte(jsonData), &action); err != nil { t.Fatalf("failed to unmarshal: %v", err) } // Default behavior is to unmarshal to computer tool if action.ResponsesComputerToolCallAction == nil { t.Error("expected ResponsesComputerToolCallAction to be populated for unknown type") } + if action.ResponsesComputerToolCallAction.Type != "unknown_action" { + t.Errorf("expected type to be preserved as unknown_action, got %q", action.ResponsesComputerToolCallAction.Type) + } })core/internal/testutil/web_search_tool.go (2)
606-609: Consider creating a fresh slice to avoid potential aliasing.The
appendon line 608 could modifyfirstMessagesif it has spare capacity. While safe in the current code, explicitly creating a new slice is clearer:♻️ Suggested refactor
// Second turn - add first response to conversation history t.Log("🔄 Starting second turn...") - secondMessages := append(firstMessages, firstResponse.Output...) - secondMessages = append(secondMessages, CreateBasicResponsesMessage("What are the main types of renewable energy?")) + secondMessages := make([]schemas.ResponsesMessage, 0, len(firstMessages)+len(firstResponse.Output)+1) + secondMessages = append(secondMessages, firstMessages...) + secondMessages = append(secondMessages, firstResponse.Output...) + secondMessages = append(secondMessages, CreateBasicResponsesMessage("What are the main types of renewable energy?"))
807-825: Usestringspackage instead of custom implementations.The custom
toLowerimplementation has a bug with multibyte UTF-8 characters:len(s)returns byte count whilerange siterates by byte position, causing gaps in the result slice for non-ASCII characters.More importantly, Go's standard library provides well-tested, optimized versions of these functions.
♻️ Suggested refactor using standard library
+import "strings" + // containsSubstring checks if s contains substr (case-insensitive) func containsSubstring(s, substr string) bool { - s = toLower(s) - substr = toLower(substr) - return len(s) >= len(substr) && indexOfSubstring(s, substr) >= 0 -} - -// toLower converts string to lowercase -func toLower(s string) string { - result := make([]rune, len(s)) - for i, r := range s { - if r >= 'A' && r <= 'Z' { - result[i] = r + 32 - } else { - result[i] = r - } - } - return string(result) -} - -// indexOfSubstring finds index of substr in s, or -1 if not found -func indexOfSubstring(s, substr string) int { - if len(substr) == 0 { - return 0 - } - if len(substr) > len(s) { - return -1 - } - - for i := 0; i <= len(s)-len(substr); i++ { - if s[i:i+len(substr)] == substr { - return i - } - } - return -1 + return strings.Contains(strings.ToLower(s), strings.ToLower(substr)) }
📜 Review details
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (7)
core/internal/testutil/tests.gocore/internal/testutil/web_search_tool.gocore/providers/anthropic/responses.gocore/providers/openai/openai_test.gocore/providers/openai/responses_test.gocore/schemas/responses.gotests/integrations/python/tests/test_openai.py
🚧 Files skipped from review as they are similar to previous changes (3)
- core/internal/testutil/tests.go
- core/providers/openai/openai_test.go
- core/providers/anthropic/responses.go
🧰 Additional context used
📓 Path-based instructions (1)
**
⚙️ CodeRabbit configuration file
always check the stack if there is one for the current PR. do not give localized reviews for the PR, always see all changes in the light of the whole stack of PRs (if there is a stack, if there is no stack you can continue to make localized suggestions/reviews)
Files:
core/schemas/responses.gotests/integrations/python/tests/test_openai.pycore/internal/testutil/web_search_tool.gocore/providers/openai/responses_test.go
🧬 Code graph analysis (2)
core/internal/testutil/web_search_tool.go (3)
core/schemas/responses.go (6)
ResponsesMessage(319-332)ResponsesTool(1091-1109)ResponsesToolWebSearch(1554-1561)ResponsesToolWebSearchUserLocation(1570-1576)ResponsesToolMessage(485-505)ResponsesToolWebSearchFilters(1564-1567)core/schemas/utils.go (1)
Ptr(14-16)core/internal/testutil/test_retry_framework.go (1)
StreamingRetryConfig(871-893)
core/providers/openai/responses_test.go (2)
core/schemas/responses.go (25)
ResponsesToolMessageActionStruct(507-512)ResponsesComputerToolCallAction(656-667)ResponsesWebSearchToolCallAction(698-705)ResponsesLocalShellToolCallAction(913-920)ResponsesMCPApprovalRequestAction(941-947)ResponsesTool(1091-1109)ResponsesToolTypeFunction(1078-1078)ResponsesToolFunction(1402-1405)ResponsesToolFileSearch(1408-1413)ResponsesToolComputerUsePreview(1545-1551)ResponsesToolTypeWebSearch(1081-1081)ResponsesToolWebSearch(1554-1561)ResponsesToolTypeMCP(1082-1082)ResponsesToolMCP(1579-1588)ResponsesToolTypeCodeInterpreter(1083-1083)ResponsesToolCodeInterpreter(1666-1668)ResponsesToolTypeImageGeneration(1084-1084)ResponsesToolImageGeneration(1671-1682)ResponsesToolTypeLocalShell(1085-1085)ResponsesToolLocalShell(1691-1693)ResponsesToolTypeCustom(1086-1086)ResponsesToolCustom(1696-1698)ResponsesToolTypeWebSearchPreview(1087-1087)ResponsesToolWebSearchPreview(1710-1713)ResponsesToolWebSearchUserLocation(1570-1576)core/schemas/utils.go (1)
Ptr(14-16)
🔇 Additional comments (8)
core/providers/openai/responses_test.go (3)
4-9:encoding/jsonimport addition is appropriate for the new round-trip tests.
775-1381: The ResponsesTool marshal/unmarshal round-trips are solid and match the schema’s “common fields + embedded tool struct” approach.
1382-1435: Helper comparison functions are fine for these tests and keep assertions readable.core/internal/testutil/web_search_tool.go (5)
170-333: LGTM!The streaming test implementation is comprehensive, covering web search call detection across multiple streaming event types (
OutputItemAdded,WebSearchCallInProgress,WebSearchCallSearching,WebSearchCallCompleted). The 60-second timeout and chunk count validation are appropriate safeguards.
351-427: LGTM!The domain filter configuration on line 356 now matches the validation call on line 423. The test correctly validates that web search sources respect the configured
AllowedDomains.
430-528: LGTM!The context size variations test is well-structured with appropriate subtests for each size level.
665-751: LGTM!The Anthropic-specific
MaxUsestest correctly validates the constraint and gracefully skips for other providers. The validation at lines 746-747 ensures the web search count stays within bounds.
753-783: Confirm: Domain filter mismatch is intentionally non-fatal?When a source URL doesn't match the allowed domain filters (lines 776-778), the function only logs a warning rather than failing the test. If this is intentional (e.g., providers may not strictly enforce domain filters), consider adding a comment to clarify. If strict enforcement is expected, this should assert failure.
✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.
Merge activity
|

Summary
Enhances web search tool functionality with comprehensive test coverage for various features including streaming, domain filtering, context sizes, multi-turn conversations, and max uses parameter.
Changes
Type of change
Affected areas
How to test
Run the comprehensive tests for web search tool functionality:
Breaking changes
Related issues
Improves test coverage for web search tool functionality.
Security considerations
No security implications. Tests use standard web search functionality with public information.
Checklist