Skip to content

Conversation

@TejasGhatte
Copy link
Collaborator

@TejasGhatte TejasGhatte commented Jan 15, 2026

Summary

Enhances web search tool functionality with comprehensive test coverage for various features including streaming, domain filtering, context sizes, multi-turn conversations, and max uses parameter.

Changes

  • Added five new comprehensive tests for web search tool capabilities:
    • Streaming web search responses
    • Domain filtering for search results
    • Different search context sizes (low/medium/high)
    • Multi-turn conversations with web search
    • Max uses parameter (Anthropic-specific)
  • Fixed Anthropic provider to properly handle web search streaming events
  • Added helper functions to validate web search sources and domain patterns
  • Updated OpenAI test configuration to use gpt-4o instead of gpt-4o-mini
  • Added JSON marshal/unmarshal tests for ResponsesToolMessageActionStruct and ResponsesTool

Type of change

  • Feature
  • Refactor

Affected areas

  • Core (Go)
  • Providers/Integrations

How to test

Run the comprehensive tests for web search tool functionality:

# Run all web search tool tests
go test ./core/internal/testutil -run TestRunWebSearchTool

# Run specific web search tool tests
go test ./core/internal/testutil -run TestRunWebSearchToolStreamTest
go test ./core/internal/testutil -run TestRunWebSearchToolWithDomainsTest
go test ./core/internal/testutil -run TestRunWebSearchToolContextSizesTest
go test ./core/internal/testutil -run TestRunWebSearchToolMultiTurnTest
go test ./core/internal/testutil -run TestRunWebSearchToolMaxUsesTest

Breaking changes

  • No

Related issues

Improves test coverage for web search tool functionality.

Security considerations

No security implications. Tests use standard web search functionality with public information.

Checklist

  • I added/updated tests where appropriate
  • I verified builds succeed (Go)
  • I verified the CI pipeline passes locally

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Jan 15, 2026

📝 Walkthrough

Summary by CodeRabbit

  • Tests

    • Added comprehensive Web Search Tool test coverage including streaming, domain filtering, context sizes, multi-turn conversations, and usage limits.
    • Expanded response schema validation tests for various tool actions and edge cases.
  • Bug Fixes

    • Improved web search response processing and delta generation logic for streaming scenarios.

✏️ Tip: You can customize this high-level summary in your review settings.

Walkthrough

Adds comprehensive web-search tool tests and domain utilities, moves web-search delta generation in Anthropic responses from content-start to completion, expands OpenAI response JSON round-trip tests, adjusts schema action unmarshalling, and updates a few test configurations.

Changes

Cohort / File(s) Summary
Web Search Tool Testing
core/internal/testutil/tests.go, core/internal/testutil/web_search_tool.go
Added five new public comprehensive web-search test functions (Stream, WithDomains, ContextSizes, MultiTurn, MaxUses) and ValidateWebSearchSources; added domain-matching and substring helpers; integrated scenarios into test runner.
Anthropic response flow
core/providers/anthropic/responses.go
Moved web_search_call query extraction and synthetic input_json_delta emission from ContentBlockStart to completion-time handling; conditional inclusion/truncation of web_search_call.action.sources in request construction.
OpenAI response tests
core/providers/openai/responses_test.go
Added large suite of JSON marshal/unmarshal tests covering ResponsesTool variants (computer tool, web search, shell, MCP, edge cases) and round-trip validations.
Schema action unmarshalling
core/schemas/responses.go
Reworked ResponsesToolMessageActionStruct.UnmarshalJSON to dispatch web-search and computer-tool actions in distinct branches with a computer-tool fallback for unknown types.
OpenAI provider tests
core/providers/openai/openai_test.go
Updated TestOpenAI to use ChatModel "gpt-4o" (replaced "gpt-4o-mini").
Python integration tests
tests/integrations/python/tests/test_openai.py
Removed extra_body reasoning hint from a streaming test call and updated allowed_domains in a web-search wildcard domains test.
Go module / manifest
manifest_file, go.mod
Small dependency/manifest adjustments (few lines changed).

Sequence Diagram(s)

mermaid
sequenceDiagram
participant Client
participant Bifrost
participant AnthropicProvider
participant WebSearchTool
Note over Client,Bifrost: New flow — query extraction delayed until completion
Client->>Bifrost: send message (streaming)
Bifrost->>AnthropicProvider: forward streaming tokens / deltas
AnthropicProvider->>AnthropicProvider: accumulate ContentBlockStart deltas
AnthropicProvider->>AnthropicProvider: on completion, extract web_search_call query & sources
AnthropicProvider->>Bifrost: emit synthetic input_json_delta(s) (web_search_call)
Bifrost->>WebSearchTool: invoke web search with emitted query/sources
WebSearchTool-->>Bifrost: return search results
Bifrost-->>Client: deliver final content_block_stop and results

mermaid
sequenceDiagram
participant Client
participant Bifrost
participant AnthropicProvider
Note over Client,Bifrost: Prior flow — immediate per-delta synthetic emission
Client->>Bifrost: send streaming messages
Bifrost->>AnthropicProvider: forward tokens/deltas
AnthropicProvider->>Bifrost: on ContentBlockStart, emit per-delta synthetic input_json_delta(s)
Bifrost->>WebSearchTool: possibly invoke web search early
WebSearchTool-->>Bifrost: return (possibly repeated) results
Bifrost-->>Client: stream partial deltas/results

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Poem

🐰 I hopped through code with glee,
added tests for search and sea,
delayed the deltas till the end,
matched domains like a friend,
streaming hops and JSON merry! 🥕✨

🚥 Pre-merge checks | ✅ 2 | ❌ 1
❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 51.43% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly identifies the main feature addition: comprehensive core tests for web search functionality and schema updates.
Description check ✅ Passed The description follows the template with all required sections completed: summary, changes, type of change, affected areas, how to test, breaking changes, and checklist items marked as completed.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch 01-15-feat_core_tests_for_web_search_schemas

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Collaborator Author

TejasGhatte commented Jan 15, 2026

@TejasGhatte TejasGhatte marked this pull request as ready for review January 15, 2026 12:36
@TejasGhatte TejasGhatte force-pushed the 01-15-feat_core_tests_for_web_search_schemas branch 2 times, most recently from 9f21e84 to 281ec08 Compare January 15, 2026 12:47
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (3)
core/schemas/responses.go (1)

530-583: Fix unreachable code in ResponsesToolMessageActionStruct.UnmarshalJSON (compile blocker).
The switch returns in every branch (including default), so the return fmt.Errorf("unknown action type...") after the switch is unreachable and will not compile. Also consider resetting the union fields before setting the chosen variant.

Proposed fix
 func (action *ResponsesToolMessageActionStruct) UnmarshalJSON(data []byte) error {
+	// Reset union fields to avoid stale state if the struct is reused.
+	*action = ResponsesToolMessageActionStruct{}
+
 	// First, peek at the type field to determine which variant to unmarshal
 	var typeStruct struct {
 		Type string `json:"type"`
 	}
 	if err := Unmarshal(data, &typeStruct); err != nil {
 		return fmt.Errorf("failed to peek at type field: %w", err)
 	}
+	if typeStruct.Type == "" {
+		return fmt.Errorf("missing required action.type")
+	}

 	// Based on the type, unmarshal into the appropriate variant
 	switch typeStruct.Type {
 	// ...
 	default:
 		// use computer tool, as it can have many possible actions
 		var computerToolCallAction ResponsesComputerToolCallAction
 		if err := Unmarshal(data, &computerToolCallAction); err != nil {
 			return fmt.Errorf("failed to unmarshal computer tool call action: %w", err)
 		}
 		action.ResponsesComputerToolCallAction = &computerToolCallAction
 		return nil
 	}
-	return fmt.Errorf("unknown action type: %s", typeStruct.Type)
 }
core/providers/anthropic/responses.go (1)

1811-1818: Don’t clobber existing params.Include when enabling web_search sources.
params.Include = []string{"web_search_call.action.sources"} will overwrite any user-provided include from ExtraParams. Prefer append+dedupe, or only set it when empty.

Proposed fix
-			} else if tool.Type != nil && (*tool.Type == AnthropicToolTypeWebSearch20250305) {
-				params.Include = []string{"web_search_call.action.sources"}
+			} else if tool.Type != nil && (*tool.Type == AnthropicToolTypeWebSearch20250305) {
+				const inc = "web_search_call.action.sources"
+				found := false
+				for _, v := range params.Include {
+					if v == inc {
+						found = true
+						break
+					}
+				}
+				if !found {
+					params.Include = append(params.Include, inc)
+				}
 			}
core/providers/openai/openai_test.go (1)

25-33: ChatModel switch to gpt-4o: confirm this cost increase is intentional for CI.
gpt-4o is available and current (Jan 2025), but it costs ~16–17× more than gpt-4o-mini (input: $2.50/$0.15 per 1M tokens; output: $10.00/$0.60 per 1M tokens). Both models are fully available across OpenAI's Chat Completions endpoint with standard rate limits (tier-based RPM/TPM that scale with usage). Confirm this pricing tradeoff is intentional for test suite coverage.

🤖 Fix all issues with AI agents
In `@core/internal/testutil/web_search_tool.go`:
- Around line 351-359: The test creates webSearchTool with
ResponsesToolWebSearch.Filters.AllowedDomains set to
["wikipedia.org","en.wikipedia.org"] but later validates against patterns like
["wikipedia.org/*","*.edu"]; update the test to make the expected patterns match
the configured list by either (A) asserting the same AllowedDomains value is
used for validation, or (B) deriving wildcard patterns from AllowedDomains
consistently (e.g., map "wikipedia.org" → "wikipedia.org/*" or add "*.domain" as
needed) so the validation of ResponsesToolWebSearchFilters uses the same
transformation logic that produced the request—adjust the code around the
webSearchTool variable and the validation/assertion block to use the same
source/derivation for patterns.

In `@core/providers/openai/responses_test.go`:
- Around line 722-742: The test and implementation for
ResponsesToolMessageActionStruct currently default unknown types to
ResponsesComputerToolCallAction which risks silent misclassification; change the
custom unmarshalling logic on ResponsesToolMessageActionStruct to return an
error for unknown "type" values (mirroring the behavior of
ResponsesCodeInterpreterOutput and ResponsesToolFileSearchCompoundFilter),
update the test to expect an error when unmarshalling
`{"type":"unknown_action"}`, and adjust any callers/tests relying on the old
fallback; alternatively, if you prefer the fallback contract, add explicit
documentation in the struct comment and test asserting the documented behavior
instead of silent acceptance.
🧹 Nitpick comments (6)
core/internal/testutil/web_search_tool.go (3)

421-425: Domain-filter validation is too weak/incorrect for the scenario it claims to test.

  • Substring matching can accept evilwikipedia.org as “wikipedia.org”.
  • ValidateWebSearchSources only logs on mismatch, so the “domain filtering” test may never fail even if filtering regresses.
  • toLower can panic on non-ASCII strings due to []rune sized by len(s) (bytes).
Proposed fix (host-based matching + enforce at least one match)
 import (
 	"context"
+	"net/url"
 	"os"
+	"strings"
 	"testing"
 	"time"
@@
 func ValidateWebSearchSources(t *testing.T, sources []schemas.ResponsesWebSearchToolCallActionSearchSource, allowedDomains []string) {
 	require.NotEmpty(t, sources, "Sources should not be empty")
 
+	var matched int
 	for i, source := range sources {
 		// Validate basic structure
 		require.NotEmpty(t, source.URL, "Source %d should have a URL", i+1)
 
 		t.Logf("  Source %d: %s", i+1, source.URL)
 
 		// If domain filters specified, validate sources match patterns
 		if len(allowedDomains) > 0 {
 			matchesFilter := false
 			for _, domain := range allowedDomains {
-				if matchesDomainPattern(source.URL, domain) {
+				if matchesDomainPattern(source.URL, domain) {
 					matchesFilter = true
 					break
 				}
 			}
 
-			if !matchesFilter {
-				t.Logf("  ⚠️ Source %d (%s) doesn't match allowed domain filters", i+1, source.URL)
+			if matchesFilter {
+				matched++
 			}
 		}
 	}
 
+	if len(allowedDomains) > 0 {
+		require.Greater(t, matched, 0, "Expected at least 1 source to match allowed domain filters")
+	}
 	t.Logf("✅ Validated %d search sources", len(sources))
 }
 
 // matchesDomainPattern checks if a URL matches a domain pattern
 func matchesDomainPattern(rawURL, pattern string) bool {
-	// Simple pattern matching implementation
-	// "*.edu" matches URLs containing ".edu"
-	// "wikipedia.org/*" matches URLs containing "wikipedia.org"
+	u, err := url.Parse(rawURL)
+	if err != nil {
+		return false
+	}
+	host := strings.ToLower(u.Hostname())
+	p := strings.ToLower(pattern)
 
-	if len(pattern) > 0 && pattern[0] == '*' {
-		// Pattern like "*.edu"
-		suffix := pattern[1:]
-		return containsSubstring(url, suffix)
+	// "*.edu"
+	if strings.HasPrefix(p, "*.") {
+		return strings.HasSuffix(host, p[1:]) // keep leading ".edu"
 	}
 
-	if len(pattern) > 0 && pattern[len(pattern)-1] == '*' {
-		// Pattern like "wikipedia.org/*"
-		prefix := pattern[:len(pattern)-2]
-		return containsSubstring(url, prefix)
+	// "wikipedia.org/*"
+	if strings.HasSuffix(p, "/*") {
+		p = strings.TrimSuffix(p, "/*")
 	}
 
-	// Exact match
-	return containsSubstring(url, pattern)
-}
-
-// containsSubstring checks if s contains substr (case-insensitive)
-func containsSubstring(s, substr string) bool {
-	s = toLower(s)
-	substr = toLower(substr)
-	return len(s) >= len(substr) && indexOfSubstring(s, substr) >= 0
-}
-
-// toLower converts string to lowercase
-func toLower(s string) string {
-	result := make([]rune, len(s))
-	for i, r := range s {
-		if r >= 'A' && r <= 'Z' {
-			result[i] = r + 32
-		} else {
-			result[i] = r
-		}
-	}
-	return string(result)
-}
-
-// indexOfSubstring finds index of substr in s, or -1 if not found
-func indexOfSubstring(s, substr string) int {
-	if len(substr) == 0 {
-		return 0
-	}
-	if len(substr) > len(s) {
-		return -1
-	}
-
-	for i := 0; i <= len(s)-len(substr); i++ {
-		if s[i:i+len(substr)] == substr {
-			return i
-		}
-	}
-	return -1
+	// Exact host match (or subdomain match if pattern is a bare registrable/domain)
+	return host == p || strings.HasSuffix(host, "."+p)
 }

Also applies to: 753-842


451-456: Prefer bifrost.Ptr(...) for pointer fields in testutil.
Use bifrost.Ptr(size) / bifrost.Ptr(maxUses) instead of &size / &maxUses for consistency with repo conventions (and to avoid future pointer-to-loop-var footguns). Based on learnings, prefer bifrost.Ptr().

Also applies to: 687-694


148-169: Consider adopting the directory’s retry-config convention for new test scenarios.
This file introduces new scenario entry points but uses bespoke retry config helpers (WebSearchRetryConfig, StreamingRetryConfig). If core/internal/testutil standardizes on GetTestRetryConfigForScenario() + typed config composition, aligning here will keep behavior consistent across the comprehensive suite. Based on learnings, ...

Also applies to: 212-214

core/internal/testutil/tests.go (1)

24-91: Consider gating “live web” scenarios to reduce flakiness/time across the whole stack.
These new web-search scenarios are now part of the comprehensive runner; if CI runs this suite broadly, it may become more network-dependent/flaky across providers.

core/providers/openai/responses_test.go (2)

157-193: Add a nil-guard for result.Input to avoid panic in the test itself.
result.Input.OpenAIResponsesRequestInputArray will panic if result.Input is nil (making failures harder to diagnose).

Proposed diff
 			result := ToOpenAIResponsesRequest(bifrostReq)

 			if result == nil {
 				t.Fatal("ToOpenAIResponsesRequest returned nil")
 			}
+			if result.Input == nil {
+				t.Fatal("ToOpenAIResponsesRequest returned nil Input")
+			}

 			messageCount := len(result.Input.OpenAIResponsesRequestInputArray)

858-900: vector_store_ids:null expectation is brittle if tags ever move to omitempty.
If the goal is “file_search tool unmarshals without error and populates the right embedded struct”, asserting on null emission may be stricter than needed.

📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 3697c69 and 281ec08.

📒 Files selected for processing (7)
  • core/internal/testutil/tests.go
  • core/internal/testutil/web_search_tool.go
  • core/providers/anthropic/responses.go
  • core/providers/openai/openai_test.go
  • core/providers/openai/responses_test.go
  • core/schemas/responses.go
  • tests/integrations/python/tests/test_openai.py
🧰 Additional context used
📓 Path-based instructions (1)
**

⚙️ CodeRabbit configuration file

always check the stack if there is one for the current PR. do not give localized reviews for the PR, always see all changes in the light of the whole stack of PRs (if there is a stack, if there is no stack you can continue to make localized suggestions/reviews)

Files:

  • core/providers/openai/openai_test.go
  • core/schemas/responses.go
  • core/providers/openai/responses_test.go
  • tests/integrations/python/tests/test_openai.py
  • core/internal/testutil/tests.go
  • core/providers/anthropic/responses.go
  • core/internal/testutil/web_search_tool.go
🧠 Learnings (19)
📚 Learning: 2025-12-09T17:07:42.007Z
Learnt from: qwerty-dvorak
Repo: maximhq/bifrost PR: 1006
File: core/schemas/account.go:9-18
Timestamp: 2025-12-09T17:07:42.007Z
Learning: In core/schemas/account.go, the HuggingFaceKeyConfig field within the Key struct is currently unused and reserved for future Hugging Face inference endpoint deployments. Do not flag this field as missing from OpenAPI documentation or require its presence in the API spec until the feature is actively implemented and used. When the feature is added, update the OpenAPI docs accordingly; otherwise, treat this field as non-breaking and not part of the current API surface.

Applied to files:

  • core/providers/openai/openai_test.go
  • core/schemas/responses.go
  • core/providers/openai/responses_test.go
  • core/internal/testutil/tests.go
  • core/providers/anthropic/responses.go
  • core/internal/testutil/web_search_tool.go
📚 Learning: 2025-12-29T11:54:55.836Z
Learnt from: akshaydeo
Repo: maximhq/bifrost PR: 1153
File: framework/configstore/rdb.go:2221-2246
Timestamp: 2025-12-29T11:54:55.836Z
Learning: In Go reviews, do not flag range-over-int patterns like for i := range n as compile-time errors, assuming Go 1.22+ semantics. Only flag actual range-capable values (slices, arrays, maps, channels, strings) and other compile-time issues. This applies to all Go files across the repository.

Applied to files:

  • core/providers/openai/openai_test.go
  • core/schemas/responses.go
  • core/providers/openai/responses_test.go
  • core/internal/testutil/tests.go
  • core/providers/anthropic/responses.go
  • core/internal/testutil/web_search_tool.go
📚 Learning: 2026-01-14T04:40:11.480Z
Learnt from: Pratham-Mishra04
Repo: maximhq/bifrost PR: 1312
File: framework/modelcatalog/pricing.go:276-426
Timestamp: 2026-01-14T04:40:11.480Z
Learning: In the Bifrost codebase, ImageUsage and other usage types guarantee that TotalTokens is populated (computed as InputTokens + OutputTokens if providers don’t supply TotalTokens). Reviewers can rely on this invariant and should not assume TotalTokens may be missing when input/output tokens exist. When implementing tiering logic or token-based decisions, you can safely use TotalTokens without extra null/zero guards, provided you’re in a context where InputTokens and OutputTokens are present. If a branch might discard tokens, ensure the invariant is preserved or add explicit checks only where the inputs are confirmed to be valid.

Applied to files:

  • core/providers/openai/openai_test.go
  • core/schemas/responses.go
  • core/providers/openai/responses_test.go
  • core/internal/testutil/tests.go
  • core/providers/anthropic/responses.go
  • core/internal/testutil/web_search_tool.go
📚 Learning: 2026-01-14T13:30:28.760Z
Learnt from: Radheshg04
Repo: maximhq/bifrost PR: 1326
File: plugins/semanticcache/test_utils.go:545-559
Timestamp: 2026-01-14T13:30:28.760Z
Learning: In the maximhq/bifrost repository, prefer using bifrost.Ptr() to create pointers instead of the address operator (&) even when & would be valid syntactically. Apply this consistently across all code paths, including test utilities, to improve consistency and readability. Replace occurrences of &value where a *T is expected with bifrost.Ptr(value) (or an equivalent call) and ensure the function is in scope and used correctly for the target pointer type.

Applied to files:

  • core/providers/openai/openai_test.go
  • core/schemas/responses.go
  • core/providers/openai/responses_test.go
  • core/internal/testutil/tests.go
  • core/providers/anthropic/responses.go
  • core/internal/testutil/web_search_tool.go
📚 Learning: 2025-12-11T11:58:25.307Z
Learnt from: Pratham-Mishra04
Repo: maximhq/bifrost PR: 1000
File: core/providers/openai/responses.go:42-84
Timestamp: 2025-12-11T11:58:25.307Z
Learning: In core/providers/openai/responses.go (and related OpenAI response handling), document and enforce the API format constraint: if ResponsesReasoning != nil and the response contains content blocks, all content blocks should be treated as reasoning blocks by default. Implement type guards or parsing logic accordingly, and add unit tests to verify that when ResponsesReasoning is non-nil, content blocks are labeled as reasoning blocks. Include clear comments in the code explaining the rationale and ensure downstream consumers rely on this behavior.

Applied to files:

  • core/providers/openai/openai_test.go
  • core/providers/openai/responses_test.go
📚 Learning: 2025-12-14T14:43:30.902Z
Learnt from: Radheshg04
Repo: maximhq/bifrost PR: 980
File: core/providers/openai/images.go:10-22
Timestamp: 2025-12-14T14:43:30.902Z
Learning: Enforce the OpenAI image generation SSE event type values across the OpenAI image flow in the repository: use "image_generation.partial_image" for partial chunks, "image_generation.completed" for the final result, and "error" for errors. Apply this consistently in schemas, constants, tests, accumulator routing, and UI code within core/providers/openai (and related Go files) to ensure uniform event typing and avoid mismatches.

Applied to files:

  • core/providers/openai/openai_test.go
  • core/providers/openai/responses_test.go
📚 Learning: 2025-12-15T10:16:21.909Z
Learnt from: qwerty-dvorak
Repo: maximhq/bifrost PR: 1006
File: core/providers/huggingface/huggingface_test.go:12-63
Timestamp: 2025-12-15T10:16:21.909Z
Learning: In provider tests under core/providers/<provider>/*_test.go, do not require or flag the use of defer for Shutdown(); instead call client.Shutdown() at the end of each test function. This pattern appears consistent across all provider tests. Apply this rule only within this path; for other tests or resources, defer may still be appropriate.

Applied to files:

  • core/providers/openai/openai_test.go
  • core/providers/openai/responses_test.go
📚 Learning: 2025-12-19T09:26:54.961Z
Learnt from: qwerty-dvorak
Repo: maximhq/bifrost PR: 1006
File: core/providers/utils/utils.go:1050-1051
Timestamp: 2025-12-19T09:26:54.961Z
Learning: Update streaming end-marker handling so HuggingFace is treated as a non-[DONE] provider for backends that do not emit a DONE marker (e.g., meta llama on novita). In core/providers/utils/utils.go, adjust ProviderSendsDoneMarker() (or related logic) to detect providers that may not emit DONE and avoid relying on DONE as the sole end signal. Add tests to cover both DONE-emitting and non-DONE backends, with clear documentation in code comments explaining the rationale and any fallback behavior.

Applied to files:

  • core/providers/openai/openai_test.go
  • core/providers/openai/responses_test.go
  • core/providers/anthropic/responses.go
📚 Learning: 2026-01-15T07:40:42.227Z
Learnt from: Radheshg04
Repo: maximhq/bifrost PR: 1326
File: core/providers/bedrock/bedrock.go:1347-1355
Timestamp: 2026-01-15T07:40:42.227Z
Learning: When handling unsupported operations across providers, avoid hardcoding provider constants (e.g., schemas.Bedrock). Use the provider.GetProviderKey() (or equivalent API) to obtain the actual provider key from configuration, ensuring errors and messages adapt to custom provider names. Apply this pattern to all core/provider implementations (not just bedrock) to improve configurability and maintainability.

Applied to files:

  • core/providers/openai/openai_test.go
  • core/providers/openai/responses_test.go
  • core/providers/anthropic/responses.go
📚 Learning: 2026-01-10T11:27:47.535Z
Learnt from: Radheshg04
Repo: maximhq/bifrost PR: 1256
File: core/providers/openai/openai.go:2276-2385
Timestamp: 2026-01-10T11:27:47.535Z
Learning: Validate image generation requests for nil and missing prompts before dispatch. Follow the same pattern used here: core/bifrost.go validates nil/empty prompts, providerUtils.CheckContextAndGetRequestBody returns a structured error when the request converter yields nil, and apply this across all providers (including OpenAI) to avoid sending null bodies.

Applied to files:

  • core/providers/openai/openai_test.go
  • core/providers/openai/responses_test.go
  • core/providers/anthropic/responses.go
📚 Learning: 2026-01-15T07:49:14.252Z
Learnt from: Radheshg04
Repo: maximhq/bifrost PR: 1326
File: core/providers/bedrock/bedrock.go:1347-1355
Timestamp: 2026-01-15T07:49:14.252Z
Learning: In provider implementations, when raising unsupported-operation errors, pass provider.GetProviderKey() to NewUnsupportedOperationError so error messages and ExtraFields.Provider reflect the provider's alias. Apply this pattern consistently across all provider files (not just this one) to ensure accurate error context and branding.

Applied to files:

  • core/providers/openai/openai_test.go
  • core/providers/openai/responses_test.go
  • core/providers/anthropic/responses.go
📚 Learning: 2026-01-14T10:53:44.658Z
Learnt from: Pratham-Mishra04
Repo: maximhq/bifrost PR: 1326
File: core/providers/gemini/gemini.go:1679-1754
Timestamp: 2026-01-14T10:53:44.658Z
Learning: Validate image generation inputs in core/bifrost.go before invoking any provider handler. Ensure in all provider implementations (e.g., core/providers/gemini/gemini.go) that the request and request.Input are non-nil before use, to prevent nil dereferences and provide clear error handling. Apply this invariant broadly to all providers and add tests for nil input scenarios.

Applied to files:

  • core/providers/openai/openai_test.go
  • core/providers/openai/responses_test.go
  • core/providers/anthropic/responses.go
📚 Learning: 2026-01-13T17:10:07.064Z
Learnt from: Radheshg04
Repo: maximhq/bifrost PR: 1312
File: tests/integrations/python/tests/test_openai.py:1166-1258
Timestamp: 2026-01-13T17:10:07.064Z
Learning: In tests under tests/integrations/python, prefer using the OpenAI image generation model 'gpt-image-1' via the config key providers.openai.image_generation for image-generation scenarios. This avoids DALLE-3 parameter limitations (e.g., n>1, quality/size combos). Ensure tests reference this provider in mocks/fixtures and document why this choice is used for test determinism.

Applied to files:

  • tests/integrations/python/tests/test_openai.py
📚 Learning: 2025-12-19T08:29:20.286Z
Learnt from: qwerty-dvorak
Repo: maximhq/bifrost PR: 1095
File: core/internal/testutil/count_tokens.go:30-67
Timestamp: 2025-12-19T08:29:20.286Z
Learning: In core/internal/testutil test files, enforce using GetTestRetryConfigForScenario() to obtain a generic retry config, then construct a typed retry config (e.g., CountTokensRetryConfig, EmbeddingRetryConfig, TranscriptionRetryConfig) with an empty Conditions slice. Copy only MaxAttempts, BaseDelay, MaxDelay, OnRetry, and OnFinalFail from the generic config. This convention should be consistently applied across all test files in this directory.

Applied to files:

  • core/internal/testutil/tests.go
  • core/internal/testutil/web_search_tool.go
📚 Learning: 2026-01-11T14:08:10.341Z
Learnt from: Pratham-Mishra04
Repo: maximhq/bifrost PR: 1298
File: core/providers/anthropic/anthropic.go:682-699
Timestamp: 2026-01-11T14:08:10.341Z
Learning: In Anthroplic streaming implementations (and analogous providers), ensure that the final 'summary' chunk, which carries usage information and metadata, is emitted after all delta chunks and uses a chunk index of last_delta_index + 1. This differentiates the summary chunk from content delta chunks. Apply this convention consistently in the anthropic provider code and in similar streaming providers, and consider adding a targeted test to assert the ordering and chunk index logic.

Applied to files:

  • core/providers/anthropic/responses.go
📚 Learning: 2026-01-13T13:36:35.221Z
Learnt from: TejasGhatte
Repo: maximhq/bifrost PR: 1319
File: core/providers/anthropic/responses.go:937-937
Timestamp: 2026-01-13T13:36:35.221Z
Learning: In core/providers/anthropic/responses.go, when handling Anthropic API streaming responses, ensure that content_block_start events include a signature field set to an empty string (e.g., contentBlock.Signature = ""). The actual signature is delivered later via signature_delta events. This behavior is per Anthropic's specification and should not be treated as an error. This guideline should apply to all Anthropic response handling files under core/providers/anthropic/ and similar go files that process streaming blocks.

Applied to files:

  • core/providers/anthropic/responses.go
📚 Learning: 2026-01-14T06:57:42.750Z
Learnt from: TejasGhatte
Repo: maximhq/bifrost PR: 1319
File: core/providers/anthropic/types.go:248-336
Timestamp: 2026-01-14T06:57:42.750Z
Learning: For Anthropic citation types (page_location, char_location, content_block_location), ensure there is an optional string field file_id to reference uploaded files. Update the Go structs modeling these citations to include FileID *string (or string with omitempty) and document its optionality in comments, so code consuming these types can handle absence of file_id gracefully.

Applied to files:

  • core/providers/anthropic/responses.go
📚 Learning: 2026-01-15T11:16:33.414Z
Learnt from: TejasGhatte
Repo: maximhq/bifrost PR: 1241
File: core/providers/anthropic/utils.go:801-874
Timestamp: 2026-01-15T11:16:33.414Z
Learning: In the Anthropic provider (core/providers/anthropic), allow empty arrays for allowed_domains and blocked_domains in the tool arguments JSON. Update or confirm sanitization logic so it does not strip empty arrays when only one of the domain filter fields is present (i.e., if the list is empty but the field is specified, keep it). This applies to code in core/providers/anthropic/utils.go (and related files in the same package if they share the same sanitization behavior). Ensure tests cover cases with empty arrays and with both fields omitted.

Applied to files:

  • core/providers/anthropic/responses.go
📚 Learning: 2026-01-14T06:57:42.750Z
Learnt from: TejasGhatte
Repo: maximhq/bifrost PR: 1319
File: core/providers/anthropic/types.go:248-336
Timestamp: 2026-01-14T06:57:42.750Z
Learning: In core/providers/anthropic/types.go, ensure the web_search_result_location citation type includes a string field named 'url' alongside the existing fields 'encrypted_index', 'title', and 'cited_text'. If the field is missing, add it with type string and appropriate struct tags (e.g., json and/or db tags) and update any related serialization or usage accordingly.

Applied to files:

  • core/providers/anthropic/responses.go
🧬 Code graph analysis (5)
core/schemas/responses.go (3)
core/schemas/json_native.go (1)
  • Unmarshal (18-20)
core/schemas/json_wasm.go (1)
  • Unmarshal (22-24)
core/providers/gemini/types.go (1)
  • Type (825-825)
core/providers/openai/responses_test.go (1)
core/schemas/responses.go (26)
  • ResponsesToolMessageActionStruct (507-512)
  • ResponsesComputerToolCallAction (656-667)
  • ResponsesWebSearchToolCallAction (698-705)
  • ResponsesLocalShellToolCallAction (913-920)
  • ResponsesMCPApprovalRequestAction (941-947)
  • ResponsesTool (1091-1109)
  • ResponsesToolTypeFunction (1078-1078)
  • ResponsesToolFunction (1402-1405)
  • ResponsesToolTypeFileSearch (1079-1079)
  • ResponsesToolFileSearch (1408-1413)
  • ResponsesToolComputerUsePreview (1545-1551)
  • ResponsesToolTypeWebSearch (1081-1081)
  • ResponsesToolWebSearch (1554-1561)
  • ResponsesToolTypeMCP (1082-1082)
  • ResponsesToolMCP (1579-1588)
  • ResponsesToolTypeCodeInterpreter (1083-1083)
  • ResponsesToolCodeInterpreter (1666-1668)
  • ResponsesToolTypeImageGeneration (1084-1084)
  • ResponsesToolImageGeneration (1671-1682)
  • ResponsesToolTypeLocalShell (1085-1085)
  • ResponsesToolLocalShell (1691-1693)
  • ResponsesToolTypeCustom (1086-1086)
  • ResponsesToolCustom (1696-1698)
  • ResponsesToolTypeWebSearchPreview (1087-1087)
  • ResponsesToolWebSearchPreview (1710-1713)
  • ResponsesToolWebSearchUserLocation (1570-1576)
core/internal/testutil/tests.go (1)
core/internal/testutil/web_search_tool.go (5)
  • RunWebSearchToolStreamTest (172-333)
  • RunWebSearchToolWithDomainsTest (336-428)
  • RunWebSearchToolContextSizesTest (431-528)
  • RunWebSearchToolMultiTurnTest (531-663)
  • RunWebSearchToolMaxUsesTest (666-751)
core/providers/anthropic/responses.go (2)
core/providers/anthropic/types.go (1)
  • AnthropicToolTypeWebSearch20250305 (368-368)
core/schemas/responses.go (2)
  • ResponsesToolMessage (485-505)
  • ResponsesWebSearchToolCallAction (698-705)
core/internal/testutil/web_search_tool.go (5)
core/schemas/responses.go (9)
  • ResponsesMessage (319-332)
  • ResponsesTool (1091-1109)
  • ResponsesToolTypeWebSearch (1081-1081)
  • ResponsesToolWebSearch (1554-1561)
  • ResponsesToolWebSearchUserLocation (1570-1576)
  • BifrostResponsesStreamResponse (1789-1828)
  • ResponsesMessageTypeWebSearchCall (299-299)
  • ResponsesToolWebSearchFilters (1564-1567)
  • ResponsesMessageTypeMessage (295-295)
core/internal/testutil/utils.go (1)
  • CreateBasicResponsesMessage (276-284)
core/schemas/utils.go (1)
  • Ptr (14-16)
core/internal/testutil/test_retry_framework.go (4)
  • StreamingRetryConfig (871-893)
  • TestRetryContext (174-179)
  • WithResponsesStreamValidationRetry (2361-2567)
  • ResponsesStreamValidationResult (2351-2357)
core/schemas/bifrost.go (1)
  • BifrostStream (456-463)
⏰ Context from checks skipped due to timeout of 900000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (17)
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
🔇 Additional comments (5)
core/schemas/responses.go (1)

557-581: Double-check the “unknown action => computer action” fallback is safe.
Routing any unrecognized action.type into ResponsesComputerToolCallAction can silently accept malformed or new tool actions and misclassify them, which may hide provider/schema drift. If this is intentional for forward-compat, consider logging/telemetry at the call site when typeStruct.Type isn’t in the known set.

core/providers/anthropic/responses.go (1)

1554-1562: Verify resultIndex = index+1 cannot collide with other Anthropic content block indices.
If a subsequent “real” content block is already assigned index+1, the synthetic web_search_tool_result block will conflict. If this is guaranteed not to happen due to how you map OpenAI output indices to Anthropic indices in this pathway, it’d help to codify that with a comment and a targeted test.

tests/integrations/python/tests/test_openai.py (1)

579-586: Streaming call simplified; ensure coverage for “reasoning params + streaming” exists elsewhere if still required.
Change seems reasonable if extra_body caused incompatibilities, but it also means this test no longer exercises that path.

core/providers/openai/responses_test.go (1)

1386-1435: mapsEqual / valuesEqual approach is solid for JSON round-trip assertions.
Nice way to avoid field-order flakes in these marshal tests.

core/internal/testutil/tests.go (1)

24-42: No issues found. All new test functions (RunWebSearchToolStreamTest, RunWebSearchToolWithDomainsTest, RunWebSearchToolContextSizesTest, RunWebSearchToolMultiTurnTest, RunWebSearchToolMaxUsesTest) have signatures that match TestScenarioFunc exactly. Function signatures in Go are determined by parameter types, not names, and all functions correctly accept (*testing.T, *bifrost.Bifrost, context.Context, ComprehensiveTestConfig) in order.

✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
core/schemas/responses.go (1)

530-583: Fix unreachable code in ResponsesToolMessageActionStruct.UnmarshalJSON (won’t compile as-is).

return fmt.Errorf("unknown action type...") at Line 582 is unreachable because all switch branches return, which should trigger a Go compiler error.

Proposed fix
 func (action *ResponsesToolMessageActionStruct) UnmarshalJSON(data []byte) error {
@@
 	default:
 		// use computer tool, as it can have many possible actions
 		var computerToolCallAction ResponsesComputerToolCallAction
 		if err := Unmarshal(data, &computerToolCallAction); err != nil {
 			return fmt.Errorf("failed to unmarshal computer tool call action: %w", err)
 		}
 		action.ResponsesComputerToolCallAction = &computerToolCallAction
 		return nil
 	}
-	return fmt.Errorf("unknown action type: %s", typeStruct.Type)
 }

Follow-up (optional): consider explicitly erroring when typeStruct.Type == "" and/or resetting other union pointers before setting the chosen variant to avoid silently accepting malformed payloads or accumulating state on reuse.

🤖 Fix all issues with AI agents
In `@core/internal/testutil/web_search_tool.go`:
- Around line 351-423: The test sets
webSearchTool.ResponsesToolWebSearch.Filters.AllowedDomains to ["wikipedia.org",
"en.wikipedia.org"] but ValidateWebSearchSources is called with patterns
["wikipedia.org/*", "*.edu"], which is inconsistent; update the validation call
to match the configured AllowedDomains (e.g., use ["wikipedia.org/*",
"en.wikipedia.org/*"] or equivalent patterns for "wikipedia.org" and
"en.wikipedia.org") so ValidateWebSearchSources(t, sources, ...) reflects the
AllowedDomains on the webSearchTool.
🧹 Nitpick comments (3)
core/providers/openai/responses_test.go (2)

775-1380: ResponsesTool marshal/unmarshal tests look solid; consider factoring out repeated “marshal+compare / unmarshal+assert” boilerplate (optional).

The scenarios cover a wide surface (cache_control, locations, different tool types) and should be useful regression protection. If this file keeps growing, a tiny helper like assertJSONRoundTrip(t, v, expectedJSON) would reduce duplication and make it easier to add new cases.


1386-1435: mapsEqual/valuesEqual can likely be simplified using reflect.DeepEqual for these decoded JSON maps (optional).

Since both sides are produced by json.Unmarshal into map[string]interface{}, reflect.DeepEqual(expected, actual) should already handle nested maps/slices reliably and would remove custom recursion.

core/internal/testutil/web_search_tool.go (1)

807-842: Prefer standard library functions over custom implementations.

These helper functions (containsSubstring, toLower, indexOfSubstring) duplicate functionality available in the strings package. The custom toLower also has a subtle bug with multi-byte UTF-8 characters—using len(s) returns byte count while range iteration yields byte indices, which can cause misaligned writes for non-ASCII input.

While URLs are typically ASCII (so this works in practice), using the standard library is cleaner and more maintainable.

♻️ Suggested refactor using stdlib
+import "strings"
+
 // matchesDomainPattern checks if a URL matches a domain pattern
 func matchesDomainPattern(url, pattern string) bool {
 	// Simple pattern matching implementation
 	// "*.edu" matches URLs containing ".edu"
 	// "wikipedia.org/*" matches URLs containing "wikipedia.org"

 	if len(pattern) > 0 && pattern[0] == '*' {
 		// Pattern like "*.edu"
 		suffix := pattern[1:]
-		return containsSubstring(url, suffix)
+		return strings.Contains(strings.ToLower(url), strings.ToLower(suffix))
 	}

 	if len(pattern) > 0 && pattern[len(pattern)-1] == '*' {
 		// Pattern like "wikipedia.org/*"
 		prefix := pattern[:len(pattern)-2]
-		return containsSubstring(url, prefix)
+		return strings.Contains(strings.ToLower(url), strings.ToLower(prefix))
 	}

 	// Exact match
-	return containsSubstring(url, pattern)
+	return strings.Contains(strings.ToLower(url), strings.ToLower(pattern))
 }
-
-// containsSubstring checks if s contains substr (case-insensitive)
-func containsSubstring(s, substr string) bool {
-	s = toLower(s)
-	substr = toLower(substr)
-	return len(s) >= len(substr) && indexOfSubstring(s, substr) >= 0
-}
-
-// toLower converts string to lowercase
-func toLower(s string) string {
-	result := make([]rune, len(s))
-	for i, r := range s {
-		if r >= 'A' && r <= 'Z' {
-			result[i] = r + 32
-		} else {
-			result[i] = r
-		}
-	}
-	return string(result)
-}
-
-// indexOfSubstring finds index of substr in s, or -1 if not found
-func indexOfSubstring(s, substr string) int {
-	if len(substr) == 0 {
-		return 0
-	}
-	if len(substr) > len(s) {
-		return -1
-	}
-
-	for i := 0; i <= len(s)-len(substr); i++ {
-		if s[i:i+len(substr)] == substr {
-			return i
-		}
-	}
-	return -1
-}
📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 3697c69 and 281ec08.

📒 Files selected for processing (7)
  • core/internal/testutil/tests.go
  • core/internal/testutil/web_search_tool.go
  • core/providers/anthropic/responses.go
  • core/providers/openai/openai_test.go
  • core/providers/openai/responses_test.go
  • core/schemas/responses.go
  • tests/integrations/python/tests/test_openai.py
🧰 Additional context used
📓 Path-based instructions (1)
**

⚙️ CodeRabbit configuration file

always check the stack if there is one for the current PR. do not give localized reviews for the PR, always see all changes in the light of the whole stack of PRs (if there is a stack, if there is no stack you can continue to make localized suggestions/reviews)

Files:

  • core/providers/openai/openai_test.go
  • core/internal/testutil/tests.go
  • core/providers/openai/responses_test.go
  • core/schemas/responses.go
  • core/providers/anthropic/responses.go
  • tests/integrations/python/tests/test_openai.py
  • core/internal/testutil/web_search_tool.go
🧠 Learnings (19)
📚 Learning: 2025-12-09T17:07:42.007Z
Learnt from: qwerty-dvorak
Repo: maximhq/bifrost PR: 1006
File: core/schemas/account.go:9-18
Timestamp: 2025-12-09T17:07:42.007Z
Learning: In core/schemas/account.go, the HuggingFaceKeyConfig field within the Key struct is currently unused and reserved for future Hugging Face inference endpoint deployments. Do not flag this field as missing from OpenAPI documentation or require its presence in the API spec until the feature is actively implemented and used. When the feature is added, update the OpenAPI docs accordingly; otherwise, treat this field as non-breaking and not part of the current API surface.

Applied to files:

  • core/providers/openai/openai_test.go
  • core/internal/testutil/tests.go
  • core/providers/openai/responses_test.go
  • core/schemas/responses.go
  • core/providers/anthropic/responses.go
  • core/internal/testutil/web_search_tool.go
📚 Learning: 2025-12-29T11:54:55.836Z
Learnt from: akshaydeo
Repo: maximhq/bifrost PR: 1153
File: framework/configstore/rdb.go:2221-2246
Timestamp: 2025-12-29T11:54:55.836Z
Learning: In Go reviews, do not flag range-over-int patterns like for i := range n as compile-time errors, assuming Go 1.22+ semantics. Only flag actual range-capable values (slices, arrays, maps, channels, strings) and other compile-time issues. This applies to all Go files across the repository.

Applied to files:

  • core/providers/openai/openai_test.go
  • core/internal/testutil/tests.go
  • core/providers/openai/responses_test.go
  • core/schemas/responses.go
  • core/providers/anthropic/responses.go
  • core/internal/testutil/web_search_tool.go
📚 Learning: 2026-01-14T04:40:11.480Z
Learnt from: Pratham-Mishra04
Repo: maximhq/bifrost PR: 1312
File: framework/modelcatalog/pricing.go:276-426
Timestamp: 2026-01-14T04:40:11.480Z
Learning: In the Bifrost codebase, ImageUsage and other usage types guarantee that TotalTokens is populated (computed as InputTokens + OutputTokens if providers don’t supply TotalTokens). Reviewers can rely on this invariant and should not assume TotalTokens may be missing when input/output tokens exist. When implementing tiering logic or token-based decisions, you can safely use TotalTokens without extra null/zero guards, provided you’re in a context where InputTokens and OutputTokens are present. If a branch might discard tokens, ensure the invariant is preserved or add explicit checks only where the inputs are confirmed to be valid.

Applied to files:

  • core/providers/openai/openai_test.go
  • core/internal/testutil/tests.go
  • core/providers/openai/responses_test.go
  • core/schemas/responses.go
  • core/providers/anthropic/responses.go
  • core/internal/testutil/web_search_tool.go
📚 Learning: 2026-01-14T13:30:28.760Z
Learnt from: Radheshg04
Repo: maximhq/bifrost PR: 1326
File: plugins/semanticcache/test_utils.go:545-559
Timestamp: 2026-01-14T13:30:28.760Z
Learning: In the maximhq/bifrost repository, prefer using bifrost.Ptr() to create pointers instead of the address operator (&) even when & would be valid syntactically. Apply this consistently across all code paths, including test utilities, to improve consistency and readability. Replace occurrences of &value where a *T is expected with bifrost.Ptr(value) (or an equivalent call) and ensure the function is in scope and used correctly for the target pointer type.

Applied to files:

  • core/providers/openai/openai_test.go
  • core/internal/testutil/tests.go
  • core/providers/openai/responses_test.go
  • core/schemas/responses.go
  • core/providers/anthropic/responses.go
  • core/internal/testutil/web_search_tool.go
📚 Learning: 2025-12-11T11:58:25.307Z
Learnt from: Pratham-Mishra04
Repo: maximhq/bifrost PR: 1000
File: core/providers/openai/responses.go:42-84
Timestamp: 2025-12-11T11:58:25.307Z
Learning: In core/providers/openai/responses.go (and related OpenAI response handling), document and enforce the API format constraint: if ResponsesReasoning != nil and the response contains content blocks, all content blocks should be treated as reasoning blocks by default. Implement type guards or parsing logic accordingly, and add unit tests to verify that when ResponsesReasoning is non-nil, content blocks are labeled as reasoning blocks. Include clear comments in the code explaining the rationale and ensure downstream consumers rely on this behavior.

Applied to files:

  • core/providers/openai/openai_test.go
  • core/providers/openai/responses_test.go
📚 Learning: 2025-12-14T14:43:30.902Z
Learnt from: Radheshg04
Repo: maximhq/bifrost PR: 980
File: core/providers/openai/images.go:10-22
Timestamp: 2025-12-14T14:43:30.902Z
Learning: Enforce the OpenAI image generation SSE event type values across the OpenAI image flow in the repository: use "image_generation.partial_image" for partial chunks, "image_generation.completed" for the final result, and "error" for errors. Apply this consistently in schemas, constants, tests, accumulator routing, and UI code within core/providers/openai (and related Go files) to ensure uniform event typing and avoid mismatches.

Applied to files:

  • core/providers/openai/openai_test.go
  • core/providers/openai/responses_test.go
📚 Learning: 2025-12-15T10:16:21.909Z
Learnt from: qwerty-dvorak
Repo: maximhq/bifrost PR: 1006
File: core/providers/huggingface/huggingface_test.go:12-63
Timestamp: 2025-12-15T10:16:21.909Z
Learning: In provider tests under core/providers/<provider>/*_test.go, do not require or flag the use of defer for Shutdown(); instead call client.Shutdown() at the end of each test function. This pattern appears consistent across all provider tests. Apply this rule only within this path; for other tests or resources, defer may still be appropriate.

Applied to files:

  • core/providers/openai/openai_test.go
  • core/providers/openai/responses_test.go
📚 Learning: 2025-12-19T09:26:54.961Z
Learnt from: qwerty-dvorak
Repo: maximhq/bifrost PR: 1006
File: core/providers/utils/utils.go:1050-1051
Timestamp: 2025-12-19T09:26:54.961Z
Learning: Update streaming end-marker handling so HuggingFace is treated as a non-[DONE] provider for backends that do not emit a DONE marker (e.g., meta llama on novita). In core/providers/utils/utils.go, adjust ProviderSendsDoneMarker() (or related logic) to detect providers that may not emit DONE and avoid relying on DONE as the sole end signal. Add tests to cover both DONE-emitting and non-DONE backends, with clear documentation in code comments explaining the rationale and any fallback behavior.

Applied to files:

  • core/providers/openai/openai_test.go
  • core/providers/openai/responses_test.go
  • core/providers/anthropic/responses.go
📚 Learning: 2026-01-15T07:40:42.227Z
Learnt from: Radheshg04
Repo: maximhq/bifrost PR: 1326
File: core/providers/bedrock/bedrock.go:1347-1355
Timestamp: 2026-01-15T07:40:42.227Z
Learning: When handling unsupported operations across providers, avoid hardcoding provider constants (e.g., schemas.Bedrock). Use the provider.GetProviderKey() (or equivalent API) to obtain the actual provider key from configuration, ensuring errors and messages adapt to custom provider names. Apply this pattern to all core/provider implementations (not just bedrock) to improve configurability and maintainability.

Applied to files:

  • core/providers/openai/openai_test.go
  • core/providers/openai/responses_test.go
  • core/providers/anthropic/responses.go
📚 Learning: 2026-01-10T11:27:47.535Z
Learnt from: Radheshg04
Repo: maximhq/bifrost PR: 1256
File: core/providers/openai/openai.go:2276-2385
Timestamp: 2026-01-10T11:27:47.535Z
Learning: Validate image generation requests for nil and missing prompts before dispatch. Follow the same pattern used here: core/bifrost.go validates nil/empty prompts, providerUtils.CheckContextAndGetRequestBody returns a structured error when the request converter yields nil, and apply this across all providers (including OpenAI) to avoid sending null bodies.

Applied to files:

  • core/providers/openai/openai_test.go
  • core/providers/openai/responses_test.go
  • core/providers/anthropic/responses.go
📚 Learning: 2026-01-15T07:49:14.252Z
Learnt from: Radheshg04
Repo: maximhq/bifrost PR: 1326
File: core/providers/bedrock/bedrock.go:1347-1355
Timestamp: 2026-01-15T07:49:14.252Z
Learning: In provider implementations, when raising unsupported-operation errors, pass provider.GetProviderKey() to NewUnsupportedOperationError so error messages and ExtraFields.Provider reflect the provider's alias. Apply this pattern consistently across all provider files (not just this one) to ensure accurate error context and branding.

Applied to files:

  • core/providers/openai/openai_test.go
  • core/providers/openai/responses_test.go
  • core/providers/anthropic/responses.go
📚 Learning: 2026-01-14T10:53:44.658Z
Learnt from: Pratham-Mishra04
Repo: maximhq/bifrost PR: 1326
File: core/providers/gemini/gemini.go:1679-1754
Timestamp: 2026-01-14T10:53:44.658Z
Learning: Validate image generation inputs in core/bifrost.go before invoking any provider handler. Ensure in all provider implementations (e.g., core/providers/gemini/gemini.go) that the request and request.Input are non-nil before use, to prevent nil dereferences and provide clear error handling. Apply this invariant broadly to all providers and add tests for nil input scenarios.

Applied to files:

  • core/providers/openai/openai_test.go
  • core/providers/openai/responses_test.go
  • core/providers/anthropic/responses.go
📚 Learning: 2025-12-19T08:29:20.286Z
Learnt from: qwerty-dvorak
Repo: maximhq/bifrost PR: 1095
File: core/internal/testutil/count_tokens.go:30-67
Timestamp: 2025-12-19T08:29:20.286Z
Learning: In core/internal/testutil test files, enforce using GetTestRetryConfigForScenario() to obtain a generic retry config, then construct a typed retry config (e.g., CountTokensRetryConfig, EmbeddingRetryConfig, TranscriptionRetryConfig) with an empty Conditions slice. Copy only MaxAttempts, BaseDelay, MaxDelay, OnRetry, and OnFinalFail from the generic config. This convention should be consistently applied across all test files in this directory.

Applied to files:

  • core/internal/testutil/tests.go
  • core/internal/testutil/web_search_tool.go
📚 Learning: 2026-01-11T14:08:10.341Z
Learnt from: Pratham-Mishra04
Repo: maximhq/bifrost PR: 1298
File: core/providers/anthropic/anthropic.go:682-699
Timestamp: 2026-01-11T14:08:10.341Z
Learning: In Anthroplic streaming implementations (and analogous providers), ensure that the final 'summary' chunk, which carries usage information and metadata, is emitted after all delta chunks and uses a chunk index of last_delta_index + 1. This differentiates the summary chunk from content delta chunks. Apply this convention consistently in the anthropic provider code and in similar streaming providers, and consider adding a targeted test to assert the ordering and chunk index logic.

Applied to files:

  • core/providers/anthropic/responses.go
📚 Learning: 2026-01-13T13:36:35.221Z
Learnt from: TejasGhatte
Repo: maximhq/bifrost PR: 1319
File: core/providers/anthropic/responses.go:937-937
Timestamp: 2026-01-13T13:36:35.221Z
Learning: In core/providers/anthropic/responses.go, when handling Anthropic API streaming responses, ensure that content_block_start events include a signature field set to an empty string (e.g., contentBlock.Signature = ""). The actual signature is delivered later via signature_delta events. This behavior is per Anthropic's specification and should not be treated as an error. This guideline should apply to all Anthropic response handling files under core/providers/anthropic/ and similar go files that process streaming blocks.

Applied to files:

  • core/providers/anthropic/responses.go
📚 Learning: 2026-01-14T06:57:42.750Z
Learnt from: TejasGhatte
Repo: maximhq/bifrost PR: 1319
File: core/providers/anthropic/types.go:248-336
Timestamp: 2026-01-14T06:57:42.750Z
Learning: For Anthropic citation types (page_location, char_location, content_block_location), ensure there is an optional string field file_id to reference uploaded files. Update the Go structs modeling these citations to include FileID *string (or string with omitempty) and document its optionality in comments, so code consuming these types can handle absence of file_id gracefully.

Applied to files:

  • core/providers/anthropic/responses.go
📚 Learning: 2026-01-15T11:16:33.414Z
Learnt from: TejasGhatte
Repo: maximhq/bifrost PR: 1241
File: core/providers/anthropic/utils.go:801-874
Timestamp: 2026-01-15T11:16:33.414Z
Learning: In the Anthropic provider (core/providers/anthropic), allow empty arrays for allowed_domains and blocked_domains in the tool arguments JSON. Update or confirm sanitization logic so it does not strip empty arrays when only one of the domain filter fields is present (i.e., if the list is empty but the field is specified, keep it). This applies to code in core/providers/anthropic/utils.go (and related files in the same package if they share the same sanitization behavior). Ensure tests cover cases with empty arrays and with both fields omitted.

Applied to files:

  • core/providers/anthropic/responses.go
📚 Learning: 2026-01-14T06:57:42.750Z
Learnt from: TejasGhatte
Repo: maximhq/bifrost PR: 1319
File: core/providers/anthropic/types.go:248-336
Timestamp: 2026-01-14T06:57:42.750Z
Learning: In core/providers/anthropic/types.go, ensure the web_search_result_location citation type includes a string field named 'url' alongside the existing fields 'encrypted_index', 'title', and 'cited_text'. If the field is missing, add it with type string and appropriate struct tags (e.g., json and/or db tags) and update any related serialization or usage accordingly.

Applied to files:

  • core/providers/anthropic/responses.go
📚 Learning: 2026-01-13T17:10:07.064Z
Learnt from: Radheshg04
Repo: maximhq/bifrost PR: 1312
File: tests/integrations/python/tests/test_openai.py:1166-1258
Timestamp: 2026-01-13T17:10:07.064Z
Learning: In tests under tests/integrations/python, prefer using the OpenAI image generation model 'gpt-image-1' via the config key providers.openai.image_generation for image-generation scenarios. This avoids DALLE-3 parameter limitations (e.g., n>1, quality/size combos). Ensure tests reference this provider in mocks/fixtures and document why this choice is used for test determinism.

Applied to files:

  • tests/integrations/python/tests/test_openai.py
🧬 Code graph analysis (5)
core/internal/testutil/tests.go (1)
core/internal/testutil/web_search_tool.go (5)
  • RunWebSearchToolStreamTest (172-333)
  • RunWebSearchToolWithDomainsTest (336-428)
  • RunWebSearchToolContextSizesTest (431-528)
  • RunWebSearchToolMultiTurnTest (531-663)
  • RunWebSearchToolMaxUsesTest (666-751)
core/providers/openai/responses_test.go (2)
core/schemas/responses.go (16)
  • ResponsesToolMessageActionStruct (507-512)
  • ResponsesComputerToolCallAction (656-667)
  • ResponsesWebSearchToolCallAction (698-705)
  • ResponsesLocalShellToolCallAction (913-920)
  • ResponsesMCPApprovalRequestAction (941-947)
  • ResponsesTool (1091-1109)
  • ResponsesToolFunction (1402-1405)
  • ResponsesToolFileSearch (1408-1413)
  • ResponsesToolComputerUsePreview (1545-1551)
  • ResponsesToolWebSearch (1554-1561)
  • ResponsesToolMCP (1579-1588)
  • ResponsesToolCodeInterpreter (1666-1668)
  • ResponsesToolImageGeneration (1671-1682)
  • ResponsesToolLocalShell (1691-1693)
  • ResponsesToolCustom (1696-1698)
  • ResponsesToolWebSearchPreview (1710-1713)
core/schemas/utils.go (1)
  • Ptr (14-16)
core/schemas/responses.go (3)
core/schemas/json_native.go (1)
  • Unmarshal (18-20)
core/schemas/json_wasm.go (1)
  • Unmarshal (22-24)
core/providers/gemini/types.go (1)
  • Type (825-825)
core/providers/anthropic/responses.go (2)
core/providers/anthropic/types.go (1)
  • AnthropicToolTypeWebSearch20250305 (368-368)
core/schemas/responses.go (2)
  • ResponsesToolMessage (485-505)
  • ResponsesWebSearchToolCallAction (698-705)
core/internal/testutil/web_search_tool.go (5)
core/schemas/responses.go (8)
  • ResponsesMessage (319-332)
  • ResponsesTool (1091-1109)
  • ResponsesToolWebSearch (1554-1561)
  • ResponsesToolWebSearchUserLocation (1570-1576)
  • ResponsesMessageTypeWebSearchCall (299-299)
  • ResponsesToolWebSearchFilters (1564-1567)
  • BifrostResponsesResponse (43-83)
  • ResponsesMessageTypeMessage (295-295)
core/schemas/utils.go (1)
  • Ptr (14-16)
core/schemas/models.go (1)
  • Model (107-127)
core/internal/testutil/test_retry_framework.go (5)
  • StreamingRetryConfig (871-893)
  • TestRetryContext (174-179)
  • WithResponsesStreamValidationRetry (2361-2567)
  • ResponsesStreamValidationResult (2351-2357)
  • WithResponsesTestRetry (443-593)
core/schemas/context.go (1)
  • NewBifrostContext (48-71)
⏰ Context from checks skipped due to timeout of 900000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (25)
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
🔇 Additional comments (11)
core/providers/openai/openai_test.go (1)

25-33: ChatModel switch to gpt-4o: verify availability/cost is acceptable for this test suite.

This likely aligns better with the expanded tool + web-search coverage, but please confirm gpt-4o is enabled for all environments where these tests are expected to run (and that higher cost/latency vs gpt-4o-mini is acceptable). As per provider-test conventions, keeping client.Shutdown() explicit at the end is consistent. Based on learnings, ...

core/internal/testutil/tests.go (1)

24-42: WebSearchTool coverage is now much more complete; ensure each added runner is robustly self-gated.

Since RunAllComprehensiveTests invokes every scenario function, the added WebSearch subtests must reliably no-op/skip when unsupported (especially RunWebSearchToolMaxUsesTest being Anthropic-only) and use consistent retry/backoff so the suite doesn’t get flaky/slow. If you want clearer visibility, consider extending printTestSummary to list the WebSearch sub-scenarios separately (optional).

tests/integrations/python/tests/test_openai.py (1)

579-586: Removing provider-specific extra_body from the baseline streaming test reduces coupling/flakiness.

This keeps test_13_streaming focused on streaming behavior (and avoids leaking provider/model-specific “reasoning hints” into a cross-provider path). Please just confirm no provider depended on that payload to exercise streaming deterministically.

core/providers/openai/responses_test.go (2)

3-9: encoding/json import for schema round-trip tests is appropriate here.


349-773: Good coverage for ResponsesToolMessageActionStruct union dispatch; confirm the “unknown action => computer tool” fallback is intended.

These tests are valuable and will protect the updated dispatch logic. One thing they implicitly encode (via the edge-case test) is that unknown type values should unmarshal into ResponsesComputerToolCallAction; that’s fine if the goal is forward-compat, but it can also silently drop fields for non-computer unknown payloads—worth double-checking and documenting in-code.

core/providers/anthropic/responses.go (2)

1506-1535: LGTM! Synthetic delta generation for web search at completion.

The deferred delta generation correctly extracts the query from the web search action and generates synthetic input_json_delta events. The nil checks are comprehensive, and the fallback behavior when queryJSON is empty (due to missing query or marshal error) gracefully skips delta generation.


1811-1820: LGTM! Conditional tool parameter handling.

The loop correctly sets Truncation for computer tools and Include for web search tools independently. When both tool types are present, both parameters will be set across different iterations. This aligns with the intended behavior of making web search source inclusion conditional on tool presence.

core/internal/testutil/web_search_tool.go (4)

170-333: Well-structured streaming test for web search.

The streaming validation properly handles:

  • 60-second timeout for the streaming context
  • Channel closure detection with proper goto pattern for breaking out of nested select
  • Multiple event type checks (OutputItemAdded, WebSearchCallInProgress, WebSearchCallSearching, WebSearchCallCompleted, OutputTextDelta)
  • Clear error collection and validation result construction

430-528: Good coverage of context size variations.

The test iterates through low, medium, and high context sizes with proper subtest isolation. The loop variable capture on line 445 is technically unnecessary in Go 1.22+ but doesn't cause any harm.


530-663: Multi-turn conversation test correctly appends history.

The test properly builds conversation history by appending the first response's output to subsequent messages, validating that web search works in multi-turn contexts.


665-751: Anthropic-specific max uses test is well-scoped.

Good practice to skip non-Anthropic providers early (line 673-676) since MaxUses is Anthropic-specific. The test correctly validates that the web search call count doesn't exceed the configured limit.

✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.

@TejasGhatte TejasGhatte force-pushed the 01-15-feat_core_tests_for_web_search_schemas branch from 281ec08 to 902893e Compare January 15, 2026 14:01
@akshaydeo akshaydeo changed the base branch from 01-05-fix_anthropic_type_citations to graphite-base/1339 January 15, 2026 14:15
@akshaydeo akshaydeo force-pushed the 01-15-feat_core_tests_for_web_search_schemas branch from 902893e to 2c8f77a Compare January 15, 2026 14:16
@graphite-app graphite-app bot changed the base branch from graphite-base/1339 to main January 15, 2026 14:16
@akshaydeo akshaydeo force-pushed the 01-15-feat_core_tests_for_web_search_schemas branch from 2c8f77a to baa0c82 Compare January 15, 2026 14:17
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
core/schemas/responses.go (1)

530-583: Fix: unreachable return after the switch will fail compilation; also reset the union on unmarshal.
Because default already returns, the final return fmt.Errorf(...) is unreachable. Additionally, UnmarshalJSON should clear other variants to avoid “multiple fields set” when reusing a struct.

Proposed diff
 func (action *ResponsesToolMessageActionStruct) UnmarshalJSON(data []byte) error {
+	// Reset union fields in case this struct is reused.
+	*action = ResponsesToolMessageActionStruct{}
+
 	// First, peek at the type field to determine which variant to unmarshal
 	var typeStruct struct {
 		Type string `json:"type"`
 	}
 	if err := Unmarshal(data, &typeStruct); err != nil {
 		return fmt.Errorf("failed to peek at type field: %w", err)
 	}

 	// Based on the type, unmarshal into the appropriate variant
 	switch typeStruct.Type {
 	...
 	default:
 		// use computer tool, as it can have many possible actions
 		var computerToolCallAction ResponsesComputerToolCallAction
 		if err := Unmarshal(data, &computerToolCallAction); err != nil {
 			return fmt.Errorf("failed to unmarshal computer tool call action: %w", err)
 		}
 		action.ResponsesComputerToolCallAction = &computerToolCallAction
 		return nil
 	}
-	return fmt.Errorf("unknown action type: %s", typeStruct.Type)
 }
🤖 Fix all issues with AI agents
In `@core/internal/testutil/web_search_tool.go`:
- Around line 785-805: The suffix-stripping logic in matchesDomainPattern is
off-by-one when handling patterns that end with '*' — it always removes two
chars (pattern[:len(pattern)-2]) assuming "/*"; change it to remove only the
trailing '*' when the pattern ends with '*' but not "/*", and remove two chars
only when the pattern specifically ends with "/*". Update matchesDomainPattern
to distinguish pattern.HasSuffix("/*") versus pattern[len-1]=='*' and compute
prefix accordingly, then continue using containsSubstring(url, prefix).
🧹 Nitpick comments (5)
tests/integrations/python/tests/test_openai.py (2)

575-586: Add the trailing comma after stream=True (and keep the call style consistent).
This is syntactically fine, but it’s an easy footgun for future diffs when adding kwargs.

Proposed diff
-        stream=True
+        stream=True,
         )

3153-3193: Test name/docstring says “wildcard domains”, but the config is now exact domains; also assert domain filtering actually happened.
With the updated allowed_domains, consider renaming the test to avoid confusion, and add an assertion that every returned source.url host matches the allowlist (when sources are present), so regressions don’t pass with only prints.

Proposed diff
 def test_56_web_search_wildcard_domains(self, provider, model, vk_enabled):
-    """Test Case 56: Web search with wildcard domain patterns"""
+    """Test Case 56: Web search with allowed domain filtering"""
     ...
     # Collect search sources
     search_sources = []
     for output_item in response.output:
         if hasattr(output_item, "type") and output_item.type == "web_search_call":
             if hasattr(output_item, "action") and hasattr(output_item.action, "sources"):
                 if output_item.action.sources:
                     search_sources.extend(output_item.action.sources)

     if len(search_sources) > 0:
+        from urllib.parse import urlparse
+        allowed = {"wikipedia.org", "en.wikipedia.org"}
         print(f"✓ Found {len(search_sources)} search sources")
         for i, source in enumerate(search_sources[:3]):
             if hasattr(source, "url"):
                 print(f"  Source {i+1}: {source.url}")
+
+        for source in search_sources:
+            if hasattr(source, "url") and source.url:
+                host = (urlparse(source.url).hostname or "").lower()
+                assert host in allowed or host.endswith(".wikipedia.org"), f"Unexpected source host: {host}"
core/providers/openai/responses_test.go (1)

348-773: Nice coverage for ResponsesToolMessageActionStruct variants; tighten the “unknown action type” assertion.
Right now the edge-case test only checks the computer-action pointer is non-nil; also asserting the decoded Type equals "unknown_action" would catch accidental overwrites/defaulting.

Proposed diff
 	t.Run("unknown action type - unmarshal to computer tool (default)", func(t *testing.T) {
 		jsonData := `{"type":"unknown_action"}`
 		var action schemas.ResponsesToolMessageActionStruct
 		if err := json.Unmarshal([]byte(jsonData), &action); err != nil {
 			t.Fatalf("failed to unmarshal: %v", err)
 		}

 		// Default behavior is to unmarshal to computer tool
 		if action.ResponsesComputerToolCallAction == nil {
 			t.Error("expected ResponsesComputerToolCallAction to be populated for unknown type")
 		}
+		if action.ResponsesComputerToolCallAction.Type != "unknown_action" {
+			t.Errorf("expected type to be preserved as unknown_action, got %q", action.ResponsesComputerToolCallAction.Type)
+		}
 	})
core/internal/testutil/web_search_tool.go (2)

606-609: Consider creating a fresh slice to avoid potential aliasing.

The append on line 608 could modify firstMessages if it has spare capacity. While safe in the current code, explicitly creating a new slice is clearer:

♻️ Suggested refactor
 		// Second turn - add first response to conversation history
 		t.Log("🔄 Starting second turn...")
-		secondMessages := append(firstMessages, firstResponse.Output...)
-		secondMessages = append(secondMessages, CreateBasicResponsesMessage("What are the main types of renewable energy?"))
+		secondMessages := make([]schemas.ResponsesMessage, 0, len(firstMessages)+len(firstResponse.Output)+1)
+		secondMessages = append(secondMessages, firstMessages...)
+		secondMessages = append(secondMessages, firstResponse.Output...)
+		secondMessages = append(secondMessages, CreateBasicResponsesMessage("What are the main types of renewable energy?"))

807-825: Use strings package instead of custom implementations.

The custom toLower implementation has a bug with multibyte UTF-8 characters: len(s) returns byte count while range s iterates by byte position, causing gaps in the result slice for non-ASCII characters.

More importantly, Go's standard library provides well-tested, optimized versions of these functions.

♻️ Suggested refactor using standard library
+import "strings"
+
 // containsSubstring checks if s contains substr (case-insensitive)
 func containsSubstring(s, substr string) bool {
-	s = toLower(s)
-	substr = toLower(substr)
-	return len(s) >= len(substr) && indexOfSubstring(s, substr) >= 0
-}
-
-// toLower converts string to lowercase
-func toLower(s string) string {
-	result := make([]rune, len(s))
-	for i, r := range s {
-		if r >= 'A' && r <= 'Z' {
-			result[i] = r + 32
-		} else {
-			result[i] = r
-		}
-	}
-	return string(result)
-}
-
-// indexOfSubstring finds index of substr in s, or -1 if not found
-func indexOfSubstring(s, substr string) int {
-	if len(substr) == 0 {
-		return 0
-	}
-	if len(substr) > len(s) {
-		return -1
-	}
-
-	for i := 0; i <= len(s)-len(substr); i++ {
-		if s[i:i+len(substr)] == substr {
-			return i
-		}
-	}
-	return -1
+	return strings.Contains(strings.ToLower(s), strings.ToLower(substr))
 }
📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 281ec08 and baa0c82.

📒 Files selected for processing (7)
  • core/internal/testutil/tests.go
  • core/internal/testutil/web_search_tool.go
  • core/providers/anthropic/responses.go
  • core/providers/openai/openai_test.go
  • core/providers/openai/responses_test.go
  • core/schemas/responses.go
  • tests/integrations/python/tests/test_openai.py
🚧 Files skipped from review as they are similar to previous changes (3)
  • core/internal/testutil/tests.go
  • core/providers/openai/openai_test.go
  • core/providers/anthropic/responses.go
🧰 Additional context used
📓 Path-based instructions (1)
**

⚙️ CodeRabbit configuration file

always check the stack if there is one for the current PR. do not give localized reviews for the PR, always see all changes in the light of the whole stack of PRs (if there is a stack, if there is no stack you can continue to make localized suggestions/reviews)

Files:

  • core/schemas/responses.go
  • tests/integrations/python/tests/test_openai.py
  • core/internal/testutil/web_search_tool.go
  • core/providers/openai/responses_test.go
🧬 Code graph analysis (2)
core/internal/testutil/web_search_tool.go (3)
core/schemas/responses.go (6)
  • ResponsesMessage (319-332)
  • ResponsesTool (1091-1109)
  • ResponsesToolWebSearch (1554-1561)
  • ResponsesToolWebSearchUserLocation (1570-1576)
  • ResponsesToolMessage (485-505)
  • ResponsesToolWebSearchFilters (1564-1567)
core/schemas/utils.go (1)
  • Ptr (14-16)
core/internal/testutil/test_retry_framework.go (1)
  • StreamingRetryConfig (871-893)
core/providers/openai/responses_test.go (2)
core/schemas/responses.go (25)
  • ResponsesToolMessageActionStruct (507-512)
  • ResponsesComputerToolCallAction (656-667)
  • ResponsesWebSearchToolCallAction (698-705)
  • ResponsesLocalShellToolCallAction (913-920)
  • ResponsesMCPApprovalRequestAction (941-947)
  • ResponsesTool (1091-1109)
  • ResponsesToolTypeFunction (1078-1078)
  • ResponsesToolFunction (1402-1405)
  • ResponsesToolFileSearch (1408-1413)
  • ResponsesToolComputerUsePreview (1545-1551)
  • ResponsesToolTypeWebSearch (1081-1081)
  • ResponsesToolWebSearch (1554-1561)
  • ResponsesToolTypeMCP (1082-1082)
  • ResponsesToolMCP (1579-1588)
  • ResponsesToolTypeCodeInterpreter (1083-1083)
  • ResponsesToolCodeInterpreter (1666-1668)
  • ResponsesToolTypeImageGeneration (1084-1084)
  • ResponsesToolImageGeneration (1671-1682)
  • ResponsesToolTypeLocalShell (1085-1085)
  • ResponsesToolLocalShell (1691-1693)
  • ResponsesToolTypeCustom (1086-1086)
  • ResponsesToolCustom (1696-1698)
  • ResponsesToolTypeWebSearchPreview (1087-1087)
  • ResponsesToolWebSearchPreview (1710-1713)
  • ResponsesToolWebSearchUserLocation (1570-1576)
core/schemas/utils.go (1)
  • Ptr (14-16)
🔇 Additional comments (8)
core/providers/openai/responses_test.go (3)

4-9: encoding/json import addition is appropriate for the new round-trip tests.


775-1381: The ResponsesTool marshal/unmarshal round-trips are solid and match the schema’s “common fields + embedded tool struct” approach.


1382-1435: Helper comparison functions are fine for these tests and keep assertions readable.

core/internal/testutil/web_search_tool.go (5)

170-333: LGTM!

The streaming test implementation is comprehensive, covering web search call detection across multiple streaming event types (OutputItemAdded, WebSearchCallInProgress, WebSearchCallSearching, WebSearchCallCompleted). The 60-second timeout and chunk count validation are appropriate safeguards.


351-427: LGTM!

The domain filter configuration on line 356 now matches the validation call on line 423. The test correctly validates that web search sources respect the configured AllowedDomains.


430-528: LGTM!

The context size variations test is well-structured with appropriate subtests for each size level.


665-751: LGTM!

The Anthropic-specific MaxUses test correctly validates the constraint and gracefully skips for other providers. The validation at lines 746-747 ensures the web search count stays within bounds.


753-783: Confirm: Domain filter mismatch is intentionally non-fatal?

When a source URL doesn't match the allowed domain filters (lines 776-778), the function only logs a warning rather than failing the test. If this is intentional (e.g., providers may not strictly enforce domain filters), consider adding a comment to clarify. If strict enforcement is expected, this should assert failure.

✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.

Copy link
Contributor

akshaydeo commented Jan 15, 2026

Merge activity

  • Jan 15, 4:05 PM UTC: A user started a stack merge that includes this pull request via Graphite.
  • Jan 15, 4:06 PM UTC: @akshaydeo merged this pull request with Graphite.

@akshaydeo akshaydeo merged commit 5adf6f6 into main Jan 15, 2026
9 checks passed
@akshaydeo akshaydeo deleted the 01-15-feat_core_tests_for_web_search_schemas branch January 15, 2026 16:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants