-
Notifications
You must be signed in to change notification settings - Fork 186
feat: add batch embedding support for Gemini and improve cross-provider compatibility --skip-pipeline #1138
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: add batch embedding support for Gemini and improve cross-provider compatibility --skip-pipeline #1138
Conversation
|
Caution Review failedThe pull request is closed. 📝 WalkthroughSummary by CodeRabbit
✏️ Tip: You can customize this high-level summary in your review settings. WalkthroughSwitch Gemini embeddings to a per-text batched request model, add Gemini↔Bifrost embedding converters and new Gemini batch types, implement a Gemini-specific HTTP embedding flow, extend Cohere usage parsing, add and renumber LangChain/OpenAI integration tests, and insert a duplicated Cross-Provider Embeddings docs section. Changes
Sequence Diagram(s)sequenceDiagram
autonumber
actor Client
participant BifrostServer
participant GeminiProvider
participant GeminiAPI
Note over Client,BifrostServer: Client sends BifrostEmbeddingRequest
Client->>BifrostServer: POST /embeddings (BifrostEmbeddingRequest)
BifrostServer->>GeminiProvider: ToGeminiEmbeddingRequest -> GeminiBatchEmbeddingRequest (one item per text)
GeminiProvider->>GeminiAPI: POST /v1/batchEmbedContents (batch payload + headers)
GeminiAPI-->>GeminiProvider: 200 OK + GeminiEmbeddingResponse (per-item embeddings, metadata)
GeminiProvider->>BifrostServer: ToBifrostEmbeddingResponse (convert embeddings, usage, metadata, latency, raw)
BifrostServer-->>Client: 200 OK (BifrostEmbeddingResponse)
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes
Poem
Pre-merge checks and finishing touches✅ Passed checks (3 passed)
📜 Recent review detailsConfiguration used: Organization UI Review profile: CHILL Plan: Pro 📒 Files selected for processing (7)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
🧪 Test Suite AvailableThis PR can be tested by a repository admin. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🧹 Nitpick comments (1)
core/providers/gemini/embedding.go (1)
138-158: Document or validate the assumption that all requests share identical parameters.The code takes parameters from only the first request (
request.Requests[0]). The comment at line 117 states that multiple requests have "same parameters but different text fields," but this assumption is not validated.Consider adding a code comment to explicitly document this assumption, or add validation to ensure all requests have consistent parameters:
// Extract parameters from the first request. // Assumption: All requests in the batch share identical parameters (dimensions, taskType, title). embeddingRequest := request.Requests[0]Alternatively, if strict validation is desired:
// TODO: Validate that all requests have consistent parameters embeddingRequest := request.Requests[0]
📜 Review details
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
core/providers/gemini/embedding.go(1 hunks)docs/integrations/langchain-sdk.mdx(1 hunks)tests/integrations/tests/test_langchain.py(23 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**
⚙️ CodeRabbit configuration file
always check the stack if there is one for the current PR. do not give localized reviews for the PR, always see all changes in the light of the whole stack of PRs (if there is a stack, if there is no stack you can continue to make localized suggestions/reviews)
Files:
core/providers/gemini/embedding.godocs/integrations/langchain-sdk.mdxtests/integrations/tests/test_langchain.py
🧠 Learnings (2)
📚 Learning: 2025-12-09T17:07:42.007Z
Learnt from: qwerty-dvorak
Repo: maximhq/bifrost PR: 1006
File: core/schemas/account.go:9-18
Timestamp: 2025-12-09T17:07:42.007Z
Learning: In core/schemas/account.go, the HuggingFaceKeyConfig field within the Key struct is currently unused and reserved for future Hugging Face inference endpoint deployments. Do not flag this field as missing from OpenAPI documentation or require its presence in the API spec until the feature is actively implemented and used. When the feature is added, update the OpenAPI docs accordingly; otherwise, treat this field as non-breaking and not part of the current API surface.
Applied to files:
core/providers/gemini/embedding.go
📚 Learning: 2025-12-19T09:26:54.961Z
Learnt from: qwerty-dvorak
Repo: maximhq/bifrost PR: 1006
File: core/providers/utils/utils.go:1050-1051
Timestamp: 2025-12-19T09:26:54.961Z
Learning: Update streaming end-marker handling so HuggingFace is treated as a non-[DONE] provider for backends that do not emit a DONE marker (e.g., meta llama on novita). In core/providers/utils/utils.go, adjust ProviderSendsDoneMarker() (or related logic) to detect providers that may not emit DONE and avoid relying on DONE as the sole end signal. Add tests to cover both DONE-emitting and non-DONE backends, with clear documentation in code comments explaining the rationale and any fallback behavior.
Applied to files:
core/providers/gemini/embedding.go
🧬 Code graph analysis (1)
core/providers/gemini/embedding.go (2)
core/providers/gemini/types.go (1)
Content(977-985)core/schemas/embedding.go (1)
EmbeddingInput(31-36)
🪛 Ruff (0.14.8)
tests/integrations/tests/test_langchain.py
401-401: Do not catch blind exception: Exception
(BLE001)
408-408: Unused method argument: test_config
(ARG002)
446-446: Unused method argument: test_config
(ARG002)
482-482: Unused method argument: test_config
(ARG002)
511-511: Unused method argument: test_config
(ARG002)
541-541: Unused method argument: test_config
(ARG002)
578-578: Unused method argument: test_config
(ARG002)
610-610: Unused method argument: test_config
(ARG002)
644-644: Unused method argument: test_config
(ARG002)
669-669: Unused method argument: test_config
(ARG002)
692-692: Unused method argument: test_config
(ARG002)
751-751: Unused method argument: test_config
(ARG002)
814-814: Unused method argument: test_config
(ARG002)
849-849: Unused method argument: test_config
(ARG002)
884-884: Unused method argument: test_config
(ARG002)
979-979: Unused method argument: test_config
(ARG002)
1014-1014: Unused method argument: test_config
(ARG002)
1043-1043: Unused method argument: test_config
(ARG002)
1161-1161: Unused method argument: test_config
(ARG002)
1198-1198: Unused method argument: test_config
(ARG002)
1238-1238: Unused method argument: test_config
(ARG002)
1285-1285: Unused method argument: test_config
(ARG002)
1328-1328: Unused method argument: test_config
(ARG002)
🔇 Additional comments (2)
tests/integrations/tests/test_langchain.py (1)
58-58: LGTM!The import of
GoogleGenerativeAIEmbeddingsis correctly added to support the new cross-provider embeddings test.docs/integrations/langchain-sdk.mdx (1)
548-567: No action needed. The Cross-Provider Embeddings section appears only once in the file (line 548), so no duplication exists. The documentation block does not require consolidation.
0a3ff50 to
240b3df
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
♻️ Duplicate comments (1)
tests/integrations/tests/test_langchain.py (1)
374-402: Handle missing GEMINI_API_KEY environment variable gracefully.Direct access to
os.environ["GEMINI_API_KEY"]at line 380 will raiseKeyErrorif the environment variable is not set, preventing the test from skipping gracefully.🔎 Proposed fix
@pytest.mark.parametrize("provider,model", get_cross_provider_params_for_scenario("embeddings")) def test_04_gemini_embeddings_basic(self, provider, model): """Test Case 4: Basic Gemini embeddings functionality""" + # Check if API key is available + if "GEMINI_API_KEY" not in os.environ: + pytest.skip("GEMINI_API_KEY environment variable not set") + try: embeddings = GoogleGenerativeAIEmbeddings( model=format_provider_model(provider, model), api_key=os.environ["GEMINI_API_KEY"],Based on past review comments.
🧹 Nitpick comments (1)
core/providers/gemini/embedding.go (1)
117-136: Move Input assignment outside the loop to avoid redundant operations.The
bifrostReq.Inputassignment block (lines 127-134) executes inside the request loop, causing repeated allocations and assignments. While the final result is correct, this is inefficient when processing multiple requests.🔎 Proposed fix
// sdk request contains multiple embedding requests with same parameters but different text fields if len(request.Requests) > 0 { var texts []string for _, req := range request.Requests { if req.Content != nil && len(req.Content.Parts) > 0 { for _, part := range req.Content.Parts { if part != nil && part.Text != "" { texts = append(texts, part.Text) } } - if len(texts) > 0 { - bifrostReq.Input = &schemas.EmbeddingInput{} - if len(texts) == 1 { - bifrostReq.Input.Text = &texts[0] - } else { - bifrostReq.Input.Texts = texts - } - } } } + if len(texts) > 0 { + bifrostReq.Input = &schemas.EmbeddingInput{} + if len(texts) == 1 { + bifrostReq.Input.Text = &texts[0] + } else { + bifrostReq.Input.Texts = texts + } + }Based on past review comments.
📜 Review details
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
core/providers/gemini/embedding.go(1 hunks)docs/integrations/langchain-sdk.mdx(1 hunks)tests/integrations/tests/test_langchain.py(23 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**
⚙️ CodeRabbit configuration file
always check the stack if there is one for the current PR. do not give localized reviews for the PR, always see all changes in the light of the whole stack of PRs (if there is a stack, if there is no stack you can continue to make localized suggestions/reviews)
Files:
core/providers/gemini/embedding.godocs/integrations/langchain-sdk.mdxtests/integrations/tests/test_langchain.py
🧠 Learnings (2)
📚 Learning: 2025-12-09T17:07:42.007Z
Learnt from: qwerty-dvorak
Repo: maximhq/bifrost PR: 1006
File: core/schemas/account.go:9-18
Timestamp: 2025-12-09T17:07:42.007Z
Learning: In core/schemas/account.go, the HuggingFaceKeyConfig field within the Key struct is currently unused and reserved for future Hugging Face inference endpoint deployments. Do not flag this field as missing from OpenAPI documentation or require its presence in the API spec until the feature is actively implemented and used. When the feature is added, update the OpenAPI docs accordingly; otherwise, treat this field as non-breaking and not part of the current API surface.
Applied to files:
core/providers/gemini/embedding.go
📚 Learning: 2025-12-19T09:26:54.961Z
Learnt from: qwerty-dvorak
Repo: maximhq/bifrost PR: 1006
File: core/providers/utils/utils.go:1050-1051
Timestamp: 2025-12-19T09:26:54.961Z
Learning: Update streaming end-marker handling so HuggingFace is treated as a non-[DONE] provider for backends that do not emit a DONE marker (e.g., meta llama on novita). In core/providers/utils/utils.go, adjust ProviderSendsDoneMarker() (or related logic) to detect providers that may not emit DONE and avoid relying on DONE as the sole end signal. Add tests to cover both DONE-emitting and non-DONE backends, with clear documentation in code comments explaining the rationale and any fallback behavior.
Applied to files:
core/providers/gemini/embedding.go
🧬 Code graph analysis (2)
core/providers/gemini/embedding.go (2)
core/providers/gemini/types.go (1)
Content(977-985)core/schemas/embedding.go (1)
EmbeddingInput(31-36)
tests/integrations/tests/test_langchain.py (3)
core/providers/gemini/types.go (1)
Modality(710-710)tests/integrations/tests/utils/parametrize.py (2)
get_cross_provider_params_for_scenario(12-47)format_provider_model(50-65)tests/integrations/tests/utils/config_loader.py (2)
get_integration_url(112-131)get_integration_url(439-440)
🪛 Ruff (0.14.8)
tests/integrations/tests/test_langchain.py
400-400: Do not catch blind exception: Exception
(BLE001)
407-407: Unused method argument: test_config
(ARG002)
445-445: Unused method argument: test_config
(ARG002)
481-481: Unused method argument: test_config
(ARG002)
510-510: Unused method argument: test_config
(ARG002)
540-540: Unused method argument: test_config
(ARG002)
577-577: Unused method argument: test_config
(ARG002)
609-609: Unused method argument: test_config
(ARG002)
643-643: Unused method argument: test_config
(ARG002)
668-668: Unused method argument: test_config
(ARG002)
691-691: Unused method argument: test_config
(ARG002)
750-750: Unused method argument: test_config
(ARG002)
813-813: Unused method argument: test_config
(ARG002)
848-848: Unused method argument: test_config
(ARG002)
883-883: Unused method argument: test_config
(ARG002)
978-978: Unused method argument: test_config
(ARG002)
1013-1013: Unused method argument: test_config
(ARG002)
1042-1042: Unused method argument: test_config
(ARG002)
1160-1160: Unused method argument: test_config
(ARG002)
1197-1197: Unused method argument: test_config
(ARG002)
1237-1237: Unused method argument: test_config
(ARG002)
1284-1284: Unused method argument: test_config
(ARG002)
1327-1327: Unused method argument: test_config
(ARG002)
🔇 Additional comments (1)
docs/integrations/langchain-sdk.mdx (1)
548-568: No action needed. The Cross-Provider Embeddings section appears only once in the file at lines 548-568. There is no documentation duplication.Likely an incorrect or invalid review comment.
240b3df to
f37a8f8
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
tests/integrations/tests/test_langchain.py (1)
407-1385: Remove unused Modality import and clarify test_config fixture usageTwo minor issues:
Unused import:
Modalityis imported fromlangchain_google_genaiat line 58 but never used. Remove it.Pytest fixture pattern: Multiple test methods accept
test_configas a parameter but don't reference it in their method bodies (e.g.,test_05_function_calling_tools,test_06_llm_chain_basic, etc.). This is expected behavior for pytest fixtures—they're used implicitly through pytest's dependency injection. However, if linters flag this with ARG002, you can silence it by:
- Renaming to
_test_config, or- Using
@pytest.mark.usefixtures("test_config")and removing the parameter, or- Adding
# noqa: ARG002to affected methods.The rest of the changes (renumbered tests, thinking/structured-output coverage) are well-structured and consistent with existing patterns.
tests/integrations/tests/test_openai.py (1)
858-999: Embedding test parametrization is solid; address unused fixture and tighten provider assertionsThe cross-provider parametrization for embedding tests (21–24, 26) works well with the configuration-driven scenario filtering. However, there are two actionable improvements:
Unused
test_configfixture arguments
Ruff is correct to flagtest_configas unused in tests 21–24 and 26. These tests never reference the fixture. Either drop it from the signatures or rename to_test_configto satisfy linters while keeping the fixture hook active.Usage assertions special-cased for Gemini
Intest_26_embedding_long_text, the patternif provider != "gemini"assumes every non-Gemini provider (openai, bedrock, cohere) returns usage data. This is currently safe since only Gemini is configured to lack usage, but the guard could become brittle as more providers are added. A more robust approach would be to maintain an explicit allowlist of providers known to return usage (e.g.,{"openai", "bedrock", "cohere"}) or derive this from a provider capability flag in config, similar to how scenarios work.The fixed
dimensions=1536assumption across parametrized tests is fine—the config confirms all configured embedding providers support 1536-dim vectors.
♻️ Duplicate comments (2)
tests/integrations/tests/test_langchain.py (1)
58-59: Make Gemini embeddings test resilient to missingGEMINI_API_KEYand align with other LangChain tests
test_04_gemini_embeddings_basicis a useful cross‑provider embeddings smoke test, but the way the API key is handled is brittle:
api_key=os.environ["GEMINI_API_KEY"]will raise aKeyErrorif the env var isn’t set.- Other Gemini/LangChain tests in this file rely on dummy keys or explicit skipping rather than requiring real provider keys, since Bifrost handles auth.
To keep this consistent with the rest of the suite and avoid hard failures in CI:
Either treat the key as optional and skip when it’s missing:
api_key = os.environ.get("GEMINI_API_KEY") if not api_key: pytest.skip("GEMINI_API_KEY environment variable not set") embeddings = GoogleGenerativeAIEmbeddings( model=format_provider_model(provider, model), api_key=api_key, base_url=get_integration_url("langchain") or None, )Or, if Bifrost doesn’t actually need a real Gemini key here, just pass the same dummy key pattern used elsewhere in this file and drop the env dependency entirely.
Also, the new import of
Modalityfromlangchain_google_genaiappears unused and could be removed to keep imports tidy.Also applies to: 374-402
core/providers/gemini/embedding.go (1)
175-194: Critical: Input assignment block still inside the loop despite past review.Lines 185-192 are executed on every iteration where
req.Content != nil, causing redundant assignments tobifrostReq.Input. Thetextsslice correctly accumulates across all requests, but the Input assignment should occur once after the loop completes, not inside the per-request condition.This was flagged in a previous review and marked as addressed, but the fix is not present in the current code.
🔎 Move Input assignment outside the loop
// sdk request contains multiple embedding requests with same parameters but different text fields if len(request.Requests) > 0 { var texts []string for _, req := range request.Requests { if req.Content != nil && len(req.Content.Parts) > 0 { for _, part := range req.Content.Parts { if part != nil && part.Text != "" { texts = append(texts, part.Text) } } - if len(texts) > 0 { - bifrostReq.Input = &schemas.EmbeddingInput{} - if len(texts) == 1 { - bifrostReq.Input.Text = &texts[0] - } else { - bifrostReq.Input.Texts = texts - } - } } } + // Assign Input once after collecting all texts + if len(texts) > 0 { + bifrostReq.Input = &schemas.EmbeddingInput{} + if len(texts) == 1 { + bifrostReq.Input.Text = &texts[0] + } else { + bifrostReq.Input.Texts = texts + } + } embeddingRequest := request.Requests[0]
🧹 Nitpick comments (1)
core/providers/gemini/types.go (1)
967-977: Batch embedding request container is consistent with existing types
GeminiBatchEmbeddingRequestis a straightforward wrapper around[]GeminiEmbeddingRequestand aligns with how the rest of the file models Gemini request/response JSON. GivenGeminiGenerationRequestalso has aRequests []GeminiEmbeddingRequestfield, consider a short doc comment on each to clarify intended usage (generation vs. embedding batch) to avoid future confusion, but no functional issues here.
📜 Review details
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (7)
core/providers/cohere/embedding.go(1 hunks)core/providers/gemini/embedding.go(3 hunks)core/providers/gemini/gemini.go(1 hunks)core/providers/gemini/types.go(1 hunks)docs/integrations/langchain-sdk.mdx(1 hunks)tests/integrations/tests/test_langchain.py(23 hunks)tests/integrations/tests/test_openai.py(5 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**
⚙️ CodeRabbit configuration file
always check the stack if there is one for the current PR. do not give localized reviews for the PR, always see all changes in the light of the whole stack of PRs (if there is a stack, if there is no stack you can continue to make localized suggestions/reviews)
Files:
docs/integrations/langchain-sdk.mdxcore/providers/cohere/embedding.gocore/providers/gemini/types.gotests/integrations/tests/test_openai.pycore/providers/gemini/embedding.gocore/providers/gemini/gemini.gotests/integrations/tests/test_langchain.py
🧠 Learnings (2)
📚 Learning: 2025-12-09T17:07:42.007Z
Learnt from: qwerty-dvorak
Repo: maximhq/bifrost PR: 1006
File: core/schemas/account.go:9-18
Timestamp: 2025-12-09T17:07:42.007Z
Learning: In core/schemas/account.go, the HuggingFaceKeyConfig field within the Key struct is currently unused and reserved for future Hugging Face inference endpoint deployments. Do not flag this field as missing from OpenAPI documentation or require its presence in the API spec until the feature is actively implemented and used. When the feature is added, update the OpenAPI docs accordingly; otherwise, treat this field as non-breaking and not part of the current API surface.
Applied to files:
core/providers/cohere/embedding.gocore/providers/gemini/types.gocore/providers/gemini/embedding.gocore/providers/gemini/gemini.go
📚 Learning: 2025-12-19T09:26:54.961Z
Learnt from: qwerty-dvorak
Repo: maximhq/bifrost PR: 1006
File: core/providers/utils/utils.go:1050-1051
Timestamp: 2025-12-19T09:26:54.961Z
Learning: Update streaming end-marker handling so HuggingFace is treated as a non-[DONE] provider for backends that do not emit a DONE marker (e.g., meta llama on novita). In core/providers/utils/utils.go, adjust ProviderSendsDoneMarker() (or related logic) to detect providers that may not emit DONE and avoid relying on DONE as the sole end signal. Add tests to cover both DONE-emitting and non-DONE backends, with clear documentation in code comments explaining the rationale and any fallback behavior.
Applied to files:
core/providers/cohere/embedding.gocore/providers/gemini/types.gocore/providers/gemini/embedding.gocore/providers/gemini/gemini.go
🧬 Code graph analysis (4)
core/providers/cohere/embedding.go (1)
core/schemas/chatcompletions.go (1)
BifrostLLMUsage(845-852)
tests/integrations/tests/test_openai.py (2)
tests/integrations/tests/utils/parametrize.py (2)
get_cross_provider_params_for_scenario(12-47)format_provider_model(50-65)tests/integrations/tests/utils/common.py (1)
assert_valid_embedding_response(1447-1500)
core/providers/gemini/embedding.go (4)
core/schemas/embedding.go (5)
BifrostEmbeddingRequest(9-16)BifrostEmbeddingResponse(22-28)EmbeddingData(118-122)EmbeddingStruct(124-128)EmbeddingInput(31-36)core/providers/gemini/types.go (5)
GeminiBatchEmbeddingRequest(967-969)GeminiEmbeddingRequest(972-978)Content(981-989)Part(995-1019)GeminiEmbeddingResponse(1190-1193)core/schemas/utils.go (1)
SafeExtractStringPointer(468-483)core/schemas/chatcompletions.go (1)
BifrostLLMUsage(845-852)
tests/integrations/tests/test_langchain.py (8)
core/providers/gemini/types.go (1)
Modality(710-710)tests/integrations/tests/utils/parametrize.py (2)
get_cross_provider_params_for_scenario(12-47)format_provider_model(50-65)tests/integrations/tests/test_openai.py (1)
test_config(251-253)tests/integrations/tests/test_anthropic.py (1)
test_config(131-133)tests/integrations/tests/test_bedrock.py (1)
test_config(243-245)tests/integrations/tests/test_pydanticai.py (1)
test_config(103-105)tests/integrations/tests/test_google.py (1)
test_config(149-151)tests/integrations/tests/test_litellm.py (1)
test_config(98-100)
🪛 Ruff (0.14.8)
tests/integrations/tests/test_openai.py
859-859: Unused method argument: test_config
(ARG002)
876-876: Unused method argument: test_config
(ARG002)
893-893: Unused method argument: test_config
(ARG002)
923-923: Unused method argument: test_config
(ARG002)
988-988: Unused method argument: test_config
(ARG002)
tests/integrations/tests/test_langchain.py
400-400: Do not catch blind exception: Exception
(BLE001)
407-407: Unused method argument: test_config
(ARG002)
445-445: Unused method argument: test_config
(ARG002)
481-481: Unused method argument: test_config
(ARG002)
510-510: Unused method argument: test_config
(ARG002)
540-540: Unused method argument: test_config
(ARG002)
577-577: Unused method argument: test_config
(ARG002)
609-609: Unused method argument: test_config
(ARG002)
643-643: Unused method argument: test_config
(ARG002)
668-668: Unused method argument: test_config
(ARG002)
691-691: Unused method argument: test_config
(ARG002)
750-750: Unused method argument: test_config
(ARG002)
813-813: Unused method argument: test_config
(ARG002)
848-848: Unused method argument: test_config
(ARG002)
883-883: Unused method argument: test_config
(ARG002)
978-978: Unused method argument: test_config
(ARG002)
1013-1013: Unused method argument: test_config
(ARG002)
1042-1042: Unused method argument: test_config
(ARG002)
1160-1160: Unused method argument: test_config
(ARG002)
1197-1197: Unused method argument: test_config
(ARG002)
1237-1237: Unused method argument: test_config
(ARG002)
1284-1284: Unused method argument: test_config
(ARG002)
1327-1327: Unused method argument: test_config
(ARG002)
🔇 Additional comments (4)
core/providers/cohere/embedding.go (1)
159-179: Cohere billed_units → Bifrost usage mapping looks correctThe new
Meta.BilledUnitsbranch mirrors the existingMeta.Tokenshandling and correctly populatesPromptTokens,CompletionTokens, andTotalTokens. This should backfill usage for newer Cohere responses without changing behavior for legacyTokensmetadata.docs/integrations/langchain-sdk.mdx (1)
548-567: Cross‑provider embeddings docs align with the new LangChain + Gemini flowThe new “Cross‑Provider Embeddings” section clearly explains why
OpenAIEmbeddingsis OpenAI‑specific and correctly recommendsGoogleGenerativeAIEmbeddingswithmodel="provider/model"andbase_url=".../langchain"for Cohere/Bedrock/Gemini, matching how the tests and core code are wired. This is a good, accurate clarification of the intended multi‑provider embeddings path.core/providers/gemini/gemini.go (1)
876-966: LGTM! Solid refactor to native Gemini batch embedding API.The Embedding method correctly transitions from the OpenAI compatibility layer to Gemini's native batchEmbedContents endpoint. The implementation properly handles:
- Batch request construction via
ToGeminiEmbeddingRequest- Direct HTTP request lifecycle with fasthttp
- Comprehensive error handling at each step
- Response conversion back to Bifrost format with nil-safety
- Metadata population (latency, provider, raw request/response)
core/providers/gemini/embedding.go (1)
116-158: LGTM! Well-structured bidirectional conversion.The new
ToBifrostEmbeddingResponsefunction correctly maps Gemini's batch embedding response to Bifrost format with:
- Proper nil-safety checks
- Correct embedding data structure mapping
- Intelligent usage metadata fallback (Statistics → Metadata)
- Appropriate total tokens calculation for embeddings
f37a8f8 to
9deef80
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
♻️ Duplicate comments (1)
core/providers/gemini/embedding.go (1)
175-194: Logic issue: Input assignment should be moved outside the inner loop.The block at lines 185-192 that sets
bifrostReq.Inputis still inside thefor _, req := range request.Requestsloop. This means it executes on each iteration, potentially overwriting the input multiple times. While the current logic happens to work because texts accumulate across iterations, the assignment should be moved after the loop completes for clarity and correctness.🔎 Suggested fix
if len(request.Requests) > 0 { var texts []string for _, req := range request.Requests { if req.Content != nil && len(req.Content.Parts) > 0 { for _, part := range req.Content.Parts { if part != nil && part.Text != "" { texts = append(texts, part.Text) } } - if len(texts) > 0 { - bifrostReq.Input = &schemas.EmbeddingInput{} - if len(texts) == 1 { - bifrostReq.Input.Text = &texts[0] - } else { - bifrostReq.Input.Texts = texts - } - } } } + if len(texts) > 0 { + bifrostReq.Input = &schemas.EmbeddingInput{} + if len(texts) == 1 { + bifrostReq.Input.Text = &texts[0] + } else { + bifrostReq.Input.Texts = texts + } + } embeddingRequest := request.Requests[0]
🧹 Nitpick comments (1)
tests/integrations/tests/test_openai.py (1)
987-1000: Document why Gemini doesn't return usage data.The conditional skip for Gemini usage data validation is correct, but the inline comment could be more specific about whether this is a known limitation of the Gemini API or a temporary implementation detail in Bifrost's Gemini provider integration.
Consider expanding the comment to clarify:
# Gemini embeddings API does not return usage/token metadata in the response # This is a known limitation of the Gemini embeddings endpoint if provider != "gemini":
📜 Review details
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (7)
core/providers/cohere/embedding.gocore/providers/gemini/embedding.gocore/providers/gemini/gemini.gocore/providers/gemini/types.godocs/integrations/langchain-sdk.mdxtests/integrations/tests/test_langchain.pytests/integrations/tests/test_openai.py
🚧 Files skipped from review as they are similar to previous changes (1)
- docs/integrations/langchain-sdk.mdx
🧰 Additional context used
📓 Path-based instructions (1)
**
⚙️ CodeRabbit configuration file
always check the stack if there is one for the current PR. do not give localized reviews for the PR, always see all changes in the light of the whole stack of PRs (if there is a stack, if there is no stack you can continue to make localized suggestions/reviews)
Files:
core/providers/gemini/embedding.gocore/providers/gemini/types.gocore/providers/cohere/embedding.gotests/integrations/tests/test_openai.pycore/providers/gemini/gemini.gotests/integrations/tests/test_langchain.py
🧠 Learnings (2)
📚 Learning: 2025-12-09T17:07:42.007Z
Learnt from: qwerty-dvorak
Repo: maximhq/bifrost PR: 1006
File: core/schemas/account.go:9-18
Timestamp: 2025-12-09T17:07:42.007Z
Learning: In core/schemas/account.go, the HuggingFaceKeyConfig field within the Key struct is currently unused and reserved for future Hugging Face inference endpoint deployments. Do not flag this field as missing from OpenAPI documentation or require its presence in the API spec until the feature is actively implemented and used. When the feature is added, update the OpenAPI docs accordingly; otherwise, treat this field as non-breaking and not part of the current API surface.
Applied to files:
core/providers/gemini/embedding.gocore/providers/gemini/types.gocore/providers/cohere/embedding.gocore/providers/gemini/gemini.go
📚 Learning: 2025-12-19T09:26:54.961Z
Learnt from: qwerty-dvorak
Repo: maximhq/bifrost PR: 1006
File: core/providers/utils/utils.go:1050-1051
Timestamp: 2025-12-19T09:26:54.961Z
Learning: Update streaming end-marker handling so HuggingFace is treated as a non-[DONE] provider for backends that do not emit a DONE marker (e.g., meta llama on novita). In core/providers/utils/utils.go, adjust ProviderSendsDoneMarker() (or related logic) to detect providers that may not emit DONE and avoid relying on DONE as the sole end signal. Add tests to cover both DONE-emitting and non-DONE backends, with clear documentation in code comments explaining the rationale and any fallback behavior.
Applied to files:
core/providers/gemini/embedding.gocore/providers/gemini/types.gocore/providers/cohere/embedding.gocore/providers/gemini/gemini.go
🧬 Code graph analysis (5)
core/providers/gemini/embedding.go (3)
core/schemas/embedding.go (5)
BifrostEmbeddingRequest(9-16)BifrostEmbeddingResponse(22-28)EmbeddingData(118-122)EmbeddingStruct(124-128)EmbeddingInput(31-36)core/providers/gemini/types.go (5)
GeminiBatchEmbeddingRequest(967-969)GeminiEmbeddingRequest(972-978)Content(981-989)Part(995-1019)GeminiEmbeddingResponse(1241-1244)core/schemas/utils.go (1)
SafeExtractStringPointer(468-483)
core/providers/cohere/embedding.go (1)
core/schemas/chatcompletions.go (1)
BifrostLLMUsage(845-852)
tests/integrations/tests/test_openai.py (1)
tests/integrations/tests/utils/parametrize.py (2)
get_cross_provider_params_for_scenario(12-47)format_provider_model(50-65)
core/providers/gemini/gemini.go (3)
core/providers/utils/utils.go (8)
CheckContextAndGetRequestBody(301-319)SetExtraHeaders(223-253)GetPathFromContext(256-261)MakeRequestWithContext(43-97)NewBifrostOperationError(776-787)HandleProviderResponse(430-535)ShouldSendBackRawRequest(819-824)ShouldSendBackRawResponse(827-832)core/providers/gemini/embedding.go (1)
ToGeminiEmbeddingRequest(9-68)core/providers/gemini/types.go (1)
GeminiEmbeddingResponse(1241-1244)
tests/integrations/tests/test_langchain.py (2)
core/providers/gemini/types.go (1)
Modality(710-710)tests/integrations/tests/utils/parametrize.py (2)
get_cross_provider_params_for_scenario(12-47)format_provider_model(50-65)
🪛 Ruff (0.14.8)
tests/integrations/tests/test_openai.py
859-859: Unused method argument: test_config
(ARG002)
876-876: Unused method argument: test_config
(ARG002)
893-893: Unused method argument: test_config
(ARG002)
923-923: Unused method argument: test_config
(ARG002)
988-988: Unused method argument: test_config
(ARG002)
tests/integrations/tests/test_langchain.py
400-400: Do not catch blind exception: Exception
(BLE001)
407-407: Unused method argument: test_config
(ARG002)
445-445: Unused method argument: test_config
(ARG002)
481-481: Unused method argument: test_config
(ARG002)
510-510: Unused method argument: test_config
(ARG002)
540-540: Unused method argument: test_config
(ARG002)
577-577: Unused method argument: test_config
(ARG002)
609-609: Unused method argument: test_config
(ARG002)
643-643: Unused method argument: test_config
(ARG002)
668-668: Unused method argument: test_config
(ARG002)
691-691: Unused method argument: test_config
(ARG002)
750-750: Unused method argument: test_config
(ARG002)
813-813: Unused method argument: test_config
(ARG002)
848-848: Unused method argument: test_config
(ARG002)
883-883: Unused method argument: test_config
(ARG002)
978-978: Unused method argument: test_config
(ARG002)
1013-1013: Unused method argument: test_config
(ARG002)
1042-1042: Unused method argument: test_config
(ARG002)
1160-1160: Unused method argument: test_config
(ARG002)
1197-1197: Unused method argument: test_config
(ARG002)
1237-1237: Unused method argument: test_config
(ARG002)
1284-1284: Unused method argument: test_config
(ARG002)
1327-1327: Unused method argument: test_config
(ARG002)
🔇 Additional comments (8)
core/providers/cohere/embedding.go (1)
170-179: Good fallback for usage extraction from BilledUnits.The added fallback correctly handles the case where
Meta.Tokensis nil butMeta.BilledUnitsis available. The logic mirrors the existing tokens-based path and maintains consistency in the response structure.core/providers/gemini/gemini.go (1)
876-965: Well-structured batch embedding implementation.The new embedding implementation correctly:
- Uses the Gemini
batchEmbedContentsendpoint for batch processing- Follows established patterns from other methods (ChatCompletion, Speech)
- Properly handles error cases with appropriate error messages
- Includes latency tracking and raw request/response capture options
- Propagates provider metadata into the final response
core/providers/gemini/types.go (1)
967-978: Type definitions align with Gemini batch embedding API.The new
GeminiBatchEmbeddingRequestand extendedGeminiEmbeddingRequestfields correctly model the Gemini API structure for batch embeddings. The JSON tags use appropriate camelCase to match API expectations, and pointer types are used for optional fields.core/providers/gemini/embedding.go (2)
7-68: Well-structured batch request construction.The function correctly:
- Collects texts from both single
Textand multipleTextsinputs- Creates individual
GeminiEmbeddingRequestentries per text with proper model prefixing- Propagates parameters (Dimensions, TaskType, Title) to each request
116-158: Good Gemini-to-Bifrost response conversion.The new
ToBifrostEmbeddingResponsefunction correctly converts Gemini embeddings to Bifrost format with proper handling of:
- Per-embedding data with index and object type
- Usage metadata extraction from Statistics (preferred) or Metadata (fallback)
tests/integrations/tests/test_langchain.py (2)
374-402: LGTM! Gemini embeddings test follows existing patterns.The test correctly implements cross-provider embeddings testing for Gemini:
- Uses dummy API key consistent with Bifrost's auth handling (line 147-148 setup)
- Validates both single (
embed_query) and batch (embed_documents) operations- Follows the same structure as
test_03_openai_embeddings_basic- Appropriate exception handling with provider-specific skip message
The past review concerns about debug prints and API key access have been properly addressed.
407-1327: Test renumbering is correct and consistent.All tests have been correctly renumbered from
test_04throughtest_26to accommodate the newtest_04_gemini_embeddings_basic. The sequential ordering is maintained, docstrings are updated with new case numbers, and test logic remains unchanged. This is the expected mechanical change for inserting a new test.tests/integrations/tests/test_openai.py (1)
858-874: The hardcodeddimensions=1536is appropriate for all providers in the embeddings scenario and is supported by OpenAI text-embedding-3-small (native default), Gemini (via output_dimensionality truncation), and Cohere embeddings. No action required.
9deef80 to
cac6fa5
Compare
cac6fa5 to
bf3d78d
Compare
7b68adf to
2cf99d2
Compare
Merge activity
|
bf3d78d to
ba5b938
Compare

Improve Gemini embedding request handling and add cross-provider embedding documentation
This PR fixes a bug in the Gemini embedding request handling to properly process multiple text inputs and adds documentation for using cross-provider embeddings with LangChain.
Changes
GoogleGenerativeAIEmbeddingsfrom thelangchain_google_genaipackage for cross-provider embedding supportType of change
Affected areas
How to test
Test the Gemini embedding functionality with multiple text inputs:
Try the cross-provider embedding functionality with LangChain as documented:
Breaking changes
Security considerations
No security implications.
Checklist