feat: add batch embedding support for Gemini and improve cross-provider compatibility --skip-pipeline #1138

TejasGhatte · 2025-12-19T11:21:04Z

Improve Gemini embedding request handling and add cross-provider embedding documentation

This PR fixes a bug in the Gemini embedding request handling to properly process multiple text inputs and adds documentation for using cross-provider embeddings with LangChain.

Changes

Fixed Gemini embedding request processing to correctly handle multiple text inputs from different requests
Added documentation for using GoogleGenerativeAIEmbeddings from the langchain_google_genai package for cross-provider embedding support
Added a new test case for Gemini embeddings functionality with LangChain
Renumbered subsequent test cases to maintain sequential ordering

Type of change

Affected areas

How to test

Test the Gemini embedding functionality with multiple text inputs:

# Run the core tests
go test ./core/providers/gemini/...

# Run the integration tests for LangChain
cd tests/integrations
python -m pytest tests/test_langchain.py::TestLangChain::test_04_gemini_embeddings_basic -v

Try the cross-provider embedding functionality with LangChain as documented:

from langchain_google_genai import GoogleGenerativeAIEmbeddings

embeddings = GoogleGenerativeAIEmbeddings(
    model="cohere/cohere-embed-v4.0",
    base_url="http://localhost:8080/langchain",
    api_key="dummy-key"
)

query_embedding = embeddings.embed_query("What is machine learning?")
doc_embeddings = embeddings.embed_documents([
    "Machine learning is a subset of AI",
    "Deep learning uses neural networks"
])

Breaking changes

Yes
No

Security considerations

No security implications.

Checklist

I added/updated tests where appropriate
I updated documentation where needed
I verified builds succeed (Go and UI)

coderabbitai · 2025-12-19T11:21:11Z

Caution

Review failed

The pull request is closed.

📝 Walkthrough

Summary by CodeRabbit

New Features
- Added Gemini embeddings support with batch processing capabilities
- New embedding parameters: TaskType, Title, and OutputDimensionality for enhanced customization
- LangChain cross-provider embeddings documentation with practical examples
Improvements
- Enhanced token handling for Cohere embeddings via billed units
- Expanded embedding test coverage for improved cross-provider compatibility

_{✏️ Tip: You can customize this high-level summary in your review settings.}

Walkthrough

Switch Gemini embeddings to a per-text batched request model, add Gemini↔Bifrost embedding converters and new Gemini batch types, implement a Gemini-specific HTTP embedding flow, extend Cohere usage parsing, add and renumber LangChain/OpenAI integration tests, and insert a duplicated Cross-Provider Embeddings docs section.

Changes

Cohort / File(s)	Summary
Gemini embedding types & converters `core/providers/gemini/types.go`, `core/providers/gemini/embedding.go`	Add `GeminiBatchEmbeddingRequest`; extend `GeminiEmbeddingRequest` with `Model`, `TaskType`, `Title`, `OutputDimensionality`. `ToGeminiEmbeddingRequest` now builds a batch (one request per text). Add `ToBifrostEmbeddingResponse` to convert Gemini responses into `schemas.BifrostEmbeddingResponse`.
Gemini provider HTTP flow `core/providers/gemini/gemini.go`	Replace previous OpenAI delegation with a direct Gemini batch embedding flow: construct/send batch request to Gemini, handle headers, latency, raw request/response capture, error mapping, and convert Gemini responses to Bifrost responses.
Cohere usage extraction `core/providers/cohere/embedding.go`	Enhance `ToBifrostEmbeddingResponse` to populate usage from `Meta.BilledUnits` when `Meta.Tokens` is absent, mapping billed units to prompt/completion/total tokens.
LangChain docs `docs/integrations/langchain-sdk.mdx`	Add a Cross-Provider Embeddings section demonstrating `GoogleGenerativeAIEmbeddings` for embeddings across providers; the example block was inserted twice.
LangChain integration tests `tests/integrations/tests/test_langchain.py`	Add `test_04_gemini_embeddings_basic` (imports `GoogleGenerativeAIEmbeddings`) and renumber many tests to reflect the new ordering.
OpenAI integration tests (parameterization) `tests/integrations/tests/test_openai.py`	Parameterize several embedding tests over `(provider, model)`, update signatures to accept these params, use provider-formatted model strings and explicit dimensions, and adjust assertions to handle provider differences (e.g., Gemini).

Sequence Diagram(s)

sequenceDiagram
    autonumber
    actor Client
    participant BifrostServer
    participant GeminiProvider
    participant GeminiAPI
    Note over Client,BifrostServer: Client sends BifrostEmbeddingRequest
    Client->>BifrostServer: POST /embeddings (BifrostEmbeddingRequest)
    BifrostServer->>GeminiProvider: ToGeminiEmbeddingRequest -> GeminiBatchEmbeddingRequest (one item per text)
    GeminiProvider->>GeminiAPI: POST /v1/batchEmbedContents (batch payload + headers)
    GeminiAPI-->>GeminiProvider: 200 OK + GeminiEmbeddingResponse (per-item embeddings, metadata)
    GeminiProvider->>BifrostServer: ToBifrostEmbeddingResponse (convert embeddings, usage, metadata, latency, raw)
    BifrostServer-->>Client: 200 OK (BifrostEmbeddingResponse)

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Focus review on: core/providers/gemini/embedding.go (batch construction, per-item param mapping and new public function), core/providers/gemini/types.go (new public types/fields), and core/providers/gemini/gemini.go (HTTP request/response handling, headers, latency, raw capture, error mapping).
Verify ToBifrostEmbeddingResponse mapping (embedding indices, usage/stat fallbacks) and Cohere billed-units → tokens mapping.
Check tests for correct renumbering and provider/model parameterization (tests/.../test_langchain.py, tests/.../test_openai.py).
Confirm duplicated docs block in docs/integrations/langchain-sdk.mdx is intentional.

Poem

🐇 I stitched the texts in tidy rows,
One request per sentence as the pipeline goes.
From Gemini's garden through Bifrost's gate,
Embeddings hop in batches — all in one crate.
Tests shuffled their numbers while docs doubled their tale.

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description check	✅ Passed	The description covers all major required sections including summary, changes, type of change, affected areas, testing instructions, and checklist items.
Docstring Coverage	✅ Passed	Docstring coverage is 97.30% which is sufficient. The required threshold is 80.00%.
Title check	✅ Passed	The title accurately summarizes the main changes: adding batch embedding support for Gemini and improving cross-provider compatibility. It directly reflects the primary objectives of the PR.

📜 Recent review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 9deef80 and ba5b938.

📒 Files selected for processing (7)

core/providers/cohere/embedding.go
core/providers/gemini/embedding.go
core/providers/gemini/gemini.go
core/providers/gemini/types.go
docs/integrations/langchain-sdk.mdx
tests/integrations/tests/test_langchain.py
tests/integrations/tests/test_openai.py

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

github-actions · 2025-12-19T11:21:22Z

🧪 Test Suite Available

This PR can be tested by a repository admin.

Run tests for PR #1138

TejasGhatte · 2025-12-19T11:21:24Z

This stack of pull requests is managed by Graphite. Learn more about stacking.

coderabbitai

Actionable comments posted: 2

🧹 Nitpick comments (1)

core/providers/gemini/embedding.go (1)
138-158: Document or validate the assumption that all requests share identical parameters.

The code takes parameters from only the first request (request.Requests[0]). The comment at line 117 states that multiple requests have "same parameters but different text fields," but this assumption is not validated.

Consider adding a code comment to explicitly document this assumption, or add validation to ensure all requests have consistent parameters:
// Extract parameters from the first request.
// Assumption: All requests in the batch share identical parameters (dimensions, taskType, title).
embeddingRequest := request.Requests[0]
Alternatively, if strict validation is desired:
// TODO: Validate that all requests have consistent parameters
embeddingRequest := request.Requests[0]

📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between ed34784 and 0a3ff50.

📒 Files selected for processing (3)

core/providers/gemini/embedding.go (1 hunks)
docs/integrations/langchain-sdk.mdx (1 hunks)
tests/integrations/tests/test_langchain.py (23 hunks)

🧰 Additional context used

📓 Path-based instructions (1)

**

⚙️ CodeRabbit configuration file

always check the stack if there is one for the current PR. do not give localized reviews for the PR, always see all changes in the light of the whole stack of PRs (if there is a stack, if there is no stack you can continue to make localized suggestions/reviews)

Files:

core/providers/gemini/embedding.go
docs/integrations/langchain-sdk.mdx
tests/integrations/tests/test_langchain.py

🧠 Learnings (2)

📚 Learning: 2025-12-09T17:07:42.007Z

Learnt from: qwerty-dvorak
Repo: maximhq/bifrost PR: 1006
File: core/schemas/account.go:9-18
Timestamp: 2025-12-09T17:07:42.007Z
Learning: In core/schemas/account.go, the HuggingFaceKeyConfig field within the Key struct is currently unused and reserved for future Hugging Face inference endpoint deployments. Do not flag this field as missing from OpenAPI documentation or require its presence in the API spec until the feature is actively implemented and used. When the feature is added, update the OpenAPI docs accordingly; otherwise, treat this field as non-breaking and not part of the current API surface.

Applied to files:

core/providers/gemini/embedding.go

📚 Learning: 2025-12-19T09:26:54.961Z

Learnt from: qwerty-dvorak
Repo: maximhq/bifrost PR: 1006
File: core/providers/utils/utils.go:1050-1051
Timestamp: 2025-12-19T09:26:54.961Z
Learning: Update streaming end-marker handling so HuggingFace is treated as a non-[DONE] provider for backends that do not emit a DONE marker (e.g., meta llama on novita). In core/providers/utils/utils.go, adjust ProviderSendsDoneMarker() (or related logic) to detect providers that may not emit DONE and avoid relying on DONE as the sole end signal. Add tests to cover both DONE-emitting and non-DONE backends, with clear documentation in code comments explaining the rationale and any fallback behavior.

Applied to files:

core/providers/gemini/embedding.go

🧬 Code graph analysis (1)

core/providers/gemini/embedding.go (2)

core/providers/gemini/types.go (1)

Content (977-985)

core/schemas/embedding.go (1)

EmbeddingInput (31-36)

🪛 Ruff (0.14.8)

tests/integrations/tests/test_langchain.py

401-401: Do not catch blind exception: Exception

(BLE001)

408-408: Unused method argument: test_config

(ARG002)

446-446: Unused method argument: test_config

(ARG002)

482-482: Unused method argument: test_config

(ARG002)

511-511: Unused method argument: test_config

(ARG002)

541-541: Unused method argument: test_config

(ARG002)

578-578: Unused method argument: test_config

(ARG002)

610-610: Unused method argument: test_config

(ARG002)

644-644: Unused method argument: test_config

(ARG002)

669-669: Unused method argument: test_config

(ARG002)

692-692: Unused method argument: test_config

(ARG002)

751-751: Unused method argument: test_config

(ARG002)

814-814: Unused method argument: test_config

(ARG002)

849-849: Unused method argument: test_config

(ARG002)

884-884: Unused method argument: test_config

(ARG002)

979-979: Unused method argument: test_config

(ARG002)

1014-1014: Unused method argument: test_config

(ARG002)

1043-1043: Unused method argument: test_config

(ARG002)

1161-1161: Unused method argument: test_config

(ARG002)

1198-1198: Unused method argument: test_config

(ARG002)

1238-1238: Unused method argument: test_config

(ARG002)

1285-1285: Unused method argument: test_config

(ARG002)

1328-1328: Unused method argument: test_config

(ARG002)

🔇 Additional comments (2)

tests/integrations/tests/test_langchain.py (1)

58-58: LGTM!

The import of GoogleGenerativeAIEmbeddings is correctly added to support the new cross-provider embeddings test.

docs/integrations/langchain-sdk.mdx (1)

548-567: No action needed. The Cross-Provider Embeddings section appears only once in the file (line 548), so no duplication exists. The documentation block does not require consolidation.

core/providers/gemini/embedding.go

tests/integrations/tests/test_langchain.py

coderabbitai

Actionable comments posted: 0

♻️ Duplicate comments (1)

tests/integrations/tests/test_langchain.py (1)
374-402: Handle missing GEMINI_API_KEY environment variable gracefully.

Direct access to os.environ["GEMINI_API_KEY"] at line 380 will raise KeyError if the environment variable is not set, preventing the test from skipping gracefully.
🔎 Proposed fix
     @pytest.mark.parametrize("provider,model", get_cross_provider_params_for_scenario("embeddings"))
     def test_04_gemini_embeddings_basic(self, provider, model):
         """Test Case 4: Basic Gemini embeddings functionality"""
+        # Check if API key is available
+        if "GEMINI_API_KEY" not in os.environ:
+            pytest.skip("GEMINI_API_KEY environment variable not set")
+        
         try:
             embeddings = GoogleGenerativeAIEmbeddings(
                 model=format_provider_model(provider, model),
                 api_key=os.environ["GEMINI_API_KEY"],
Based on past review comments.

🧹 Nitpick comments (1)

core/providers/gemini/embedding.go (1)

117-136: Move Input assignment outside the loop to avoid redundant operations.

The bifrostReq.Input assignment block (lines 127-134) executes inside the request loop, causing repeated allocations and assignments. While the final result is correct, this is inefficient when processing multiple requests.

🔎 Proposed fix

 	// sdk request contains multiple embedding requests with same parameters but different text fields
 	if len(request.Requests) > 0 {
 		var texts []string
 		for _, req := range request.Requests {
 			if req.Content != nil && len(req.Content.Parts) > 0 {
 				for _, part := range req.Content.Parts {
 					if part != nil && part.Text != "" {
 						texts = append(texts, part.Text)
 					}
 				}
-				if len(texts) > 0 {
-					bifrostReq.Input = &schemas.EmbeddingInput{}
-					if len(texts) == 1 {
-						bifrostReq.Input.Text = &texts[0]
-					} else {
-						bifrostReq.Input.Texts = texts
-					}
-				}
 			}
 		}
+		if len(texts) > 0 {
+			bifrostReq.Input = &schemas.EmbeddingInput{}
+			if len(texts) == 1 {
+				bifrostReq.Input.Text = &texts[0]
+			} else {
+				bifrostReq.Input.Texts = texts
+			}
+		}

Based on past review comments.

📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 0a3ff50 and 240b3df.

📒 Files selected for processing (3)

core/providers/gemini/embedding.go (1 hunks)
docs/integrations/langchain-sdk.mdx (1 hunks)
tests/integrations/tests/test_langchain.py (23 hunks)

🧰 Additional context used

📓 Path-based instructions (1)

**

⚙️ CodeRabbit configuration file

always check the stack if there is one for the current PR. do not give localized reviews for the PR, always see all changes in the light of the whole stack of PRs (if there is a stack, if there is no stack you can continue to make localized suggestions/reviews)

Files:

core/providers/gemini/embedding.go
docs/integrations/langchain-sdk.mdx
tests/integrations/tests/test_langchain.py

🧠 Learnings (2)

📚 Learning: 2025-12-09T17:07:42.007Z

Learnt from: qwerty-dvorak
Repo: maximhq/bifrost PR: 1006
File: core/schemas/account.go:9-18
Timestamp: 2025-12-09T17:07:42.007Z
Learning: In core/schemas/account.go, the HuggingFaceKeyConfig field within the Key struct is currently unused and reserved for future Hugging Face inference endpoint deployments. Do not flag this field as missing from OpenAPI documentation or require its presence in the API spec until the feature is actively implemented and used. When the feature is added, update the OpenAPI docs accordingly; otherwise, treat this field as non-breaking and not part of the current API surface.

Applied to files:

core/providers/gemini/embedding.go

📚 Learning: 2025-12-19T09:26:54.961Z

Learnt from: qwerty-dvorak
Repo: maximhq/bifrost PR: 1006
File: core/providers/utils/utils.go:1050-1051
Timestamp: 2025-12-19T09:26:54.961Z
Learning: Update streaming end-marker handling so HuggingFace is treated as a non-[DONE] provider for backends that do not emit a DONE marker (e.g., meta llama on novita). In core/providers/utils/utils.go, adjust ProviderSendsDoneMarker() (or related logic) to detect providers that may not emit DONE and avoid relying on DONE as the sole end signal. Add tests to cover both DONE-emitting and non-DONE backends, with clear documentation in code comments explaining the rationale and any fallback behavior.

Applied to files:

core/providers/gemini/embedding.go

🧬 Code graph analysis (2)

core/providers/gemini/embedding.go (2)

core/providers/gemini/types.go (1)

Content (977-985)

core/schemas/embedding.go (1)

EmbeddingInput (31-36)

tests/integrations/tests/test_langchain.py (3)

core/providers/gemini/types.go (1)

Modality (710-710)

tests/integrations/tests/utils/parametrize.py (2)

get_cross_provider_params_for_scenario (12-47)

format_provider_model (50-65)

tests/integrations/tests/utils/config_loader.py (2)

get_integration_url (112-131)

get_integration_url (439-440)

🪛 Ruff (0.14.8)

tests/integrations/tests/test_langchain.py

400-400: Do not catch blind exception: Exception

(BLE001)

407-407: Unused method argument: test_config

(ARG002)

445-445: Unused method argument: test_config

(ARG002)

481-481: Unused method argument: test_config

(ARG002)

510-510: Unused method argument: test_config

(ARG002)

540-540: Unused method argument: test_config

(ARG002)

577-577: Unused method argument: test_config

(ARG002)

609-609: Unused method argument: test_config

(ARG002)

643-643: Unused method argument: test_config

(ARG002)

668-668: Unused method argument: test_config

(ARG002)

691-691: Unused method argument: test_config

(ARG002)

750-750: Unused method argument: test_config

(ARG002)

813-813: Unused method argument: test_config

(ARG002)

848-848: Unused method argument: test_config

(ARG002)

883-883: Unused method argument: test_config

(ARG002)

978-978: Unused method argument: test_config

(ARG002)

1013-1013: Unused method argument: test_config

(ARG002)

1042-1042: Unused method argument: test_config

(ARG002)

1160-1160: Unused method argument: test_config

(ARG002)

1197-1197: Unused method argument: test_config

(ARG002)

1237-1237: Unused method argument: test_config

(ARG002)

1284-1284: Unused method argument: test_config

(ARG002)

1327-1327: Unused method argument: test_config

(ARG002)

🔇 Additional comments (1)

docs/integrations/langchain-sdk.mdx (1)

548-568: No action needed. The Cross-Provider Embeddings section appears only once in the file at lines 548-568. There is no documentation duplication.

Likely an incorrect or invalid review comment.

coderabbitai

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)

tests/integrations/tests/test_langchain.py (1)

407-1385: Remove unused Modality import and clarify test_config fixture usage

Two minor issues:

Unused import: Modality is imported from langchain_google_genai at line 58 but never used. Remove it.

Pytest fixture pattern: Multiple test methods accept test_config as a parameter but don't reference it in their method bodies (e.g., test_05_function_calling_tools, test_06_llm_chain_basic, etc.). This is expected behavior for pytest fixtures—they're used implicitly through pytest's dependency injection. However, if linters flag this with ARG002, you can silence it by:

Renaming to _test_config, or

Using @pytest.mark.usefixtures("test_config") and removing the parameter, or

Adding # noqa: ARG002 to affected methods.

The rest of the changes (renumbered tests, thinking/structured-output coverage) are well-structured and consistent with existing patterns.

tests/integrations/tests/test_openai.py (1)

858-999: Embedding test parametrization is solid; address unused fixture and tighten provider assertions

The cross-provider parametrization for embedding tests (21–24, 26) works well with the configuration-driven scenario filtering. However, there are two actionable improvements:

Unused test_config fixture arguments
Ruff is correct to flag test_config as unused in tests 21–24 and 26. These tests never reference the fixture. Either drop it from the signatures or rename to _test_config to satisfy linters while keeping the fixture hook active.

Usage assertions special-cased for Gemini
In test_26_embedding_long_text, the pattern if provider != "gemini" assumes every non-Gemini provider (openai, bedrock, cohere) returns usage data. This is currently safe since only Gemini is configured to lack usage, but the guard could become brittle as more providers are added. A more robust approach would be to maintain an explicit allowlist of providers known to return usage (e.g., {"openai", "bedrock", "cohere"}) or derive this from a provider capability flag in config, similar to how scenarios work.

The fixed dimensions=1536 assumption across parametrized tests is fine—the config confirms all configured embedding providers support 1536-dim vectors.

♻️ Duplicate comments (2)

tests/integrations/tests/test_langchain.py (1)
58-59: Make Gemini embeddings test resilient to missing GEMINI_API_KEY and align with other LangChain tests

test_04_gemini_embeddings_basic is a useful cross‑provider embeddings smoke test, but the way the API key is handled is brittle:

api_key=os.environ["GEMINI_API_KEY"] will raise a KeyError if the env var isn’t set.

Other Gemini/LangChain tests in this file rely on dummy keys or explicit skipping rather than requiring real provider keys, since Bifrost handles auth.

To keep this consistent with the rest of the suite and avoid hard failures in CI:
Either treat the key as optional and skip when it’s missing:
api_key = os.environ.get("GEMINI_API_KEY")
if not api_key:
    pytest.skip("GEMINI_API_KEY environment variable not set")

embeddings = GoogleGenerativeAIEmbeddings(
    model=format_provider_model(provider, model),
    api_key=api_key,
    base_url=get_integration_url("langchain") or None,
)
Or, if Bifrost doesn’t actually need a real Gemini key here, just pass the same dummy key pattern used elsewhere in this file and drop the env dependency entirely.
Also, the new import of Modality from langchain_google_genai appears unused and could be removed to keep imports tidy.

Also applies to: 374-402
core/providers/gemini/embedding.go (1)
175-194: Critical: Input assignment block still inside the loop despite past review.

Lines 185-192 are executed on every iteration where req.Content != nil, causing redundant assignments to bifrostReq.Input. The texts slice correctly accumulates across all requests, but the Input assignment should occur once after the loop completes, not inside the per-request condition.

This was flagged in a previous review and marked as addressed, but the fix is not present in the current code.
🔎 Move Input assignment outside the loop
 	// sdk request contains multiple embedding requests with same parameters but different text fields
 	if len(request.Requests) > 0 {
 		var texts []string
 		for _, req := range request.Requests {
 			if req.Content != nil && len(req.Content.Parts) > 0 {
 				for _, part := range req.Content.Parts {
 					if part != nil && part.Text != "" {
 						texts = append(texts, part.Text)
 					}
 				}
-				if len(texts) > 0 {
-					bifrostReq.Input = &schemas.EmbeddingInput{}
-					if len(texts) == 1 {
-						bifrostReq.Input.Text = &texts[0]
-					} else {
-						bifrostReq.Input.Texts = texts
-					}
-				}
 			}
 		}
+		// Assign Input once after collecting all texts
+		if len(texts) > 0 {
+			bifrostReq.Input = &schemas.EmbeddingInput{}
+			if len(texts) == 1 {
+				bifrostReq.Input.Text = &texts[0]
+			} else {
+				bifrostReq.Input.Texts = texts
+			}
+		}
 
 		embeddingRequest := request.Requests[0]

🧹 Nitpick comments (1)

core/providers/gemini/types.go (1)

967-977: Batch embedding request container is consistent with existing types

GeminiBatchEmbeddingRequest is a straightforward wrapper around []GeminiEmbeddingRequest and aligns with how the rest of the file models Gemini request/response JSON. Given GeminiGenerationRequest also has a Requests []GeminiEmbeddingRequest field, consider a short doc comment on each to clarify intended usage (generation vs. embedding batch) to avoid future confusion, but no functional issues here.

📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 240b3df and f37a8f8.

📒 Files selected for processing (7)

core/providers/cohere/embedding.go (1 hunks)
core/providers/gemini/embedding.go (3 hunks)
core/providers/gemini/gemini.go (1 hunks)
core/providers/gemini/types.go (1 hunks)
docs/integrations/langchain-sdk.mdx (1 hunks)
tests/integrations/tests/test_langchain.py (23 hunks)
tests/integrations/tests/test_openai.py (5 hunks)

🧰 Additional context used

📓 Path-based instructions (1)

**

⚙️ CodeRabbit configuration file

always check the stack if there is one for the current PR. do not give localized reviews for the PR, always see all changes in the light of the whole stack of PRs (if there is a stack, if there is no stack you can continue to make localized suggestions/reviews)

Files:

docs/integrations/langchain-sdk.mdx
core/providers/cohere/embedding.go
core/providers/gemini/types.go
tests/integrations/tests/test_openai.py
core/providers/gemini/embedding.go
core/providers/gemini/gemini.go
tests/integrations/tests/test_langchain.py

🧠 Learnings (2)

📚 Learning: 2025-12-09T17:07:42.007Z

Learnt from: qwerty-dvorak
Repo: maximhq/bifrost PR: 1006
File: core/schemas/account.go:9-18
Timestamp: 2025-12-09T17:07:42.007Z
Learning: In core/schemas/account.go, the HuggingFaceKeyConfig field within the Key struct is currently unused and reserved for future Hugging Face inference endpoint deployments. Do not flag this field as missing from OpenAPI documentation or require its presence in the API spec until the feature is actively implemented and used. When the feature is added, update the OpenAPI docs accordingly; otherwise, treat this field as non-breaking and not part of the current API surface.

Applied to files:

core/providers/cohere/embedding.go
core/providers/gemini/types.go
core/providers/gemini/embedding.go
core/providers/gemini/gemini.go

📚 Learning: 2025-12-19T09:26:54.961Z

Learnt from: qwerty-dvorak
Repo: maximhq/bifrost PR: 1006
File: core/providers/utils/utils.go:1050-1051
Timestamp: 2025-12-19T09:26:54.961Z
Learning: Update streaming end-marker handling so HuggingFace is treated as a non-[DONE] provider for backends that do not emit a DONE marker (e.g., meta llama on novita). In core/providers/utils/utils.go, adjust ProviderSendsDoneMarker() (or related logic) to detect providers that may not emit DONE and avoid relying on DONE as the sole end signal. Add tests to cover both DONE-emitting and non-DONE backends, with clear documentation in code comments explaining the rationale and any fallback behavior.

Applied to files:

core/providers/cohere/embedding.go
core/providers/gemini/types.go
core/providers/gemini/embedding.go
core/providers/gemini/gemini.go

🧬 Code graph analysis (4)

core/providers/cohere/embedding.go (1)

core/schemas/chatcompletions.go (1)

BifrostLLMUsage (845-852)

tests/integrations/tests/test_openai.py (2)

tests/integrations/tests/utils/parametrize.py (2)

get_cross_provider_params_for_scenario (12-47)

format_provider_model (50-65)

tests/integrations/tests/utils/common.py (1)

assert_valid_embedding_response (1447-1500)

core/providers/gemini/embedding.go (4)

core/schemas/embedding.go (5)

BifrostEmbeddingRequest (9-16)

BifrostEmbeddingResponse (22-28)

EmbeddingData (118-122)

EmbeddingStruct (124-128)

EmbeddingInput (31-36)

core/providers/gemini/types.go (5)

GeminiBatchEmbeddingRequest (967-969)

GeminiEmbeddingRequest (972-978)

Content (981-989)

Part (995-1019)

GeminiEmbeddingResponse (1190-1193)

core/schemas/utils.go (1)

SafeExtractStringPointer (468-483)

core/schemas/chatcompletions.go (1)

BifrostLLMUsage (845-852)

tests/integrations/tests/test_langchain.py (8)

core/providers/gemini/types.go (1)

Modality (710-710)

tests/integrations/tests/utils/parametrize.py (2)

get_cross_provider_params_for_scenario (12-47)

format_provider_model (50-65)

tests/integrations/tests/test_openai.py (1)

test_config (251-253)

tests/integrations/tests/test_anthropic.py (1)

test_config (131-133)

tests/integrations/tests/test_bedrock.py (1)

test_config (243-245)

tests/integrations/tests/test_pydanticai.py (1)

test_config (103-105)

tests/integrations/tests/test_google.py (1)

test_config (149-151)

tests/integrations/tests/test_litellm.py (1)

test_config (98-100)

🪛 Ruff (0.14.8)

tests/integrations/tests/test_openai.py

859-859: Unused method argument: test_config

(ARG002)

876-876: Unused method argument: test_config

(ARG002)

893-893: Unused method argument: test_config

(ARG002)

923-923: Unused method argument: test_config

(ARG002)

988-988: Unused method argument: test_config

(ARG002)

tests/integrations/tests/test_langchain.py

400-400: Do not catch blind exception: Exception

(BLE001)

407-407: Unused method argument: test_config

(ARG002)

445-445: Unused method argument: test_config

(ARG002)

481-481: Unused method argument: test_config

(ARG002)

510-510: Unused method argument: test_config

(ARG002)

540-540: Unused method argument: test_config

(ARG002)

577-577: Unused method argument: test_config

(ARG002)

609-609: Unused method argument: test_config

(ARG002)

643-643: Unused method argument: test_config

(ARG002)

668-668: Unused method argument: test_config

(ARG002)

691-691: Unused method argument: test_config

(ARG002)

750-750: Unused method argument: test_config

(ARG002)

813-813: Unused method argument: test_config

(ARG002)

848-848: Unused method argument: test_config

(ARG002)

883-883: Unused method argument: test_config

(ARG002)

978-978: Unused method argument: test_config

(ARG002)

1013-1013: Unused method argument: test_config

(ARG002)

1042-1042: Unused method argument: test_config

(ARG002)

1160-1160: Unused method argument: test_config

(ARG002)

1197-1197: Unused method argument: test_config

(ARG002)

1237-1237: Unused method argument: test_config

(ARG002)

1284-1284: Unused method argument: test_config

(ARG002)

1327-1327: Unused method argument: test_config

(ARG002)

🔇 Additional comments (4)

core/providers/cohere/embedding.go (1)

159-179: Cohere billed_units → Bifrost usage mapping looks correct

The new Meta.BilledUnits branch mirrors the existing Meta.Tokens handling and correctly populates PromptTokens, CompletionTokens, and TotalTokens. This should backfill usage for newer Cohere responses without changing behavior for legacy Tokens metadata.

docs/integrations/langchain-sdk.mdx (1)

548-567: Cross‑provider embeddings docs align with the new LangChain + Gemini flow

The new “Cross‑Provider Embeddings” section clearly explains why OpenAIEmbeddings is OpenAI‑specific and correctly recommends GoogleGenerativeAIEmbeddings with model="provider/model" and base_url=".../langchain" for Cohere/Bedrock/Gemini, matching how the tests and core code are wired. This is a good, accurate clarification of the intended multi‑provider embeddings path.

core/providers/gemini/gemini.go (1)

876-966: LGTM! Solid refactor to native Gemini batch embedding API.

The Embedding method correctly transitions from the OpenAI compatibility layer to Gemini's native batchEmbedContents endpoint. The implementation properly handles:

Batch request construction via ToGeminiEmbeddingRequest

Direct HTTP request lifecycle with fasthttp

Comprehensive error handling at each step

Response conversion back to Bifrost format with nil-safety

Metadata population (latency, provider, raw request/response)

core/providers/gemini/embedding.go (1)

116-158: LGTM! Well-structured bidirectional conversion.

The new ToBifrostEmbeddingResponse function correctly maps Gemini's batch embedding response to Bifrost format with:

Proper nil-safety checks

Correct embedding data structure mapping

Intelligent usage metadata fallback (Statistics → Metadata)

Appropriate total tokens calculation for embeddings

core/providers/gemini/embedding.go

coderabbitai

Actionable comments posted: 0

♻️ Duplicate comments (1)

core/providers/gemini/embedding.go (1)
175-194: Logic issue: Input assignment should be moved outside the inner loop.

The block at lines 185-192 that sets bifrostReq.Input is still inside the for _, req := range request.Requests loop. This means it executes on each iteration, potentially overwriting the input multiple times. While the current logic happens to work because texts accumulate across iterations, the assignment should be moved after the loop completes for clarity and correctness.
🔎 Suggested fix
 	if len(request.Requests) > 0 {
 		var texts []string
 		for _, req := range request.Requests {
 			if req.Content != nil && len(req.Content.Parts) > 0 {
 				for _, part := range req.Content.Parts {
 					if part != nil && part.Text != "" {
 						texts = append(texts, part.Text)
 					}
 				}
-				if len(texts) > 0 {
-					bifrostReq.Input = &schemas.EmbeddingInput{}
-					if len(texts) == 1 {
-						bifrostReq.Input.Text = &texts[0]
-					} else {
-						bifrostReq.Input.Texts = texts
-					}
-				}
 			}
 		}
+		if len(texts) > 0 {
+			bifrostReq.Input = &schemas.EmbeddingInput{}
+			if len(texts) == 1 {
+				bifrostReq.Input.Text = &texts[0]
+			} else {
+				bifrostReq.Input.Texts = texts
+			}
+		}
 
 		embeddingRequest := request.Requests[0]

🧹 Nitpick comments (1)

tests/integrations/tests/test_openai.py (1)
987-1000: Document why Gemini doesn't return usage data.

The conditional skip for Gemini usage data validation is correct, but the inline comment could be more specific about whether this is a known limitation of the Gemini API or a temporary implementation detail in Bifrost's Gemini provider integration.

Consider expanding the comment to clarify:
# Gemini embeddings API does not return usage/token metadata in the response
# This is a known limitation of the Gemini embeddings endpoint
if provider != "gemini":

📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between f37a8f8 and 9deef80.

📒 Files selected for processing (7)

core/providers/cohere/embedding.go
core/providers/gemini/embedding.go
core/providers/gemini/gemini.go
core/providers/gemini/types.go
docs/integrations/langchain-sdk.mdx
tests/integrations/tests/test_langchain.py
tests/integrations/tests/test_openai.py

🚧 Files skipped from review as they are similar to previous changes (1)

docs/integrations/langchain-sdk.mdx

🧰 Additional context used

📓 Path-based instructions (1)

**

⚙️ CodeRabbit configuration file

always check the stack if there is one for the current PR. do not give localized reviews for the PR, always see all changes in the light of the whole stack of PRs (if there is a stack, if there is no stack you can continue to make localized suggestions/reviews)

Files:

core/providers/gemini/embedding.go
core/providers/gemini/types.go
core/providers/cohere/embedding.go
tests/integrations/tests/test_openai.py
core/providers/gemini/gemini.go
tests/integrations/tests/test_langchain.py

🧠 Learnings (2)

📚 Learning: 2025-12-09T17:07:42.007Z

Learnt from: qwerty-dvorak
Repo: maximhq/bifrost PR: 1006
File: core/schemas/account.go:9-18
Timestamp: 2025-12-09T17:07:42.007Z
Learning: In core/schemas/account.go, the HuggingFaceKeyConfig field within the Key struct is currently unused and reserved for future Hugging Face inference endpoint deployments. Do not flag this field as missing from OpenAPI documentation or require its presence in the API spec until the feature is actively implemented and used. When the feature is added, update the OpenAPI docs accordingly; otherwise, treat this field as non-breaking and not part of the current API surface.

Applied to files:

core/providers/gemini/embedding.go
core/providers/gemini/types.go
core/providers/cohere/embedding.go
core/providers/gemini/gemini.go

📚 Learning: 2025-12-19T09:26:54.961Z

Learnt from: qwerty-dvorak
Repo: maximhq/bifrost PR: 1006
File: core/providers/utils/utils.go:1050-1051
Timestamp: 2025-12-19T09:26:54.961Z
Learning: Update streaming end-marker handling so HuggingFace is treated as a non-[DONE] provider for backends that do not emit a DONE marker (e.g., meta llama on novita). In core/providers/utils/utils.go, adjust ProviderSendsDoneMarker() (or related logic) to detect providers that may not emit DONE and avoid relying on DONE as the sole end signal. Add tests to cover both DONE-emitting and non-DONE backends, with clear documentation in code comments explaining the rationale and any fallback behavior.

Applied to files:

core/providers/gemini/embedding.go
core/providers/gemini/types.go
core/providers/cohere/embedding.go
core/providers/gemini/gemini.go

🧬 Code graph analysis (5)

core/providers/gemini/embedding.go (3)

core/schemas/embedding.go (5)

BifrostEmbeddingRequest (9-16)

BifrostEmbeddingResponse (22-28)

EmbeddingData (118-122)

EmbeddingStruct (124-128)

EmbeddingInput (31-36)

core/providers/gemini/types.go (5)

GeminiBatchEmbeddingRequest (967-969)

GeminiEmbeddingRequest (972-978)

Content (981-989)

Part (995-1019)

GeminiEmbeddingResponse (1241-1244)

core/schemas/utils.go (1)

SafeExtractStringPointer (468-483)

core/providers/cohere/embedding.go (1)

core/schemas/chatcompletions.go (1)

BifrostLLMUsage (845-852)

tests/integrations/tests/test_openai.py (1)

tests/integrations/tests/utils/parametrize.py (2)

get_cross_provider_params_for_scenario (12-47)

format_provider_model (50-65)

core/providers/gemini/gemini.go (3)

core/providers/utils/utils.go (8)

CheckContextAndGetRequestBody (301-319)

SetExtraHeaders (223-253)

GetPathFromContext (256-261)

MakeRequestWithContext (43-97)

NewBifrostOperationError (776-787)

HandleProviderResponse (430-535)

ShouldSendBackRawRequest (819-824)

ShouldSendBackRawResponse (827-832)

core/providers/gemini/embedding.go (1)

ToGeminiEmbeddingRequest (9-68)

core/providers/gemini/types.go (1)

GeminiEmbeddingResponse (1241-1244)

tests/integrations/tests/test_langchain.py (2)

core/providers/gemini/types.go (1)

Modality (710-710)

tests/integrations/tests/utils/parametrize.py (2)

get_cross_provider_params_for_scenario (12-47)

format_provider_model (50-65)

🪛 Ruff (0.14.8)

tests/integrations/tests/test_openai.py

859-859: Unused method argument: test_config

(ARG002)

876-876: Unused method argument: test_config

(ARG002)

893-893: Unused method argument: test_config

(ARG002)

923-923: Unused method argument: test_config

(ARG002)

988-988: Unused method argument: test_config

(ARG002)

tests/integrations/tests/test_langchain.py

400-400: Do not catch blind exception: Exception

(BLE001)

407-407: Unused method argument: test_config

(ARG002)

445-445: Unused method argument: test_config

(ARG002)

481-481: Unused method argument: test_config

(ARG002)

510-510: Unused method argument: test_config

(ARG002)

540-540: Unused method argument: test_config

(ARG002)

577-577: Unused method argument: test_config

(ARG002)

609-609: Unused method argument: test_config

(ARG002)

643-643: Unused method argument: test_config

(ARG002)

668-668: Unused method argument: test_config

(ARG002)

691-691: Unused method argument: test_config

(ARG002)

750-750: Unused method argument: test_config

(ARG002)

813-813: Unused method argument: test_config

(ARG002)

848-848: Unused method argument: test_config

(ARG002)

883-883: Unused method argument: test_config

(ARG002)

978-978: Unused method argument: test_config

(ARG002)

1013-1013: Unused method argument: test_config

(ARG002)

1042-1042: Unused method argument: test_config

(ARG002)

1160-1160: Unused method argument: test_config

(ARG002)

1197-1197: Unused method argument: test_config

(ARG002)

1237-1237: Unused method argument: test_config

(ARG002)

1284-1284: Unused method argument: test_config

(ARG002)

1327-1327: Unused method argument: test_config

(ARG002)

🔇 Additional comments (8)

core/providers/cohere/embedding.go (1)

170-179: Good fallback for usage extraction from BilledUnits.

The added fallback correctly handles the case where Meta.Tokens is nil but Meta.BilledUnits is available. The logic mirrors the existing tokens-based path and maintains consistency in the response structure.

core/providers/gemini/gemini.go (1)

876-965: Well-structured batch embedding implementation.

The new embedding implementation correctly:

Uses the Gemini batchEmbedContents endpoint for batch processing

Follows established patterns from other methods (ChatCompletion, Speech)

Properly handles error cases with appropriate error messages

Includes latency tracking and raw request/response capture options

Propagates provider metadata into the final response

core/providers/gemini/types.go (1)

967-978: Type definitions align with Gemini batch embedding API.

The new GeminiBatchEmbeddingRequest and extended GeminiEmbeddingRequest fields correctly model the Gemini API structure for batch embeddings. The JSON tags use appropriate camelCase to match API expectations, and pointer types are used for optional fields.

core/providers/gemini/embedding.go (2)

7-68: Well-structured batch request construction.

The function correctly:

Collects texts from both single Text and multiple Texts inputs

Creates individual GeminiEmbeddingRequest entries per text with proper model prefixing

Propagates parameters (Dimensions, TaskType, Title) to each request

116-158: Good Gemini-to-Bifrost response conversion.

The new ToBifrostEmbeddingResponse function correctly converts Gemini embeddings to Bifrost format with proper handling of:

Per-embedding data with index and object type

Usage metadata extraction from Statistics (preferred) or Metadata (fallback)

tests/integrations/tests/test_langchain.py (2)

374-402: LGTM! Gemini embeddings test follows existing patterns.

The test correctly implements cross-provider embeddings testing for Gemini:

Uses dummy API key consistent with Bifrost's auth handling (line 147-148 setup)

Validates both single (embed_query) and batch (embed_documents) operations

Follows the same structure as test_03_openai_embeddings_basic

Appropriate exception handling with provider-specific skip message

The past review concerns about debug prints and API key access have been properly addressed.

407-1327: Test renumbering is correct and consistent.

All tests have been correctly renumbered from test_04 through test_26 to accommodate the new test_04_gemini_embeddings_basic. The sequential ordering is maintained, docstrings are updated with new case numbers, and test logic remains unchanged. This is the expected mechanical change for inserting a new test.

tests/integrations/tests/test_openai.py (1)

858-874: The hardcoded dimensions=1536 is appropriate for all providers in the embeddings scenario and is supported by OpenAI text-embedding-3-small (native default), Gemini (via output_dimensionality truncation), and Cohere embeddings. No action required.

Pratham-Mishra04 · 2025-12-22T16:04:31Z

Merge activity

Dec 22, 4:04 PM UTC: A user started a stack merge that includes this pull request via Graphite.
Dec 22, 4:08 PM UTC: Graphite rebased this pull request as part of a merge.
Dec 22, 4:10 PM UTC: @Pratham-Mishra04 merged this pull request with Graphite.

TejasGhatte marked this pull request as ready for review December 19, 2025 11:22

coderabbitai bot requested changes Dec 19, 2025

View reviewed changes

core/providers/gemini/embedding.go Show resolved Hide resolved

tests/integrations/tests/test_langchain.py Show resolved Hide resolved

TejasGhatte force-pushed the 12-19-fix_gemini_embeddings_for_multiple_texts_through_langchain branch from 0a3ff50 to 240b3df Compare December 19, 2025 11:47

coderabbitai bot reviewed Dec 19, 2025

View reviewed changes

TejasGhatte force-pushed the 12-19-fix_gemini_embeddings_for_multiple_texts_through_langchain branch from 240b3df to f37a8f8 Compare December 20, 2025 20:14

TejasGhatte requested review from Pratham-Mishra04 and akshaydeo December 20, 2025 20:15

TejasGhatte changed the title ~~fix: gemini embeddings for multiple texts through langchain~~ feat: add batch embedding support for Gemini and improve cross-provider compatibility Dec 20, 2025

coderabbitai bot requested changes Dec 20, 2025

View reviewed changes

core/providers/gemini/embedding.go Show resolved Hide resolved

TejasGhatte force-pushed the 12-19-fix_gemini_embeddings_for_multiple_texts_through_langchain branch from f37a8f8 to 9deef80 Compare December 22, 2025 06:07

coderabbitai bot reviewed Dec 22, 2025

View reviewed changes

coderabbitai bot approved these changes Dec 22, 2025

View reviewed changes

TejasGhatte changed the base branch from main to graphite-base/1138 December 22, 2025 11:42

TejasGhatte force-pushed the 12-19-fix_gemini_embeddings_for_multiple_texts_through_langchain branch from 9deef80 to cac6fa5 Compare December 22, 2025 11:42

TejasGhatte changed the base branch from graphite-base/1138 to 12-21-refactor_use_gemini_endpoint_for_fine_tuned_models_in_vertex December 22, 2025 11:43

TejasGhatte changed the base branch from 12-21-refactor_use_gemini_endpoint_for_fine_tuned_models_in_vertex to graphite-base/1138 December 22, 2025 12:41

TejasGhatte force-pushed the 12-19-fix_gemini_embeddings_for_multiple_texts_through_langchain branch from cac6fa5 to bf3d78d Compare December 22, 2025 13:15

TejasGhatte force-pushed the graphite-base/1138 branch from 7b68adf to 2cf99d2 Compare December 22, 2025 13:15

TejasGhatte changed the base branch from graphite-base/1138 to 12-21-refactor_use_gemini_endpoint_for_fine_tuned_models_in_vertex December 22, 2025 13:15

TejasGhatte mentioned this pull request Dec 22, 2025

fix: dynamic thinking budget handling for gemini #1158

Merged

15 tasks

TejasGhatte changed the title ~~feat: add batch embedding support for Gemini and improve cross-provider compatibility~~ feat: add batch embedding support for Gemini and improve cross-provider compatibility --skip-pipeline Dec 22, 2025

Pratham-Mishra04 approved these changes Dec 22, 2025

View reviewed changes

Pratham-Mishra04 changed the base branch from 12-21-refactor_use_gemini_endpoint_for_fine_tuned_models_in_vertex to graphite-base/1138 December 22, 2025 16:04

Pratham-Mishra04 changed the base branch from graphite-base/1138 to main December 22, 2025 16:06

fix: gemini embeddings for multiple texts through langchain

ba5b938

Pratham-Mishra04 force-pushed the 12-19-fix_gemini_embeddings_for_multiple_texts_through_langchain branch from bf3d78d to ba5b938 Compare December 22, 2025 16:08

Pratham-Mishra04 merged commit 959f17b into main Dec 22, 2025
8 of 9 checks passed

Pratham-Mishra04 deleted the 12-19-fix_gemini_embeddings_for_multiple_texts_through_langchain branch December 22, 2025 16:10

feat: add batch embedding support for Gemini and improve cross-provider compatibility --skip-pipeline #1138

feat: add batch embedding support for Gemini and improve cross-provider compatibility --skip-pipeline #1138

Uh oh!

Conversation

TejasGhatte commented Dec 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Improve Gemini embedding request handling and add cross-provider embedding documentation

Changes

Type of change

Affected areas

How to test

Breaking changes

Security considerations

Checklist

Uh oh!

coderabbitai bot commented Dec 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review failed

Summary by CodeRabbit

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Poem

Pre-merge checks and finishing touches

Uh oh!

github-actions bot commented Dec 19, 2025

🧪 Test Suite Available

Uh oh!

TejasGhatte commented Dec 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Pratham-Mishra04 commented Dec 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Merge activity

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

TejasGhatte commented Dec 19, 2025 •

edited

Loading

coderabbitai bot commented Dec 19, 2025 •

edited

Loading

TejasGhatte commented Dec 19, 2025 •

edited

Loading

Pratham-Mishra04 commented Dec 22, 2025 •

edited

Loading