-
Notifications
You must be signed in to change notification settings - Fork 550
feat: add Cohere embedding integration #1305
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: develop
Are you sure you want to change the base?
Conversation
Documentation preview |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR adds Cohere as a new embedding provider to the NeMo Guardrails framework, enabling users to use Cohere's embedding models for document encoding tasks.
- Implements CohereEmbeddingModel class with both synchronous and asynchronous encoding methods
- Adds comprehensive test coverage for the new Cohere integration
- Updates documentation to include Cohere in the supported embedding providers table
Reviewed Changes
Copilot reviewed 6 out of 6 changed files in this pull request and generated 1 comment.
Show a summary per file
File | Description |
---|---|
nemoguardrails/embeddings/providers/cohere.py | Implements the CohereEmbeddingModel class with sync/async encoding capabilities |
nemoguardrails/embeddings/providers/init.py | Registers the new Cohere embedding provider in the system |
tests/test_embeddings_cohere.py | Adds comprehensive test coverage for Cohere embedding functionality |
tests/test_configs/with_cohere_embeddings/config.yml | Test configuration file for Cohere embeddings setup |
tests/test_configs/with_cohere_embeddings/config.co | Test flow configuration for validating Cohere integration |
docs/user-guides/configuration-guide.md | Updates documentation to include Cohere in supported providers table |
Comments suppressed due to low confidence (1)
tests/test_embeddings_cohere.py:70
- Function name
test_live_query
is duplicated. This function should have a distinct name liketest_live_query_sync
to differentiate it from the async version on line 52.
def test_live_query(app):
@bwook00 Thank you for your PR 🎉 Would you please enable pre-commits and apply it? Please see contributing guidelines. once it is enabled you can do:
|
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## develop #1305 +/- ##
===========================================
- Coverage 70.45% 70.40% -0.05%
===========================================
Files 161 162 +1
Lines 16214 16241 +27
===========================================
+ Hits 11423 11434 +11
- Misses 4791 4807 +16
Flags with carried forward coverage won't be shown. Click here to find out more.
🚀 New features to boost your workflow:
|
Hi @Pouyanpi , Just wanted to gently ask if there are any plans to merge this PR. |
Hi @bwook00, I apologize for the significant delay and truly appreciate your patience on this PR. What blocked the merge: The blocker wasn't issues with your code. it's that we don't have proper test infrastructure in place for external API providers. I've been meaning to implement a mock-based testing framework for this, but haven't found the time to do it properly. Currently, your tests follow the same pattern as our OpenAI tests (requiring LIVE_TEST mode with valid API credentials). While this works for manual testing, it means:
Why the coverage is 42.85%: Looking at your implementation, the 16 missing lines are likely:
I am going to push those tests directly to this branch (or open a PR and target to your branc) and then you can take over from there. What do you think? |
Hi @Pouyanpi ,
Sounds good 🙂. Please let me know once you've pushed the tests to this branch and where I should take over from. Alternatively, I'm also happy to add those tests myself. Just let me know whichever way is more convenient for you! So, just to confirm my understanding of the next steps:
Could you please confirm if this is correct? :) |
Thank you @bwook00 for your flexibility and active contribution.
Yes, this is correct. You just need to remove the part that skips the Cohere tests, which was introduced in #1446.
You can start working on that PR now. It’s blocked by #1446, not #1305. We’ll be merging #1446 by the end of the day. |
Add comprehensive mock-based unit tests for Cohere and OpenAI embedding providers that run without requiring API credentials. Tests cover: - Provider initialization with known/unknown models - Sync and async encoding methods - Custom parameters (input_type, api_key) - ImportError handling - All predefined model configurations These tests complement existing live integration tests and enable consistent CI/CD testing without external API dependencies. * skip cohere tests till #1305 is rebased onto develop after merging this PR
@bwook00 #1446 is merged to develop. Please rebase your branch onto develop and then remove following lines from the try:
import nemoguardrails.embeddings.providers.cohere
COHERE_AVAILABLE = True
except (ImportError, ModuleNotFoundError):
COHERE_AVAILABLE = False
@pytest.mark.skipif(
not COHERE_AVAILABLE, reason="Cohere provider not available in this branch"
) |
Description
Add Cohere Embedding provider
@Pouyanpi,
I added it along with #1304.
(It probably conflicts with 1304)
Checklist