Skip to content

Conversation

@NaluTripician
Copy link
Contributor

@NaluTripician NaluTripician commented Oct 13, 2025

Pull Request Template

Description

This pull request introduces a new semantic reranking feature to the Azure Cosmos DB .NET SDK, enabling users to rerank documents using an inference service that leverages Azure Active Directory (AAD) authentication. The main changes include the addition of the InferenceService class, new API surface for semantic reranking, and appropriate integration into the SDK's authorization and client context infrastructure. Notably, this functionality is only available when using AAD authentication.

Semantic Reranking Feature Integration:

  • Added the InferenceService class, which handles communication with the Cosmos DB Inference Service for semantic reranking, including HTTP client configuration, payload construction, and response handling. This service enforces AAD authentication and manages its own authorization and disposal.
  • Introduced a new public (under PREVIEW) or internal API SemanticRerankAsync to the Container class, allowing users to rerank a list of documents based on a context/query string. This is implemented in ContainerInlineCore and routed through the client context. [1] [2]

Authorization and Token Handling Updates:

  • Extended the AuthorizationTokenProvider abstraction and its implementations to support a new method, AddInferenceAuthorizationHeaderAsync, which is only valid for AAD-based token providers. Non-AAD providers throw a NotImplementedException for this method. [1] [2] [3] [4] [5] [6]

Client Context and Resource Management:

  • Updated ClientContextCore and CosmosClientContext to manage the lifecycle of the InferenceService, including creation, caching, and disposal. Added methods for invoking semantic reranking and for retrieving or creating the inference service instance. [1] [2] [3] [4] [5] [6]

Dependency Updates:

  • Added a dependency on the Azure.Identity package in the test project to support AAD authentication scenarios.
    Please delete options that are not relevant.
  • [] New feature (non-breaking change which adds functionality)

Closing issues

To automatically close an issue: closes #IssueNumber

@NaluTripician NaluTripician marked this pull request as draft October 13, 2025 17:52
@NaluTripician NaluTripician marked this pull request as ready for review October 22, 2025 22:37
milismsft
milismsft previously approved these changes Oct 22, 2025
Copy link

@milismsft milismsft left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please try to address the potential multiple background tasks related to the Interference object (and proper dispose of that task as well) :-)

Copy link
Member

@aayush3011 aayush3011 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@NaluTripician LGTM, added the comments, that we discussed offline.

this.CreateClientHelper(this.httpClient);

//Set endpoints
this.inferenceEndpoint = new Uri($"https://{accountProperties.Id}.{basePath}");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As discussed offline, please add an environment variable where the inference endpoint can be set up.

AZURE_COSMOS_SEMANTIC_RERANKER_INFERENCE_ENDPOINT

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants