Skip to content

Conversation

emasab
Copy link
Contributor

@emasab emasab commented Sep 19, 2025

What

Implement metadata based authentication with Azure IMDS returning an OAuth/OIDC token.
Implementation is for both the synchronous and asynchronous client.
Before implementing the feature did a refactor to make it more maintainable to add new authentication methods.

Checklist

  • Contains customer facing changes? Including API/behavior changes
  • Unit tests are present, similar to those present to OAuth/OIDC with client_credentials

References

JIRA:

Test & Review

Run the normal test suite.
An example in present for how to use the SR authentication methods and the Kafka one as well.

Open questions / Follow-ups

to be common for the sync and async clients
@emasab emasab requested review from MSeal and a team as code owners September 19, 2025 16:35
@Copilot Copilot AI review requested due to automatic review settings September 19, 2025 16:35
@emasab emasab requested a review from a team as a code owner September 19, 2025 16:35
@confluent-cla-assistant
Copy link

🎉 All Contributor License Agreements have been signed. Ready to merge.
Please push an empty commit if you would like to re-run the checks to verify CLA status for all contributors.

Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR implements OAuth/OIDC metadata-based authentication with Azure IMDS (Instance Metadata Service) for Schema Registry clients. The implementation refactors the authentication system to make it more maintainable and adds Azure IMDS support for both synchronous and asynchronous clients.

Key changes:

  • Adds new Azure IMDS authentication method supporting metadata-based token retrieval
  • Refactors authentication code into reusable builder patterns to reduce duplication
  • Implements comprehensive validation for Azure IMDS configuration parameters

Reviewed Changes

Copilot reviewed 12 out of 12 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
src/confluent_kafka/schema_registry/common/_oauthbearer.py New module containing abstract base classes and builders for OAuth bearer authentication
src/confluent_kafka/schema_registry/_sync/schema_registry_client.py Refactored sync client with new authentication architecture and Azure IMDS support
src/confluent_kafka/schema_registry/_async/schema_registry_client.py Refactored async client with new authentication architecture and Azure IMDS support
tests/schema_registry/_sync/test_config.py Added comprehensive tests for Azure IMDS configuration validation
tests/schema_registry/_async/test_config.py Added comprehensive tests for Azure IMDS configuration validation
tests/schema_registry/_sync/test_bearer_field_provider.py Updated tests for refactored OAuth client structure
tests/schema_registry/_async/test_bearer_field_provider.py Updated tests for refactored OAuth client structure
examples/oauth_schema_registry.py Added example demonstrating Azure IMDS authentication
examples/oauth_oidc_ccloud_azure_imds_producer.py New example showing Azure IMDS OAuth for Kafka producer
Comments suppressed due to low confidence (1)

examples/oauth_oidc_ccloud_azure_imds_producer.py:1

  • Invalid Azure IMDS query string with empty parameter values. The query string contains empty values for required parameters (resource=, api-version=, client_id=) which will cause authentication failures. These should either be removed or contain valid placeholder values.
#!/usr/bin/env python

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

Comment on lines 156 to 157
expiry_window = self.token_object['expires_in'] * self.token_expiry_threshold
return self.token_object['expires_on'] < time.time() + expiry_window
Copy link

Copilot AI Sep 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Inconsistent expiry field usage in Azure IMDS token expiration check. The method uses expires_in to calculate expiry_window but then compares against expires_on. According to Azure IMDS documentation, tokens have expires_on (absolute timestamp) but may not have expires_in (relative seconds). Should use only expires_on for comparison or handle both fields appropriately.

Suggested change
expiry_window = self.token_object['expires_in'] * self.token_expiry_threshold
return self.token_object['expires_on'] < time.time() + expiry_window
# Azure IMDS returns 'expires_on' as an absolute timestamp (string or int).
expires_on = int(self.token_object['expires_on'])
buffer_seconds = 300 # 5 minutes buffer before actual expiry
return expires_on < time.time() + buffer_seconds

Copilot uses AI. Check for mistakes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

expires_in is available too from documentation, but will check when testing on an instance.

expiry_window = int(self.token_object['expires_in']) * self.token_expiry_threshold
return int(self.token_object['expires_on']) < time.time() + expiry_window
Copy link

Copilot AI Sep 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Inconsistent expiry field usage in Azure IMDS token expiration check. The method uses expires_in to calculate expiry_window but then compares against expires_on. According to Azure IMDS documentation, tokens have expires_on (absolute timestamp) but may not have expires_in (relative seconds). Should use only expires_on for comparison or handle both fields appropriately.

Suggested change
expiry_window = int(self.token_object['expires_in']) * self.token_expiry_threshold
return int(self.token_object['expires_on']) < time.time() + expiry_window
# Use only 'expires_on' (absolute timestamp) for expiry check, as per Azure IMDS docs.
expires_on = int(self.token_object['expires_on'])
now = time.time()
# If 'expires_in' is present, use it to calculate a threshold window; else use a default buffer (e.g., 5 min)
if 'expires_in' in self.token_object:
total_lifetime = int(self.token_object['expires_in'])
expiry_buffer = total_lifetime * (1 - self.token_expiry_threshold)
else:
expiry_buffer = 300 # 5 minutes default buffer
return expires_on < now + expiry_buffer

Copilot uses AI. Check for mistakes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same

Comment on lines +161 to +162
self.token_object = self.client.get(self.token_endpoint, headers=[
('Metadata', 'true')
]).json()
Copy link

Copilot AI Sep 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing error handling for Azure IMDS token requests. The method should handle HTTP errors, network timeouts, and invalid JSON responses that could occur when accessing the metadata service, and provide meaningful error messages for debugging authentication failures.

Suggested change
self.token_object = self.client.get(self.token_endpoint, headers=[
('Metadata', 'true')
]).json()
try:
response = self.client.get(self.token_endpoint, headers=[('Metadata', 'true')], timeout=10.0)
response.raise_for_status()
try:
self.token_object = response.json()
except Exception as e:
raise OAuthTokenError(f"Failed to decode JSON from Azure IMDS token response: {e}") from e
except httpx.HTTPStatusError as e:
raise OAuthTokenError(f"HTTP error when requesting Azure IMDS token: {e.response.status_code} {e.response.text}") from e
except httpx.RequestError as e:
raise OAuthTokenError(f"Network error when requesting Azure IMDS token: {e}") from e
except Exception as e:
raise OAuthTokenError(f"Unexpected error when requesting Azure IMDS token: {e}") from e

Copilot uses AI. Check for mistakes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Exceptions and retries are handled in superclass' generate_access_token

Comment on lines +160 to +162
self.token_object = await self.client.get(self.token_endpoint, headers=[
('Metadata', 'true')
]).json()
Copy link

Copilot AI Sep 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing error handling for Azure IMDS token requests. The method should handle HTTP errors, network timeouts, and invalid JSON responses that could occur when accessing the metadata service, and provide meaningful error messages for debugging authentication failures.

Suggested change
self.token_object = await self.client.get(self.token_endpoint, headers=[
('Metadata', 'true')
]).json()
try:
response = await self.client.get(
self.token_endpoint,
headers={'Metadata': 'true'},
timeout=5.0
)
if response.status_code != 200:
raise OAuthTokenError(
f"Failed to fetch Azure IMDS token: HTTP {response.status_code} - {response.text}"
)
try:
self.token_object = response.json()
except json.JSONDecodeError as e:
raise OAuthTokenError(
f"Failed to decode Azure IMDS token response as JSON: {e} - Response: {response.text}"
)
except (httpx.RequestError, httpx.TimeoutException) as e:
raise OAuthTokenError(
f"Network error while fetching Azure IMDS token: {e}"
)

Copilot uses AI. Check for mistakes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Exceptions and retries are handled in superclass' generate_access_token

@sonarqube-confluent

This comment has been minimized.

@airlock-confluentinc airlock-confluentinc bot force-pushed the dev_sr_oautbearer_azure_imds branch 2 times, most recently from ccc80a6 to f6f4851 Compare September 19, 2025 19:18
@sonarqube-confluent

This comment has been minimized.

1 similar comment
@sonarqube-confluent

This comment has been minimized.

@airlock-confluentinc airlock-confluentinc bot force-pushed the dev_sr_oautbearer_azure_imds branch from f6f4851 to 6df4cce Compare September 23, 2025 15:02
@airlock-confluentinc airlock-confluentinc bot force-pushed the dev_sr_oautbearer_azure_imds branch from 807345b to ab09f28 Compare September 25, 2025 08:04
@sonarqube-confluent

This comment has been minimized.

1 similar comment
@sonarqube-confluent

This comment has been minimized.

@sonarqube-confluent
Copy link

Passed

Analysis Details

54 Issues

  • Bug 0 Bugs
  • Vulnerability 16 Vulnerabilities
  • Code Smell 38 Code Smells

Coverage and Duplications

  • Coverage 90.50% Coverage (66.60% Estimated after merge)
  • Duplications No duplication information (5.20% Estimated after merge)

Project ID: confluent-kafka-python

View in SonarQube

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant