Skip to content

Conversation

@adamnova
Copy link
Contributor

@adamnova adamnova commented Sep 12, 2025

Pull Request Template

#4678

Overview

Adds opt-in support for System.Text.Json streaming API in client-side encryption operations, providing improved performance and reduced memory usage for large encrypted documents.

Usage

Set via RequestOptions (Read/Write Operations)

var requestOptions = new ItemRequestOptions
{
    Properties = new Dictionary<string, object>
    {
        { "encryption-json-processor", JsonProcessor.Stream }
    }
};

// Read with streaming processor
var response = await encryptionContainer.ReadItemAsync<MyDoc>(
    id, partitionKey, requestOptions);

Listening to telemetry

OpenTelemetry SDK

OpenTelemetry SDK automatically captures activities from both sources:

using OpenTelemetry;
using OpenTelemetry.Trace;

var tracerProvider = Sdk.CreateTracerProviderBuilder()
    .AddSource("Microsoft.Azure.Cosmos.Encryption.Custom")
    .AddConsoleExporter()
    .Build();

// Use your encrypted container - all traces are automatically collected

Using ActivityListener

For custom processing without OpenTelemetry SDK, you can use ActivityListener:

using System.Diagnostics;

var listener = new ActivityListener
{
    ShouldListenTo = (activitySource) => 
        activitySource.Name == "Microsoft.Azure.Cosmos.Encryption.Custom"
    
    Sample = (ref ActivityCreationOptions<ActivityContext> _) => 
        ActivitySamplingResult.AllDataAndRecorded,
    
    ActivityStarted = activity =>
    {
        Console.WriteLine($"[START] {activity.Source.Name}: {activity.DisplayName}");
    },
    
    ActivityStopped = activity =>
    {
        Console.WriteLine($"[STOP]  {activity.Source.Name}: {activity.DisplayName} ({activity.Duration.TotalMilliseconds}ms)");
    }
};

ActivitySource.AddActivityListener(listener);

// Clean up when done
listener.Dispose();

Key Changes

  • New Property Bag Key: "encryption-json-processor" in RequestOptions.Properties
  • New Enum Value: JsonProcessor.Stream (available in .NET 8.0+)
  • Backward Compatible: Defaults to JsonProcessor.Newtonsoft when not specified
  • Algorithm Support: Only works with MdeAeadAes256CbcHmac256Randomized algorithm

Introduces JsonProcessorPropertyBag to centralize handling of JsonProcessor overrides using RequestOptions.Properties. Updates encryption and decryption workflows to support processor selection, adds related tests and helpers, and refactors code to use the new override mechanism for both production and test scenarios.
Introduces CosmosDiagnosticsContext for lightweight diagnostics and scope tracking, with ActivitySource integration for telemetry. Updates EncryptionProcessor to record diagnostic scopes and enforce streaming mode restrictions. Adds comprehensive unit tests for diagnostics context and processor, and merges stream processor end-to-end tests into MdeCustomEncryptionTests.
Moved diagnostics scope creation for MDE decryption to cover the entire decryption logic in EncryptionProcessor. Updated tests to assert scope presence/absence more robustly and removed conditional compilation for ENCRYPTION_CUSTOM_PREVIEW. Project files now always define ENCRYPTION_CUSTOM_PREVIEW constant.
Introduces direct serialization to output streams via WriteToStream to reduce intermediate memory allocations. Refactors decryption logic to streamline processor selection and diagnostics scope naming, and adds EncryptionDiagnostics constants for improved diagnostics context management. Updates MDE encryption processor to support direct stream handling and processor selection for .NET 8+ preview builds.
Introduces TestEncryptorFactory to centralize and simplify the creation of mock Encryptor and DataEncryptionKey instances in tests, reducing repetitive Moq setup code. Updates all relevant test classes to use this factory. Refactors MdeEncryptionProcessor to improve fallback handling for legacy and unencrypted streams, ensuring correct stream positioning and graceful fallback to legacy decryption paths.
Replaces separate diagnostic prefixes for JSON processor selection with a unified prefix in EncryptionDiagnostics and updates all usages and tests accordingly. Adds documentation for the JSON processor property bag override and introduces new tests for large payloads, concurrency, and cancellation scenarios with the streaming JSON processor.
Adds validation to ensure that streaming encryption only processes JSON documents with an object as the root element. Root arrays and other root types are now explicitly rejected, improving contract enforcement and error handling.
Refactored root validation logic to ensure streaming encryption only accepts JSON documents with an object root. Added tests to verify rejection of root arrays and primitive values, improving contract enforcement and error messaging.
Expanded EncryptionProcessorDiagnosticsTests to cover diagnostics scope emission for various decryption paths, including MDE payloads, provided-output, stream override, malformed JSON, duplicate scope prevention, null input, cancellation, and encryption. Added a fake MDE DataEncryptionKey implementation and a SlowCancelableStream helper for cancellation tests. Conditional assertions ensure correct scope behavior for NET8 preview builds.
Introduces IMdeJsonProcessorAdapter and concrete adapters for Newtonsoft and Stream processors to unify JSON processing logic in MdeEncryptionProcessor. Diagnostic scope tracking for encryption and decryption is improved, and related tests are updated to verify scope tracking for both processors. Legacy logic for processor selection and property inspection is moved into the respective adapters for better maintainability.
Unifies and updates diagnostics scope naming for encryption and decryption selection, removing legacy/impl scopes and ensuring only one selection scope is emitted per operation. Refactors MDE processor logic to directly handle Newtonsoft and Stream processors, improving fallback and error handling. Adds comprehensive unit tests for NewtonsoftAdapter and StreamAdapter, covering encryption, decryption, legacy/unencrypted payloads, and diagnostics scope assertions.
…rypted stream used; sync request options with processor; add legacy override guard and fallback JObject decrypt
Added a test to verify that explicitly overriding the JSON processor with Newtonsoft maintains the expected encryption/decryption behavior. Also added assertions to change feed processor tests to ensure both documents are processed and the processor does not time out.
Changed expected diagnostics.Scopes.Count from 1 to 0 in multiple StreamAdapterTests to reflect updated behavior. Ensures tests align with current diagnostics tracking implementation.
Introduces EncryptionRequestOptionsExperimental for configuring the JSON processor pipeline via request options, including helper methods and tests. Refactors JsonProcessorPropertyBag for improved property handling and updates test helpers to use the new API.
@adamnova adamnova marked this pull request as ready for review September 26, 2025 09:45
- Created PR_DESCRIPTION.md with detailed usage examples, benefits, compatibility matrix
- Added KIRAN_FEEDBACK_TODOS.md with in-depth analysis of all review comments
- Addresses Kiran's comment: 'Please add the details of usage in the PR description'
- Includes code examples for opt-in via RequestOptions and EncryptionOptions
- Documents property bag key, migration path, and architecture details
- Provides performance guidance and diagnostic scope information
- Addresses Kiran's comment: 'null check for owner'
- Added defensive null check before calling this.owner.Record()
- Prevents potential NullReferenceException if struct state becomes inconsistent
- All 202 unit tests pass
- Minimal performance cost (one null check)
- No behavioral changes for normal usage
- Addresses Kiran's comment: 'Its better for the caller to reset position not the responsibility of the write method'
- Removed automatic position reset from CosmosJsonDotNetSerializer.WriteToStream()
- Updated XML documentation to clarify caller responsibility for position management
- Updated both callers in NewtonsoftAdapter to explicitly manage stream position
- Caller 1: Added explicit position reset after write for returned stream
- Caller 2: Removed redundant pre-write position set, kept post-write reset
- Fixes Single Responsibility Principle violation
- All 202 unit tests pass
- No behavioral changes - callers now have explicit control over stream position
…cope

Addresses Kiran's code review comment about Activity disposal when Scope is copied.

Changes:
- Added comprehensive XML documentation to Scope struct explaining proper usage
- Documented that Scope should only be used with 'using' pattern
- Added inline comment explaining Activity.Dispose() idempotency
- Simplified null check to use null-conditional operator for consistency

Analysis:
- Activity.Dispose() is idempotent per .NET framework documentation
- Multiple disposals are safe but not the intended usage pattern
- Current implementation is already correct and safe
- This change improves code clarity and prevents misuse

The Scope struct is designed for 'using' statement usage which ensures
single disposal. While copying the struct would result in multiple
Activity.Dispose() calls, this is safe due to idempotency, though
not recommended. Documentation now clearly communicates this contract.

Tests: All 202 tests pass on net6.0 and net8.0
Addresses Kiran's code review comment about startTicks being an odd parameter.

Changes:
- Removed startTicks parameter from Scope constructor
- Moved Stopwatch.GetTimestamp() call from CreateScope into Scope constructor
- Updated CreateScope to only pass owner, name, and activity to Scope

Benefits:
- Better timing accuracy - timestamp captured closer to actual scope start
- Cleaner design - Scope is now more self-contained
- Reduced time gap between Activity start and tick capture
- More intuitive API - Scope owns its own start time

The previous design had a small timing gap where Activity was started,
then startTicks was captured, then Scope was constructed. Now the
timestamp is captured immediately during Scope construction, providing
more accurate timing measurements.

Tests: All 202 tests pass on net6.0 and net8.0
Addresses Kiran's code review comment about scope being self-contained
and not capturing nested spans (limitation).

Changes:
- Added comprehensive XML remarks to CosmosDiagnosticsContext class
- Documented that scopes are recorded in flat list, not hierarchically
- Added example showing LIFO disposal order for nested scopes
- Updated CreateScope method documentation to clarify nested behavior
- Recommended alternatives for hierarchical span tracking

Documentation now clearly explains:
- Nested scopes like Outer { Inner1 { } Inner2 { } } record as [Inner1, Inner2, Outer]
- No parent-child relationships are maintained
- This is a design limitation of the lightweight implementation
- For hierarchical tracking, use ActivitySource/Activity or structured logging

This improves understanding of the diagnostics system's capabilities
and helps developers make informed decisions about when to use
alternative tracking mechanisms.

Tests: All 202 tests pass on net6.0 and net8.0
Addresses Kiran's code review question about why owner=null is supported
instead of throwing an ArgumentException.

Changes:
- Added detailed XML remarks to Scope struct explaining null owner design
- Documented Scope.Noop property with rationale for default(Scope) pattern
- Updated CreateScope documentation to clarify no-op return value
- Enhanced inline comment explaining intentional null owner for no-op pattern

Rationale for null owner design:
- Performance: Avoids allocations for empty scope names or disabled diagnostics
- Simplicity: enabled flag (owner != null) provides clean early-exit in Dispose
- Safety: Null-conditional operator usage prevents NullReferenceException
- Zero-allocation no-op pattern: default(Scope) requires no memory allocation

The null owner pattern is intentional and safe:
- enabled=false when owner is null
- Dispose() checks enabled flag first and returns early
- No operations performed for no-op scopes
- This is a common pattern for lightweight diagnostic scopes

Alternative considered but rejected: Throwing ArgumentException would require
a different no-op pattern and add complexity without meaningful benefit.

Tests: All 202 tests pass on net6.0 and net8.0
Addresses Kiran's code review question about whether EncryptionDiagnostics
should be a separate type vs inlined into CosmosDiagnosticsContext.

Changes:
- Added comprehensive XML summary documenting purpose and usage
- Added XML remarks explaining separation of concerns rationale
- Documented relationship with CosmosDiagnosticsContext
- Added XML documentation and examples for both scope name prefix constants
- Clarified usage patterns with concrete examples

Design rationale (kept as separate class):
- Separation of concerns:
  * CosmosDiagnosticsContext = diagnostic infrastructure (scope creation, recording, Activity)
  * EncryptionDiagnostics = domain-specific scope names for encryption operations
- Better discoverability of all encryption-related scope names
- Reusability across encryption pipeline components
- Clear naming showing these are encryption-specific constants
- Maintains single responsibility principle

Alternative considered but not chosen: Inlining into CosmosDiagnosticsContext
would mix infrastructure code with domain constants and reduce discoverability.

The documentation now clearly explains why the separate class design was
chosen and how it improves code organization.

Tests: All 202 tests pass on net6.0 and net8.0
Removed verbose explanations for:
- Null-conditional operator behavior (self-explanatory)
- Activity.Dispose() idempotency (common .NET knowledge)

The code remains functionally identical.
@kirankumarkolli
Copy link
Member

var requestOptions = new ItemRequestOptions
{
    Properties = new Dictionary<string, object>
    {
        { "encryption-json-processor", "JsonProcessor.Stream" }
    }
};

@kirankumarkolli
Copy link
Member

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@adamnova adamnova force-pushed the feature/encryptionprocessor-stream-switch branch from c449f17 to 1a4d4b8 Compare November 3, 2025 07:53
- Enhance documentation to explicitly state that only the most recently created scope is tracked
- Clarify that parent-child relationships are not captured (flat list design)
- Explain this is an intentional design decision for simplicity and lightweight implementation
- Add guidance to use Activity objects directly for hierarchical tracing if needed
…e EncryptionDiagnostics

- Move ScopeEncryptModeSelectionPrefix and ScopeDecryptModeSelectionPrefix constants from EncryptionDiagnostics to CosmosDiagnosticsContext
- Delete EncryptionDiagnostics.cs file and empty Diagnostics folder
- Update all references in MdeEncryptionProcessor.cs to use CosmosDiagnosticsContext constants
- Update all test file references (EncryptionProcessorTests.cs and MdeCustomEncryptionTests.cs)
- Addresses Kiran's feedback to inline constants into infrastructure class
- Improves code organization by keeping diagnostic scope names with the diagnostics infrastructure
…text to use fail-fast validation

- Add ThrowIfNullOrEmpty method to ArgumentValidation for string validation
- Add ThrowIfNullOrWhiteSpace method to ArgumentValidation for comprehensive string validation
- Update CreateScope method in CosmosDiagnosticsContext to use ArgumentValidation.ThrowIfNullOrEmpty
- Remove defensive null/empty handling in favor of fail-fast approach
- Remove unused Scope.Noop property
- All methods include backward compatibility for .NET Standard 2.0, .NET 6.0, and .NET 8.0+
Improves CosmosDiagnosticsContext by tracking disposed scopes to ensure idempotent disposal and thread safety, and refines scope name validation. Adds comprehensive unit tests and a testable wrapper to verify scope creation, disposal, timing, and concurrency behaviors.
… CosmosDiagnosticsContext improvements

- Replace manual null check patterns with ArgumentValidation.ThrowIfNull/ThrowIfNullOrEmpty/ThrowIfNullOrWhiteSpace
  - MdeEncryptionAlgorithm.cs: dekProperties and encryptionKeyStoreProvider parameters
  - DataEncryptionKeyContainerCore.cs: id parameter
  - DataEncryptionKeyContainerInlineCore.cs: id parameter (2 methods)
  - EncryptionOptionsExtensions.cs: DataEncryptionKeyId, EncryptionAlgorithm, PathsToEncrypt
  - MdeKeyWrapProvider.cs: metadata parameter (2 methods)
  - SymmetricKey.cs: rootKey parameter
  - MemoryTextReader.cs: buffer parameter
- Remove #pragma warning disable CA2208 directives from EncryptionOptionsExtensions
- Simplify CosmosDiagnosticsContext.Scope to class-based implementation for proper idempotent disposal
- Add comprehensive XML documentation to IMdeJsonProcessorAdapter interface
- Add Activity integration tests for CosmosDiagnosticsContext
- Add diagnostic scope validation to EncryptionProcessorTests
- All 257 tests passing
…ull-conditional operators

- Added ArgumentValidation.ThrowIfNull(diagnosticsContext) to all methods accepting diagnosticsContext parameter
- Replaced diagnosticsContext?.CreateScope() with diagnosticsContext.CreateScope()
- Ensures contract clarity since diagnosticsContext is never null in production code paths
- Add ThrowIfNegative(int) to ArgumentValidation for validating non-negative values
- Add ThrowIfGreaterThan(int, int) to ArgumentValidation for validating upper bounds
- Both methods use .NET 8+ ArgumentOutOfRangeException helpers when available
- Replace manual ArgumentOutOfRangeException checks in MemoryTextReader.Read() with ArgumentValidation calls
- Ensures consistency with ArgumentValidation pattern used throughout the codebase
- All 257 tests passing
Introduces comprehensive tests for ArgumentValidation methods, covering null, empty, whitespace, negative, and comparison scenarios. Ensures correct exception types and param names are thrown for invalid arguments, and verifies valid inputs are accepted.
@kirankumarkolli
Copy link
Member

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@kirankumarkolli kirankumarkolli merged commit 14c1265 into Azure:master Nov 4, 2025
25 of 29 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants