Skip to content

Conversation

@priyansh4320
Copy link
Collaborator

@priyansh4320 priyansh4320 commented Jan 19, 2026

Why are these changes needed?

The current Bedrock client (api_type: "bedrock") uses the legacy ModelClient interface, which returns flattened ChatCompletion responses. This limits access to rich content types and creates inconsistencies across providers.

# Create Bedrock V2 client directly
client = BedrockV2Client(
    aws_region=os.getenv("AWS_REGION", "eu-north-1"),
    aws_access_key=os.getenv("AWS_ACCESS_KEY"),
    aws_secret_key=os.getenv("AWS_SECRET_ACCESS_KEY"),
)

# Make a request
response = client.create({
    "model": "qwen.qwen3-coder-480b-a35b-v1:0",
    "messages": [{"role": "user", "content": "Explain quantum computing in 2 sentences."}],
})

# Verify it's a UnifiedResponse
print(f"Response type: {type(response)}")
print(f"Is UnifiedResponse: {isinstance(response, UnifiedResponse)}")
print(f"\nProvider: {response.provider}")
print(f"Model: {response.model}")
print(f"\nText content: {response.text}")
print(f"\nUsage: {response.usage}")
print(f"Cost: ${response.cost:.6f}" if response.cost else "Cost: N/A")

Related issue number

#2346

Checks

@joggrbot
Copy link
Contributor

joggrbot bot commented Jan 19, 2026

📝 Documentation Analysis

All docs are up to date! 🎉


✅ Latest commit analyzed: efec957 | Powered by Joggr

@priyansh4320 priyansh4320 self-assigned this Jan 19, 2026
@priyansh4320 priyansh4320 marked this pull request as ready for review January 20, 2026 07:43
@priyansh4320
Copy link
Collaborator Author

@claude Review this PR. Summarize the features supported from bedrock v2 client and compare with https://github.com/ag2ai/ag2/blob/main/autogen/oai/bedrock.py

@claude
Copy link

claude bot commented Jan 20, 2026

Claude finished @priyansh4320's task in 3m 3s —— View job


PR Review: Bedrock V2 Client Implementation

Summary

This PR successfully implements a next-generation Bedrock V2 client (api_type: "bedrock_v2") that returns rich UnifiedResponse objects instead of flattened ChatCompletion responses. The implementation is well-structured, thoroughly tested, and properly documented.

Recommendation: ✅ Approve with minor suggestions


Feature Comparison: V2 vs Legacy Client

Bedrock V2 Client Features (autogen/llm_clients/bedrock_v2.py)

✨ New Features in V2

  1. Rich Content Preservation

    • Returns UnifiedResponse with typed content blocks (TextContent, ImageContent, ToolCallContent, GenericContent)
    • Preserves all response metadata and provider-specific information
    • Forward-compatible with unknown content types via GenericContent
  2. Structured Outputs Support

    • Full support for response_format with Pydantic models or JSON schemas
    • Implements structured output via internal tool calling (__structured_output tool)
    • Schema normalization and $ref resolution for Pydantic models
  3. Direct Property Access

    • response.text - Direct text access
    • response.get_content_by_type("image") - Type-based content filtering
    • response.get_tool_calls() - Tool call extraction
    • response.cost, response.usage, response.provider - Rich metadata
  4. Enhanced Type Safety

    • Pydantic-based models with validation
    • Enum-based roles (UserRoleEnum)
    • Type hints throughout
  5. Backward Compatibility

    • create_v1_compatible() method for legacy format
    • Works with existing AG2 agent system
  6. Improved Developer Experience

    • Cleaner API design
    • Better error messages
    • Consistent interface across providers

🔄 Shared Features (Both V1 and V2)

  1. AWS Authentication

    • Multiple credential methods (access key, session token, profile, IAM role)
    • Environment variable support
    • Same credential handling as V1
  2. Retry Configuration

    • total_max_attempts, max_attempts, mode parameters
    • Exponential backoff (standard, adaptive, legacy modes)
    • Identical to V1
  3. Model Parameters

    • temperature, top_p, max_tokens (base params)
    • top_k, k, seed, cache_seed (additional params)
    • additional_model_request_fields for model-specific features
    • Same as V1
  4. Tool Calling

    • Bedrock Converse API tool support
    • Tool definition formatting
    • Tool result handling
    • Same underlying implementation (shared utilities)
  5. System Prompt Support

    • supports_system_prompts parameter
    • Model-aware system message handling
    • Same as V1
  6. Image Support

    • Multimodal input (images in messages)
    • Image output (image content blocks)
    • Base64 and URL formats
    • V2 has better output handling with ImageContent blocks
  7. Cost Calculation

    • Custom pricing via price parameter
    • Fallback to hardcoded model pricing
    • Same pricing database
  8. Streaming Warning

    • Both warn that streaming is not supported
    • Both disable streaming if requested

Legacy Bedrock Client Features (autogen/oai/bedrock.py)

📦 V1-Only Characteristics

  1. Response Format

    • Returns ChatCompletion (OpenAI-compatible format)
    • Flattened structure loses rich content information
    • Text-only content in message body
  2. Limited Structured Output Support

    • Has response_format support (lines 151, 273-291, 337-361, 463-522)
    • Uses same internal __structured_output tool approach
    • BUT: Returns as text in flattened ChatCompletion format
  3. Message Retrieval

    • message_retrieval() returns list of Choice.message objects
    • Requires manual parsing to extract content
  4. No Rich Content Blocks

    • Images returned but not as typed blocks
    • Tool calls in OpenAI format (flat dict)
    • No content type filtering

Code Review Findings

✅ Strengths

  1. Excellent Code Organization (autogen/llm_clients/bedrock_v2.py:1-793)

    • Clear separation of concerns
    • Well-documented with extensive docstrings
    • Logical method grouping
  2. Comprehensive Testing (test/llm_clients/test_bedrock_v2.py:1-1049)

    • 1049 lines of tests covering all major features
    • Good use of fixtures and mocks
    • Tests for edge cases and error conditions
    • Protocol compliance tests
  3. Shared Utility Reuse (autogen/llm_clients/bedrock_v2.py:66-74)

    • Reuses functions from legacy client: calculate_cost, convert_stop_reason_to_finish_reason, extract_system_messages, format_tool_calls, format_tools, oai_messages_to_bedrock_messages
    • Avoids code duplication
    • Maintains consistency
  4. Documentation (website/docs/user-guide/models/amazon-bedrock.mdx:40-138)

    • Clear comparison between V1 and V2
    • Migration guide
    • Good examples
  5. Proper Error Handling

    • Import guards for optional dependencies (autogen/llm_clients/bedrock_v2.py:77-91)
    • Validation of required parameters (autogen/llm_clients/bedrock_v2.py:262-263)
    • Graceful degradation for unknown content types (autogen/llm_clients/bedrock_v2.py:609-610)

🔍 Minor Issues & Suggestions

1. Low Test Coverage (21.64%)

Issue: Codecov shows 228 lines missing coverage in bedrock_v2.py

Suggestion: The test file has 1049 lines but coverage is only 21.64%. This appears to be because integration tests (test/llm_clients/test_bedrock_v2_integration.py) require real AWS credentials and are skipped in CI. Consider:

  • Adding # pragma: no cover to integration-only code paths
  • Mock more edge cases in unit tests
  • Document which features require integration tests

2. Duplicate Schema Normalization Code

Locations:

  • autogen/llm_clients/bedrock_v2.py:318-356
  • autogen/oai/bedrock.py:196-270

Issue: The _normalize_pydantic_schema_to_dict() method is duplicated between V1 and V2 clients.

Suggestion: Extract to a shared utility module to reduce duplication:

# autogen/oai/bedrock_utils.py (or similar)
def normalize_pydantic_schema_to_dict(schema: dict[str, Any] | type[BaseModel]) -> dict[str, Any]:
    """Shared normalization logic"""

3. Structured Output Logic Duplication

Locations:

  • V2: Lines 301-410, 508-544, 569-613
  • V1: Lines 170-361, 473-522

Issue: Structured output handling (_get_response_format_schema, _create_structured_output_tool, _merge_tools_with_structured_output, _extract_structured_output_from_tool_call, _validate_and_format_structured_output) is largely duplicated.

Suggestion: Consider extracting shared logic to a base class or utility module.

4. Response Format Handling in __init__

Location: autogen/llm_clients/bedrock_v2.py:260

Issue: Response format is stored in __init__ but can also be passed in create() params (line 439). This could lead to confusion.

Suggestion: Document the precedence clearly or remove instance-level response_format in favor of per-request specification.

5. Tool Call Response Format Inconsistency

Location: autogen/llm_clients/bedrock_v2.py:674-684

Issue: In create_v1_compatible(), tool calls are formatted as dicts, but in the main response they're ToolCallContent objects. The transformation is correct but could be more explicit.

Suggestion: Add a comment explaining the format transformation for clarity.

6. Empty Content Block Handling

Location: autogen/llm_clients/bedrock_v2.py:615-616

Issue: If response has no content, an empty TextContent is added. This is good defensive coding but worth documenting.

Suggestion: Add a comment explaining why empty text is added (to ensure response.messages[0] always exists).

7. Hardcoded Model Pricing is Outdated

Location: autogen/oai/bedrock.py:892-899

Issue: The legacy client has hardcoded prices for only 6 models, all from 2024 or earlier. V2 client references this via calculate_cost() import.

Suggestion: Add a comment in documentation encouraging use of price parameter for accurate pricing, as model prices change and new models are added frequently.

📊 Architecture Assessment

V2 Client Architecture: ✅ Excellent

  • Clean separation between protocol implementation and business logic
  • Type-safe with Pydantic models
  • Forward-compatible design
  • Proper abstraction layers

Code Reuse: ✅ Good

  • Shares utilities with legacy client where appropriate
  • Avoids unnecessary duplication of AWS/Bedrock-specific logic

Testing Strategy: ⚠️ Needs Improvement

  • Unit tests are comprehensive
  • Integration tests exist but don't run in CI (requiring AWS creds)
  • Coverage metrics don't reflect actual test quality

Feature Support Summary

Feature V1 (bedrock) V2 (bedrock_v2) Notes
Response Format ChatCompletion UnifiedResponse V2 is richer
Text Content ✅ Flattened ✅ Typed blocks V2 preserves structure
Tool Calling Same underlying API
Structured Outputs ✅ Limited ✅ Full support V2 has better API
Image Input Same support
Image Output V2 has ImageContent blocks
System Prompts Same support
Cost Tracking Same pricing data
Retry Config Identical
Auth Methods Identical
Model Parameters Same parameters
Streaming Both warn and disable
Rich Content Access V2 only
Forward Compatibility V2 has GenericContent
Direct Property Access V2 only
Backward Compat N/A V2 has create_v1_compatible()

Recommendations

For Approval

Approve this PR - The implementation is solid, well-tested, and provides significant value:

  • Modernized architecture with UnifiedResponse
  • Better developer experience
  • Forward compatibility
  • Maintains backward compatibility via create_v1_compatible()

For Future Work

  1. Improve Test Coverage Metrics

    • Add # pragma: no cover to integration-only paths
    • Expand unit test edge case coverage
    • Document coverage strategy
  2. Reduce Code Duplication

    • Extract shared structured output logic to utilities
    • Share schema normalization code
  3. Update Pricing Database

    • Add more recent model pricing
    • Add comment encouraging price parameter usage
  4. Consider Deprecation Path for V1

    • Add deprecation warning to V1 client in future release
    • Provide migration tools/documentation

Questions for Author

  1. Is the low coverage (21.64%) acceptable given that many tests require AWS credentials?
  2. Should we extract the shared structured output logic now or in a follow-up PR?
  3. Are there plans to eventually deprecate the V1 client?

Files Reviewed:

  • ✅ autogen/llm_clients/bedrock_v2.py (793 lines)
  • ✅ autogen/oai/bedrock.py (915 lines)
  • ✅ test/llm_clients/test_bedrock_v2.py (1049 lines)
  • ✅ website/docs/user-guide/models/amazon-bedrock.mdx
  • ✅ autogen/oai/client.py (registration logic)

@codecov
Copy link

codecov bot commented Jan 24, 2026

Codecov Report

❌ Patch coverage is 21.64948% with 228 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
autogen/llm_clients/bedrock_v2.py 21.47% 222 Missing and 1 partial ⚠️
autogen/oai/client.py 0.00% 4 Missing and 1 partial ⚠️
Files with missing lines Coverage Δ
autogen/llm_clients/__init__.py 100.00% <100.00%> (ø)
autogen/llm_config/types.py 100.00% <100.00%> (ø)
autogen/oai/client.py 53.72% <0.00%> (-4.04%) ⬇️
autogen/llm_clients/bedrock_v2.py 21.47% <21.47%> (ø)

... and 20 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Feature Request]: Implement Bedrock V2 Client (ModelClientV2)

2 participants