Skip to content

Conversation

@priyansh4320
Copy link
Collaborator

@priyansh4320 priyansh4320 commented Jan 13, 2026

Why are these changes needed?

This PR implements V2 client for anthropic

Related issue number

Checks

@priyansh4320 priyansh4320 self-assigned this Jan 13, 2026
@joggrbot
Copy link
Contributor

joggrbot bot commented Jan 13, 2026

📝 Documentation Analysis

Joggr found 1 outdated docs in the pull request.

Autofix

Joggr opened 1 pull request(s) to fix the outdated docs.

Outdated

file reason confidence
website/docs/_blogs/2024-06-24-AltModels-Classes/index.mdx The document appears mostly up-to-date and conceptually accurate, but it omits mention of the newly introduced 'anthropic_v2' client class for Anthropic that is now visible in both code and documentation, and all examples reference only the older 'anthropic' client. This may cause confusion for users, given the significance and feature differences between V1 and V2 Anthropic clients. 65.3%

✅ Latest commit analyzed: 0cd3f6f | Powered by Joggr

@priyansh4320 priyansh4320 marked this pull request as ready for review January 15, 2026 04:12
@priyansh4320 priyansh4320 requested review from Lancetnik and randombet and removed request for Lancetnik January 15, 2026 04:42
@randombet
Copy link
Collaborator

@claude Review this PR. Summarize the features supported from Anthropic v2 client and compare with https://github.com/ag2ai/ag2/blob/main/autogen/oai/anthropic.py

@claude
Copy link

claude bot commented Jan 16, 2026

Claude finished @randombet's task in 1m 58s —— View job


PR Review: Anthropic V2 Client Implementation

Overview

This PR introduces a new Anthropic V2 Client (AnthropicV2Client) that implements the ModelClient protocol with enhanced support for structured responses and better integration with AG2's agent system. The V2 client returns UnifiedResponse objects instead of ChatCompletion, enabling richer feature support.


Feature Comparison: V1 vs V2

Architecture Differences

Feature V1 (AnthropicClient) V2 (AnthropicV2Client)
Response Format ChatCompletion (OpenAI-compatible) UnifiedResponse (rich typed content)
Location autogen/oai/anthropic.py autogen/llm_clients/anthropic_v2.py
Lines of Code ~1508 lines ~1110 lines
API Type api_type: "anthropic" api_type: "anthropic_v2"
Content Blocks Combined into string with [Thinking] tags Typed blocks (ReasoningContent, TextContent, ToolCallContent, etc.)

Shared Features (Both V1 and V2)

Both implementations support:

Native Structured Outputs (beta API for Claude Sonnet 4.5+)

  • Uses client.beta.messages.parse() for Pydantic models without tools
  • Uses client.beta.messages.create() with output_format for dict schemas or when tools are present
  • Validates SDK version ≥ 0.74.1 via validate_structured_outputs_version()

JSON Mode Fallback (for older models)

  • Prompt-based structured outputs with <json_response> tags
  • Extracts and validates JSON from response

Extended Thinking (thinking blocks)

  • Both support the thinking parameter
  • V1 combines thinking into message text with [Thinking] prefix
  • V2 preserves thinking as separate ReasoningContent blocks

Tool/Function Calling

  • Full support for tool definitions and tool results
  • Strict mode support for guaranteed schema validation
  • Handles both legacy and beta tool APIs

Multiple Authentication Methods

  • Standard API key (ANTHROPIC_API_KEY)
  • AWS Bedrock (via AnthropicBedrock client)
  • GCP Vertex AI (via AnthropicVertex client)

Message Conversion

  • Both use oai_messages_to_anthropic_messages() helper
  • Handles role alternation requirements
  • Supports vision (images) in messages

Cost Calculation

  • Uses same pricing table (ANTHROPIC_PRICING_1k)
  • Same cost calculation logic via _calculate_cost()

V2-Specific Advantages

1. Typed Content Blocks

V2 returns rich UnifiedResponse with properly typed content:

# V2 Response Structure
UnifiedResponse(
    messages=[
        UnifiedMessage(
            role="assistant",
            content=[
                ReasoningContent(type="reasoning", reasoning="..."),
                TextContent(type="text", text="..."),
                ToolCallContent(type="tool_call", id="...", name="...", arguments="..."),
                GenericContent(type="parsed", parsed={...})  # For structured outputs
            ]
        )
    ]
)

2. Access to Individual Content Types

response = client.create(params)

# Get reasoning blocks
for reasoning in response.messages[0].get_reasoning():
    print(f"Thinking: {reasoning.reasoning}")

# Get text content
text = response.messages[0].get_text()

# Get tool calls
tool_calls = response.messages[0].get_tool_calls()

3. Backward Compatibility

V2 provides create_v1_compatible() method that returns ChatCompletion format:

# Get V1-style response when needed
chat_completion = client.create_v1_compatible(params)

4. Forward Compatibility

V2 uses GenericContent for unknown content types, making it forward-compatible with future Anthropic API features.

5. Better Structured Output Handling

V2 handles structured outputs more cleanly:

  • Stores parsed Pydantic models in GenericContent(type="parsed")
  • Preserves both parsed object and JSON text representation
  • Better error handling when parsing fails

Code Quality Review

✅ Strengths

  1. Excellent Code Reuse: V2 reuses most helper functions from V1 (oai_messages_to_anthropic_messages, _calculate_cost, validation helpers, etc.)

  2. Comprehensive Testing: The PR includes:

    • test/llm_clients/test_anthropic_v2.py (586 lines) - unit tests with mocks
    • test/llm_clients/test_anthropic_v2_integration.py (494 lines) - integration tests
    • Example notebook demonstrating V2 usage
  3. Well-Documented: Clear docstrings explaining the differences and usage patterns

  4. Clean Separation: V2 doesn't modify V1, allowing gradual migration

⚠️ Issues & Suggestions

1. Missing AWS Bedrock & GCP Vertex Support (High Priority)

Location: autogen/llm_clients/anthropic_v2.py:152-189

The V2 client only supports standard Anthropic API key authentication. V1 supports AWS Bedrock and GCP Vertex AI.

# V1 has this logic (autogen/oai/anthropic.py:436-458)
if self._api_key is not None:
    self._client = Anthropic(**client_kwargs)
elif self._gcp_region is not None:
    self._client = AnthropicVertex(**kw)
else:
    self._client = AnthropicBedrock(**client_kwargs)

# V2 only does this (autogen/llm_clients/anthropic_v2.py:182-188)
client_kwargs = {"api_key": self._api_key}
if base_url:
    client_kwargs["base_url"] = base_url
if timeout:
    client_kwargs["timeout"] = timeout
self._client = Anthropic(**client_kwargs, **kwargs)

Recommendation: Add AWS Bedrock and GCP Vertex AI support to V2 client to achieve feature parity.

Fix this →

2. Code Coverage Gap (Medium Priority)

Coverage: 59.94% with 153 lines missing coverage in anthropic_v2.py

The main gaps are likely in:

  • Error handling paths (fallback to JSON Mode)
  • AWS/GCP authentication paths (currently missing)
  • Edge cases in structured output parsing

Recommendation: Add tests for:

  • Native structured output fallback scenarios
  • Error handling when SDK version is too old
  • Dict schema vs Pydantic model paths

3. Documentation Gaps (Medium Priority)

Location: website/docs/user-guide/models/anthropic.mdx

The documentation has been updated but doesn't clearly explain:

  • When to use V1 vs V2
  • Migration guide from V1 to V2
  • Benefits of UnifiedResponse over ChatCompletion

Recommendation: Add a section explaining:

## Anthropic V2 Client (New)

AG2 now offers an Anthropic V2 client that returns richer response objects:

**When to use V2:**
- When you need access to reasoning blocks separately
- When building custom agents that process structured content
- For forward compatibility with future Anthropic features

**When to use V1:**
- When you need AWS Bedrock or GCP Vertex AI support (V2 support coming soon)
- When your existing code expects ChatCompletion format
- For simple use cases where UnifiedResponse is overkill

4. Type Annotation Improvements (Low Priority)

Location: Multiple locations with # type: ignore comments

# autogen/llm_clients/anthropic_v2.py:188
self._client = Anthropic(**client_kwargs, **kwargs)  # type: ignore[misc]

# autogen/llm_clients/anthropic_v2.py:266
response = self._client.beta.messages.create(**anthropic_params)  # type: ignore[misc]

While these are necessary due to dynamic typing, consider adding more specific type hints where possible.

5. Minor: Inconsistent Naming (Low Priority)

The class is named AnthropicV2Client but the config entry is AnthropicV2LLMConfigEntry. The V1 uses AnthropicClient and AnthropicLLMConfigEntry. Consider consistent naming like:

  • AnthropicClientV2 (matches pattern better)
  • Or keep as is and document the rationale

Summary of New Features in V2

✨ What's New

  1. Rich Response Objects: Returns UnifiedResponse with typed content blocks instead of plain text
  2. Separate Reasoning Access: Thinking blocks available as ReasoningContent instead of embedded in text
  3. Better Tool Call Representation: Tool calls as typed ToolCallContent objects
  4. Parsed Output Preservation: Structured outputs stored as GenericContent(type="parsed")
  5. Forward Compatibility: Generic content blocks for unknown future types
  6. Cleaner API: Helper methods like response.messages[0].get_reasoning()

🔄 Migration Path

V2 provides full backward compatibility via create_v1_compatible():

# Old code using V1
response = client.create(params)
text = response.choices[0].message.content

# New code using V2 (backward compatible)
chat_completion = client.create_v1_compatible(params)
text = chat_completion.choices[0].message.content

# Or use new API
unified_response = client.create(params)
text = unified_response.messages[0].get_text()

📋 What's the Same

  • Structured outputs (both native and JSON Mode)
  • Extended thinking support
  • Tool/function calling
  • Cost calculation
  • All authentication methods (except V2 missing Bedrock/Vertex - needs fix)
  • Message format conversion

Final Recommendation

Approve with minor fixes required:

  1. ⚠️ MUST FIX: Add AWS Bedrock and GCP Vertex AI support to V2
  2. 📝 SHOULD FIX: Improve documentation about V1 vs V2 choice
  3. OPTIONAL: Increase test coverage for edge cases

The V2 implementation is well-architected and provides valuable improvements for users who need richer response objects. The code quality is good with excellent reuse of V1 helpers. Once AWS Bedrock and GCP Vertex AI support is added, this will be ready to merge.


Test Coverage

From codecov report:

  • V2 Client: 61.18% coverage (104 missing, 40 partial)
  • Modified V1 files: 0% coverage for new lines (AWS/GCP credential handling)

Priority: Add integration tests for AWS Bedrock and GCP Vertex AI paths in both V1 and V2.


@codecov
Copy link

codecov bot commented Jan 24, 2026

Codecov Report

❌ Patch coverage is 59.94764% with 153 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
autogen/llm_clients/anthropic_v2.py 61.18% 104 Missing and 40 partials ⚠️
autogen/oai/client.py 0.00% 4 Missing and 1 partial ⚠️
autogen/oai/anthropic.py 0.00% 4 Missing ⚠️
Files with missing lines Coverage Δ
autogen/llm_clients/__init__.py 100.00% <100.00%> (ø)
autogen/llm_config/types.py 100.00% <100.00%> (ø)
autogen/oai/anthropic.py 29.44% <0.00%> (+9.61%) ⬆️
autogen/oai/client.py 53.72% <0.00%> (-20.19%) ⬇️
autogen/llm_clients/anthropic_v2.py 61.18% <61.18%> (ø)

... and 114 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants