Skip to content

OpenRouter API leads to pydantic_ai.exceptions.UnexpectedModelBehavior: Exceeded maximum retries (1) for result validation. Deeper dive shown: "Plain text responses are not permitted, please call one of the functions instead" #822

Closed as not planned
@HomenShum

Description

@HomenShum

Title: OpenRouter API Not Providing Structured Response - Exceeded Maximum Retries for Result Validation


Description:

When using the OpenRouter API with meta-llama/llama-3.3-70b-instruct, I encounter an issue where the structured response is not returned, resulting in an exception after exceeding the maximum number of retries. However, the same code works as expected when using the openai:gpt-4o-mini model.


Steps to Reproduce:

  1. Setup Code:

    import asyncio
    import json
    from pydantic import BaseModel
    from some_module import (
        search_company_summary,
        OpenAIModel,
        Agent,
        OPENROUTER_API_KEY,
    )
    
    class CityLocation(BaseModel):
        city: str
        country: str
    
    async def main():
        company_name_to_search = "20n Bio"
        # Use test data
        search_results_with_verification = await search_company_summary(
            company_name=company_name_to_search, use_test_data=True
        )
        print("\n--- Final Results with Verification ---")
        print(json.dumps(search_results_with_verification, indent=2))
    
        # Using OpenRouter API model
        openrouter_model = OpenAIModel(
            'meta-llama/llama-3.3-70b-instruct',
            base_url='https://openrouter.ai/api/v1',
            api_key=OPENROUTER_API_KEY,
        )
        agent = Agent(openrouter_model, result_type=CityLocation, result_tool_name='city_location')
    
        # Using GPT 4o MINI model
        agent = Agent('openai:gpt-4o-mini', result_type=CityLocation, result_tool_name='city_location')
        result = await agent.run('Where the olympics held in 2012?')
        print(result.data)
        # Expected output:
        #   city='London' country='United Kingdom'
    
        print(result.cost())
    
    if __name__ == "__main__":
        asyncio.run(main())
  2. Run the script with OpenRouter API:

    (parsely_v012325) PS D:\VSCode Projects\parsely_Jan25> & "d:/VSCode Projects/parsely_Jan25/parsely_v012325/Scripts/python.exe" "d:/VSCode Projects/parsely_Jan25/test.py"
    d:\VSCode Projects\parsely_Jan25\test.py:375: LogfireNotConfiguredWarning: No logs or spans will be created until logfire.configure() has been called. Set the environment variable LOGFIRE_IGNORE_NO_CONFIG=1 or add ignore_no_config=true in pyproject.toml to suppress this warning.
    result = await agent.run('Where the olympics held in 2012?')
    INFO:httpx:HTTP Request: POST https://openrouter.ai/api/v1/chat/completions "HTTP/1.1 200 OK"
    INFO:httpx:HTTP Request: POST https://openrouter.ai/api/v1/chat/completions "HTTP/1.1 200 OK"
    Traceback (most recent call last):
      File "d:\VSCode Projects\parsely_Jan25\test.py", line 382, in 
        asyncio.run(main())
      File "C:\Users\hshum\AppData\Local\Programs\Python\Python311\Lib\asyncio\runners.py", line 190, in run
        return runner.run(main)
                 ^^^^^^^^^^^^^^^
      File "C:\Users\hshum\AppData\Local\Programs\Python\Python311\Lib\asyncio\runners.py", line 118, in run
        return self._loop.run_until_complete(task)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
      File "C:\Users\hshum\AppData\Local\Programs\Python\Python311\Lib\asyncio\base_events.py", line 653, in run_until_complete
        return future.result()
                 ^^^^^^^^^^^^^^
      File "d:\VSCode Projects\parsely_Jan25\test.py", line 375, in main
        result = await agent.run('Where the olympics held in 2012?')
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
      File "D:\VSCode Projects\parsely_Jan25\parsely_v012325\Lib\site-packages\pydantic_ai\agent.py", line 311, in run
        final_result, tool_responses = await self._handle_model_response(
                                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
      File "D:\VSCode Projects\parsely_Jan25\parsely_v012325\Lib\site-packages\pydantic_ai\agent.py", line 1108, in _handle_model_response
        return await self._handle_text_response(text, run_context, result_schema)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
      File "D:\VSCode Projects\parsely_Jan25\parsely_v012325\Lib\site-packages\pydantic_ai\agent.py", line 1126, in _handle_text_response
        self._incr_result_retry(run_context)
      File "D:\VSCode Projects\parsely_Jan25\parsely_v012325\Lib\site-packages\pydantic_ai\agent.py", line 1300, in _incr_result_retry
        raise exceptions.UnexpectedModelBehavior(
    pydantic_ai.exceptions.UnexpectedModelBehavior: Exceeded maximum retries (1) for result validation
    (parsely_v012325) PS D:\VSCode Projects\parsely_Jan25>
    
  3. Run the script with GPT 4o MINI (Successful Case):

    (parsely_v012325) PS D:\VSCode Projects\parsely_Jan25> & "d:/VSCode Projects/parsely_Jan25/parsely_v012325/Scripts/python.exe" "d:/VSCode Projects/parsely_Jan25/test.py"
    d:\VSCode Projects\parsely_Jan25\test.py:375: LogfireNotConfiguredWarning: No logs or spans will be created until logfire.configure() has been called. Set the environment variable LOGFIRE_IGNORE_NO_CONFIG=1 or add ignore_no_config=true in pyproject.toml to suppress this warning.
    result = await agent.run('Where the olympics held in 2012?')
    INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
    city='London' country='United Kingdom'
    

Additional Context:

The issue seems to be related to the way the OpenRouter API handles structured responses. The error message indicates that the model behavior is unexpected, and the code exceeds the maximum retries for result validation:

raise exceptions.UnexpectedModelBehavior(
    pydantic_ai.exceptions.UnexpectedModelBehavior: Exceeded maximum retries (1) for result validation

I suspect that the OpenRouter API does not allow plain text responses and expects a structured function call response. Meanwhile, the GPT 4o MINI model handles the structured response correctly.


Core Function Example (Related to Entity Description):

async def process_entity_description(content_text: str) -> EntityAndDescription:
    """
    Processes the input content text to extract entity information and break it into sections.

    Args:
        content_text: The input text content.

    Returns:
        An EntityAndDescription object containing entity name, overall description, and section breakdowns.
    """
    try:
        with capture_run_messages() as messages:
            entity_and_description_agent_result = await entity_and_description_agent.run(
                user_prompt=(
                    "Please break down the following text into smaller, manageable sections, "
                    "identify the key points, and provide a clear, decontextualized description for each section:\n\n"
                    f"{content_text}"
                )
            )
            
            return entity_and_description_agent_result.data

    except UnexpectedModelBehavior as e:
        logger.error(f"UnexpectedModelBehavior in process_entity_description: {e}")
        logger.error(f"Cause: {repr(e.__cause__)}")
        logger.error(f"Messages: {messages}")
        raise e
    except Exception as e:
        logger.error(f"Error in process_entity_description: {e}")
        raise e

Expected Behavior:

  • When using the OpenRouter API, the model should return a structured response without exceeding the retry limit.
  • The agent should successfully validate the result and output the expected structured data.

Actual Behavior:

  • The OpenRouter API returns a plain text response, leading to a validation failure after one retry, and the following error is raised:

    pydantic_ai.exceptions.UnexpectedModelBehavior: Exceeded maximum retries (1) for result validation
    
  • In contrast, using the GPT 4o MINI model produces the expected output.


Environment:

  • Python Version: 3.11 (as per traceback)
  • Relevant Libraries/Packages: pydantic_ai, httpx
  • APIs: OpenRouter API (https://openrouter.ai/api/v1), OpenAI API (for GPT 4o MINI)

Additional Notes:

  • The warning about LogfireNotConfiguredWarning is noted but does not appear to be directly related to the structured response issue.
  • The issue was identified by closely inspecting the captured run messages and traceback.

Request:

Could you please investigate why the OpenRouter API is not returning the structured response and causing a result validation failure? Any insights into how to configure the API or modify the code to handle this behavior would be appreciated.


Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions