Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OpenRouter API leads to pydantic_ai.exceptions.UnexpectedModelBehavior: Exceeded maximum retries (1) for result validation. Deeper dive shown: "Plain text responses are not permitted, please call one of the functions instead" #822

Open
HomenShum opened this issue Jan 30, 2025 · 4 comments
Labels
more info More information required

Comments

@HomenShum
Copy link

HomenShum commented Jan 30, 2025

Title: OpenRouter API Not Providing Structured Response - Exceeded Maximum Retries for Result Validation


Description:

When using the OpenRouter API with meta-llama/llama-3.3-70b-instruct, I encounter an issue where the structured response is not returned, resulting in an exception after exceeding the maximum number of retries. However, the same code works as expected when using the openai:gpt-4o-mini model.


Steps to Reproduce:

  1. Setup Code:

    import asyncio
    import json
    from pydantic import BaseModel
    from some_module import (
        search_company_summary,
        OpenAIModel,
        Agent,
        OPENROUTER_API_KEY,
    )
    
    class CityLocation(BaseModel):
        city: str
        country: str
    
    async def main():
        company_name_to_search = "20n Bio"
        # Use test data
        search_results_with_verification = await search_company_summary(
            company_name=company_name_to_search, use_test_data=True
        )
        print("\n--- Final Results with Verification ---")
        print(json.dumps(search_results_with_verification, indent=2))
    
        # Using OpenRouter API model
        openrouter_model = OpenAIModel(
            'meta-llama/llama-3.3-70b-instruct',
            base_url='https://openrouter.ai/api/v1',
            api_key=OPENROUTER_API_KEY,
        )
        agent = Agent(openrouter_model, result_type=CityLocation, result_tool_name='city_location')
    
        # Using GPT 4o MINI model
        agent = Agent('openai:gpt-4o-mini', result_type=CityLocation, result_tool_name='city_location')
        result = await agent.run('Where the olympics held in 2012?')
        print(result.data)
        # Expected output:
        #   city='London' country='United Kingdom'
    
        print(result.cost())
    
    if __name__ == "__main__":
        asyncio.run(main())
  2. Run the script with OpenRouter API:

    (parsely_v012325) PS D:\VSCode Projects\parsely_Jan25> & "d:/VSCode Projects/parsely_Jan25/parsely_v012325/Scripts/python.exe" "d:/VSCode Projects/parsely_Jan25/test.py"
    d:\VSCode Projects\parsely_Jan25\test.py:375: LogfireNotConfiguredWarning: No logs or spans will be created until logfire.configure() has been called. Set the environment variable LOGFIRE_IGNORE_NO_CONFIG=1 or add ignore_no_config=true in pyproject.toml to suppress this warning.
    result = await agent.run('Where the olympics held in 2012?')
    INFO:httpx:HTTP Request: POST https://openrouter.ai/api/v1/chat/completions "HTTP/1.1 200 OK"
    INFO:httpx:HTTP Request: POST https://openrouter.ai/api/v1/chat/completions "HTTP/1.1 200 OK"
    Traceback (most recent call last):
      File "d:\VSCode Projects\parsely_Jan25\test.py", line 382, in 
        asyncio.run(main())
      File "C:\Users\hshum\AppData\Local\Programs\Python\Python311\Lib\asyncio\runners.py", line 190, in run
        return runner.run(main)
                 ^^^^^^^^^^^^^^^
      File "C:\Users\hshum\AppData\Local\Programs\Python\Python311\Lib\asyncio\runners.py", line 118, in run
        return self._loop.run_until_complete(task)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
      File "C:\Users\hshum\AppData\Local\Programs\Python\Python311\Lib\asyncio\base_events.py", line 653, in run_until_complete
        return future.result()
                 ^^^^^^^^^^^^^^
      File "d:\VSCode Projects\parsely_Jan25\test.py", line 375, in main
        result = await agent.run('Where the olympics held in 2012?')
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
      File "D:\VSCode Projects\parsely_Jan25\parsely_v012325\Lib\site-packages\pydantic_ai\agent.py", line 311, in run
        final_result, tool_responses = await self._handle_model_response(
                                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
      File "D:\VSCode Projects\parsely_Jan25\parsely_v012325\Lib\site-packages\pydantic_ai\agent.py", line 1108, in _handle_model_response
        return await self._handle_text_response(text, run_context, result_schema)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
      File "D:\VSCode Projects\parsely_Jan25\parsely_v012325\Lib\site-packages\pydantic_ai\agent.py", line 1126, in _handle_text_response
        self._incr_result_retry(run_context)
      File "D:\VSCode Projects\parsely_Jan25\parsely_v012325\Lib\site-packages\pydantic_ai\agent.py", line 1300, in _incr_result_retry
        raise exceptions.UnexpectedModelBehavior(
    pydantic_ai.exceptions.UnexpectedModelBehavior: Exceeded maximum retries (1) for result validation
    (parsely_v012325) PS D:\VSCode Projects\parsely_Jan25>
    
  3. Run the script with GPT 4o MINI (Successful Case):

    (parsely_v012325) PS D:\VSCode Projects\parsely_Jan25> & "d:/VSCode Projects/parsely_Jan25/parsely_v012325/Scripts/python.exe" "d:/VSCode Projects/parsely_Jan25/test.py"
    d:\VSCode Projects\parsely_Jan25\test.py:375: LogfireNotConfiguredWarning: No logs or spans will be created until logfire.configure() has been called. Set the environment variable LOGFIRE_IGNORE_NO_CONFIG=1 or add ignore_no_config=true in pyproject.toml to suppress this warning.
    result = await agent.run('Where the olympics held in 2012?')
    INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
    city='London' country='United Kingdom'
    

Additional Context:

The issue seems to be related to the way the OpenRouter API handles structured responses. The error message indicates that the model behavior is unexpected, and the code exceeds the maximum retries for result validation:

raise exceptions.UnexpectedModelBehavior(
    pydantic_ai.exceptions.UnexpectedModelBehavior: Exceeded maximum retries (1) for result validation

I suspect that the OpenRouter API does not allow plain text responses and expects a structured function call response. Meanwhile, the GPT 4o MINI model handles the structured response correctly.


Core Function Example (Related to Entity Description):

async def process_entity_description(content_text: str) -> EntityAndDescription:
    """
    Processes the input content text to extract entity information and break it into sections.

    Args:
        content_text: The input text content.

    Returns:
        An EntityAndDescription object containing entity name, overall description, and section breakdowns.
    """
    try:
        with capture_run_messages() as messages:
            entity_and_description_agent_result = await entity_and_description_agent.run(
                user_prompt=(
                    "Please break down the following text into smaller, manageable sections, "
                    "identify the key points, and provide a clear, decontextualized description for each section:\n\n"
                    f"{content_text}"
                )
            )
            
            return entity_and_description_agent_result.data

    except UnexpectedModelBehavior as e:
        logger.error(f"UnexpectedModelBehavior in process_entity_description: {e}")
        logger.error(f"Cause: {repr(e.__cause__)}")
        logger.error(f"Messages: {messages}")
        raise e
    except Exception as e:
        logger.error(f"Error in process_entity_description: {e}")
        raise e

Expected Behavior:

  • When using the OpenRouter API, the model should return a structured response without exceeding the retry limit.
  • The agent should successfully validate the result and output the expected structured data.

Actual Behavior:

  • The OpenRouter API returns a plain text response, leading to a validation failure after one retry, and the following error is raised:

    pydantic_ai.exceptions.UnexpectedModelBehavior: Exceeded maximum retries (1) for result validation
    
  • In contrast, using the GPT 4o MINI model produces the expected output.


Environment:

  • Python Version: 3.11 (as per traceback)
  • Relevant Libraries/Packages: pydantic_ai, httpx
  • APIs: OpenRouter API (https://openrouter.ai/api/v1), OpenAI API (for GPT 4o MINI)

Additional Notes:

  • The warning about LogfireNotConfiguredWarning is noted but does not appear to be directly related to the structured response issue.
  • The issue was identified by closely inspecting the captured run messages and traceback.

Request:

Could you please investigate why the OpenRouter API is not returning the structured response and causing a result validation failure? Any insights into how to configure the API or modify the code to handle this behavior would be appreciated.


@samuelcolvin
Copy link
Member

Please format the could and terminal output as markdown properly.

This is unreadable.

@samuelcolvin samuelcolvin added the more info More information required label Jan 31, 2025
@HomenShum
Copy link
Author

Please format the could and terminal output as markdown properly.

This is unreadable.

Thank you for the feedback, updated with edit.

@sydney-runkle
Copy link
Member

Hmm, this just seems like a model limitation to me...

@jonchun
Copy link

jonchun commented Feb 7, 2025

Agreed with @sydney-runkle - Llama 70b can struggle to properly call tools sometimes. Can you try posting some DEBUG level output so we can see the actual HTTP requests/responses from httpx? Make sure you strip out any sensitive data -- can't remember if it's included off the top of my head.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
more info More information required
Projects
None yet
Development

No branches or pull requests

4 participants