OpenRouter API leads to pydantic_ai.exceptions.UnexpectedModelBehavior: Exceeded maximum retries (1) for result validation. Deeper dive shown: "Plain text responses are not permitted, please call one of the functions instead"


**Title:** OpenRouter API Not Providing Structured Response - Exceeded Maximum Retries for Result Validation

---

**Description:**

When using the OpenRouter API with `meta-llama/llama-3.3-70b-instruct`, I encounter an issue where the structured response is not returned, resulting in an exception after exceeding the maximum number of retries. However, the same code works as expected when using the `openai:gpt-4o-mini` model.

---

**Steps to Reproduce:**

1. **Setup Code:**

   ```python
   import asyncio
   import json
   from pydantic import BaseModel
   from some_module import (
       search_company_summary,
       OpenAIModel,
       Agent,
       OPENROUTER_API_KEY,
   )

   class CityLocation(BaseModel):
       city: str
       country: str

   async def main():
       company_name_to_search = "20n Bio"
       # Use test data
       search_results_with_verification = await search_company_summary(
           company_name=company_name_to_search, use_test_data=True
       )
       print("\n--- Final Results with Verification ---")
       print(json.dumps(search_results_with_verification, indent=2))

       # Using OpenRouter API model
       openrouter_model = OpenAIModel(
           'meta-llama/llama-3.3-70b-instruct',
           base_url='https://openrouter.ai/api/v1',
           api_key=OPENROUTER_API_KEY,
       )
       agent = Agent(openrouter_model, result_type=CityLocation, result_tool_name='city_location')

       # Using GPT 4o MINI model
       agent = Agent('openai:gpt-4o-mini', result_type=CityLocation, result_tool_name='city_location')
       result = await agent.run('Where the olympics held in 2012?')
       print(result.data)
       # Expected output:
       #   city='London' country='United Kingdom'

       print(result.cost())

   if __name__ == "__main__":
       asyncio.run(main())
   ```

2. **Run the script with OpenRouter API:**

   ```plaintext
   (parsely_v012325) PS D:\VSCode Projects\parsely_Jan25> & "d:/VSCode Projects/parsely_Jan25/parsely_v012325/Scripts/python.exe" "d:/VSCode Projects/parsely_Jan25/test.py"
   d:\VSCode Projects\parsely_Jan25\test.py:375: LogfireNotConfiguredWarning: No logs or spans will be created until logfire.configure() has been called. Set the environment variable LOGFIRE_IGNORE_NO_CONFIG=1 or add ignore_no_config=true in pyproject.toml to suppress this warning.
   result = await agent.run('Where the olympics held in 2012?')
   INFO:httpx:HTTP Request: POST https://openrouter.ai/api/v1/chat/completions "HTTP/1.1 200 OK"
   INFO:httpx:HTTP Request: POST https://openrouter.ai/api/v1/chat/completions "HTTP/1.1 200 OK"
   Traceback (most recent call last):
     File "d:\VSCode Projects\parsely_Jan25\test.py", line 382, in 
       asyncio.run(main())
     File "C:\Users\hshum\AppData\Local\Programs\Python\Python311\Lib\asyncio\runners.py", line 190, in run
       return runner.run(main)
                ^^^^^^^^^^^^^^^
     File "C:\Users\hshum\AppData\Local\Programs\Python\Python311\Lib\asyncio\runners.py", line 118, in run
       return self._loop.run_until_complete(task)
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
     File "C:\Users\hshum\AppData\Local\Programs\Python\Python311\Lib\asyncio\base_events.py", line 653, in run_until_complete
       return future.result()
                ^^^^^^^^^^^^^^
     File "d:\VSCode Projects\parsely_Jan25\test.py", line 375, in main
       result = await agent.run('Where the olympics held in 2012?')
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
     File "D:\VSCode Projects\parsely_Jan25\parsely_v012325\Lib\site-packages\pydantic_ai\agent.py", line 311, in run
       final_result, tool_responses = await self._handle_model_response(
                                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
     File "D:\VSCode Projects\parsely_Jan25\parsely_v012325\Lib\site-packages\pydantic_ai\agent.py", line 1108, in _handle_model_response
       return await self._handle_text_response(text, run_context, result_schema)
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
     File "D:\VSCode Projects\parsely_Jan25\parsely_v012325\Lib\site-packages\pydantic_ai\agent.py", line 1126, in _handle_text_response
       self._incr_result_retry(run_context)
     File "D:\VSCode Projects\parsely_Jan25\parsely_v012325\Lib\site-packages\pydantic_ai\agent.py", line 1300, in _incr_result_retry
       raise exceptions.UnexpectedModelBehavior(
   pydantic_ai.exceptions.UnexpectedModelBehavior: Exceeded maximum retries (1) for result validation
   (parsely_v012325) PS D:\VSCode Projects\parsely_Jan25>
   ```

3. **Run the script with GPT 4o MINI (Successful Case):**

   ```plaintext
   (parsely_v012325) PS D:\VSCode Projects\parsely_Jan25> & "d:/VSCode Projects/parsely_Jan25/parsely_v012325/Scripts/python.exe" "d:/VSCode Projects/parsely_Jan25/test.py"
   d:\VSCode Projects\parsely_Jan25\test.py:375: LogfireNotConfiguredWarning: No logs or spans will be created until logfire.configure() has been called. Set the environment variable LOGFIRE_IGNORE_NO_CONFIG=1 or add ignore_no_config=true in pyproject.toml to suppress this warning.
   result = await agent.run('Where the olympics held in 2012?')
   INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
   city='London' country='United Kingdom'
   ```

---

**Additional Context:**

The issue seems to be related to the way the OpenRouter API handles structured responses. The error message indicates that the model behavior is unexpected, and the code exceeds the maximum retries for result validation:

```python
raise exceptions.UnexpectedModelBehavior(
    pydantic_ai.exceptions.UnexpectedModelBehavior: Exceeded maximum retries (1) for result validation
```

I suspect that the OpenRouter API does not allow plain text responses and expects a structured function call response. Meanwhile, the GPT 4o MINI model handles the structured response correctly.

---

**Core Function Example (Related to Entity Description):**

```python
async def process_entity_description(content_text: str) -> EntityAndDescription:
    """
    Processes the input content text to extract entity information and break it into sections.

    Args:
        content_text: The input text content.

    Returns:
        An EntityAndDescription object containing entity name, overall description, and section breakdowns.
    """
    try:
        with capture_run_messages() as messages:
            entity_and_description_agent_result = await entity_and_description_agent.run(
                user_prompt=(
                    "Please break down the following text into smaller, manageable sections, "
                    "identify the key points, and provide a clear, decontextualized description for each section:\n\n"
                    f"{content_text}"
                )
            )
            
            return entity_and_description_agent_result.data

    except UnexpectedModelBehavior as e:
        logger.error(f"UnexpectedModelBehavior in process_entity_description: {e}")
        logger.error(f"Cause: {repr(e.__cause__)}")
        logger.error(f"Messages: {messages}")
        raise e
    except Exception as e:
        logger.error(f"Error in process_entity_description: {e}")
        raise e
```

---

**Expected Behavior:**

- When using the OpenRouter API, the model should return a structured response without exceeding the retry limit.
- The agent should successfully validate the result and output the expected structured data.

**Actual Behavior:**

- The OpenRouter API returns a plain text response, leading to a validation failure after one retry, and the following error is raised:

  ```plaintext
  pydantic_ai.exceptions.UnexpectedModelBehavior: Exceeded maximum retries (1) for result validation
  ```

- In contrast, using the GPT 4o MINI model produces the expected output.

---

**Environment:**

- **Python Version:** 3.11 (as per traceback)
- **Relevant Libraries/Packages:** pydantic_ai, httpx
- **APIs:** OpenRouter API (`https://openrouter.ai/api/v1`), OpenAI API (for GPT 4o MINI)

---

**Additional Notes:**

- The warning about `LogfireNotConfiguredWarning` is noted but does not appear to be directly related to the structured response issue.
- The issue was identified by closely inspecting the captured run messages and traceback.

---

**Request:**

Could you please investigate why the OpenRouter API is not returning the structured response and causing a result validation failure? Any insights into how to configure the API or modify the code to handle this behavior would be appreciated.

---



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

OpenRouter API leads to pydantic_ai.exceptions.UnexpectedModelBehavior: Exceeded maximum retries (1) for result validation. Deeper dive shown: "Plain text responses are not permitted, please call one of the functions instead" #822

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

OpenRouter API leads to pydantic_ai.exceptions.UnexpectedModelBehavior: Exceeded maximum retries (1) for result validation. Deeper dive shown: "Plain text responses are not permitted, please call one of the functions instead" #822

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions