Description
Title: OpenRouter API Not Providing Structured Response - Exceeded Maximum Retries for Result Validation
Description:
When using the OpenRouter API with meta-llama/llama-3.3-70b-instruct
, I encounter an issue where the structured response is not returned, resulting in an exception after exceeding the maximum number of retries. However, the same code works as expected when using the openai:gpt-4o-mini
model.
Steps to Reproduce:
-
Setup Code:
import asyncio import json from pydantic import BaseModel from some_module import ( search_company_summary, OpenAIModel, Agent, OPENROUTER_API_KEY, ) class CityLocation(BaseModel): city: str country: str async def main(): company_name_to_search = "20n Bio" # Use test data search_results_with_verification = await search_company_summary( company_name=company_name_to_search, use_test_data=True ) print("\n--- Final Results with Verification ---") print(json.dumps(search_results_with_verification, indent=2)) # Using OpenRouter API model openrouter_model = OpenAIModel( 'meta-llama/llama-3.3-70b-instruct', base_url='https://openrouter.ai/api/v1', api_key=OPENROUTER_API_KEY, ) agent = Agent(openrouter_model, result_type=CityLocation, result_tool_name='city_location') # Using GPT 4o MINI model agent = Agent('openai:gpt-4o-mini', result_type=CityLocation, result_tool_name='city_location') result = await agent.run('Where the olympics held in 2012?') print(result.data) # Expected output: # city='London' country='United Kingdom' print(result.cost()) if __name__ == "__main__": asyncio.run(main())
-
Run the script with OpenRouter API:
(parsely_v012325) PS D:\VSCode Projects\parsely_Jan25> & "d:/VSCode Projects/parsely_Jan25/parsely_v012325/Scripts/python.exe" "d:/VSCode Projects/parsely_Jan25/test.py" d:\VSCode Projects\parsely_Jan25\test.py:375: LogfireNotConfiguredWarning: No logs or spans will be created until logfire.configure() has been called. Set the environment variable LOGFIRE_IGNORE_NO_CONFIG=1 or add ignore_no_config=true in pyproject.toml to suppress this warning. result = await agent.run('Where the olympics held in 2012?') INFO:httpx:HTTP Request: POST https://openrouter.ai/api/v1/chat/completions "HTTP/1.1 200 OK" INFO:httpx:HTTP Request: POST https://openrouter.ai/api/v1/chat/completions "HTTP/1.1 200 OK" Traceback (most recent call last): File "d:\VSCode Projects\parsely_Jan25\test.py", line 382, in asyncio.run(main()) File "C:\Users\hshum\AppData\Local\Programs\Python\Python311\Lib\asyncio\runners.py", line 190, in run return runner.run(main) ^^^^^^^^^^^^^^^ File "C:\Users\hshum\AppData\Local\Programs\Python\Python311\Lib\asyncio\runners.py", line 118, in run return self._loop.run_until_complete(task) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\hshum\AppData\Local\Programs\Python\Python311\Lib\asyncio\base_events.py", line 653, in run_until_complete return future.result() ^^^^^^^^^^^^^^ File "d:\VSCode Projects\parsely_Jan25\test.py", line 375, in main result = await agent.run('Where the olympics held in 2012?') ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "D:\VSCode Projects\parsely_Jan25\parsely_v012325\Lib\site-packages\pydantic_ai\agent.py", line 311, in run final_result, tool_responses = await self._handle_model_response( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "D:\VSCode Projects\parsely_Jan25\parsely_v012325\Lib\site-packages\pydantic_ai\agent.py", line 1108, in _handle_model_response return await self._handle_text_response(text, run_context, result_schema) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "D:\VSCode Projects\parsely_Jan25\parsely_v012325\Lib\site-packages\pydantic_ai\agent.py", line 1126, in _handle_text_response self._incr_result_retry(run_context) File "D:\VSCode Projects\parsely_Jan25\parsely_v012325\Lib\site-packages\pydantic_ai\agent.py", line 1300, in _incr_result_retry raise exceptions.UnexpectedModelBehavior( pydantic_ai.exceptions.UnexpectedModelBehavior: Exceeded maximum retries (1) for result validation (parsely_v012325) PS D:\VSCode Projects\parsely_Jan25>
-
Run the script with GPT 4o MINI (Successful Case):
(parsely_v012325) PS D:\VSCode Projects\parsely_Jan25> & "d:/VSCode Projects/parsely_Jan25/parsely_v012325/Scripts/python.exe" "d:/VSCode Projects/parsely_Jan25/test.py" d:\VSCode Projects\parsely_Jan25\test.py:375: LogfireNotConfiguredWarning: No logs or spans will be created until logfire.configure() has been called. Set the environment variable LOGFIRE_IGNORE_NO_CONFIG=1 or add ignore_no_config=true in pyproject.toml to suppress this warning. result = await agent.run('Where the olympics held in 2012?') INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK" city='London' country='United Kingdom'
Additional Context:
The issue seems to be related to the way the OpenRouter API handles structured responses. The error message indicates that the model behavior is unexpected, and the code exceeds the maximum retries for result validation:
raise exceptions.UnexpectedModelBehavior(
pydantic_ai.exceptions.UnexpectedModelBehavior: Exceeded maximum retries (1) for result validation
I suspect that the OpenRouter API does not allow plain text responses and expects a structured function call response. Meanwhile, the GPT 4o MINI model handles the structured response correctly.
Core Function Example (Related to Entity Description):
async def process_entity_description(content_text: str) -> EntityAndDescription:
"""
Processes the input content text to extract entity information and break it into sections.
Args:
content_text: The input text content.
Returns:
An EntityAndDescription object containing entity name, overall description, and section breakdowns.
"""
try:
with capture_run_messages() as messages:
entity_and_description_agent_result = await entity_and_description_agent.run(
user_prompt=(
"Please break down the following text into smaller, manageable sections, "
"identify the key points, and provide a clear, decontextualized description for each section:\n\n"
f"{content_text}"
)
)
return entity_and_description_agent_result.data
except UnexpectedModelBehavior as e:
logger.error(f"UnexpectedModelBehavior in process_entity_description: {e}")
logger.error(f"Cause: {repr(e.__cause__)}")
logger.error(f"Messages: {messages}")
raise e
except Exception as e:
logger.error(f"Error in process_entity_description: {e}")
raise e
Expected Behavior:
- When using the OpenRouter API, the model should return a structured response without exceeding the retry limit.
- The agent should successfully validate the result and output the expected structured data.
Actual Behavior:
-
The OpenRouter API returns a plain text response, leading to a validation failure after one retry, and the following error is raised:
pydantic_ai.exceptions.UnexpectedModelBehavior: Exceeded maximum retries (1) for result validation
-
In contrast, using the GPT 4o MINI model produces the expected output.
Environment:
- Python Version: 3.11 (as per traceback)
- Relevant Libraries/Packages: pydantic_ai, httpx
- APIs: OpenRouter API (
https://openrouter.ai/api/v1
), OpenAI API (for GPT 4o MINI)
Additional Notes:
- The warning about
LogfireNotConfiguredWarning
is noted but does not appear to be directly related to the structured response issue. - The issue was identified by closely inspecting the captured run messages and traceback.
Request:
Could you please investigate why the OpenRouter API is not returning the structured response and causing a result validation failure? Any insights into how to configure the API or modify the code to handle this behavior would be appreciated.