Pydantic conversion logic for structured outputs is broken for models containing dictionaries #2004

dbczumar · 2025-01-10T01:38:33Z

Confirm this is an issue with the Python library and not an underlying OpenAI API

This is an issue with the Python library

Describe the bug

There's a bug in OpenAI's python client logic for translating pydantic models with dictionaries into structured outputs JSON schema definitions: dictionaries are always required to be empty in the resulting JSON schema, rendering the dictionary outputs significantly less useful since the LLM is never allowed to populate them

I've filed a small PR to fix this and introduce test coverage: #2003

To Reproduce

import json
from typing import Any, Dict

import pydantic

from openai.lib._pydantic import to_strict_json_schema

class GenerateToolCallArguments(pydantic.BaseModel):
    arguments: Dict[str, Any] = pydantic.Field(description="The arguments to pass to the tool")

print(json.dumps(to_strict_json_schema(GenerateToolCallArguments), indent=4))

Observe that the output inserts additionalProperties: False into the resulting JSON schema definition, meaning that the dictionary must always be empty:

{
    "properties": {
        "arguments": {
            "description": "The arguments to pass to the tool",
            "title": "Arguments",
            "type": "object",
            # THE INSERTION OF THIS LINE IS A BUG
            "additionalProperties": false
        }
    },
    "required": [
        "arguments"
    ],
    "title": "GenerateToolCallArguments",
    "type": "object",
    "additionalProperties": false
}

Code snippets

No response

OS

macOS

Python version

Python v3.10.12

Library version

1.59.6

The text was updated successfully, but these errors were encountered:

dbczumar · 2025-01-10T01:48:10Z

Tagging @RobertCraigie for visibility, just in case (saw that you've been active on recent issues) :)

BrunoScaglione · 2025-01-11T23:20:44Z

I'm having the same issue, can confirm that models with dictionaries is the root problem. But i checked the documentation again, and they do talk about only allowing additionalProperties=false.

dbczumar · 2025-01-15T22:19:21Z

@RobertCraigie Any updates or additional thoughts here?

dvschuyl · 2025-01-24T12:08:43Z

I have also encountered the same issue. After some tinkering, I found some more types that resulted in errors. The only buildin collection type that doesn't seem to be affected is the list.

My code (python 3.13.1):

import json
from pydantic import BaseModel
from openai.lib._pydantic import to_strict_json_schema
from openai import OpenAI


class Schema(BaseModel):
    # Python buildin collections
    # `range` and `bytearray` are not supported types, so I didn't include them

    # tuple_field: tuple[int, int, int]
    list_field: list[int]
    # dict_field: dict[int, int]
    # set_field: set[int]
    # frozenset_field: frozenset[int]
    # bytes_field: bytes


print(json.dumps(to_strict_json_schema(Schema), indent=4))

api_key = ...
with OpenAI(api_key=api_key) as client:
    response = client.beta.chat.completions.parse(
        model="gpt-4o",
        messages=[{"role": "user", "content": "Fill the schema with random values"}],
        response_format=Schema,
    )

print(json.dumps(response.choices[0].message.model_dump()["content"], indent=4))

dbczumar · 2025-02-19T08:43:13Z

@RobertCraigie Any updates or additional thoughts here?

dbczumar added the bug Something isn't working label Jan 10, 2025

dbczumar linked a pull request Jan 10, 2025 that will close this issue

BUG FIX: Pydantic conversion logic for structured outputs is broken for models containing dictionaries #2003

Open

1 task

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Pydantic conversion logic for structured outputs is broken for models containing dictionaries #2004

Pydantic conversion logic for structured outputs is broken for models containing dictionaries #2004

dbczumar commented Jan 10, 2025

dbczumar commented Jan 10, 2025

BrunoScaglione commented Jan 11, 2025 •

edited

Loading

dbczumar commented Jan 15, 2025

dvschuyl commented Jan 24, 2025

dbczumar commented Feb 19, 2025

Pydantic conversion logic for structured outputs is broken for models containing dictionaries #2004

Pydantic conversion logic for structured outputs is broken for models containing dictionaries #2004

Comments

dbczumar commented Jan 10, 2025

Confirm this is an issue with the Python library and not an underlying OpenAI API

Describe the bug

To Reproduce

Code snippets

OS

Python version

Library version

dbczumar commented Jan 10, 2025

BrunoScaglione commented Jan 11, 2025 • edited Loading

dbczumar commented Jan 15, 2025

dvschuyl commented Jan 24, 2025

dbczumar commented Feb 19, 2025

BrunoScaglione commented Jan 11, 2025 •

edited

Loading