Skip to content

how to use custom json schema for agent's output? #528

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
x0day opened this issue Apr 16, 2025 · 7 comments
Open

how to use custom json schema for agent's output? #528

x0day opened this issue Apr 16, 2025 · 7 comments
Assignees
Labels
question Question about using the SDK

Comments

@x0day
Copy link

x0day commented Apr 16, 2025

Currently, the output_type parameter of Agent can only pass pydantic's Model to assist in generating json schema.

but pydantic has some limitations when generating json schema for LLM, such as multi-layer $defs references and Optional conversions. and this will affect the performance of some third-party LLMs.

@x0day x0day added the question Question about using the SDK label Apr 16, 2025
@rm-openai
Copy link
Collaborator

You don't need to pass a pydantic object. Most Python objects will work. For example a dataclass

@dataclass
class Output:
    joke: str


async def main():
    agent = Agent(
        name="Assistant",
        instructions="You only respond in haikus.",
        output_type=Output,
    )

    result = await Runner.run(agent, "Tell me about a joke about programming.")
    print(result.final_output)

Would that work for you?

@domino14
Copy link

hi @rm-openai . I'm not the original poster, but I just can't get this to work with any model that is even slightly complex. I moved all my Pydantic models to use "dataclass", and I have this model:

@dataclass
class OnError:
    error_type: str
    next: str = ""


@dataclass
class ConditionalBranch:
    condition: str
    next: str


@dataclass
class BaseNode:
    id: str
    type: str
    next: str = ""
    description: str = ""
    on_error: list[OnError] = field(default_factory=list)
    service: str = ""
    action: str = ""
    # inputs: dict[str, str] = field(default_factory=dict)
    # parameters: dict[str, str] = field(default_factory=dict)
    agent: str = ""
    branches: list[ConditionalBranch] = field(default_factory=list)
    parallel_branches: list[list[Any]] = field(default_factory=list)

Notice that I commented out "inputs" and "parameters" because I was getting this error:

agents.exceptions.UserError: additionalProperties should not be set for object types. This could be because you're using an older version of Pydantic, or because you configured additional properties to be allowed. If you really need this, update the function or output tool to not use a strict schema.

After commenting those out, I get this error:

openai.BadRequestError: Error code: 400 - {'error': {'message': "Invalid schema for response_format 'final_output': In context=('properties', 'description'), 'default' is not permitted.", 'type': 'invalid_request_error', 'param': 'text.format.schema', 'code': 'invalid_json_schema'}}

It doesn't like the default empty string value for my description field of BaseNode? What am I doing wrong?

@rm-openai
Copy link
Collaborator

@domino14 ah yes sorry about this. TLDR you're running into an issue with structured outputs. You can read more about the requirements for the JSON schema here: https://platform.openai.com/docs/guides/structured-outputs#supported-schemas

I know that not every schema fits these requirements, so I'll look into a fix that will make this possible.

@domino14
Copy link

@rm-openai thanks for getting back to me.

How do my default values of "" break the schema?

If I am understanding correctly, I can't use arbitrary dictionaries like dict[str, str] because Structured Outputs requires all keys to be specified. Is this true?

@rm-openai
Copy link
Collaborator

@domino14 yeah that's right - we can't guarantee valid JSON for that case. This PR (#539) should fix things for you, you'll just need to pass output_schema_strict=False

@x0day
Copy link
Author

x0day commented Apr 18, 2025

@rm-openai Thanks for your reply, as you mentioned. The generated schema is like this.

import json
from dataclasses import dataclass, field
from typing import Any, Optional
from pydantic import TypeAdapter

@dataclass
class OnError:
    error_type: str
    next: str = ""


@dataclass
class ConditionalBranch:
    condition: str
    next: str


@dataclass
class BaseNode:
    on_error: list[OnError] = field(default_factory=list)
    action: Optional[str] = ""
    branches: list[ConditionalBranch] = field(default_factory=list)
    parallel_branches: list[list[Any]] = field(default_factory=list)

print(json.dumps(TypeAdapter(BaseNode).json_schema(), indent=2))
{
  "$defs": {
    "ConditionalBranch": {
      "properties": {
        "condition": {
          "title": "Condition",
          "type": "string"
        },
        "next": {
          "title": "Next",
          "type": "string"
        }
      },
      "required": [
        "condition",
        "next"
      ],
      "title": "ConditionalBranch",
      "type": "object"
    },
    "OnError": {
      "properties": {
        "error_type": {
          "title": "Error Type",
          "type": "string"
        },
        "next": {
          "default": "",
          "title": "Next",
          "type": "string"
        }
      },
      "required": [
        "error_type"
      ],
      "title": "OnError",
      "type": "object"
    }
  },
  "properties": {
    "on_error": {
      "items": {
        "$ref": "#/$defs/OnError"
      },
      "title": "On Error",
      "type": "array"
    },
    "action": {
      "anyOf": [
        {
          "type": "string"
        },
        {
          "type": "null"
        }
      ],
      "default": "",
      "title": "Action"
    },
    "branches": {
      "items": {
        "$ref": "#/$defs/ConditionalBranch"
      },
      "title": "Branches",
      "type": "array"
    },
    "parallel_branches": {
      "items": {
        "items": {},
        "type": "array"
      },
      "title": "Parallel Branches",
      "type": "array"
    }
  },
  "title": "BaseNode",
  "type": "object"
}

But models like qwen cannot understand the "$ref": "#/$defs/ConditionalBranch" in json schema and also don't support {"type": "null"} in the schema.
so we want add output_schema for Agent, like params_json_schema in FunctionTool.

@rm-openai
Copy link
Collaborator

@x0day makes sense. Will send a PR soon.

@rm-openai rm-openai self-assigned this Apr 18, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Question about using the SDK
Projects
None yet
Development

No branches or pull requests

3 participants