Skip to content

Add typed structured-output round-trip helpers#196

Open
SalimELMARDI wants to merge 2 commits into
exa-labs:masterfrom
SalimELMARDI:feat/typed-structured-output-roundtrip
Open

Add typed structured-output round-trip helpers#196
SalimELMARDI wants to merge 2 commits into
exa-labs:masterfrom
SalimELMARDI:feat/typed-structured-output-roundtrip

Conversation

@SalimELMARDI

Copy link
Copy Markdown

Summary

This PR adds first-class typed structured-output round-tripping across the Python SDK’s core schema-enabled surfaces.

Today, the SDK already accepts structured schemas in several places, but callers still need to manually parse or validate the returned payloads themselves. This PR makes that flow consistent and ergonomic: when a user passes a schema, the SDK now exposes the validated typed object directly on the response.

What’s Included

New typed response fields

  • AnswerResponse.parsed_output
  • Result.parsed_summary
  • DeepSearchOutput.parsed_content

Supported round-trip flows

  • answer(..., output_schema=MyModel)
  • get_contents(..., summary={"schema": MyModel})
  • search(..., contents={"summary": {"schema": MyModel}})
  • search(..., type="deep" | "deep-reasoning", output_schema=MyModel)
  • async equivalents for the same surfaces

Additional improvements

  • Deep search now accepts Pydantic models directly as output_schema, not just raw dict schemas
  • Nested summary schemas in contents={"summary": {"schema": ...}} preserve field names correctly during request serialization
  • Added tests, docs, and a focused example for the typed round-trip workflow

Why This Matters

This improves a very common research / eval / agent workflow:

Before:

  1. pass a schema into the SDK
  2. receive raw dict / str
  3. manually call json.loads(...), model_validate(...), or model_validate_json(...)

After:

  1. pass a schema into the SDK
  2. read the typed object directly from parsed_output, parsed_summary, or parsed_content

That makes the SDK more useful for:

  • evaluation pipelines
  • research notebooks
  • literature synthesis
  • agent workflows
  • structured automation
  • reproducible downstream processing

Before / After

Measured locally with the smoke scripts in personal/structured_output_roundtrip_smoke/:

Before

  • surfaces tested: 3
  • native typed surfaces: 0
  • manual validation required: 3

After

  • surfaces tested: 3
  • native typed surfaces: 3
  • manual validation required: 0

Tested surfaces:

  • answer
  • summary
  • deep_search

This PR is about improving structured-output ergonomics and consistency, not backend latency or model quality.

Backwards Compatibility

This change is fully additive:

  • existing response fields like answer, summary, and output.content are unchanged
  • existing dict-schema workflows continue to work
  • new typed fields are optional convenience surfaces layered on top of existing behavior

Validation

  • Added unit coverage for sync and async typed round-trip behavior
  • Added schema field-name preservation coverage for summary schemas
  • Full unit suite passes locally

Example

from pydantic import BaseModel, Field
from exa_py import Exa


class CompanyAnswer(BaseModel):
    company_name: str = Field(description="Company being described")
    core_focus: str = Field(description="Primary product or market focus")


exa = Exa(api_key="your-api-key")

response = exa.answer(
    "Summarize what Exa does for AI teams.",
    output_schema=CompanyAnswer,
)

print(response.parsed_output.company_name if response.parsed_output else None)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant