Skip to content

Conversation

@Occupying-Mars
Copy link

Description

Adds ART ↔ verifiers integration under integrations/art_framework:

  • art_framework.py: loads ART task configs, converts tools to ToolEnv, exact-match rubric by default, optional JudgeRubric.
  • utils/art_adapter.py: ART→verifiers tool conversion with strict JSON schemas (no additionalProperties).
  • utils/verifiers_adapter.py: verifiers→ART export (schema-only).
  • examples/calculator.json: small, real, runnable example.
  • test_env.py: unit test for conversion, parser, and reward.

Key behaviors:

  • Enforces strict JSON tool schemas compatible with agents’ validator.
  • ARTParser extracts the final answer from the configured completion tool.
  • Works as an integration (installed via vf-install, not a core env).

Type of Change

  • New feature (non-breaking change which adds functionality)
  • Documentation update
  • Test improvement

Testing

  • All existing tests pass when running uv run pytest locally.
  • New tests have been added to cover the changes

What I ran:

  • Unit test (no network):
    • uv run pytest integrations/art_framework/test_env.py -q
  • Local install + eval (exact-match rubric):
    • uv run vf-install art_framework -p integrations
    • UV_NO_PROJECT=1 uv run vf-eval -s art_framework -a '{"task_config_path":"integrations/art_framework/examples/calculator.json"}' -m gpt-5-nano -n 2 -r 1
  • Optional judge (requires OPENAI_API_KEY):
    • export OPENAI_API_KEY=sk-...
    • UV_NO_PROJECT=1 uv run vf-eval -s art_framework -a '{"task_config_path":"integrations/art_framework/examples/calculator.json","use_llm_judge":true,"judge_model":"gpt-5-nano"}' -m gpt-5-nano -n 2 -r 1

i tested it myself since it's just integration didn't really need to see rollouts but it works
Screenshot 2025-10-28 at 5 27 57 PM

Checklist

  • My code follows the style guidelines of this project as outlined in AGENTS.md
    • ruff clean; explicit types where useful; fail-fast error handling
  • I have performed a self-review of my own code
  • I have commented code where non-obvious (conversion, strict schema rationale)
  • I have made corresponding changes to the documentation
    • integrations/art_framework/README.md with Overview, Quickstart, Env Args, ART config format, Portability
  • My changes generate no new warnings
  • Any dependent changes have been merged and published
    • None; integration is self-contained and optional

Additional Notes

  • Alignment: Follows integrations layout and roadmap; mirrors patterns from MCP and wiki_search (ToolEnv + optional JudgeRubric).
  • Strict schema: Tools are generated with explicit parameters (no **kwargs) to satisfy agents’ strict JSON schema validation (no additionalProperties).
  • Portability: Includes export helper and runnable example config.
  • Known limitations: Example implementation supports simple lambdas for demos; real deployments should provide proper functions/modules.

@CLAassistant
Copy link

CLAassistant commented Oct 28, 2025

CLA assistant check
All committers have signed the CLA.

@Occupying-Mars
Copy link
Author

@willccbb (sorry for the tag again just looking for a review)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants