Problem
TOON cuts bytes, but LLMs are trained on JSON. Switching text to TOON could lower task-success rates. This is the go/no-go gate for #913's TOON PR.
Proposed solution
- Run the existing workflow eval harness (
pnpm run evals:workflow, evals/workflows/results.json) on master vs a TOON branch.
- Gate: no task-success regression. Record byte savings on the same runs.
- Capture results for the blog post.
Part of #941. Blocks #913 PR 2.
Problem
TOON cuts bytes, but LLMs are trained on JSON. Switching
textto TOON could lower task-success rates. This is the go/no-go gate for #913's TOON PR.Proposed solution
pnpm run evals:workflow,evals/workflows/results.json) on master vs a TOON branch.Part of #941. Blocks #913 PR 2.