write results.json for structured agent consumption, harden crash diagnostics#331
write results.json for structured agent consumption, harden crash diagnostics#331shichangs wants to merge 2 commits intokarpathy:masterfrom
Conversation
…gnostics train.py now writes a results.json file after evaluation with the same metrics already printed to stdout. This gives agents a structured, parseable results channel instead of relying on grepping free-form stdout. program.md updated to read results.json first (fallback to grep), and to use filtered grep for crash diagnostics instead of raw tail, reducing the surface for indirect prompt injection via training output. Fixes karpathy#64 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
cc @johnwaldo (issue author) @mvanhorn (PR #79 author) — would appreciate your review. This takes a similar approach to #79 but rebased cleanly on current main. Offline verificationThe import json
# Simulate what train.py writes
results = {
"val_bpb": 0.9979,
"training_seconds": 300.1,
"total_seconds": 325.9,
"peak_vram_mb": 45060.2,
"mfu_percent": 39.8,
"total_tokens_M": 499.6,
"num_steps": 953,
"num_params_M": 50.3,
"depth": 8,
}
with open("results.json", "w") as f:
json.dump(results, f, indent=2)
# Verify roundtrip
with open("results.json") as f:
loaded = json.load(f)
assert loaded == results
assert isinstance(loaded["val_bpb"], float)
assert isinstance(loaded["num_steps"], int)
print("✓ results.json roundtrip OK")The key behavioral changes:
|
johnwaldo
left a comment
There was a problem hiding this comment.
Clean, minimal PR — correct mitigation layer for #64. Structured JSON eliminates the agent's need to parse free-form stdout, and the crash-path grep filter is a meaningful improvement over raw tail. A few observations:
Stale results.json on killed runs
If train.py is killed mid-run (the 10-minute timeout described in program.md), a results.json from a previous successful run would still exist on disk. The agent would read valid JSON with stale metrics and treat it as current — a silent wrong-answer bug.
Suggest deleting results.json at the top of train.py before training starts:
# Clear previous results so a killed run leaves no file
if os.path.exists("results.json"):
os.remove("results.json")This preserves the "no file = crash" contract that program.md step 5/6 relies on.
Crash-path grep is narrower but still an injection surface
The grep -i "error|exception|traceback" filter is better than raw tail -n 50, but an attacker can still craft output containing those words to get text into the agent's context. Worth noting this is mitigation, not elimination — a structured error.json on the crash path would close it fully, but that's arguably out of scope for this PR.
PR #79 cleanup
Since this supersedes #79 (same approach, rebased on current main), it would be good to close #79 with a pointer here to avoid confusion.
If train.py is killed mid-run (e.g. timeout), a results.json from a previous successful run would still exist on disk, causing the agent to read valid JSON with stale metrics. Delete it early so that "no file = crash" contract holds. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
Thanks for the thorough review @johnwaldo!
@mvanhorn Closing #79 in favor of this one — same approach, rebased on current main. Thanks for the original work there! |
|
Nice rebase. The crash diagnostics filter is a good addition over my original #79. Closing mine in favor of this. |
Summary
Fixes #64
train.py: writesresults.jsonafter evaluation with the same metrics already printed to stdout. Gives agents a structured, parseable results channel instead of relying on grepping free-form stdout fromrun.log.program.md: instructs agent to readresults.jsonfirst (fallback to grep). Crash diagnostics now use filteredgrep -i "error|exception|traceback"instead of rawtail -n 50, reducing the surface for indirect prompt injection via training output..gitignore: addsresults.jsonandrun.logas runtime artifacts.Existing stdout output is unchanged. No new dependencies (
jsonis stdlib).Similar to #79 but rebased on current main with no merge conflicts.
Test plan
uv run train.pyand verifyresults.jsonis written with correct metricsresults.jsonvalues match stdout summaryresults.jsonis NOT written🤖 Generated with Claude Code