feat: real-time web dashboard for experiment monitoring by Death-Incarnate · Pull Request #334 · karpathy/autoresearch

Death-Incarnate · 2026-03-18T18:37:00Z

What this adds

A single-file FastAPI dashboard (dashboard.py) that gives autoresearch a proper UI for monitoring and controlling experiments in real time.

Start it

cd autoresearch
uv run dashboard.py
# open http://localhost:7788

Set REPO_DIR env var to point at any autoresearch clone. Set PORT to change from 7788.

Tabs

Overview

KPI cards: best val_bpb, baseline, improvement %, total/kept/discarded run counts
val_bpb scatter chart colored by status (keep=green, discard=red, crash=gray)
Status pie chart
Live GPU stats: utilization %, VRAM used, temperature (polls nvidia-smi)

Live Training

Step progress bar with % complete
Loss + mfu% dual-axis chart, updates every 2 seconds via polling
SSE log stream from run.log — color-coded (amber for step lines, green for summary blocks, red for errors)
Autoscroll toggle

Experiments

Full results table sorted newest-first, best run highlighted in gold
Click any commit hash → inline diff of that experiment's train.py changes

Git History

Last 40 commits with click-to-diff

Controls

Launch form: run tag, model picker, max experiments
Stop button for the running experiment
Live hyperparameter table parsed directly from train.py

Settings

Edit program.md in-browser with save
Read-only train.py viewer

Implementation notes

Zero frontend dependencies — all HTML/CSS/JS served inline from the single Python file
FastAPI + uvicorn only (added to pyproject.toml)
SSE endpoint for streaming log tail to the browser
All file reads are relative to REPO_DIR so it works with any clone location

Single-file FastAPI dashboard (dashboard.py) with: - Overview: KPI cards (best val_bpb, baseline, improvement %, run counts), val_bpb scatter chart colored by keep/discard/crash, status pie chart, live GPU stats (util %, VRAM, temperature) - Live Training: step progress bar, loss + mfu% dual-axis chart (2s poll), SSE log stream with color-coded output, autoscroll toggle - Experiments: sortable table, best run highlighted in gold, click commit hash to see inline train.py diff - Git History: last 40 commits with click-to-diff - Controls: launch form (run tag, model picker), stop button, live hyperparams table parsed from train.py - Settings: edit program.md in-browser, read-only train.py viewer Zero frontend deps — pure HTML/CSS/JS served inline from the single Python file. Configure repo path via REPO_DIR env var, port via PORT env var (default 7788). Start with: cd autoresearch && uv run dashboard.py Adds fastapi>=0.135.1 and uvicorn>=0.42.0 to pyproject.toml.

Death-Incarnate · 2026-03-18T19:03:58Z

As context for why this matters right now: Alexey Grigorev just wrote a breakdown of autoresearch that's getting traction — https://alexeyondata.substack.com/p/karpathys-autoresearch-went-viral

One of the main things people hit when they clone this repo and start running experiments is zero visibility into what's happening. The training loop is a black box unless you're tailing logs manually.

The dashboard fills that gap — live loss curve, GPU stats, log stream, experiment history — all without adding any dependencies to the core research loop. It's opt-in and read-only relative to the experiment state.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: real-time web dashboard for experiment monitoring#334

feat: real-time web dashboard for experiment monitoring#334
Death-Incarnate wants to merge 1 commit intokarpathy:masterfrom
Death-Incarnate:feature/dashboard

Death-Incarnate commented Mar 18, 2026

Uh oh!

Death-Incarnate commented Mar 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Death-Incarnate commented Mar 18, 2026

What this adds

Start it

Tabs

Implementation notes

Uh oh!

Death-Incarnate commented Mar 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant