You are an AI agent reading this repo to understand it or change it. This file is your front door: what DOS is in three lines, how to build and check your work, the short list of files actually worth reading, and the rules the kernel enforces on its own contributors. It is deliberately short and navigational — the detail lives behind the links.
Fitting, given what DOS is: the kernel exists to not believe an agent's self-report. So don't take this file on faith either — every claim here is one you can check with a
doscommand or agitread, and where that's the point, the command is shown. (Human-oriented? Read README.md instead; it is the same story written for a person browsing GitHub.)
DOS is a small, deterministic kernel that referees a fleet of AI agents working on a shared git repo. Every agent narrates — "I shipped it," "tests pass," "still making progress." DOS treats all of that as a claim, not a fact, and hands back a verdict read from ground truth the agent could not have authored: git history, the file tree, a clock, an environment's own state.
verify— did(plan, phase)actually ship? (from git ancestry, never the agent's word)arbitrate— may this agent take this lane, or does it collide with one already running?liveness— is the run ADVANCING, or just SPINNING / STALLED?refuse— say no with a reason a machine can act on.
Nothing here is coding-specific: a repo declares its own rules (lanes, paths,
ship-stamp grammar) as data in dos.toml; the kernel supplies only the
machinery. Reach it from the dos CLI, an MCP server, or import dos.
Before reading anything, watch the core idea happen:
pip install -e . # editable install from this clone (PyYAML is the only runtime dep)
dos quickstart # scaffolds a throwaway repo, commits, then:
# SHIPPED AUTH AUTH1 … (via grep-subject) exit 0
# NOT_SHIPPED AUTH AUTH2 … (via none) exit 1One SHIPPED, one NOT_SHIPPED, from git alone — that contrast is DOS. The
hand-typed version of the same thing is docs/QUICKSTART.md
(5 minutes, every line is real output).
The person who cloned this is often not a kernel contributor — they want to know what they fetched, see it work, install it, or wire it into their agent host. These are the verified moves; answer from this table instead of re-deriving them from the long README:
| The user asks | The move |
|---|---|
| "What is this? Show me." | dos quickstart — the 60-second caught-lie demo above. The hand-typed version is docs/QUICKSTART.md. |
| "Install it" (to use) | dos-kernel is on PyPI (since 2026-06-10) — pip install dos-kernel (runtime, PyYAML-only) or pip install "dos-kernel[mcp]" (adds the MCP server); tracking unreleased master is pip install "dos-kernel @ git+https://github.com/anthony-chaudhary/dos-kernel.git", and inside this clone pip install -e . works the same. Never pip install dos — that bare name is an unrelated squatter package. The full matrix (uv, pipx, wrappers, WSL) is docs/INSTALL.md. |
| "Install it" (to work on it) | pip install -e ".[dev,mcp]" — exactly what CI installs; brings pytest/ruff/mypy. |
| "Wire it into Claude Code" (or Cursor / Codex / Gemini / Antigravity / Claude Cowork) | Enforcement (hooks): dos init --hooks auto <their repo> — detects the runtime(s) the repo already uses and wires them all; or name one (--hooks claude-code, cursor, codex, gemini, antigravity, claude-cowork). Advisory (MCP): register dos-mcp in the host config — or install the bundled plugin, claude-plugin/README.md (prerequisite: the [mcp] install above). Hooks enforce, MCP advises; the repo recommends both. (Trae is the advisory-only exception: it has no hook seam, so it gets MCP + rules + skills and deliberately no --hooks trae — docs/294. Claude Cowork shares Claude Code's surfaces — same .claude/settings.json, same harness — but the app doesn't fire hooks yet, so its working surface is MCP + skills — docs/298.) |
| "Use it on MY repo" | cd <their repo> && dos init . && dos doctor — then dos verify PLAN PHASE answers from their git history. Works on a plain git repo; the one dos.toml is all the config. |
| "Wire it into LangGraph / CrewAI / AutoGen / the OpenAI or Claude Agents SDK" | examples/playbooks/cookbook-fleet-frameworks.md — one function at that framework's believe-the-agent seam (a referee node, a termination condition, an output guardrail); every recipe's seam was executed against the real framework, versions + verbatim output in the file. |
| "Run the tests" | The [dev] install in the next section, then python -m pytest -q — and read that section's foreground note before you start. |
pip install -e ".[dev,mcp]" # editable + the test/lint toolchain (exactly what CI installs)
python -m pytest -q # the full kernel suite — must stay green (~4,900 tests, ~4–5 min)
python scripts/dev.py fast # the inner loop: pytest -m "not slow" (skips the ~150s of heavies)
python scripts/dev.py verify-self # doctor --check + a real SHIPPED/NOT_SHIPPED round-trip (the CI smoke)
dos doctor --workspace . # what IS this workspace? (the config seam, made visible)
ruff check src/dos src/dos_mcp # lint exactly as CI does (the wider tree is NOT lint-clean — don't "fix" it)scripts/dev.py (test / fast / lint / verify-self / all) mirrors the CI
steps so green-local implies green-CI. Use fast while editing one module — it
skips the @pytest.mark.slow heavies (the poisoned-pool replays + the real-install
suite); the full pytest -q is still the pre-commit gate.
Two traps that bite an agent here. (1) A bare pip install -e . deliberately
installs only PyYAML — pytest comes from the [dev] extra, so the suite command
above fails without it. (2) Run the suite in the foreground and wait for its
verdict. It takes a few minutes; in a one-shot/headless session do NOT launch it
in the background and end your turn — your session ends before the suite does, and
the user receives a promise instead of a verdict.
This repo is itself a DOS workspace (dos doctor reports
is_kernel_repo: true), so adjudicate your own work with the kernel — don't trust
your own narration any more than the kernel trusts an agent's:
dos verify --workspace . docs/82_liveness-oracle-plan liveness # did a phase actually ship? (asks git)
dos commit-audit --workspace . HEAD # does a commit's SUBJECT match its own diff?
dos arbitrate --workspace . --lane src # may I take this lane right now?The full working ritual (doctor → arbitrate → edit → verify → commit-audit) is
the "DOS on DOS" section of CLAUDE.md. Use it for real, not just
as a demo: before you claim a docs/NN_*.md phase is done, dos verify it; after
you commit, dos commit-audit it. The oracle answers from git, so let the oracle
close the phase, not your prose.
The tree has ~213 files under docs/ and ~265 under benchmark/. Almost none
of that is required to understand or use DOS — it is a dated build journal and
research record. Do not try to read it all. Start with exactly these, in order:
| Read this | To learn |
|---|---|
| README.md | What DOS is, the syscall ABI, the full CLI, how to adopt it. The front door for humans. |
| docs/QUICKSTART.md | The runnable 5-minute hello-world. |
| CLAUDE.md | The architecture contract — the 4 layers, the one-way import rule, where code is allowed to live. Read this before editing any src/dos/ file. |
| docs/HACKING.md | Extend DOS without forking it — reasons, lanes, judges, renderers as workspace data (7 extension axes). |
| CONTRIBUTING.md | How to send a change: the layering rule and the CI-enforced litmus tests. |
Need to go deeper into the why or the research?
- The design notes (
docs/79,102,108,138,182,204…) are essays explaining the thinking the code rests on. The curated index — guides vs. design notes vs. the dated journal — is docs/README.md. The numbers are chronology, not a reading order, and a few collide (there are twodocs/191), so prefer the index over guessing a number. - The benchmarks are six independent research programs that measure DOS claims; they are consumers of the kernel, never part of it. Start at benchmark/README.md / benchmark/BENCHMARKS.md, not by listing the directory.
- The per-module map (every kernel leaf, its
docs/NNlineage) is the cold tier, docs/ARCHITECTURE.md. Read it before touching a specific leaf.
| Path | What it is |
|---|---|
src/dos/ |
The kernel — pure verdict modules (oracle, arbiter, liveness, …). The thing you are mostly here to understand. |
src/dos/drivers/ |
Drivers — the only place provider/host/IO policy lives (a host's lanes, an LLM judge). Outside the kernel boundary. |
src/dos_mcp/ |
The MCP server (a separate top-level package on purpose; the kernel never imports it). |
src/dos/skills/ |
The generic skill pack — package data, not code (nothing imports it; the files shell dos verbs). |
tests/ |
The kernel suite. Many tests are litmus tests that pin the architecture rules below. |
examples/ |
Runnable playbooks, copy-me extension skeletons (dos_ext/, drivers/), example workspaces. The fastest way to see real usage. |
docs/ |
Guides (QUICKSTART, HACKING, ARCHITECTURE) + the numbered design-note / build journal. |
benchmark/ |
Six research programs measuring DOS claims. Consumers, not kernel. |
paper/, scripts/, claude-plugin/, .github/ |
The paper (generated — never hand-edit the .tex), release/dev tooling, the bundled Claude Code plugin, CI. All operate on the package; none is imported by it. |
DOS has a strict 4-layer architecture with a one-directional import rule, and
several of these are enforced by tests in tests/ (a violation turns the suite
red, so you'll find out fast). The ones most likely to bite an edit:
- The kernel imports no host and no vendor. No module under
src/dos/(exceptdrivers/) may name a host (job,apply, …) or a vendor (claude,gemini,cursor, …) as a code identifier. Host/vendor specifics live in a driver or come fromdos.toml. (Pinned bytest_vendor_agnostic_kernel.py, and the host litmus inCLAUDE.md.) verifyneeds no plan. The truth syscall must answer against a plain git repo with no plan and no registry. (Pinned bytest_verify_no_plan.py.)- Every verdict is a pure
classify(evidence, policy). I/O is gathered at the CLI boundary and passed in as data; a verdict function does no disk/network I/O. This is what makes the kernel testable. - The package never assumes it lives in the repo it serves. Every path resolves
against
SubstrateConfig.root(--workspace›$DISPATCH_WORKSPACE› cwd), never__file__. - A policy/scorer/judge can only refuse MORE, never admit a collision. Extensions are conjunctive under a deterministic floor — a buggy or hostile one degrades to the safe default, it cannot loosen safety.
The canonical statement of all of this, with the full layer table and the litmus list, is CLAUDE.md. If a doc ever seems to contradict it, the doc is the stale one — CLAUDE.md is the contract.
The outward-facing twin of these rules is
docs/STABILITY.md — the published promise about which
surfaces a consumer or plugin may depend on, what the version number means,
and how a deprecation is announced (DosDeprecationWarning, a
two-minor-release window). A change that breaks a surface that file calls
Stable needs the deprecation process, not just a green suite.
A commit is the ship-stamp dos verify reads, so a finished, green change
that isn't committed is a phase the kernel will call NOT_SHIPPED. When a unit of
work is complete and pytest -q is green, commit it — this trunk is master and
the preference is to land promptly, not defer. A few specifics:
- Commit only the lane you worked. Stage the specific files you touched
(
git add src/dos/… docs/…); never a blanketgit add -A. The working tree here is often shared with another agent's in-flight edits — sweeping them into your commit is the exactSELF_MODIFY/ disjoint-lane hazard the kernel refuses. - Match the existing commit-subject grammar (
git logshows it). Do not add aCo-Authored-Byor other agent-attribution trailer — commits here carry no agent co-author, even if your harness appends one by default. - Out of scope? File an issue, don't widen the commit. A finding that isn't
your current task goes to
gh issue create— dedupe first (gh issue list --search "…"), then file with a checkable done-condition, a lane guess, and where you found it. Issue text is public and the leak gate never scans it: no private paths or hostnames — whenscripts/leak_scan.pyis present, pipe the drafted body through it (--stdin, or--text-fileon a draft written outside the repo) before posting; a hit is a refusal. Never close an issue off your own narration — putFixes #Nin the commit body and let the landing close it (or use theissue-verifyskill for an evidenced manual close). The full rule is the "Out-of-scope findings" section of CLAUDE.md. - Ask first only for the hard-to-reverse / outward-facing — pushing, tagging, a
release, history rewrites. A local commit on
masteris none of those.
This file is for any agent (Claude Code, Cursor, Codex, Gemini CLI, Aider, an Agent-SDK app). CLAUDE.md holds the Claude-Code-specific working notes and the full architecture contract; it is the deeper read once this orientation has landed.