Agent OS is the agent-facing wrapper around secure-exec. It provides ACP sessions, agent adapters, quickstarts, and the public AgentOs client APIs while depending on secure-exec for the generic VM runtime.
- secure-exec dependency workflow. Manage the secure-exec dependency ONLY through
scripts/secure-exec-dep.mjs(thejust secure-exec-*recipes); never hand-edit thepath/version/catalog:pins.- Testing against local secure-exec changes: run
just secure-exec-localto repoint npm (link:) and crates (path = "../secure-exec/...") at the sibling checkout, thennode scripts/secure-exec-dep.mjs set-crate-version <sibling-version>so the Cargo version requirement matches the sibling crate version (otherwise cargo cannot resolve the path deps). Also runpnpm installin../secure-execfirst, or cargo panics inv8-runtime/build.rswith "missing Node dependencies at .../packages/build-tools/node_modules" (the V8 bridge assets are built from there). Usejust secure-exec-statusto inspect. This mode is for local builds/tests ONLY. - Pushing changes that depend on secure-exec changes: NEVER push with local (
path:/link:) dependencies. First preview-publish the secure-exec changes to their own secure-exec branch (thepreview-publish-secure-execflow), then point agent-os back at that exact published version withjust secure-exec-pinned+just secure-exec-set-version <version>(andset-crate-version <version>for the crates). Only commit/push the pinned-to-remote state.
- Testing against local secure-exec changes: run
- Keep generic runtime, kernel, VFS, language execution, and registry software behavior in secure-exec.
- Agent OS owns ACP, sessions, agent adapters, toolkit semantics, quickstarts, and the AgentOs facade.
- Call OS instances VMs, never sandboxes.
- The protocol has no backwards compatibility. Clients and the sidecar ship in same-version lockstep, so never add protocol or config versioning, runtime negotiation, fallbacks, or converters. Configs such as
CreateVmConfigcarry noversionfield; the single same-version wire handshake is the only version check. Change the protocol freely and update both sides together.
Two independent version tracks:
- secure-exec — the
@secure-exec/*npm packages and thesecure-exec-*Cargo crates always share one version (npm and crates are kept in sync; pin both to the same<v>). @agentos-software/*software packages (registry agents / WASM commands) are on a separate track and version independently of secure-exec.
Manage them ONLY via these recipes (never hand-edit path/version/catalog: pins):
just secure-exec-local— point deps at the sibling../secure-execcheckout for local hacking.just secure-exec-set-version <v>— pin secure-exec to a published version: sets the@secure-exec/*npm packages and thesecure-exec-*crates (same<v>, they're in sync) and switches to pinned mode.just agentos-pkgs-set-version <v>— pin the@agentos-software/*software packages (separate version track).
agent-os builds against secure-exec crates + npm packages, so a secure-exec change must reach agent-os before it can be pushed. NEVER push with local (path:/link:) deps. Flow: preview-publish the secure-exec branch (the preview-publish-secure-exec skill), then just secure-exec-set-version <published-version> (pins npm + crates + switches to pinned mode), and push only that pinned state. Caveat: a preview publishes npm but the crates.io job is dry-run/skipped — a secure-exec crate change only flows locally (secure-exec-local) or via a real crates.io release.
just preview-publish <branch> dispatches .github/workflows/publish.yaml to cut a preview (debug build, npm-only, dist-tag = sanitized branch name) — for handing a build to an external project. Preview-publish is for previews ONLY; never cut a release with it. Releases go through just release (the scripts/publish flow).
To consume an unpublished agent-os build in another project on this machine:
- npm:
pnpm -r build, then eitherpnpm packthe package(s) andnpm install ./rivet-dev-agentos-*.tgzin the external project, or add alink:/file:override (e.g."@rivet-dev/agentos": "link:/abs/path/agent-os/packages/agentos"). The sidecar binary ships as@rivet-dev/agentos-sidecar. - cargo: point the external Cargo project at the local crate via a path dep or
[patch.crates-io]override (e.g.[patch.crates-io] agentos-sidecar = { path = "/abs/path/agent-os/crates/agentos-sidecar" }).
Trust model (decide which side of the boundary something is on before judging whether it is a security bug). Three components:
- Client (trusted, except for anything it submits for execution). The AgentOs client / wire caller. The client and every value it configures are trusted:
CreateVmConfig, mount descriptors and plugin configs (host_dir paths, S3 endpoints/credentials, Google Drive, sandbox-agent), the permission policy, network allowlist, resource limits, env, and DNS overrides. Configuration is not an attack surface. The only untrusted thing the client supplies is the code/payload it asks to run, because that runs in the executor. - Sidecar (trusted; the TCB and enforcement point). The agent-os sidecar embeds and extends secure-exec; it brokers client requests and owns the kernel, VFS, mounts/plugins, socket table, and permission policy, and enforces the boundary against the executor.
- Executor — V8 isolates or WASM (untrusted; the adversary). Runs guest JS/Python/WASM plus any third-party/npm/agent-generated code. Assume it is actively hostile; how code reached the executor never makes it trusted.
The security boundary is sidecar ↔ executor. A defect that requires the client to supply a malicious config/endpoint/credential/policy is NOT a sandbox vulnerability (the client configures its own VM and already controls the host). Treat such hardening as defense-in-depth, not as an escape, and do not add validation that only guards trusted client-provided configuration. Corollaries: the permission policy/limits are trusted input but the guest is the subject they bind, so a guest bypassing an applied rule is in-scope; a host-backed mount's target/credentials are trusted, but confining the guest's I/O through it (symlink / .. / TOCTOU escapes) is in-scope. The wire transport is single-client over stdio, so wire authn/authz-between-clients and VM-to-VM-via-forged-id concerns are out of scope until a multi-client transport exists. See secure-exec root CLAUDE.md → Trust Model for the canonical statement.
- Isolation is layered (defense in depth), like Cloudflare Workers. Untrusted guest code is isolated within the host process by V8/WASM virtualization today; host-level jailing (sandboxing the process itself) is a planned additional layer. Because the in-process layer is load-bearing: keep the embedded V8 patched to current security releases, and never let one isolate take down the shared process — a per-isolate failure (heap OOM, CPU runaway) must terminate that isolate, not abort the host process.
- Match Cloudflare Workers wherever it makes sense. Use Workers' published behavior as the reference point for isolation semantics, resource limits, and egress defaults — e.g. ~128 MiB memory per isolate, bounded CPU time, default-deny network egress. Resource limits must be bounded by default (never
None/0 for memory, heap, stack, or CPU time); operators may raise them.
- Every public method on
packages/core/src/agent-os.tsmust stay mirrored by RivetKit actor actions after the user confirms the Rivet repo path. - Subscription methods are delivered through actor events; lifecycle behavior belongs in actor sleep/destroy hooks.
- Agent adapters must use real upstream agent SDKs. Do not replace SDK adapters with direct API-call stubs.
- Host-native agent wrappers are not allowed; agents run through the VM runtime supplied by secure-exec.
- Agent OS extension payloads use the secure-exec
Extenvelope with Agent OS-owned namespaces and generated ACP payloads. - Keep ACP decoding and session state in Agent OS wrapper code, not in secure-exec core sidecar code.
- The agent-os sidecar wrapper embeds and extends secure-exec; secure-exec must remain free of ACP, agent, and session dependencies.
- Prefer the agent-os sidecar wrapper for heavy lifting. Multi-step ACP/session orchestration, state machines, and anything that would otherwise cost several client→sidecar round-trips belong in the sidecar (
crates/agentos-sidecar), exposed as a single wire request; the TypeScript (packages/core) and Rust (crates/client) clients stay thin forwarders and must BOTH expose it. Rationale: (a) keep clients simple and in parity, (b) cut client↔sidecar latency. Keep logic client-side only when it needs state the sidecar cannot reach — e.g. RivetKit actor durable storage (ctx.db_*/SQLite), which the sidecar has no access to. Even then, the sidecar must not pull ACP/session deps into secure-exec core.
- External/consumer usage (installing
@rivet-dev/agentosand using it in your own project) is documented in the website quickstart + Agents/Custom Software pages underwebsite/, not in this file. ThisCLAUDE.mdis contributor/maintainer-only. - The Agent OS website and docs live in
website/(Astro + Starlight) and deploy toagentos-sdk.dev(docs atagentos-sdk.dev/docs). The marketing pages and docs were migrated out ofrivet.dev/agent-osandrivet.dev/docs/agent-os, which now 301-redirect to this domain. - Docs styling is owned by the shared
@rivet-dev/docs-themerepo (github.com/rivet-dev/docs-theme), consumed viagithub:rivet-dev/docs-theme#<tag>and wired in via...docsTheme(starlight, siteConfig). To change any docs styling (palette, header, sidebar, code blocks, fonts), edit that repo and follow its CLAUDE.md release workflow — never restyle docs inwebsite/src. This site owns only content +website/docs.config.mjs(sidebar icons via each item'sattrs['data-icon']). - Architecture reference docs live in
website/src/content/docs/docs/architecture/and are surfaced inwebsite/docs.config.mjsunder Reference → Advanced → Architecture. Treat these pages as the canonical human-facing architecture reference. When architecture behavior changes or new architecture is added, recommend the corresponding docs update to the user; do not proactively edit the docs unless the user asks for docs work or the task explicitly includes it. - The core quickstart under
examples/quickstart/and the RivetKit example must stay behaviorally identical. - Every quickstart change needs a matching automated test in the same change.
- Confirm the docs repo path with the user before editing Agent OS docs.
- Keep
website/src/data/registry.tscurrent when package names or registry entries change.
- Auto-skip expensive resource-saturation tests. A test that proves the absence of a bound by actually saturating a resource — a JS/WASM infinite loop pinning a CPU core for the watchdog window, a heap/alloc bomb, a fork bomb, or anything that aborts the process — must be marked
#[ignore = "expensive: <resource> saturation; run with --ignored"](vitest:it.skipor an env gate). These pin cores or crash the runner and bog down normal runs. - Still test the expensive safeguards. A configured limit/watchdog/quota actually firing — CPU-time limit set → runaway terminated; WASM fuel set → exit 124; heap cap → bounded; fd/process/socket cap → denied — is bounded and fast because the safeguard ends it. Keep these in the default suite; they are the regression guard that the protection works.
- Rule of thumb: if the test ends only when a timeout/watchdog whose absence you are documenting fires (slow, unbounded) →
#[ignore]. If it ends because a safeguard fires (fast, bounded) → keep it running.
All agent working files live user-scoped in ~/.agents/, never inside the repo. Override the location with the AGENTS_DIR env var. These files are not committed; .agent/ is gitignored as a safety net.
- Specs:
~/.agents/specs/— design specs and interface definitions for planned work. - Research:
~/.agents/research/— research documents on external systems, prior art, and design analysis. - Todo:
~/.agents/todo/*.md— deferred work items with context on what needs to be done and why. - Notes:
~/.agents/notes/— general notes and tracking. - Benchmarks:
~/.agents/benchmarks/— benchmark result artifacts.
When the user asks to track something in a note, store it in ~/.agents/notes/ by default. When something is identified as "do later", add it to ~/.agents/todo/. Design documents and interface specs go in ~/.agents/specs/.