An agent that can execute shell commands, open URLs, and write files is a privileged process. ZeroClaw's security model sits on top of every tool call and every channel message, gating what the agent is actually allowed to do at runtime.
There are six layers. From outer to inner:
Before a message from a channel reaches the agent, the channel's pairing and allow-list are checked. allowed_users, allowed_chats, IP allowlists for webhooks β all enforced at the channel adapter, before the runtime sees the event.
Docs: each channel's page under Channels.
The coarse-grained knob. Three settings:
- ReadOnly β the agent can observe (read files, query memory, fetch URLs it's allowed to fetch) but cannot write or execute commands.
- Supervised (default) β low-risk ops run; medium-risk ask the operator; high-risk block.
- Full β no approval gates;
workspace_onlyis implicitly disabled.forbidden_paths,forbidden_commands, and the OS sandbox still enforce.
Docs: Autonomy levels.
The agent operates within a configured workspace directory. file_read, file_write, and shell (for commands that touch the filesystem) refuse paths outside it unless workspace_only = false.
Per-session sandbox roots (ACP and gateway WebSocket): When a session is opened via ACP (session/new with a cwd parameter) or via the gateway WebSocket (connect-time cwd parameter), that path becomes the SecurityPolicy workspace boundary for all file and shell tools for the lifetime of the session. The daemon's global workspace_dir remains the data directory for memory, identity, cron, and other persistent state. The model is: session cwd = project boundary the agent can touch; workspace_dir = where ZeroClaw stores its own files. Note: the agent's system prompt currently reflects the daemon's workspace_dir rather than the session cwd; enforcement is correct but the model's self-reported location may differ.
Important: the cwd parameter changes which directory on the ZeroClaw host the agent is sandboxed to β it does not affect which machine tools run on. Tool use (shell commands, file reads/writes) always executes on the machine running ZeroClaw. If you connect to a remote ZeroClaw instance over the gateway WebSocket, tool calls operate on the remote machine's filesystem, not on your local machine. For localhost-only deployments this distinction does not matter, but remote setups should account for it.
Beyond the workspace, a forbidden_paths list (default: /etc, /sys, /boot, ~/.ssh, β¦) is always blocked regardless of workspace setting.
For shell invocations:
allowed_commandsβ if non-empty, shell only runs commands whose basename is in this listforbidden_commandsβ explicit denylist (rm -rf /,shutdown, kernel operations)validate_command_executionβ a pattern-matching pass that looks for dangerous flags, pipelines, and argument shapes
The validator runs before the command hits the shell. A blocked command surfaces as a tool error the model sees and can react to.
When a sandbox backend is available, tool invocations run inside it:
| Platform | Default backend |
|---|---|
| Linux | Landlock (kernel) / Bubblewrap / Firejail / Docker β auto-detected |
| macOS | Seatbelt (native) |
| Windows | AppContainer (experimental) |
| Any | Docker (if the daemon is reachable) |
The sandbox confines filesystem access to the workspace, drops network reachability except what the tool explicitly needs, and removes access to the parent process's secrets.
Docs: Sandboxing.
Every tool invocation β whether it executed, was blocked, or required approval β produces a signed receipt in a chain. Each receipt includes the hash of the previous one, so tampering with any receipt invalidates the rest.
Receipts are the source of truth for "what did the agent do yesterday". They're readable, greppable, and durable.
Docs: Tool receipts.
Beyond the six layers:
- OTP gating β
[security.otp] gated_actions = ["shell", "browser", "file_write"]requires a one-time code before each listed action. Useful for remote-access scenarios. - Emergency stop β
zeroclaw estophalts all in-flight tool calls. With[security.estop] enabled = true, resuming requires an OTP. - Prompt injection guard β scans model output for known injection patterns before tool calls are validated.
- Leak detector β scans outbound messages for secrets (API key patterns, private keys) and blocks sends that match.
- Pairing guard β device pairing for channel auth; prevents stolen credentials from working on a new device.
A blocked tool call doesn't silently fail:
- The security validator returns an error
- The runtime wraps it as a
ToolResult::Errand hands it back to the model - The model sees "Error: Shell command blocked by policy: forbidden pattern
rm -rf /" and can retry, apologise, or ask the user
If a tool is excluded from the channel via [autonomy].non_cli_excluded_tools (which gates non-CLI channels as a group), it simply isn't advertised to the model on those channels. Model never sees a tool it can't use.
Out of the box:
- Autonomy:
Supervised - Workspace-only:
true - Sandbox: auto-detect (uses whatever the OS provides)
- Audit logging:
false(enable explicitly) - OTP:
false - E-stop:
false
This is a reasonable middle ground β safe enough for a laptop, permissive enough to not frustrate. Crank it up for production (OTP, audit, restricted tools) or down to YOLO for a dev box.