Verified by: 3 Explore agents, LSP symbol checks, direct file reads, and
/consult-codexdual-AI consultation (Codex GPT-5.3 + Code-Searcher). All findings independently confirmed with high agreement.
Users report OpenClaw can be resource-intensive. This guide documents every resource-consuming subsystem with verified source references, plain-English explanations, and practical optimization strategies.
- A. CPU-Intensive Operations
- B. Memory-Intensive Operations
- C. Disk-Intensive Operations
- D. Optimization Strategies for Users
- E. Monitoring & Profiling Guide
- F. How OpenClaw Memory Works (Architecture Deep-Dive)
- Summary
Think of CPU usage like a chef in a kitchen. Most of the time they're waiting for orders, but some tasks — like hand-rolling 42 different sushi pieces just to pick the smallest one — keep them frantically busy. These are the operations that make your fans spin and your machine feel sluggish.
| # | Operation | Source | Impact | Plain English |
|---|---|---|---|---|
| 1 | Screenshot normalization — nested loop of up to 7 sizes x 6 qualities = 42 sharp resize ops per screenshot | extensions/browser/src/browser/screenshot.ts:35-57 |
Very High | Like resizing a photo 42 different ways to find which version fits in an envelope — each resize takes real effort |
| 2 | PNG image optimization — grid of 5 sizes x 4 compression levels = 20 sharp ops (mozjpeg is CPU-heavy) | src/media/image-ops.ts:603-660 |
Very High | Like printing the same photo in 20 different quality settings to find the smallest file — each print job takes CPU time |
| 3 | Local embedding inference — on-device GGUF model via node-llama-cpp, Promise.all over all texts |
extensions/memory-core/src/memory/embeddings.ts:103-164 |
Very High (when local) | Like running a mini-ChatGPT on your own machine to understand your notes — powerful but demands serious CPU |
| 4 | Plugin loading via jiti — synchronous TypeScript transpilation per plugin at startup | src/plugins/loader.ts:166-195 |
High (startup) | Like compiling a recipe book from scratch every time you open the kitchen, instead of using a pre-printed copy |
| 5 | Cosine similarity fallback — O(n) full-scan vector comparison when sqlite-vec unavailable | extensions/memory-core/src/memory/manager-search.ts:74-149 |
High (per query) | Like comparing a new photo to every single photo in your album one-by-one, instead of using a smart index |
| 6 | PDF-to-image rendering — per-page canvas creation + PNG encoding via @napi-rs/canvas |
src/media/pdf-extract.ts:42-103 |
High (per PDF) | Like photocopying each page of a PDF into a separate image file — each page takes a rendering pass |
| 7 | Full AX tree traversal — Accessibility.getFullAXTree on complex browser pages |
extensions/browser/src/browser/cdp.ts:282-295 |
Medium-High | Like reading every element on a web page aloud for accessibility — hundreds of elements on complex pages |
| 8 | Image resize via sips — macOS-specific process spawning for each HEIC conversion/resize | src/media/image-ops.ts:136-274 |
Medium | Like opening a separate program for each photo conversion — the per-process overhead adds up |
| 9 | Media understanding — sending media to AI providers (Whisper/Gemini/OpenAI) for transcription | src/media-understanding/runner.ts:773-919 |
Medium | CPU cost is mostly on the provider side, but local buffering and encoding still takes cycles |
| 10 | Ed25519 keypair generation — asymmetric crypto on first run / device identity creation | src/infra/device-identity.ts:57 |
Low (one-time) | Like generating a strong password — intensive but happens only once |
| 11 | Memory sync — file hashing + markdown chunking + embedding + SQLite FTS5/vec indexing | extensions/memory-core/src/memory/manager-sync-ops.ts:693 |
Medium (periodic) | Like re-indexing a library catalog — scanning, categorizing, and filing every document |
| 12 | TTS generation — ElevenLabs/OpenAI/Edge TTS API calls + audio buffer handling | src/tts/tts.ts (barrel → plugin-sdk/speech-runtime.js) |
Medium | API calls are remote but audio buffer conversion is local CPU work |
| 13 | Agent execution loop — continuous model response processing | src/auto-reply/reply/agent-runner-execution.ts:139 |
Medium (continuous) | The main "brain" loop — always running while the bot is responding |
| 14 | Cron timer loop — re-arming setTimeout for scheduled job processing |
src/cron/service/timer.ts:638 |
Low (idle) | Like a clock ticking in the background — minimal CPU unless jobs are firing |
Polling and reconnection loops:
- Signal SSE event stream — long-lived HTTP connection with reconnection logic
- Telegram long-polling — continuous
getUpdatescalls to Telegram API - WhatsApp monitor — persistent connection with keepalive
Crypto operations:
- HMAC signature verification for webhooks
- SHA-256 hashing for content deduplication and caching
Child process stdout/stderr accumulation:
src/process/exec.ts:339-342— unbounded string concatenation of process outputextensions/memory-core/src/memory/qmd-manager.ts— QMD process output is capped atMAX_QMD_OUTPUT_CHARS(200,000 chars by default). TheresolveSpawnInvocation()helper at:72handles Windows-compatible spawn routing.
Media fetch buffering:
src/media/fetch.ts:182-199— media fetch is now bounded whenmaxBytesis specified:readResponseWithLimit()(src/media/read-response-with-limit.ts) streams chunk-by-chunk and aborts early on overflow, preventing unbounded memory consumption. Falls back to unboundedarrayBuffer()only when no limit is specified (e.g., document fetches without size constraints).
Memory (RAM) is like your desk space. The more papers, sticky notes, and browser tabs you keep open, the more crowded it gets. Some parts of OpenClaw are tidy — they clean up after themselves. Others keep piling up papers and never throw anything away, eventually leaving no room to work.
| Cache | Location | Bound | Risk |
|---|---|---|---|
| Session store cache | src/config/sessions/store-cache.ts:11-13 |
45s TTL, structuredClone per read |
Medium — each entry holds all 500 sessions |
| Discord presence cache | src/discord/monitor/presence-cache.ts:9 |
5000/account LRU | Low |
| Telegram sent message cache | src/telegram/sent-message-cache.ts:12 |
24h TTL, 100/chat | Low-Medium |
| History map | src/auto-reply/reply/history.ts:7 |
1000 keys LRU | Well bounded |
| Inbound dedupe | src/auto-reply/reply/inbound-dedupe.ts:9 |
5000 max, 20min TTL | Well bounded |
| Gateway dedupe | src/gateway/server-constants.ts:26-27 |
1000 max, 5min TTL | Well bounded |
| Browser roleRefs | extensions/browser/src/browser/pw-session.ts:114-115 |
50 max LRU | Well bounded |
| Followup queues | src/auto-reply/reply/queue/state.ts:19 |
20/queue, no queue count cap; clearFollowupQueue() (queue/cleanup.ts:24) clears individual queues during session cleanup |
Partially mitigated — individual queues can be cleared but total queue-map still uncapped |
| Agent event seqByRun | src/infra/agent-events.ts:23 |
No cleanup (seqByRun never pruned; runContextById now cleaned via clearAgentRunContext() at :49) |
Partial leak — runContextById fixed, seqByRun still leaks |
| Agent run sequence | src/gateway/server-runtime-state.ts:234 |
Bounded at AGENT_RUN_SEQ_MAX = 10,000 (pruned by maintenance timer) |
Well bounded |
| WhatsApp group histories | src/web/auto-reply/monitor.ts:105 |
Helper has 1000-key cap, but web direct writes bypass it | Partial leak |
| WhatsApp group member names | src/web/auto-reply/monitor.ts:115 |
No eviction at all | Leak risk |
| Cost usage cache | src/gateway/server-methods/usage.ts:60 |
30s TTL per entry, no max entry count | Low-Medium |
| Warned contexts | src/infra/session-maintenance-warning.ts:17 |
Never pruned | Low |
| Announce queues | src/agents/subagent-announce-queue.ts:60 |
Per-queue cap, no queue count cap | Low |
| Telegram sent msgs outer map | src/telegram/sent-message-cache.ts:12 |
Per-chat TTL, outer map never evicts dead chat keys | Low-Medium |
Session store cache: Like photocopying an entire filing cabinet every time you check one folder — works, but wastes desk space.
Unbounded Maps (seqByRun, agentRunSeq): Like a guest book that records every visitor but never tears out old pages — after months, it's a thick ledger eating memory for no reason.
- Chromium instance (Playwright CDP):
extensions/browser/src/browser/pw-session.ts:121— connection cache (cachedByCdpUrl), but Chromium itself can consume 200MB to 2GB+Like having a full web browser running invisibly in the background — it alone can use more memory than everything else combined.
- Per-page state caps: console (500), errors (200), network requests (500) —
extensions/browser/src/browser/pw-session.ts:117-120 - WeakMaps used for page/context state (GC-friendly):
extensions/browser/src/browser/pw-session.ts:107-108
The conversation with the AI model grows with every message. Left unchecked, this would eat unlimited memory.
Like a conversation transcript that keeps growing — OpenClaw has a "summarizer" that periodically condenses old messages, like a meeting secretary writing "minutes" instead of keeping the full recording.
OpenClaw has several defenses:
- Compaction system:
src/agents/compaction.ts— multi-stage summarization to prevent unbounded context growth - History turn limiting:
src/agents/pi-embedded-runner/history.ts:15-36 - Context pruning extension:
src/agents/pi-extensions/context-pruning/extension.ts - WebSocket payload limits: 25MB/frame, 50MB/connection buffer, 6MB history —
src/gateway/server-constants.ts:3-4
Modules loaded via jiti persist for process lifetime. Each plugin's tools, commands, and hooks are registered in global maps and never unloaded.
Disk usage is like storage boxes in your garage. Some boxes are neatly labeled and rotated out seasonally (managed). Others just keep growing — like never deleting old text messages — until one day your phone says "Storage Full." OpenClaw has both kinds, and the unbounded ones are the ones to watch.
| Resource | Location | Risk |
|---|---|---|
Transcript .jsonl files |
src/config/sessions/transcript.ts:133-209 |
No rotation, no size limit — grows forever per session |
| Command logger | src/hooks/bundled/command-logger/handler.ts:47-62 |
No rotation — commands.log grows unbounded |
| Telegram sticker cache | src/telegram/sticker-cache.ts:35-67 |
No eviction — JSON grows with unique stickers |
| Browser user-data profiles | extensions/browser/src/browser/chrome.ts:80-81 |
Full Chromium profile — can reach GBs |
| SQLite databases | extensions/memory-core/src/memory/manager-sync-ops.ts:88-98 |
No VACUUM — database grows unbounded without periodic vacuuming (uses default DELETE journal mode, no WAL) |
| Per-day log file size | src/logging/logger.ts:49,187-202 |
Capped — 500MB default (DEFAULT_MAX_LOG_FILE_BYTES), configurable via logging.maxFileBytes; warns once then suppresses writes when reached |
Voice-call calls.jsonl |
extensions/voice-call/src/manager/store.ts:7-10 |
Append-only, no rotation + full-file reads on load |
Transcript JSONL files: Like a chat log that records every message forever but never archives or deletes old conversations — a busy bot can accumulate gigabytes over months.
Browser user-data profiles: Like a real web browser's cache, cookies, and history — it grows the more pages the bot visits, just like your own browser.
commands.log: Like a security camera that records 24/7 but never overwrites old footage — eventually the DVR fills up.
| Resource | Limit | Location |
|---|---|---|
| Media files | 2min TTL auto-cleanup | src/media/store.ts:16,113-130 |
| Rolling logs | 24h age pruning | src/logging/logger.ts:48,357 |
| Session store | 500 entries, 30d prune, 10MB rotation, 3 backups | src/config/sessions/store-maintenance.ts:12-14 |
| Cron run logs | 2MB/2000 lines self-pruning | src/cron/run-log.ts:82-83 |
| TTS temp files | 5min delayed cleanup | src/tts/tts-core.ts:15,197-209 |
| Pairing requests | 3/channel, 1h TTL | src/pairing/pairing-store.ts:14-15 |
| Type | Limit | Location |
|---|---|---|
| Images | 6MB (10MB input files) | src/media/constants.ts:1, src/media/input-files.ts:106 |
| Audio | 16MB | src/media/constants.ts:2 |
| Video | 16MB | src/media/constants.ts:3 |
| Documents | 100MB | src/media/constants.ts:4 |
| WS frame | 25MB | src/gateway/server-constants.ts:3 |
| WS buffer | 50MB/connection | src/gateway/server-constants.ts:4 |
| Browser screenshot | 5MB | extensions/browser/src/browser/screenshot.ts:9 |
There are zero free-space checks (statvfs/disk usage) anywhere in the codebase. OpenClaw will keep writing until the disk is completely full, with no warning.
Like filling a storage unit without ever checking how much room is left — OpenClaw will keep writing until the disk is completely full.
You don't need to understand the code to keep your OpenClaw instance running lean. Think of these tips like maintaining a car: you don't need to be a mechanic, but changing the oil and checking tire pressure goes a long way. Here's the equivalent for OpenClaw.
- Disable browser tools if not needed — saves Chromium CPU overhead entirely
- Use API-based embedding providers instead of local GGUF models — offloads inference to the cloud
- Minimize plugins to reduce jiti transpilation at startup
- Avoid frequent PDF processing / large screenshots — each triggers heavy image processing loops
- Use the sqlite-vec extension — avoids the O(n) cosine similarity fallback on every memory query
- Configure
dmHistoryLimitto cap conversation context per session - Limit active browser profiles / close unused pages — each Chromium page holds state
- Disable unused channels — each channel adapter holds connection state and caches
- Monitor WhatsApp group count if using auto-reply — group history Maps are unbounded
- Restart Gateway periodically if running for weeks — clears accumulated
seqByRun/ run sequence leaks
-
Periodically clean transcript files:
# List largest transcripts du -sh ~/.openclaw/agents/*/sessions/*.jsonl | sort -rh | head -20
Transcripts are never auto-pruned.
-
Periodically clean browser profiles:
du -sh ~/.openclaw/browser/*/user-data/
Chromium profiles grow indefinitely.
-
Run SQLite VACUUM periodically:
sqlite3 ~/.openclaw/memory/*.sqlite VACUUM
WAL files can bloat without periodic compaction.
-
Monitor
commands.logsize — it grows unbounded:ls -lh ~/.openclaw/logs/commands.log -
Clean orphaned temp directories:
ls -lh /tmp/tts-* /tmp/openclaw-* 2>/dev/null
-
Set
logging.leveltowarnin production to reduce log volume -
Use session maintenance config:
sessions.pruneAfter— auto-prune old sessionssessions.maxEntries— cap total session countsessions.rotateBytes— rotate session store file at size threshold
Sections A-C told you what consumes CPU, memory, and disk. This section tells you how to see it happening on your actual machine. Think of it like the difference between a thermometer (one quick reading) and a thermograph (a chart that records temperature all day). You need both: quick checks to see what's happening right now, and historical tools to spot trends before they become problems.
Before monitoring anything, find the OpenClaw Gateway process:
# Find the Gateway PID (both platforms)
pgrep -f "openclaw"
# See full process details (both platforms)
ps aux | grep openclaw
# macOS — also check Activity Monitor (search "openclaw" or "node")
# The Gateway runs as a Node.js process, so look for "node" if "openclaw" doesn't appearTip: Save the PID in a variable for repeated use:
OC_PID=$(pgrep -f "openclaw")
These tools are like glancing at your car's dashboard — they show you what's happening right now, but don't record history.
top / htop (both platforms)
# Filter top for the OpenClaw process
# Linux:
top -p $(pgrep -f "openclaw")
# macOS (different flag):
top -pid $(pgrep -f "openclaw")
# htop — a friendlier version with color-coded bars (both platforms, same flag)
# Install: brew install htop (macOS) / apt install htop (Linux)
htop -p $(pgrep -f "openclaw")Per-process one-liner (both platforms)
# Snapshot of CPU% and memory for the Gateway
ps -p $(pgrep -f "openclaw") -o pid,%cpu,%mem,rss,vsz,commandColumn reference: %cpu = CPU percentage, %mem = RAM percentage, rss = resident set size (actual RAM in KB), vsz = virtual size (allocated, not all physical).
macOS Activity Monitor:
Open Activity Monitor from Spotlight (Cmd+Space → "Activity Monitor"), then search for openclaw or node. The Memory tab shows Real Memory (same as RSS) and the CPU tab shows per-process usage.
top and htop show you what's happening right now — like glancing at a speedometer. But what if you want to know how fast you were going at 3 AM when nobody was watching? That's what sysstat does. It's a suite of tools that automatically records system performance every 10 minutes and lets you replay the data later. Think of it as a dashcam for your server.
| Tool | Purpose |
|---|---|
sar |
System Activity Reporter — replays historical data (CPU, memory, disk, network) |
pidstat |
Per-process stats (like top, but with timestamps and logging) |
iostat |
Disk I/O throughput and latency |
mpstat |
Per-CPU core breakdown |
# Debian/Ubuntu
sudo apt install sysstat
# RHEL/CentOS/Fedora
sudo dnf install sysstat # or: sudo yum install sysstat
# Enable automatic data collection (records every 10 min by default)
sudo systemctl enable --now sysstatAfter enabling, sysstat stores daily data files in /var/log/sa/ (or /var/log/sysstat/). You can replay any day's data — even from last week.
# CPU usage over time (today) — shows %user, %system, %idle
sar -u
# CPU usage for a specific date (e.g., February 10)
sar -u -f /var/log/sa/sa10
# CPU usage sampled every 5 seconds, 12 times (1 minute of live data)
sar -u 5 12
# Memory usage over time — shows kbmemfree, kbmemused, %memused
sar -r
# Disk I/O — shows tps (transfers per second), read/write throughput
sar -b # block device summary
sar -d # per-device breakdown (useful to identify which disk)
# Network interface stats — shows rxkB/s, txkB/s per interface
sar -n DEVReading sar output: Each row is a timestamp. Look for
%idledropping below 20 (CPU saturated),%memusedabove 90 (memory pressure), or sudden spikes intps(disk I/O storms). If your Gateway was sluggish at 3 AM, runsar -u -f /var/log/sa/sa$(date -d yesterday +%d)to see yesterday's CPU.
pidstat is like top, but it prints timestamped lines you can log to a file. Perfect for watching OpenClaw specifically without the noise of other processes.
# Per-process CPU usage for the Gateway
pidstat -p $(pgrep -f "openclaw") 1
# Prints a line every 1 second showing %usr, %system, %CPU
# Per-process memory (RSS, VSZ, %MEM)
pidstat -r -p $(pgrep -f "openclaw") 5
# Prints memory stats every 5 seconds
# Per-process disk I/O (kB read/written per second)
pidstat -d -p $(pgrep -f "openclaw") 5
# Combined: CPU + memory + disk, every 5 seconds for 1 minute (12 samples)
pidstat -urd -p $(pgrep -f "openclaw") 5 12
# Monitor ALL Node.js processes (useful if OpenClaw spawns children)
pidstat -urd -C "node" 5Pro tip: Pipe
pidstatoutput to a file for later analysis:pidstat -urd -p $(pgrep -f "openclaw") 60 1440 > ~/openclaw-24h-stats.log &This logs CPU/memory/disk every 60 seconds for 24 hours (1440 samples).
sysstat is not available on macOS. Use these alternatives instead:
# Memory pressure (macOS equivalent of sar -r)
vm_stat 5
# Columns: free, active, inactive, wired — multiply by page size (16384 on Apple Silicon)
# Disk I/O (macOS equivalent of sar -b / iostat)
iostat -w 5
# Shows KB/t, tps, MB/s per disk
# Per-process stats — use ps in a loop (macOS equivalent of pidstat)
while true; do
date; ps -p $(pgrep -f "openclaw") -o %cpu,%mem,rss 2>/dev/null
sleep 5
doneIf your OpenClaw instance is slow but CPU and memory look fine, the bottleneck might be disk. These tools tell you who's reading/writing and how fast.
# iostat — disk throughput snapshot
# Linux (extended stats):
iostat -x 5 # every 5 seconds; key columns: r/s, w/s, %util
# macOS (different flags — no -x support):
iostat -w 5 # every 5 seconds; shows KB/t, tps, MB/s per disk
# iotop — who's doing the I/O (Linux, requires root)
sudo iotop -o # only show processes with active I/O
# Look for "node" or "openclaw" — heavy writes may be transcript/log accumulation
# macOS — use fs_usage to trace file system calls
sudo fs_usage -f filesys node # trace all filesystem calls by Node.js processesWhen you've confirmed that the Gateway is using too much CPU or memory, these tools let you look inside the Node.js process to find exactly which function is responsible.
# Start the Gateway with the inspector enabled
openclaw --inspect=0.0.0.0:9229
# Or attach to an already-running Gateway (send SIGUSR1)
kill -USR1 $(pgrep -f "openclaw")Then open chrome://inspect in Chrome/Chromium, click the OpenClaw target, and:
- CPU Profile tab → Record → trigger the slow operation → Stop → see which functions took the most time
- Memory tab → Take Heap Snapshot → see what objects are consuming memory (look for the unbounded Maps from Section B)
# Install
npm install -g clinic
# Detect CPU bottlenecks (generates an HTML flamegraph)
clinic doctor -- node /path/to/openclaw/gateway.js
# Generate a flamegraph (visual call-stack breakdown)
clinic flame -- node /path/to/openclaw/gateway.js
# Detect async bottlenecks (event loop delays)
clinic bubbleprof -- node /path/to/openclaw/gateway.js# Install
npm install -g 0x
# Profile the Gateway (press Ctrl+C to stop and generate flamegraph)
0x -- node /path/to/openclaw/gateway.js
# Auto-open flamegraph in browser when done
0x -o -- node /path/to/openclaw/gateway.js
# Wider bars in the flamegraph = more CPU time spent in that functionOpenClaw communicates via WebSocket (Gateway port 18789), HTTP APIs, and outbound connections to AI providers. These tools help you see what's connected and how much traffic is flowing.
# Who's connected to the Gateway WebSocket port? (both platforms)
lsof -i :18789
# Shows every client connected — useful to check if browser extension / web UI is attached
# List all connections by the Gateway process (Linux)
ss -tnp | grep openclaw
# Shows established TCP connections, remote IPs, and ports
# macOS equivalent
lsof -i -P -n | grep openclaw
# Count active WebSocket connections
lsof -i :18789 | grep -c ESTABLISHED
# Watch connection count over time
watch -n 5 'lsof -i :18789 | grep -c ESTABLISHED'These commands give you a quick health check on the biggest disk consumers documented in Section C.
# Check total state directory size
du -sh ~/.openclaw/
# Check transcript sizes (largest first)
du -sh ~/.openclaw/agents/*/sessions/*.jsonl | sort -rh | head -20
# Check browser profile sizes
du -sh ~/.openclaw/browser/*/user-data/
# Check SQLite sizes
ls -lh ~/.openclaw/memory/*.sqlite
# Check log sizes
ls -lh /tmp/openclaw/
# Check temp file accumulation
ls -lh /tmp/tts-* /tmp/openclaw-* 2>/dev/null
# Check commands.log
ls -lh ~/.openclaw/logs/commands.logOne-time checks are useful, but problems often creep in gradually. A cron job that logs resource usage lets you spot trends — like memory slowly growing or transcripts silently ballooning.
# Save this as ~/openclaw-monitor.sh
#!/bin/bash
TIMESTAMP=$(date '+%Y-%m-%d %H:%M:%S')
PID=$(pgrep -f "openclaw")
if [ -n "$PID" ]; then
RSS=$(ps -p "$PID" -o rss= 2>/dev/null | tr -d ' ')
CPU=$(ps -p "$PID" -o %cpu= 2>/dev/null | tr -d ' ')
DISK=$(du -sm ~/.openclaw/ 2>/dev/null | cut -f1)
echo "$TIMESTAMP PID=$PID RSS_KB=$RSS CPU=$CPU% DISK_MB=$DISK" >> ~/openclaw-resource.log
fi
# Make executable
chmod +x ~/openclaw-monitor.sh# Log every 5 minutes (add with: crontab -e)
*/5 * * * * ~/openclaw-monitor.shAfter a few days, review trends:
# See memory trend (RSS column)
awk '{print $1, $2, $4}' ~/openclaw-resource.log | tail -50
# Check if disk usage is climbing
awk '{print $1, $2, $6}' ~/openclaw-resource.log | tail -50If you enabled sysstat (see above), it already collects system-wide stats every 10 minutes. No extra cron needed — just query with sar whenever you want historical data.
For most single-instance deployments (Mac mini, small VPS), the tools above are sufficient. Consider Prometheus + Grafana only if:
- You run multiple OpenClaw instances and need centralized dashboards
- You want alerting (e.g., "notify me if RSS exceeds 2GB")
- You're already running a monitoring stack for other services
Setting up Prometheus/Grafana is beyond the scope of this guide — see the Prometheus docs for getting started.
Sections A–E covered CPU, RAM, disk, optimizations, and monitoring. This section zooms in on OpenClaw's memory system — the persistent knowledge layer that lets the AI remember things across conversations. Understanding how it works helps you tune resource usage (embedding inference, SQLite storage, sync frequency) and explains why certain CPU/disk costs from earlier sections exist.
Plain English: Memory reaches the AI through two separate paths — like having both a briefcase of essential notes you always carry AND a searchable filing cabinet you can query on demand.
Path 1 — Bootstrap injection (the briefcase)
At the start of every session, OpenClaw loads MEMORY.md (or memory.md) from the workspace directory and injects its contents into the AI's first message as a context file. This happens unconditionally for all primary sessions — the only filtering is for subagent sessions, which receive only AGENTS.md and TOOLS.md.
- Resolution:
src/agents/workspace.ts:467-484— scans forMEMORY.mdandmemory.md, deduplicates - Loading:
src/agents/workspace.ts:487-547— reads file contents intoWorkspaceBootstrapFile[] - Filtering:
src/agents/workspace.ts:557-565—filterBootstrapFilesForSession()only filters subagent sessions via an allowlist; all other sessions (including group chats) receive the full set - Context building:
src/agents/pi-embedded-helpers/bootstrap.ts:187-239— trims tobootstrapMaxChars(default 20,000 chars) using head/tail strategy withtotalMaxCharscap (default 24,000) - Orchestration:
src/agents/bootstrap-files.ts:64-96— wires resolution → filtering → context building
Correction vs. third-party articles: Some sources claim MEMORY.md is "never injected in group chats, for privacy." This is inaccurate. Bootstrap injection happens for all non-subagent sessions regardless of chat type. What is suppressed in groups is memory search citations (see F5).
Path 2 — On-demand search tools (the filing cabinet)
The AI can actively query its memory index using two tools:
| Tool | Purpose | Source |
|---|---|---|
memory_search |
Semantic hybrid search across all indexed memory files; returns top snippets with path + line numbers | src/agents/tools/memory-tool.ts:106-161 |
memory_get |
Read a specific file or line range from MEMORY.md or memory/*.md; use after memory_search to pull exact content |
src/agents/tools/memory-tool.ts:163-214 |
The memory_search tool description instructs the AI to use it as a "mandatory recall step" before answering questions about prior work, decisions, preferences, or dates.
| Resource | Default path | Source |
|---|---|---|
| Workspace directory | ~/.openclaw/workspace/ |
src/agents/workspace.ts:12-21 |
| Primary memory file | ~/.openclaw/workspace/MEMORY.md (or memory.md) |
src/agents/workspace.ts:32-33 |
| Memory subdirectory | ~/.openclaw/workspace/memory/*.md (recursive) |
extensions/memory-core/src/memory/internal.ts:80-146 |
| SQLite index database | ~/.openclaw/memory/{agentId}.sqlite |
src/agents/memory-search.ts:132-140 |
| Additional paths | Configured via memorySearch.extraPaths[] |
extensions/memory-core/src/memory/internal.ts:35-46 |
The listMemoryFiles() function (internal.ts:79-146) scans these locations, skips symlinks, filters for .md extensions only, and deduplicates by resolved path.
Memory sources (configurable via memorySearch.sources):
"memory"(default) — files from MEMORY.md + memory/ directory + extraPaths"sessions"— session transcript JSONL files, indexed as searchable content (requiresexperimental.sessionMemory: true)
Plain English: Like cutting a book into overlapping pages so you can search for any phrase, even one that falls on a page boundary. Each "page" shares a few lines with its neighbors to avoid losing context at the edges.
The chunkMarkdown() function (extensions/memory-core/src/memory/internal.ts:167-260) splits memory file content into searchable chunks:
| Parameter | Default | Effect |
|---|---|---|
chunking.tokens |
400 | Max chunk size in tokens (≈ 1,600 chars since maxChars = tokens × 4) |
chunking.overlap |
80 | Overlap between adjacent chunks in tokens (≈ 320 chars) |
How it works:
- Split content into lines
- Accumulate lines until
currentChars + lineSize > maxChars - Flush the current chunk (recording
startLine,endLine, SHA-256hash) - Walk-backward overlap (
internal.ts:201-222): carry the last N chars worth of lines from the flushed chunk into the next chunk's starting state - Long lines (>
maxChars) are split into segments, each preserving the original line number - Repeat until all lines are processed
Source references: defaults at src/agents/memory-search.ts:99-100
Plain English: To search by meaning (not just keywords), OpenClaw converts text into numerical "fingerprints" called embeddings. It can do this locally on your machine or send text to a cloud API.
| Provider | Model | Max tokens | Dimensions | Runs where |
|---|---|---|---|---|
| OpenAI | text-embedding-3-small |
8,192 | 1,536 | Cloud API |
| Gemini | gemini-embedding-001 |
2,048 | 768 | Cloud API |
| Voyage | voyage-4-large |
32,000 | 1,024 | Cloud API |
| Local | embeddinggemma-300M (GGUF) |
varies | ~300 | On-device via node-llama-cpp |
Source: model defaults at src/agents/memory-search.ts:99-100, local model at extensions/memory-core/src/memory/embeddings.ts:73-74
Auto-selection order (extensions/memory-core/src/memory/embeddings.ts:166-286):
- Local — but only if the model file already exists on disk (won't auto-download)
- OpenAI — if API key is available
- Gemini — if API key is available
- Voyage — if API key is available
- If none work, throws an error
A configurable fallback provider (memorySearch.fallback) is tried if the primary provider fails (embeddings.ts:190-213).
Cross-reference: Local embedding inference is CPU item #3 in Section A. Use
provider: "openai"to offload this cost to the cloud.
Plain English: Like searching a library with both a card catalog (meaning-based) and a word index (exact term matches), then combining the results. The meaning search finds conceptually related content; the keyword search catches exact phrases the meaning search might miss.
How a search query is processed:
- Embed the query — convert to a vector using the configured embedding provider
- Vector search — find similar chunks via
vec_distance_cosine()in sqlite-vec, or fall back to O(n) cosine similarity scan if sqlite-vec is unavailable (extensions/memory-core/src/memory/manager-search.ts:22-148) - Keyword search — FTS5 BM25 ranking via the
chunks_ftsvirtual table (extensions/memory-core/src/memory/manager-search.ts:190-263) - Merge results — combined score:
vectorWeight × vectorScore + textWeight × textScore(extensions/memory-core/src/memory/hybrid.ts:128) - Filter and cap — discard results below
minScore, return topmaxResults
Default parameters:
| Parameter | Default | Source |
|---|---|---|
vectorWeight |
0.7 | src/agents/memory-search.ts:102 |
textWeight |
0.3 | src/agents/memory-search.ts:103 |
candidateMultiplier |
4 (fetch 4× candidates, then trim) | src/agents/memory-search.ts:104 |
maxResults |
6 | src/agents/memory-search.ts:99 |
minScore |
0.35 | src/agents/memory-search.ts:100 |
| Snippet cap | 700 chars | extensions/memory-core/src/memory/manager.ts:33 |
With defaults: 24 candidates are fetched (6 × 4), merged and scored, then the top 6 with score ≥ 0.35 are returned, each snippet capped at 700 characters.
Citation suppression in groups: The memory_search tool suppresses source citations in group/channel chats by default (memory-tool.ts:190-218). In auto mode, citations appear only in direct chats. This is configurable via memory.citations ("on", "off", or "auto").
Cross-reference: The O(n) cosine fallback (when sqlite-vec is unavailable) is CPU item #5 in Section A. Install sqlite-vec to avoid it.
The memory index is stored in a SQLite database with six tables:
| Table | Type | Purpose |
|---|---|---|
meta |
Regular | Stores index metadata (model, provider, chunk settings, vector dims) |
files |
Regular | Tracks indexed files with path, hash, mtime, size, source |
chunks |
Regular | Stores chunk text, embedding vectors (as JSON), line ranges, hashes |
chunks_vec |
Virtual (vec0) |
sqlite-vec index for fast cosine distance queries |
chunks_fts |
Virtual (fts5) |
FTS5 full-text index for BM25 keyword search |
embedding_cache |
Regular | Caches embeddings to avoid re-processing unchanged content |
Source: extensions/memory-core/src/memory/memory-schema.ts:9-82
Embedding cache prevents re-embedding unchanged chunks. The composite primary key is (provider, model, provider_key, hash) (memory-schema.ts:47), so switching providers or models naturally invalidates the cache. A configurable cache.maxEntries limit can be set to cap cache growth.
Cross-reference: SQLite databases have no automatic
VACUUM— WAL files can bloat over time. See Section C and theVACUUMtip in Section D.
Plain English: Like a librarian who notices when you edit a book and automatically re-catalogs it — but waits a moment in case you're still editing before starting the work.
File watcher (chokidar): watches MEMORY.md, memory.md, memory/, and any extraPaths for changes.
| Setting | Default | Effect |
|---|---|---|
sync.watch |
true |
Enable file watching |
sync.watchDebounceMs |
1,500ms | Wait this long after last file change before syncing |
sync.onSessionStart |
true |
Sync when a new session starts |
sync.onSearch |
true |
Sync before search if dirty flag is set |
sync.intervalMinutes |
0 (disabled) | Periodic sync timer |
Source: extensions/memory-core/src/memory/manager-sync-ops.ts:378-432 (watcher setup), src/agents/memory-search.ts:96 (debounce default)
Session delta tracking (for session memory source):
| Setting | Default | Effect |
|---|---|---|
sync.sessions.deltaBytes |
100,000 (100KB) | Re-index session after this many new bytes |
sync.sessions.deltaMessages |
50 | Re-index session after this many new messages |
Source: src/agents/memory-search.ts:97-98, extensions/memory-core/src/memory/manager-sync-ops.ts:464-500
Sync triggers in order of priority:
- Session start — if
sync.onSessionStartis true (manager.ts:299-315) - Before search — if dirty flag is set and
sync.onSearchis true (manager.ts:315-333) - File watch — after debounce period (
manager-sync-ops.ts:657-668) - Session delta — when byte/message threshold is exceeded (
manager-sync-ops.ts:439-500) - Interval timer — if
intervalMinutes > 0(manager-sync-ops.ts:646-655)
During sync, unchanged files are skipped (hash comparison against the files table), and stale files are removed from the index.
Plain English: Before the AI's conversation notebook gets summarized (compacted) to free up space, it gets one special turn to save important notes to disk — like a student quickly writing down key formulas before an exam booklet is collected.
When the conversation approaches the context window limit, OpenClaw inserts a silent "memory flush" turn before running compaction:
Trigger condition (src/auto-reply/reply/memory-flush.ts:173-215):
totalTokens >= contextWindow - reserveTokens - softThreshold
| Parameter | Default | Source |
|---|---|---|
softThresholdTokens |
4,000 | memory-flush.ts:10 |
| Flush prompt | "Store durable memories now (use memory/YYYY-MM-DD.md)" |
memory-flush.ts:13-19 |
Double-flush prevention: The system tracks memoryFlushCompactionCount per session. If it already matches the current compactionCount, the flush is skipped (memory-flush.ts:102-106). This prevents the same flush from running twice for the same compaction cycle.
Important clarification: Some third-party articles describe "daily log files" (
memory/YYYY-MM-DD.md) as a built-in automatic system. This is inaccurate. The flush prompt instructs the AI agent to write to that filename pattern — the agent decides what (if anything) to save. The system creates the turn; the agent creates the file. If there's nothing worth saving, the agent replies with a silent token and no file is created.
Plain English: When you type /new to start a fresh conversation, the old conversation gets saved as a dated memory file — like tearing out your notepad page and filing it before starting a blank one.
The session memory hook (src/hooks/bundled/session-memory/handler.ts:53-225) triggers on the /new command:
- Reads the last N messages from the current session's JSONL transcript file (default: 15 messages,
handler.ts:132) - Generates a descriptive slug via LLM (e.g.,
"debugging-auth-flow") or falls back to HHMM timestamp (handler.ts:146-150) - Creates
memory/YYYY-MM-DD-{slug}.mdwith session metadata and conversation content (handler.ts:153-185)
Configurable settings (via hook config session-memory):
| Setting | Default | Effect |
|---|---|---|
messages |
15 | Number of recent messages to include |
llmSlug |
true (unless test env) |
Whether to use LLM to generate a descriptive filename slug |
This hook is not automatic — it only runs when the user explicitly types /new. It does not run on session timeout or process restart.
OpenClaw includes an experimental alternative memory backend using the external qmd CLI tool (extensions/memory-core/src/memory/qmd-manager.ts). Instead of the built-in SQLite + sqlite-vec approach, QMD provides:
- BM25 keyword search + vector similarity + reranking in a single external process
- Managed as a sidecar process spawned by OpenClaw
- Configured via
agents.defaults.memoryBackend: "qmd"with its own config block
This is experimental and not the default backend. The built-in SQLite backend (Sections F3–F8) is the standard path.
All settings live under agents.defaults.memorySearch in the OpenClaw config. Per-agent overrides are supported via agents.{agentId}.memorySearch.
| Key | Type | Default | Description |
|---|---|---|---|
enabled |
boolean | true |
Enable/disable memory search entirely |
provider |
string | "auto" |
Embedding provider: "openai", "gemini", "voyage", "local", or "auto" |
fallback |
string | "none" |
Fallback provider if primary fails |
model |
string | (per provider) | Embedding model name (see F4 table) |
sources |
string[] | ["memory"] |
Data sources: "memory", "sessions" |
extraPaths |
string[] | [] |
Additional directories/files to index |
local.modelPath |
string | "hf:ggml-org/..." |
Path or HuggingFace URI for local GGUF model |
local.modelCacheDir |
string | — | Directory for cached model files |
store.driver |
string | "sqlite" |
Storage driver (only "sqlite" currently) |
store.path |
string | ~/.openclaw/memory/{agentId}.sqlite |
SQLite database path |
store.vector.enabled |
boolean | true |
Enable sqlite-vec for fast vector search |
store.vector.extensionPath |
string | — | Custom path to sqlite-vec shared library |
chunking.tokens |
number | 400 |
Max chunk size in tokens |
chunking.overlap |
number | 80 |
Overlap between chunks in tokens |
sync.onSessionStart |
boolean | true |
Sync index when session starts |
sync.onSearch |
boolean | true |
Sync before search if files changed |
sync.watch |
boolean | true |
Watch memory files for changes |
sync.watchDebounceMs |
number | 1500 |
Debounce delay for file watcher |
sync.intervalMinutes |
number | 0 |
Periodic sync interval (0 = disabled) |
sync.sessions.deltaBytes |
number | 100000 |
Re-index session after this many new bytes |
sync.sessions.deltaMessages |
number | 50 |
Re-index session after this many new messages |
query.maxResults |
number | 6 |
Max results returned per search |
query.minScore |
number | 0.35 |
Minimum score threshold |
query.hybrid.enabled |
boolean | true |
Enable hybrid vector + keyword search |
query.hybrid.vectorWeight |
number | 0.7 |
Weight for vector similarity score |
query.hybrid.textWeight |
number | 0.3 |
Weight for BM25 keyword score |
query.hybrid.candidateMultiplier |
number | 4 |
Fetch this many × maxResults candidates |
cache.enabled |
boolean | true |
Enable embedding cache |
cache.maxEntries |
number | — | Max cached embeddings (unlimited if unset) |
experimental.sessionMemory |
boolean | false |
Enable session transcript indexing |
Source: src/agents/memory-search.ts:8-398
This section cross-references memory system costs back to Sections A–C:
| Resource | Memory system cost | Cross-reference |
|---|---|---|
| CPU | Local embedding inference (GGUF model on device) | A #3 |
| CPU | O(n) cosine similarity fallback without sqlite-vec | A #5 |
| CPU | Markdown chunking + SHA-256 hashing during sync | A #11 |
| Disk | SQLite database per agent (chunks, vectors, cache) | C — unbounded growth |
| Disk | WAL files can bloat without periodic VACUUM | C — unbounded growth |
| Disk | memory/*.md files accumulate from flush + session hook |
C — unbounded growth |
| RAM | Embedding vectors held in SQLite during sync/search | B — in-memory caches |
| RAM | Index manager singleton cached per agent+workspace | B — in-memory caches |
| Network | API calls to OpenAI/Gemini/Voyage for cloud embeddings | N/A (external) |
Key optimization levers:
- Set
provider: "openai"(or another cloud provider) to eliminate local CPU cost - Install sqlite-vec to avoid the O(n) cosine fallback
- Set
sync.intervalMinutes: 0andsync.watch: falseif memory files rarely change - Set
cache.maxEntriesto limit embedding cache disk growth - Periodically run
sqlite3 ~/.openclaw/memory/*.sqlite VACUUM(see Section D)
| Resource | Biggest risks | Key mitigation |
|---|---|---|
| CPU | Screenshot normalization (42 ops), image optimization (20 ops), local embeddings | Disable browser tools, use API embeddings, install sqlite-vec |
| Memory | Chromium (200MB-2GB), unbounded Maps (seqByRun, group histories), session store cloning |
Periodic Gateway restart, limit browser usage, cap history |
| Disk | Transcript JSONL (never pruned), browser profiles (GBs), commands.log (unbounded) |
Manual cleanup, SQLite VACUUM, session maintenance config |