Skip to content

Unbounded duplicated thread metadata can crash Desktop startup reconciliation via better-sqlite3 .all() #29007

@kangminlee-maker

Description

@kangminlee-maker

Summary

Codex Desktop can crash during startup before the window is ready when the local profile contains a large ~/.codex/sqlite/state_5.sqlite with oversized threads display metadata.

The immediate crash trigger is in the packaged Electron main-process startup reconciliation path that copies/merges temporary app-server state from:

  • source: ~/.codex/sqlite/state_5.sqlite
  • target: ~/.codex/state_5.sqlite
  • marker: ~/.codex/.app-server-state-reconciled-v1

The broader structural issue is that bounded UI/list metadata and unbounded conversation content are mixed together. Fields such as title, preview, and first_user_message are used by startup/list/navigation hot paths, but can contain full prompt-sized content and can be duplicated across all three fields.

In the current packaged app, the startup reconciliation code still eagerly materializes all thread rows:

for (let row of sourceDb.prepare(`SELECT ${columns} FROM threads`).all()) {
  ...
}

If columns includes very large title, preview, and first_user_message values, this loads the full thread metadata table into V8 memory and can terminate the Electron main process with a V8 OOM. macOS records this as EXC_BREAKPOINT / SIGTRAP.

Environment

Original crash:

  • Codex Desktop: 26.611.62324 (4028)
  • Codex Framework / Chromium: 149.0.7827.115
  • macOS: 26.4.1 (25E253)
  • Architecture: Apple Silicon / arm64
  • Process: /Applications/Codex.app/Contents/MacOS/Codex

After reinstall/update, I rechecked the current app:

  • Codex Desktop: 26.616.31447 (4133)
  • app-server / CLI: 0.142.0-alpha.1
  • better-sqlite3: 12.9.0
  • Codex Framework still: 149.0.7827.115

The app currently launches after manually repairing/trimming the affected DB state, but the same vulnerable startup reconciliation code path is still present in the updated app bundle.

Crash signature

Relevant macOS crash stack excerpt:

Exception Type: EXC_BREAKPOINT (SIGTRAP)
Termination Reason: SIGNAL, Trace/BPT trap: 5
Triggered by Thread: 0 CrBrowserMain

v8::String::NewFromUtf8(...)
better_sqlite3.node Data::GetValueJS(v8::Isolate*, sqlite3_stmt*, int, bool)
better_sqlite3.node Statement::JS_all(...)
...
ChromeMain
main

Controlled launch attempts also produced V8 OOM stderr:

OOM error in V8: Scavenger: semi-space copy Allocation failed - JavaScript heap out of memory

Local DB evidence

The source DB read by the startup reconciliation path was valid SQLite, not corrupt:

PRAGMA quick_check = ok
threads = 29141

Before manual repair, the threads table contained very large duplicated display metadata:

max(length(title))              = 657278
max(length(preview))            = 657278
max(length(first_user_message)) = 657278

rows where title = preview = first_user_message:
29137 / 29141

sum(title + preview + first_user_message):
~2152.4 MB raw SQLite text

Most of this was active exec thread metadata:

archived=0 rows: 28438, ~2150.1 MB metadata
source=exec rows: 27985, ~2147.6 MB metadata

After trimming the source DB to bounded metadata and creating/allowing the reconciliation marker, the app launched successfully.

Current updated-app check

After reinstalling/updating to 26.616.31447 (4133), I extracted the packaged app.asar and confirmed the startup reconciliation code still uses .all() over threads.

The repaired ~/.codex/sqlite/state_5.sqlite is now bounded:

threads = 29141
max title / preview / first_user_message = 200 / 4000 / 4000
over limit = 0 / 0 / 0

However, the live root DB is already accumulating oversized metadata again after normal use:

~/.codex/state_5.sqlite
threads = 31540
max title / preview / first_user_message = 76211 / 76211 / 76211
over limit rows = 39 / 18 / 18
source = exec

So the immediate startup crash is currently avoided by the repaired source DB and marker, but the underlying unbounded metadata producer still appears active.

Why this is not just a one-line .all() bug

The .all() call is the immediate trigger, but the deeper invariant violation is:

thread display metadata is unbounded and duplicated, while multiple hot paths assume it is small.

title, preview, and first_user_message are used like lightweight projections for sidebar/list/startup/reconciliation paths. In my affected DB they had become transcript-sized payloads, and the same large string was usually stored three times.

That turns a valid local DB into a poisoned state: any path that treats those columns as small metadata can become a memory, CPU, IPC, or renderer problem.

Related issues

This appears related to the local-history / unbounded metadata family, especially:

This report is a more severe startup-specific failure mode:

Codex Desktop can crash before the UI is ready because Electron main-process startup reconciliation calls better-sqlite3 .all() over all threads rows, including unbounded metadata columns.

Expected behavior

  • Local thread metadata should not be able to crash the Desktop app during startup.
  • Startup reconciliation should not materialize all thread rows and all large text columns into V8 at once.
  • Display metadata fields such as title, preview, and list/read summaries should be bounded.
  • Existing oversized local state should be repaired, truncated to safe projections, or degraded gracefully.

Actual behavior

  • Codex Desktop crashed during startup with EXC_BREAKPOINT / SIGTRAP.
  • The meaningful native stack points to better_sqlite3.node Statement::JS_all converting SQLite text into V8 strings.
  • Manual DB trimming was required to recover the app.

Suggested fixes

  1. Replace startup reconciliation .all() calls with streaming or paged iteration, e.g. iterate() or keyset paging.
  2. Avoid selecting large display fields unless needed.
  3. In reconciliation SQL, cap projected metadata fields, for example:
    • title: short display label
    • preview: bounded preview
    • first_user_message: bounded snapshot or excluded from hot paths
  4. Enforce metadata caps at write time and migration/reconciliation boundaries.
  5. Treat raw conversation history/session JSONL as the source of truth for full content; keep threads metadata as bounded projections.
  6. Add a repair/doctor path for existing oversized threads rows.
  7. Keep startup recovery non-fatal where possible: log and continue, or show a repair prompt, instead of letting V8 OOM terminate the app.

Workaround used locally

I backed up both DBs, then trimmed:

  • title <= 200
  • preview <= 4000
  • first_user_message <= 4000

After that, creating/allowing the .app-server-state-reconciled-v1 marker let the app launch normally.

I am not attaching raw DB rows or full crash reports publicly because they may contain private local paths and conversation content, but I can provide additional sanitized excerpts if useful.

Metadata

Metadata

Assignees

No one assigned

    Labels

    appIssues related to the Codex desktop appapp-serverIssues involving app server protocol or interfacesbugSomething isn't workingperformance

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions