Skip to content
/ soma Public

Interpretable, drive-based agent that develops without external rewards—event-sourced, auditable milestones, caregiver loop, eval harness.

License

Notifications You must be signed in to change notification settings

smonnar/soma

SOMA — Self‑Organizing Mechanomorphic Agent

A research scaffold for building an interpretable, drive‑based agent that develops without external reward. SOMA proceeds in small, auditable milestones (M0→M11) with structured logs, a replay tool, and a lightweight evaluation harness.

Status: Core build complete through M11 (Evaluation Harness v1). M12 (richer sandbox/causal puzzles) is available as an optional extension. See Definition of Done below.


Quickstart (Windows 11 + PowerShell)

# from repo root
python -m venv .venv
. .\.venv\Scripts\Activate.ps1
python -m pip install -U pip
pip install -r requirements.txt
# optional for editable installs
# pip install -e .

Run SOMA

# Grid v0 (baseline)
python -m scripts.run --env grid-v0 --ticks 60 --seed 123 --size 9 --n-objects 18 --view-radius 1

# Grid v1 (richer affordances / state toggles)
python -m scripts.run --env grid-v1 --ticks 60 --seed 123 --size 9 --n-objects 16 --view-radius 1

Each run writes a timestamped folder under runs/ with:

  • meta.json — run metadata
  • events.jsonl — append‑only event stream
  • events.sqlite — structured store (events table)
  • report.md — autogenerated short run report (M11)
  • caregiver_*.jsonl/json — query/answer/tag files (M10)

Replay notes/symbols

# Most recent run
python -m scripts.replay --kind note
# Or specify a folder
python -m scripts.replay --run runs\m10care_YYYYMMDDTHHMMSSZ --kind symbol

Caregiver interface (M10)

List pending queries and answer with token→gloss tags.

# list queries
python -m scripts.caregiver ls runs\m10care_YYYYMMDDTHHMMSSZ

# answer a query (repeat --tag for multiple)
python -m scripts.caregiver answer runs\m10care_YYYYMMDDTHHMMSSZ --qid m10care_...:41 `
  --tag N!=sudden-color-change --note "looked totally new"

SOMA ingests answers during the next ticks; merged tags persist in caregiver_tags.json and appear in future symbol notes and reports.

Evaluate a run (M11)

# evaluate the most recent run
python -m scripts.eval last

# or a specific folder
python -m scripts.eval runs\m10care_YYYYMMDDTHHMMSSZ

Outputs report.md with novelty stats, memory‑reuse ratios, symbol diversity, and caregiver‑gloss usage.


Environments

  • grid‑v0 — static colored shapes; no rewards; small viewport.
  • grid‑v1 — adds affordances: toggleable objects (e.g., buttons/gates), simple multi‑step interactions, and persistent state. Both expose the same observation schema to SOMA.

Actions: up, down, left, right, noop, ping


Milestones implemented

  • M0 — deterministic tick loop; JSONL events
  • M1 — SQLite event store; SelfNotes; replay CLI
  • M2 — Sandbox v0 (grid‑v0)
  • M3 — Reflex engine (overload/loop safe‑guards)
  • M4 — Memory v1 (episodic + vector store), similarity recall notes
  • M5 — Curiosity v1 (novelty + attention)
  • M6 — Motivation v1 (drive pressures)
  • M7 — Behavior planner v1 (drive→policy)
  • M8 — State tracker v1 (self‑model for interpretability)
  • M9 — Symbolic channel v0 (compact utterances: N!, Stab↓, ?, …)
  • M10 — Caregiver interface v0 (query/answer/tag loop)
  • M11 — Evaluation harness v1 (markdown report)
  • M12 (optional) — Sandbox v1 (richer affordances) with baseline planner support

Project layout

soma/
  configs/              # (reserved)
  runs/                 # per‑run artifacts
  scripts/              # CLI entrypoints: run, replay, caregiver, eval
  soma/
    core/               # loop, state, events, store
    cogs/               # reflex, memory, curiosity, motivation, planner, …
    sandbox/            # envs: grid‑v0, grid‑v1
    ...
  tests/                # (minimal stubs; expand as needed)

Definition of Done (project)

  • One CLI: python -m scripts.run with flags for env/ticks/seed ✅
  • Structured logs: JSONL + SQLite; per‑run report.md
  • Evaluation: scripts.eval computes core developmental metrics ✅
  • Replay: scripts.replay surfaces self‑notes & symbols ✅
  • Caregiver loop: scripts.caregiver for queries/answers/tags ✅
  • Docs: this README ✅
  • Tests: minimal smoke tests in tests/ -- add more unit + scenario tests as follow‑ups

With M0–M11 shipped and v1 sandbox available, v0 SOMA is functional for experimentation and evaluation. Recommended follow‑ups: strengthen unit tests and add a couple of scenario tests (overload, contradiction) and 1–2 Architecture Decision Records.


Troubleshooting

  • Module not found (soma): run commands from the repo root: python -m scripts.run.
  • Typer/Click errors: ensure typer>=0.12 installed; prefer python -m style invocations.
  • Stale imports / odd behavior: clear __pycache__ and re‑run.
  • “noop lock” late‑run: parameters tuned in staleness & planner; see soma/cogs/working_memory/staleness.py.

License

MIT (or your preferred license).

About

Interpretable, drive-based agent that develops without external rewards—event-sourced, auditable milestones, caregiver loop, eval harness.

Topics

Resources

License

Code of conduct

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages