Feature Request: Agent-recordable, verifiable procedural skills ("executable memory")

### What variant of Codex are you using?

CLI

### What feature would you like to see?

### Summary

Codex can already *read* procedural knowledge (skills) and *recall* context (memories), but it has no first-class way for the agent to **turn one of its own successful, repeatable runs into a reusable, executable, verifiable procedure** — and to **repair that procedure cheaply when the environment drifts**, instead of re-deriving the whole thing from scratch.

I'd like to propose an **agent-recordable procedural skill** capability (call it "executable memory"): a small, surface-consistent extension that sits between today's static skills and background-extracted memories.

### Additional information

### The gap (grounded in the current tree)

Three adjacent primitives already exist, but none closes the loop:

- **Skills are static and human-authored.** `SKILL.md` is parsed at load time (`codex-rs/core-skills/src/loader.rs`), and `SkillMetadata` is a read-only snapshot (`codex-rs/core-skills/src/model.rs`). There is no agent-facing path to *write* or *update* a skill from a successful run.
- **Memories are unstructured context, not executable procedure.** The agent can append a free-form markdown note via `AddAdHocNote` (`codex-rs/ext/memories/src/tools/ad_hoc_note.rs`), and memories are auto-injected into context by a `ContextContributor`. But a note is prose, not a replayable step sequence with success criteria.
- **Traces are diagnostic, not re-executable.** `rollout-trace` records everything and exposes `replay_bundle` (`codex-rs/rollout-trace/src/lib.rs`), but explicitly for *offline reconstruction / debugging* — there is no production pathway to replay a recorded sequence against new inputs.

So the agent re-discovers the same multi-step procedure (a chain of shell/tool calls) on every repeat of a recurring task, paying full exploration cost in tokens and latency each time.

### Proposed capability (minimal slice)

Reuse what already exists; add only the missing loop. Concretely:

1. **Record** — an agent-callable tool (e.g. `skill.record`) that, after a successful repeatable task, persists the procedure as a `SKILL.md` under `./.codex/skills/<name>` (existing format), where the body captures *steps = the tool-call sequence* plus *postconditions = how to verify success*. This is the agent-write counterpart to today's human-authored skills, and a structured counterpart to `AddAdHocNote`.
2. **Recall + replay** — surface matching recorded skills the way memories already auto-surface (`ContextContributor`), and let the agent replay the stored steps instead of re-exploring.
3. **Verify** — on replay, check the recorded postconditions rather than assuming success.
4. **Minimal repair** — when a postcondition fails, patch the failing step (and persist the patch) instead of discarding and re-deriving the whole procedure. Keeping runtime patches separate from the trusted baseline keeps a recorded skill auditable.
5. **Safety gate** — route any step the agent flags as high-risk through the existing approval flow before replay, so a recorded procedure can't silently execute a destructive step.

Capture could optionally be assisted by existing `PostToolUse` / `Stop` hook events (`codex-rs/hooks/src/lib.rs`) and the `rollout-trace` reducer as the source of the step sequence — no new recording infrastructure required.

### Why this fits Codex (and not a separate product)

- **Surface-consistent.** It's pure text/tool-call procedure persisted as files. It works headless and applies equally to CLI, IDE, and cloud surfaces — no new platform-specific dependencies, no display requirement.
- **Directly on the existing roadmap surface.** It's an incremental extension of `core-skills` + `memories`, not a new subsystem. The novel pieces are narrow: an agent **write** path for skills, **postcondition verification**, and **minimal repair**.
- **Concrete payoff.** Recurring tasks (repeated build/test/debug loops, multi-step setup/migration chores, fixed tool pipelines) stop paying full re-exploration cost on every run — fewer tokens, lower latency, more determinism.

### Open design questions (happy to discuss before any code)

- Should a recorded skill be a `SKILL.md` superset (add an optional executable/verify block) or a sibling artifact type, to avoid confusing human-authored vs. agent-recorded skills?
- What's the right trigger for **record** — an explicit tool the model calls, a user command, or a `Stop`-hook prompt at end of a successful task?
- How should postconditions be expressed so they're cheap to evaluate and safe (no side effects)?
- What's the minimal-repair contract — patch-and-persist vs. propose-patch-for-confirmation?
- Where does this sit relative to the memories Phase-1 background pipeline — complementary, or should recording feed that pipeline?

I'm glad to iterate on the design in this thread before anything is implemented; identifying the right shape here seems like the hard part.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature Request: Agent-recordable, verifiable procedural skills ("executable memory") #28999

What variant of Codex are you using?

What feature would you like to see?

Summary

Additional information

The gap (grounded in the current tree)

Proposed capability (minimal slice)

Why this fits Codex (and not a separate product)

Open design questions (happy to discuss before any code)

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Feature Request: Agent-recordable, verifiable procedural skills ("executable memory") #28999

Description

What variant of Codex are you using?

What feature would you like to see?

Summary

Additional information

The gap (grounded in the current tree)

Proposed capability (minimal slice)

Why this fits Codex (and not a separate product)

Open design questions (happy to discuss before any code)

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions