Skip to content
Open
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
129 changes: 32 additions & 97 deletions .agents/aidevops/graduated-learnings.md
Original file line number Diff line number Diff line change
@@ -1,134 +1,69 @@
---
description: Shared learnings graduated from local memory across all users
mode: subagent
tools:
read: true
write: false
edit: false
bash: false
glob: false
grep: false
webfetch: false
tools: { read: true, write: false, edit: false, bash: false, glob: false, grep: false, webfetch: false }
---

# Graduated Learnings

Validated patterns promoted from local memory (3+ accesses or high confidence).
Graduated by `memory-graduate-helper.sh`, timestamped per batch.

**Categories**:

- **Solutions & Fixes**: Working solutions to real problems
- **Anti-Patterns**: Approaches that failed (avoid repeating)
- **Patterns & Best Practices**: Proven approaches
- **Architecture Decisions**: Key design choices and rationale
- **Configuration & Preferences**: Tool and workflow settings
- **Context & Background**: Important background information

**Usage**: `memory-helper.sh graduate [candidates|graduate|status]`
Graduated by `memory-graduate-helper.sh`. Usage: `memory-helper.sh graduate [candidates|graduate|status]`

## Graduated: 2026-02-08

### Anti-Patterns (What NOT to Do)
### Anti-Patterns

- **[FAILED_APPROACH]** Tried using PostgreSQL for memory but it adds deployment complexity - SQLite FTS5 is simpler
*(confidence: high, validated: 9x)*

- **[FAILURE_PATTERN]** [task:refactor] Haiku missed edge cases when refactoring complex shell scripts with many conditionals [model:haiku]
*(confidence: high, validated: 3x)*
- PostgreSQL for memory adds deployment complexity — SQLite FTS5 is simpler *(high, 9x)*
- Haiku missed edge cases refactoring complex shell scripts with many conditionals *(high, 3x)*

### Architecture Decisions

- **[ARCHITECTURAL_DECISION]** YAML handoffs are more token-efficient than markdown (~400 vs ~2000 tokens)
*(confidence: high, validated: 0x)*

- **[DECISION]** Mailbox uses SQLite (`mailbox.db`) not TOON files. Prune shows storage report by default, `--force` to delete. Migration from TOON runs automatically on `aidevops update` via `setup.sh`.
*(confidence: medium, validated: 8x)*

- **[DECISION]** Agent lifecycle uses three tiers: `draft/` (R&D, orchestration-created), `custom/` (private, permanent), shared (`.agents/` via PR). Both `draft/` and `custom/` survive `setup.sh` deployments. Orchestration agents (Build+, Ralph loop, runners) know they can create drafts for reusable parallel processing context and propose them for inclusion in aidevops.
*(confidence: medium, validated: 3x)*
- YAML handoffs more token-efficient than markdown (~400 vs ~2000 tokens) *(high, 0x)*
- Mailbox uses SQLite (`mailbox.db`) not TOON files. Prune shows storage report by default, `--force` to delete. Migration from TOON runs automatically on `aidevops update` via `setup.sh` *(medium, 8x)*
- Agent lifecycle: three tiers — `draft/` (R&D, orchestration-created), `custom/` (private, permanent), shared (`.agents/` via PR). Both `draft/` and `custom/` survive `setup.sh`. Orchestration agents can create drafts for reusable parallel processing context and propose them for inclusion *(medium, 3x)*

### Configuration & Preferences

- **[USER_PREFERENCE]** Prefer conventional commits with scope: feat(memory): description
*(confidence: medium, validated: 4x)*
- Prefer conventional commits with scope: `feat(memory): description` *(medium, 4x)*

### Patterns & Best Practices

- **[SUCCESS_PATTERN]** [task:feature] Breaking task into 4 phases with separate commits worked well for Claude-Flow feature adoption [model:sonnet]
*(confidence: high, validated: 3x)*

- **[SUCCESS_PATTERN]** [task:bugfix] Opus identified root cause of race condition by reasoning through concurrent execution paths [model:opus]
*(confidence: high, validated: 2x)*

- **[CODEBASE_PATTERN]** Memory daemon should auto-extract learnings from thinking blocks when sessions end
*(confidence: medium, validated: 5x)*
- Breaking task into 4 phases with separate commits worked well for feature adoption *(high, 3x)*
- Opus identified root cause of race condition by reasoning through concurrent execution paths *(high, 2x)*
- Memory daemon should auto-extract learnings from thinking blocks when sessions end *(medium, 5x)*

## Graduated: 2026-02-11

### Anti-Patterns (What NOT to Do)
### Anti-Patterns

- **[FAILURE_PATTERN]** Mentioning issues in summary text without logging TODOs or fixing them. Fix: log issues immediately, don't batch to end.
*(confidence: high, validated: 1x)*
- Mentioning issues in summary text without logging TODOs or fixing them. Fix: log issues immediately, don't batch to end *(high, 1x)*

### Architecture Decisions

- **[DECISION]** Log discovered issues as TODOs immediately, not at session end. Deferring loses context.
*(confidence: high, validated: 1x)*

- **[DECISION]** For content generation (images, video, UGC, ads): read domain subagents first. Structured templates (Nanobanana Pro, 7-component video format, hook frameworks) produce better results than freehand.
*(confidence: high, validated: 1x)*

- **[DECISION]** UGC content: generate all shots, assemble with `ffmpeg` transitions, output final sequence as primary deliverable (not individual clips).
*(confidence: high, validated: 1x)*

- **[DECISION]** CRITICAL: Supervisor needs orphaned PR scanner (Phase 3c). Pattern: workers emit `TASK_COMPLETE` before PR, or `evaluate_worker` fails to parse PR URL. Fix: scan `gh pr list --state open --head feature/tXXX` for tasks with `task_only`/`no_pr`/NULL `pr_url`. Would catch t199.2 (PR #849), t199.3 (PR #846), t199.5 (PR #872) automatically.
*(confidence: high, validated: 0x)*
- Log discovered issues as TODOs immediately, not at session end. Deferring loses context *(high, 1x)*
- Content generation (images, video, UGC, ads): read domain subagents first. Structured templates (Nanobanana Pro, 7-component video format, hook frameworks) produce better results than freehand *(high, 1x)*
- UGC content: generate all shots, assemble with `ffmpeg` transitions, output final sequence as primary deliverable (not individual clips) *(high, 1x)*
- CRITICAL: Supervisor needs orphaned PR scanner (Phase 3c). Pattern: workers emit `TASK_COMPLETE` before PR, or `evaluate_worker` fails to parse PR URL. Fix: scan `gh pr list --state open --head feature/tXXX` for tasks with `task_only`/`no_pr`/NULL `pr_url`. Would catch t199.2 (PR #849), t199.3 (PR #846), t199.5 (PR #872) automatically *(high, 0x)*

### Configuration & Preferences

- **[USER_PREFERENCE]** User-facing generated assets (images, videos, documents) should be output to `~/Downloads/` so the user can immediately review them in Finder. Do NOT bury outputs in `~/.aidevops/.agent-workspace/` for interactive sessions — that path is invisible to the user. Reserve `.agent-workspace` for headless/pipeline runs only.
*(confidence: high, validated: 0x)*

- **[USER_PREFERENCE]** Runtime identity: use version-check output, don't guess. Wrong identity → wrong config paths, CLI commands, prompt loading.
*(confidence: high, validated: 0x)*
- User-facing generated assets (images, videos, documents) → `~/Downloads/` for immediate review. Do NOT bury outputs in `~/.aidevops/.agent-workspace/` for interactive sessions. Reserve `.agent-workspace` for headless/pipeline runs only *(high, 0x)*
- Runtime identity: use version-check output, don't guess. Wrong identity → wrong config paths, CLI commands, prompt loading *(high, 0x)*

### Patterns & Best Practices

- **[CODEBASE_PATTERN]** OpenCode: `prompt` field in `opencode.json` replaces (not appends) `anthropic_default`. All active agents must have `build.txt` set or fall back to upstream `anthropic.txt`, losing aidevops overrides.
*(confidence: high, validated: 1x)*

- **[CODEBASE_PATTERN]** Task ID collision: t264 assigned by two sessions simultaneously (PR #1040 vs version-manager fix). Prevention: `git pull` and re-read TODO.md before assigning IDs.
*(confidence: high, validated: 1x)*

- **[CODEBASE_PATTERN]** Stale TODO.md: completed tasks (t231 PR #955, t247 subtasks, t259 PR #1020) remain open because `update_todo_on_complete()` only runs post-PR. Fix: run `supervisor-helper.sh reconcile-todo` periodically; workers check if work is done and report `task_obsolete`.
*(confidence: high, validated: 0x)*

- **[SUCCESS_PATTERN]** [task:feature] t136.5: Scaffold aidevops-pro/anon repos | PR #792 | [model:opus] [duration:1206s]
*(confidence: medium, validated: 51x)*
- OpenCode: `prompt` field in `opencode.json` replaces (not appends) `anthropic_default`. All active agents must have `build.txt` set or fall back to upstream `anthropic.txt`, losing aidevops overrides *(high, 1x)*
- Task ID collision: t264 assigned by two sessions simultaneously (PR #1040 vs version-manager fix). Prevention: `git pull` and re-read TODO.md before assigning IDs *(high, 1x)*
- Stale TODO.md: completed tasks (t231 PR #955, t247 subtasks, t259 PR #1020) remain open because `update_todo_on_complete()` only runs post-PR. Fix: run `supervisor-helper.sh reconcile-todo` periodically; workers check if work is done and report `task_obsolete` *(high, 0x)*
- t136.5: Scaffold aidevops-pro/anon repos | PR #792 *(medium, 51x)*
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

While the goal of making this machine-readable document more concise is good, these changes remove valuable technical metadata, which contradicts the PR's claim of "zero information loss". This data appears to be important for tracking agent performance and behavior.

Specifically, the following information has been removed throughout the file:

  • Task and Model Metadata: Tags like [task:refactor], [model:haiku], [model:sonnet], and [model:opus] are removed. This data is crucial for analyzing which models are effective for which types of tasks.
  • Duration Metadata: The [duration:1206s] tag has been removed, losing a key performance metric.
  • Specific Nouns: In one case, "Claude-Flow feature adoption" was shortened to "feature adoption", losing specific context.

This lost information seems to fall under the category of "institutional memory" that should be preserved. If this is a machine-consumed document, removing these structured tags could also break the parser.

Please consider restoring this metadata, perhaps in a more structured but still concise format if desired.

References
  1. Data that justifies a rule's existence (e.g., statistics) should be preserved in documentation as it is considered institutional memory.


### Solutions & Fixes

- **[ERROR_FIX]** Deploying auto-recovery infinite loop: `retry_count` was LOCAL, reset every pulse cycle. If both `deployed` and `failed` transitions fail, task stuck forever. Fixed t263 (PR #1036): persistent `deploying_recovery_attempts` DB column, max 10 attempts, fallback SQL UPDATE.
*(confidence: high, validated: 0x)*

- **[ERROR_FIX]** Pulse silent failure: Phase 3 called with `2>/dev/null || true` masks crashes. Symptom: only header printed. Diagnosis: check `post-pr.log` for repeated entries.
*(confidence: high, validated: 0x)*

- **[WORKING_SOLUTION]** Bash associative arrays (`declare -A`) + `set -u` = unbound variable on empty arrays and subscript access. Use newline-delimited string + grep instead for portable `set -u`-safe lookups. Fixed in `issue-sync-helper.sh` PR #1086.
*(confidence: high, validated: 0x)*

- **[WORKING_SOLUTION]** Parallel workers on blocked-by chains create merge conflicts (t008.1-4, t012.3-5). Solution: dispatch sequentially or use single worker.
*(confidence: high, validated: 0x)*

- **[WORKING_SOLUTION]** Decomposition bug (t278): parents marked [x] while subtasks still [ ]. Verify subtask completion before marking parent done.
*(confidence: high, validated: 0x)*

- **[WORKING_SOLUTION]** issue-sync `find_closing_pr()`: format mismatch (`pr:#NNN` vs `PR #NNN`) silently omits PR reference. Ensure regex matches TODO.md format. Fixed t291/PR#1129.
*(confidence: high, validated: 0x)*

- **[WORKING_SOLUTION]** CRITICAL: Cron pulse on macOS needs (1) `/usr/sbin` in PATH, (2) `GH_TOKEN` cached to file (keyring inaccessible), (3) `get_aidevops_identity` validates `gh api` output. Fixed PR #780.
*(confidence: medium, validated: 52x)*

- **[WORKING_SOLUTION]** SYSTEMIC: Deployed scripts at `~/.aidevops/agents/scripts/` NOT auto-updated after merging. Cron pulse runs deployed copy. Run `aidevops update` or `rsync` after script changes. Consider auto-deploy in post-merge hook.
*(confidence: medium, validated: 37x)*
- Deploying auto-recovery infinite loop: `retry_count` was LOCAL, reset every pulse cycle. If both `deployed` and `failed` transitions fail, task stuck forever. Fixed t263 (PR #1036): persistent `deploying_recovery_attempts` DB column, max 10 attempts, fallback SQL UPDATE *(high, 0x)*
- Pulse silent failure: Phase 3 called with `2>/dev/null || true` masks crashes. Symptom: only header printed. Diagnosis: check `post-pr.log` for repeated entries *(high, 0x)*
- Bash `declare -A` + `set -u` = unbound variable on empty arrays and subscript access. Use newline-delimited string + grep instead for portable `set -u`-safe lookups. Fixed in `issue-sync-helper.sh` PR #1086 *(high, 0x)*
- Parallel workers on blocked-by chains create merge conflicts (t008.1-4, t012.3-5). Solution: dispatch sequentially or use single worker *(high, 0x)*
- Decomposition bug (t278): parents marked [x] while subtasks still [ ]. Verify subtask completion before marking parent done *(high, 0x)*
- issue-sync `find_closing_pr()`: format mismatch (`pr:#NNN` vs `PR #NNN`) silently omits PR reference. Ensure regex matches TODO.md format. Fixed t291/PR#1129 *(high, 0x)*
- CRITICAL: Cron pulse on macOS needs (1) `/usr/sbin` in PATH, (2) `GH_TOKEN` cached to file (keyring inaccessible), (3) `get_aidevops_identity` validates `gh api` output. Fixed PR #780 *(medium, 52x)*
- SYSTEMIC: Deployed scripts at `~/.aidevops/agents/scripts/` NOT auto-updated after merging. Cron pulse runs deployed copy. Run `aidevops update` or `rsync` after script changes. Consider auto-deploy in post-merge hook *(medium, 37x)*
Loading