Skip to content

Bosun/codex self improvement loop commits#469

Open
jaeko44 wants to merge 87 commits intomainfrom
bosun/codex-self-improvement-loop-commits
Open

Bosun/codex self improvement loop commits#469
jaeko44 wants to merge 87 commits intomainfrom
bosun/codex-self-improvement-loop-commits

Conversation

@jaeko44
Copy link
Copy Markdown
Member

@jaeko44 jaeko44 commented Mar 31, 2026

No description provided.

jaeko44 added 30 commits March 31, 2026 16:13
…ecovery routes to inreview

Three critical bugs fixed:

1. checkMergedPRsAndUpdateTasks() ignored 'blocked' tasks entirely — merged-PR
   detection only covered todo/inprogress/inreview. Now includes blocked tasks
   in mergedRecoveryCandidates and allowsMergedRecovery.

2. safeRecoverTask() always routed recovered tasks to 'todo' regardless of
   whether they had a linked PR. Now checks for prNumber/prUrl and routes
   to 'inreview' when a PR exists.

3. recoverTimedBlockedWorkflowTasks() and recoverAutoBlockedTasks() both
   hardcoded 'todo' as the recovery status. Now check for linked PRs and
   recover to 'inreview' when appropriate.

Live store impact: 8 blocked tasks with merged PRs moved to done,
16 blocked tasks without PRs unblocked to todo. Zero blocked tasks remain.
jaeko44 and others added 3 commits April 1, 2026 05:28
- Added header scope label to the dashboard for better context on task counts.
- Updated task detail fetching to include workflow run information.
- Improved task classification for blocked tasks based on worktree failures.
- Adjusted batch size for Vitest runner on Windows to optimize performance.
- Enhanced PR fix templates to validate PR state before cloning and releasing claims.
- Introduced new validation node for PR state in workflow templates.
- Updated workflow nodes to support provider and provider configuration for OpenCode executor runs.
- Refactored executor resolution logic to include provider and provider configuration.

Co-authored-by: bosun-ve[bot] <262908237+bosun-ve[bot]@users.noreply.github.com>
…dex-self-improvement-loop-commits' and 'bosun/codex-self-improvement-loop-commits' of https://github.com/virtengine/bosun into bosun/codex-self-improvement-loop-commits

Co-authored-by: bosun-ve[bot] <262908237+bosun-ve[bot]@users.noreply.github.com>
Co-authored-by: bosun-ve[bot] <262908237+bosun-ve[bot]@users.noreply.github.com>
@jaeko44
Copy link
Copy Markdown
Member Author

jaeko44 commented Mar 31, 2026

@copilot 2 failing tests remain:

Test Files 2 failed | 260 passed | 1 skipped (263)
Tests 2 failed | 4851 passed | 3 skipped (4856)
Start at 18:29:59
Duration 103.83s (transform 8.97s, setup 25.02s, import 32.49s, tests 211.48s, environment 50ms)

Error: AssertionError: expected '/* Auto-generated by tools/generate-d…' to be '/* Auto-generated by tools/generate-d…' // Object.is equality

  • Expected
  • Received

@@ -49894,19 +49894,19 @@
"voicePersona": "neutral",
"voiceInstructions": "You are Bosun, a voice assistant for the VirtEngine development platform. Be helpful, concise, and professional. Use tools to answer questions and execute tasks.",
"enabledTools": null,
"enabledMcpServers": []
},

  • "background-task-execution": "# Skill: Background Task Execution\n\n- Send heartbeat updates before work starts, after major milestones, and during long operations.\n- Post status notes when a step runs long enough to look stalled.\n- Verify the affected code path before marking the task complete.\n- Finish only after build, tests, and any required push or PR handoff succeed.\n",
    
  • "pr-workflow": "# Skill: Pull Request Workflow\n\n- Fetch and merge the base branch plus `origin/main` before pushing.\n- Push the task branch and hand off PR lifecycle steps to Bosun.\n- Use a short PR description with Summary, Changes, Testing, and Notes.\n- Never bypass git hooks with `--no-verify`; fix the failing check or note unrelated breakage.\n",
    
  • "error-recovery": "# Skill: Error Recovery Patterns\n\n- Classify the failure first: syntax, test, dependency, git, network, config, or resource limits.\n- Fix the first real error before chasing downstream noise.\n- Prefer the smallest safe change that resolves the root cause.\n- If the error is external or flaky, retry with limits and stop rather than papering over it.\n",
    
  • "tdd-pattern": "# Skill: Test-Driven Development\n\n- Start with the smallest failing test that proves the target behavior.\n- Implement the minimum code required to pass that test.\n- Refactor only after the test is green.\n- Keep tests deterministic: no real network, random data, or timer-based synchronization.\n",
    
  • "commit-conventions": "# Skill: Conventional Commits\n\n- Use Conventional Commits such as `feat:`, `fix:`, `chore:`, or `test:`.\n- Keep the subject short, imperative, and scoped to the actual change.\n- Mention validation in the handoff or PR notes, not in the subject line.\n- Do not bundle unrelated work into the same commit message or commit.\n",
    
  • "agent-coordination": "# Skill: Multi-Agent Coordination\n\n- Keep scope ownership clear before editing shared files.\n- Prefer small, isolated changes that reduce merge conflict risk.\n- Leave concise status notes when handing work to another agent or retry.\n- Re-check git status before finalizing so no unrelated edits leak into the task.\n",
    
  • "bosun-agent-api": "# Skill: Bosun Agent Status API\n\n- POST `/status` when starting a new phase or when context changes.\n- POST `/heartbeat` during active work so Bosun does not requeue the task.\n- POST `/error` with concise failure context before aborting.\n- POST `/complete` only after verification is done and the task is truly finished.\n",
    
  • "code-quality-anti-patterns": "# Skill: Code Quality Anti-Patterns\n\n- Keep caches, lazy singletons, and loaded flags at module scope.\n- Await async work or attach `.catch()`; never leave floating promises.\n- Wrap hot-path callbacks and handlers in error boundaries.\n- Mock external boundaries only; avoid over-mocking the module under test.\n- Keep tests deterministic and remove dead branches instead of layering guard code.\n",
    
  • "skill-codebase-audit": "# Skill: Codebase Annotation Audit\n\n## Purpose\nSystematically audit and annotate a codebase so that *future* AI agents can\nnavigate it 4× faster, use 20% fewer tokens, and avoid false-positive changes.\nThis skill is **documentation-only** — it MUST NOT fix bugs, refactor code,\nor change program behavior.\n\n## Philosophy — LEAN Annotations\n\nModern AI coding SDKs (Copilot, Codex, Claude Code) already auto-compact\ncontext. Adding a memory/compaction layer on top is wasteful. What *does* help\nis **repo-level documentation** that agents read at the start of a session:\nsummaries, warnings, archit
    

Error: AssertionError: expected '/* ──────────────────────────────────…' to contain 'return status === "active" || status …'

  • Expected
  • Received
  • return status === "active" || status === "running" || status === "busy" || status === "inprogress";
  • /* ─────────────────────────────────────────────────────────────
    • Tab: Agents — thread/slot cards, capacity, detail expansion
    • ────────────────────────────────────────────────────────────── */
  • import { h } from "preact";
  • import { useState, useCallback, useEffect, useRef, useMemo } from "preact/hooks";
  • import htm from "htm";
  • const html = htm.bind(h);
  • import {
  • Typography, Box, Stack, CardContent, Button, IconButton, Chip,
  • Divider, Paper, TextField, CircularProgress, Alert, Tooltip,
  • Tabs, Tab, Skeleton, Avatar, LinearProgress,
  • List, ListItem, ListItemButton, ListItemText, ListItemIcon,
  • Menu, MenuItem, Accordion, AccordionSummary, AccordionDetails,
  • Select, FormControl, InputLabel,
  • } from "@mui/material";
  • import { haptic, showConfirm } from "../modules/telegram.js";
  • import { apiFetch, sendCommandToChat } from "../modules/api.js";
  • import { iconText, resolveIcon } from "../modules/icon-utils.js";
  • import {
  • executorData,
  • agentsData,
  • statusData,
  • telemetrySummary,
  • agentLogQuery,
  • agentLogFile,
  • agentWorkspaceTarget,
  • showToast,
  • refreshTab,
  • scheduleRefresh,
  • } from "../modules/state.js";
  • import { navigateTo, routeParams, setRouteParams } from "../modules/router.js";
  • import {
  • activeWorkspaceId,
  • loadWorkspaces,
  • workspaces as managedWorkspaces,
  • } from "../components/workspace-switcher.js";
  • import { ICONS } from "../modules/icons.js";
  • import { formatCompactCount } from "../modules/session-insights.js";
  • import { formatRelative, truncate } from "../modules/utils.js";
  • import { getSessionRuntimeState, resolveSessionWorkspaceHint } from "../modules/session-api.js";
  • import {
  • Card,
  • Badge,
  • StatCard,
  • SkeletonCard,
  • EmptyState,
  • } from "../components/shared.js";
  • import { ProgressBar } from "../components/charts.js";
  • import { Collapsible } from "../components/forms.js";

…nerate demo-defaults

Agent-Logs-Url: https://github.com/virtengine/bosun/sessions/61d2cb29-29a9-45db-8a9e-6a26e464ae67

Co-authored-by: jaeko44 <9289791+jaeko44@users.noreply.github.com>
Copy link
Copy Markdown
Contributor

Copilot AI commented Mar 31, 2026

@copilot 2 failing tests remain:

...

Fixed both remaining failures in commit 8593011:

  • fleet-tab-render.test.mjs: Restored isFleetEntryActive in ui/tabs/agents.js to use the direct status comparison pattern (return status === "active" || status === "running" || status === "busy" || status === "inprogress") expected by the test — the function had been refactored to delegate to getFleetEntryStatusMeta in the branch.
  • demo-defaults-sync.test.mjs: Regenerated ui/demo-defaults.js and site/ui/demo-defaults.js via node tools/generate-demo-defaults.mjs to sync with the current workflow template definitions (skill content had drifted).

…ithub.com/virtengine/bosun into bosun/codex-self-improvement-loop-commits

Co-authored-by: bosun-ve[bot] <262908237+bosun-ve[bot]@users.noreply.github.com>
jaeko44 and others added 3 commits April 1, 2026 05:54
- Removed delegation watchdog settings and related parameters from agent phases.
- Eliminated auto-fix validation remediation paths and associated nodes.
- Updated task status handling to set tasks as "todo" instead of "blocked" on validation failure.
- Streamlined PR creation process by simplifying the PR body generation.
- Adjusted metadata versioning to reflect changes in the task lifecycle template.

Co-authored-by: bosun-ve[bot] <262908237+bosun-ve[bot]@users.noreply.github.com>
Co-authored-by: bosun-ve[bot] <262908237+bosun-ve[bot]@users.noreply.github.com>
- Updated task fetching in DispatchSection to use pagination with a limit of 200 tasks.
- Added recovery strategy limit to the error recovery template and introduced a new node to load recovery strategies.
- Enhanced error analysis by including reusable prior strategies in the analysis prompt.
- Expanded health check template to include recent run evaluations and apply self-improvement ratchets based on the latest run outcomes.
- Improved task lifecycle template with enhanced validation handling, including auto-fix attempts and logging for validation failures.
- Updated workflow engine to provide clearer summaries and recommendations based on workflow completion status.

Co-authored-by: bosun-ve[bot] <262908237+bosun-ve[bot]@users.noreply.github.com>
@jaeko44
Copy link
Copy Markdown
Member Author

jaeko44 commented Mar 31, 2026

@copilot

⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯[1/1]⎯

Test Files 1 failed | 243 passed | 1 skipped (245)
Tests 1 failed | 3930 passed | 3 skipped (3934)
Start at 18:52:13
Duration 294.98s (transform 7.51s, setup 20.98s, import 22.32s, tests 350.55s, environment 43ms)

Error: AssertionError: expected '/* Auto-generated by tools/generate-d…' to be '/* Auto-generated by tools/generate-d…' // Object.is equality

  • Expected
  • Received

@@ -2426,12 +2426,12 @@
"fix",
"single-pr",
"session-tracked",
"worktree-managed"
],

  •   "nodeCount": 16,
    
  •   "edgeCount": 16,
    
  •   "nodeCount": 15,
    
  •   "edgeCount": 14,
      "recommended": true,
      "enabled": true,
      "trigger": "trigger.manual",
      "variables": {},
      "metadata": {
    

@@ -2531,60 +2531,31 @@
"outputs": [
"default"
]
},
{

  •       "id": "validate-pr-state",
    
  •       "id": "setup-worktree",
          "type": "action.run_command",
    
  •       "label": "Validate PR Is Still Open",
    
  •       "label": "Clone & Checkout PR Branch",
          "config": {
            "command": "node",
            "args": [
              "-e",
    
  •           "const {execFileSync}=require('child_process'); const repo=String(process.env.PR_REPO||'').trim(); const num=String(process.env.PR_NUMBER||'0').trim(); const fallbackBranch=String(process.env.PR_BRANCH||'').trim(); const fallbackBase=String(process.env.PR_BASE||'main').trim(); if(!repo||!num){console.log(JSON.stringify({ok:false,open:false,skip:true,reason:'missing_repo_or_number',repo,number:num,branch:fallbackBranch,base:fallbackBase}));process.exit(0);} try{   const raw=execFileSync('gh',['pr','view',num,'--repo',repo,'--json','state,isDraft,headRefName,baseRefName,url'],{encoding:'utf8',stdio:['pipe','pipe','pipe'],timeout:30000}).trim();   const view=JSON.parse(raw||'{}');   const state=String(view?.state||'').trim().toUpperCase();   const isDraft=view?.isDraft===true;   const open=state==='OPEN'&&!isDraft;   const branch=String(view?.headRefName||fallbackBranch||'').trim();   const base=String(view?.baseRefName||fallbackBase||'main').trim()||'main';   console.log(JSON.stringify({ok:open,open,skip:!open,reason:open?'open':(isDraft?'draft_pr':'pr_not_open'),state,isDraft,repo,number:num,branch,base,url:String(view?.url||'').trim()||null})); }catch(err){   console.log(JSON.stringify({ok:false,open:false,skip:true,reason:'pr_view_failed',error:String(err?.message||err),repo,number:num,branch:fallbackBranch,base:fallbackBase})); }"
    
  •           "const os=require('os'); const path=require('path'); const fs=require('fs'); const {execFileSync}=require('child_process'); const repo=String(process.env.PR_REPO||'').trim(); const branch=String(process.env.PR_BRANCH||'').trim(); const base=String(process.env.PR_BASE||'main').trim(); const num=String(process.env.PR_NUMBER||'0').trim(); if(!repo||!branch){console.log(JSON.stringify({error:'missing repo or branch',repo,branch}));process.exit(1);} let wt=path.join(os.tmpdir(),'bosun-prfix-'+num.replace(/[^0-9a-z]/gi,'-')); let reused=false; if(fs.existsSync(path.join(wt,'.git'))){   try{     const cur=execFileSync('git',['rev-parse','--abbrev-ref','HEAD'],{cwd:wt,encoding:'utf8',timeout:10000}).trim();     if(cur===branch){       execFileSync('git',['fetch','origin',branch],{cwd:wt,encoding:'utf8',timeout:120000,stdio:['ignore','pipe','pipe']});       execFileSync('git',['reset','--hard','origin/'+branch],{cwd:wt,encoding:'utf8',timeout:30000});       execFileSync('git',['clean','-fd','-e','.bosun/'],{cwd:wt,encoding:'utf8',timeout:30000});       try{execFileSync('git',['fetch','origin',base],{cwd:wt,encoding:'utf8',timeout:60000,stdio:['ignore','pipe','pipe']});}catch{}       reused=true;     }else{try{fs.rmSync(wt,{recursive:true,force:true});}catch{}}   }catch{try{fs.rmSync(wt,{recursive:true,force:true});}catch{}} } if(!reused){   if(fs.existsSync(wt)){try{fs.rmSync(wt,{recursive:true,force:true});}catch{wt=wt+'-'+Date.now().toString(36);}}   execFileSync('gh',['repo','clone',repo,wt,'--','--branch',branch],{encoding:'utf8',timeout:300000,stdio:'inherit'});   execFileSync('git',['fetch','origin',branch],{cwd:wt,encoding:'utf8',timeout:120000,stdio:['ignore','pipe','pipe']});   execFileSync('git',['reset','--hard','o
    

Error: Process completed with exit code 1.

Copy link
Copy Markdown
Contributor

Copilot AI commented Mar 31, 2026

@copilot

⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯[1/1]⎯
...

Fixed in commit 1429ad6. The workflow template was updated (the validate-pr-state node was removed, reducing nodeCount 16→15 and edgeCount 16→14) after the previous demo-defaults regeneration. Ran node tools/generate-demo-defaults.mjs again to resync both ui/demo-defaults.js and site/ui/demo-defaults.js.

jaeko44 and others added 3 commits April 1, 2026 09:11
…sages by falling back to sendDirect

test(telegram): add test for editDirect fallback when message to edit is not found

test(workflow): add tests for condition.expression resolving template placeholders as JS literals

test(workflow-templates): verify recovery of blocked worktrees summary uses precomputed count variable

fix(ui): update demo defaults for node and edge counts, and enhance PR body configuration

feat(workflow-templates): implement detailed validation failure handling with retries and notifications

fix(workflow-nodes): improve condition.expression execution to prevent unsupported syntax and enhance error handling

Co-authored-by: bosun-ve[bot] <262908237+bosun-ve[bot]@users.noreply.github.com>
…ithub.com/virtengine/bosun into bosun/codex-self-improvement-loop-commits

Co-authored-by: bosun-ve[bot] <262908237+bosun-ve[bot]@users.noreply.github.com>
…lates with dynamic environment variables

Co-authored-by: bosun-ve[bot] <262908237+bosun-ve[bot]@users.noreply.github.com>
@jaeko44
Copy link
Copy Markdown
Member Author

jaeko44 commented Mar 31, 2026

@copilot

⎯⎯⎯⎯⎯⎯⎯ Failed Tests 6 ⎯⎯⎯⎯⎯⎯⎯

FAIL tests/workflow-guaranteed.test.mjs > guaranteed: behavioral contracts > template-weekly-fitness-summary: emits parse-safe summary with mixed telemetry quality
Error: template-weekly-fitness-summary: 1 engine-level error(s):
• has-followups: {"nodeId":"has-followups","error":"Expression contains unsupported syntax","timestamp":1774996332026}
❯ Object.noEngineErrors tests/sandbox/template-harness.mjs:287:17
285| noEngineErrors(ctx, label = templateId) {
286| if (ctx.errors.length > 0) {
287| throw new Error(
| ^
288| ${label}: ${ctx.errors.length} engine-level error(s):\n +
289| ctx.errors.map((e) => ` • ${e.nodeId ?? "?"}: ${e.message…
❯ tests/workflow-guaranteed.test.mjs:413:24

⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯[1/6]⎯

FAIL tests/workflow-guaranteed.test.mjs > guaranteed: behavioral contracts > template-weekly-fitness-summary: emits artifact and survives degraded telemetry
Error: template-weekly-fitness-summary: 1 engine-level error(s):
• has-followups: {"nodeId":"has-followups","error":"Expression contains unsupported syntax","timestamp":1774996339429}
❯ Object.noEngineErrors tests/sandbox/template-harness.mjs:287:17
285| noEngineErrors(ctx, label = templateId) {
286| if (ctx.errors.length > 0) {
287| throw new Error(
| ^
288| ${label}: ${ctx.errors.length} engine-level error(s):\n +
289| ctx.errors.map((e) => ` • ${e.nodeId ?? "?"}: ${e.message…
❯ tests/workflow-guaranteed.test.mjs:444:24

⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯[2/6]⎯

FAIL tests/workflow-guaranteed.test.mjs > guaranteed: behavioral contracts > template-weekly-fitness-summary: mixed degraded + healthy sources produce parse-safe trend deltas
Error: template-weekly-fitness-summary: 1 engine-level error(s):
• has-followups: {"nodeId":"has-followups","error":"Expression contains unsupported syntax","timestamp":1774996346818}
❯ Object.noEngineErrors tests/sandbox/template-harness.mjs:287:17
285| noEngineErrors(ctx, label = templateId) {
286| if (ctx.errors.length > 0) {
287| throw new Error(
| ^
288| ${label}: ${ctx.errors.length} engine-level error(s):\n +
289| ctx.errors.map((e) => ` • ${e.nodeId ?? "?"}: ${e.message…
❯ tests/workflow-guaranteed.test.mjs:505:24

⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯[3/6]⎯

FAIL tests/workflow-guaranteed.test.mjs > guaranteed: behavioral contracts > template-weekly-fitness-summary: partially parsed task telemetry still computes best-effort throughput
Error: template-weekly-fitness-summary: 1 engine-level error(s):
• has-followups: {"nodeId":"has-followups","error":"Expression contains unsupported syntax","timestamp":1774996354206}
❯ Object.noEngineErrors tests/sandbox/template-harness.mjs:287:17
285| noEngineErrors(ctx, label = templateId) {
286| if (ctx.errors.length > 0) {
287| throw new Error(
| ^
288| ${label}: ${ctx.errors.length} engine-level error(s):\n +
289| ctx.errors.map((e) => ` • ${e.nodeId ?? "?"}: ${e.message…
❯ tests/workflow-guaranteed.test.mjs:549:24

⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯[4/6]⎯

FAIL tests/workflow-guaranteed.test.mjs > guaranteed: behavioral contracts > template-weekly-fitness-summary: non-json task telemetry is marked degraded and suppressed from trend alerts
Error: template-weekly-fitness-summary: 1 engine-level error(s):
• has-followups: {"nodeId":"has-followups","error":"Expression contains unsupported syntax","timestamp":1774996361597}
❯ Object.noEngineErrors tests/sandbox/template-harness.mjs:287:17
285| noEngineErrors(ctx, label = templateId) {
286| if (ctx.errors.length > 0) {
287| throw new Error(
| ^
288| ${label}: ${ctx.errors.length} engine-level error(s):\n +
289| ctx.errors.map((e) => ` • ${e.nodeId ?? "?"}: ${e.message…
❯ tests/workflow-guaranteed.test.mjs:575:24

⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯[5/6]⎯

FAIL tests/workflow-guaranteed.test.mjs > guaranteed: behavioral contracts > template-weekly-fitness-summary: parses wrapped canonical task payloads and preserves planner artifact
Error: template-weekly-fitness-summary: 1 engine-level error(s):
• has-followups: {"nodeId":"has-followups","error":"Expression contains unsupported syntax","timestamp":1774996368984}
❯ Object.noEngineErrors tests/sandbox/template-harness.mjs:287:17
285| noEngineErrors(ctx, label = templateId) {
286| if (ctx.errors.length > 0) {
287| throw new Error(
| ^
288| ${label}: ${ctx.errors.length} engine-level error(s):\n +
289| ctx.errors.map((e) => ` • ${e.nodeId ?? "?"}: ${e.message…
❯ tests/workflow-guaranteed.test.mjs:614:24

⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯[6/6]⎯

Test Files 1 failed (1)
Tests 6 failed | 81 passed (87)
Start at 22:30:35
Duration 189.20s (transform 1.53s, setup 283ms, import 862ms, tests 187.94s, environment 0ms)

JSON report written to /home/runner/work/bosun/bosun/test-results-n22-s4.json
Error: Process completed with exit code 1.

@jaeko44
Copy link
Copy Markdown
Member Author

jaeko44 commented Mar 31, 2026

@copilot

❯ tests/demo-defaults-sync.test.mjs:20:41
18| const data = await buildDemoDefaultsData();
19| const expected = renderDefaultsScript(data);
20| expect(read("ui/demo-defaults.js")).toBe(expected);
| ^
21| expect(read("site/ui/demo-defaults.js")).toBe(expected);
22| });

⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯[1/8]⎯

FAIL tests/workflow-guaranteed.test.mjs > guaranteed: all templates execute without engine errors > template-weekly-fitness-summary
Error: template-weekly-fitness-summary: 1 engine-level error(s):
• has-followups: {"nodeId":"has-followups","error":"Expression contains unsupported syntax","timestamp":1774996262215}
❯ Object.noEngineErrors tests/sandbox/template-harness.mjs:287:17
285| noEngineErrors(ctx, label = templateId) {
286| if (ctx.errors.length > 0) {
287| throw new Error(
| ^
288| ${label}: ${ctx.errors.length} engine-level error(s):\n +
289| ctx.errors.map((e) => ` • ${e.nodeId ?? "?"}: ${e.message…
❯ tests/workflow-guaranteed.test.mjs:211:26

⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯[2/8]⎯

FAIL tests/workflow-guaranteed.test.mjs > guaranteed: behavioral contracts > template-weekly-fitness-summary: emits parse-safe summary with mixed telemetry quality
Error: template-weekly-fitness-summary: 1 engine-level error(s):
• has-followups: {"nodeId":"has-followups","error":"Expression contains unsupported syntax","timestamp":1774996398028}
❯ Object.noEngineErrors tests/sandbox/template-harness.mjs:287:17
285| noEngineErrors(ctx, label = templateId) {
286| if (ctx.errors.length > 0) {
287| throw new Error(
| ^
288| ${label}: ${ctx.errors.length} engine-level error(s):\n +
289| ctx.errors.map((e) => ` • ${e.nodeId ?? "?"}: ${e.message…
❯ tests/workflow-guaranteed.test.mjs:413:24

⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯[3/8]⎯

FAIL tests/workflow-guaranteed.test.mjs > guaranteed: behavioral contracts > template-weekly-fitness-summary: emits artifact and survives degraded telemetry
Error: template-weekly-fitness-summary: 1 engine-level error(s):
• has-followups: {"nodeId":"has-followups","error":"Expression contains unsupported syntax","timestamp":1774996405441}
❯ Object.noEngineErrors tests/sandbox/template-harness.mjs:287:17
285| noEngineErrors(ctx, label = templateId) {
286| if (ctx.errors.length > 0) {
287| throw new Error(
| ^
288| ${label}: ${ctx.errors.length} engine-level error(s):\n +
289| ctx.errors.map((e) => ` • ${e.nodeId ?? "?"}: ${e.message…
❯ tests/workflow-guaranteed.test.mjs:444:24

⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯[4/8]⎯

FAIL tests/workflow-guaranteed.test.mjs > guaranteed: behavioral contracts > template-weekly-fitness-summary: mixed degraded + healthy sources produce parse-safe trend deltas
Error: template-weekly-fitness-summary: 1 engine-level error(s):
• has-followups: {"nodeId":"has-followups","error":"Expression contains unsupported syntax","timestamp":1774996412840}
❯ Object.noEngineErrors tests/sandbox/template-harness.mjs:287:17
285| noEngineErrors(ctx, label = templateId) {
286| if (ctx.errors.length > 0) {
287| throw new Error(
| ^
288| ${label}: ${ctx.errors.length} engine-level error(s):\n +
289| ctx.errors.map((e) => ` • ${e.nodeId ?? "?"}: ${e.message…
❯ tests/workflow-guaranteed.test.mjs:505:24

⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯[5/8]⎯

FAIL tests/workflow-guaranteed.test.mjs > guaranteed: behavioral contracts > template-weekly-fitness-summary: partially parsed task telemetry still computes best-effort throughput
Error: template-weekly-fitness-summary: 1 engine-level error(s):
• has-followups: {"nodeId":"has-followups","error":"Expression contains unsupported syntax","timestamp":1774996420235}
❯ Object.noEngineErrors tests/sandbox/template-harness.mjs:287:17
285| noEngineErrors(ctx, label = templateId) {
286| if (ctx.errors.length > 0) {
287| throw new Error(
| ^
288| ${label}: ${ctx.errors.length} engine-level error(s):\n +
289| ctx.errors.map((e) => ` • ${e.nodeId ?? "?"}: ${e.message…
❯ tests/workflow-guaranteed.test.mjs:549:24

⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯[6/8]⎯

FAIL tests/workflow-guaranteed.test.mjs > guaranteed: behavioral contracts > template-weekly-fitness-summary: non-json task telemetry is marked degraded and suppressed from trend alerts
Error: template-weekly-fitness-summary: 1 engine-level error(s):
• has-followups: {"nodeId":"has-followups","error":"Expression contains unsupported syntax","timestamp":1774996427626}
❯ Object.noEngineErrors tests/sandbox/template-harness.mjs:287:17
285| noEngineErrors(ctx, label = templateId) {
286| if (ctx.errors.length > 0) {
287| throw new Error(
| ^
288| ${label}: ${ctx.errors.length} engine-level error(s):\n +
289| ctx.errors.map((e) => ` • ${e.nodeId ?? "?"}: ${e.message…
❯ tests/workflow-guaranteed.test.mjs:575:24

⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯[7/8]⎯

FAIL tests/workflow-guaranteed.test.mjs > guaranteed: behavioral contracts > template-weekly-fitness-summary: parses wrapped canonical task payloads and preserves planner artifact
Error: template-weekly-fitness-summary: 1 engine-level error(s):
• has-followups: {"nodeId":"has-followups","error":"Expression contains unsupported syntax","timestamp":1774996435013}
❯ Object.noEngineErrors tests/sandbox/template-harness.mjs:287:17
285| noEngineErrors(ctx, label = templateId) {
286| if (ctx.errors.length > 0) {
287| throw new Error(
| ^
288| ${label}: ${ctx.errors.length} engine-level error(s):\n +
289| ctx.errors.map((e) => ` • ${e.nodeId ?? "?"}: ${e.message…
❯ tests/workflow-guaranteed.test.mjs:614:24

⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯[8/8]⎯

Test Files 2 failed | 242 passed | 1 skipped (245)
Tests 8 failed | 3931 passed | 3 skipped (3942)
Start at 22:30:08
Duration 275.95s (transform 7.48s, setup 21.44s, import 22.52s, tests 329.70s, environment 44ms)

Error: AssertionError: expected '/* Auto-generated by tools/generate-d…' to be '/* Auto-generated by tools/generate-d…' // Object.is equality

  • Expected
  • Received

@@ -2426,12 +2426,12 @@
"fix",
"single-pr",
"session-tracked",
"worktree-managed"
],

  •   "nodeCount": 17,
    
  •   "edgeCount": 19,
    
  •   "nodeCount": 16,
    
  •   "edgeCount": 16,
      "recommended": true,
      "enabled": true,
      "trigger": "trigger.manual",
      "variables": {},
      "metadata": {
    

@@ -2538,11 +2538,11 @@
"label": "Validate PR Is Still Open",
"config": {
"command": "node",
"args": [
"-e",

  •           "const {execFileSync}=require('child_process'); const repo=String(process.env.PR_REPO||'').trim(); const num=String(process.env.PR_NUMBER||'0').trim(); const fallbackBranch=String(process.env.PR_BRANCH||'').trim(); const fallbackBase=String(process.env.PR_BASE||'main').trim(); if(!repo||!num){console.log(JSON.stringify({ok:false,open:false,skip:true,reason:'missing_repo_or_number',repo,number:num,branch:fallbackBranch,base:fallbackBase}));process.exit(0);} try{   const raw=execFileSync('gh',['pr','view',num,'--repo',repo,'--json','state,isDraft,headRefName,baseRefName,url,mergedAt,closedAt'],{encoding:'utf8',stdio:['pipe','pipe','pipe'],timeout:30000}).trim();   const view=JSON.parse(raw||'{}');   const state=String(view?.state||'').trim().toUpperCase();   const isDraft=view?.isDraft===true;   const mergedAt=String(view?.mergedAt||'').trim()||null;   const closedAt=String(view?.closedAt||'').trim()||null;   const merged=state==='MERGED'||Boolean(mergedAt);   const open=state==='OPEN'&&!isDraft;   const branch=String(view?.headRefName||fallbackBranch||'').trim();   const base=String(view?.baseRefName||fallbackBase||'main').trim()||'main';   const targetTaskStatus=merged?'done':(state==='CLOSED'?'cancelled':null);   const shouldResolveTask=Boolean(targetTaskStatus);   const reason=open?'open':(merged?'pr_merged':(state==='CLOSED'?'pr_closed':(isDraft?'draft_pr':'pr_not_open')));   console.log(JSON.stringify({ok:open,open,skip:!open,reason,state,isDraft,merged,mergedAt,closedAt,shouldResolveTask,targetTaskStatus,repo,number:num,branch,base,url:String(view?.url||'').trim()||null})); }catch(err){   console.log(JSON.stringify({ok:false,open:false,skip:true,reason:'pr_view_failed',error:String(err?.message||err),repo,number:num,branch:fallbackBranch,base:fallbackBase})); }"
    
  •           "const {execFileSync}=require('child_process'); const repo=String(process.env.PR_REPO||'').trim(); const num=String(process.env.PR_NUMBER||'0').trim(); const fallbackBranch=String(process.env.PR_BRANCH||'').trim(); const fallbackBase=String(process.env.PR_BASE||'main').trim(); if(!repo||!num){console.log(JSON.stringify({ok:false,open:false,skip:true,reason:'missing_repo_or_number',repo,number:num,branch:fallbackBranch,base:fallbackBase}));process.exit(0);} try{   const raw=execFileSync('gh',['pr','view',num,'--repo',repo,'--json','state,isDraft,headRefName,baseRefName,url'],{encoding:'utf8',stdio:['pipe','pipe','pipe'],timeout:30000}).trim();   const view=JSON.parse(raw||'{}');   const state=String(view?.state||'').trim().toUpperCase();   const isDraft=view?.isDraft===true;   const open=state==='OPEN'&&!isDraft;   const branch=String(view?.headRefName||fallbackBranch||'').trim();   const base=String(view?.baseRefName||fallbackBase||'main').trim()||'main';   console.log(JSON.stringify({ok:open,open,skip:!open,reason:open?'open':(isDraft?'draft_pr':'pr_not_open'),state,isDraft,repo,number:num,branch,base,url:String(view?.url||'').trim()||null})); }catch(err){   console.log(JSON.stringify({ok:false,open:false,skip:true,reason:'pr_view_failed',error:String(err?.message||err),repo,number:num,branch:fallbackBranch,base:fallbackBase})); }"
            ],
            "parseJson": true,
            "continueOnError": true,
            "failOnError": false,
            "timeoutMs": 60000,
    

@@ -2553,45 +2553,10 @@
"PR_N

Error: Error: template-weekly-fitness-summary: 1 engine-level error(s):
• has-followups: {"nodeId":"has-followups","error":"Expression contains unsupported syntax","timestamp":1774996262215}
❯ Object.noEngineErrors tests/sandbox/template-harness.mjs:287:17
❯ tests/workflow-guaranteed.test.mjs:211:26

Error: Error: template-weekly-fitness-summary: 1 engine-level error(s):
• has-followups: {"nodeId":"has-followups","error":"Expression contains unsupported syntax","timestamp":1774996398028}
❯ Object.noEngineErrors tests/sandbox/template-harness.mjs:287:17
❯ tests/workflow-guaranteed.test.mjs:413:24

Error: Error: template-weekly-fitness-summary: 1 engine-level error(s):
• has-followups: {"nodeId":"has-followups","error":"Expression contains unsupported syntax","timestamp":1774996405441}
❯ Object.noEngineErrors tests/sandbox/template-harness.mjs:287:17
❯ tests/workflow-guaranteed.test.mjs:444:24

Error: Error: template-weekly-fitness-summary: 1 engine-level error(s):
• has-followups: {"nodeId":"has-followups","error":"Expression contains unsupported syntax","timestamp":1774996412840}
❯ Object.noEngineErrors tests/sandbox/template-harness.mjs:287:17
❯ tests/workflow-guaranteed.test.mjs:505:24

Error: Error: template-weekly-fitness-summary: 1 engine-level error(s):
• has-followups: {"nodeId":"has-followups","error":"Expression contains unsupported syntax","timestamp":1774996420235}
❯ Object.noEngineErrors tests/sandbox/template-harness.mjs:287:17
❯ tests/workflow-guaranteed.test.mjs:549:24

Error: Error: template-weekly-fitness-summary: 1 engine-level error(s):
• has-followups: {"nodeId":"has-followups","error":"Expression contains unsupported syntax","timestamp":1774996427626}
❯ Object.noEngineErrors tests/sandbox/template-harness.mjs:287:17
❯ tests/workflow-guaranteed.test.mjs:575:24

Error: Error: template-weekly-fitness-summary: 1 engine-level error(s):
• has-followups: {"nodeId":"has-followups","error":"Expression contains unsupported syntax","timestamp":1774996435013}
❯ Object.noEngineErrors tests/sandbox/template-harness.mjs:287:17
❯ tests/workflow-guaranteed.test.mjs:614:24

Copy link
Copy Markdown
Contributor

Copilot AI commented Mar 31, 2026

@copilot

⎯⎯⎯⎯⎯⎯⎯ Failed Tests 6 ⎯⎯⎯⎯⎯⎯⎯
...

Fixed in commit 01f22cf. The blockedPatterns array in condition.expression used [^\\w$.] in regex literals — in a JS regex literal \\w is a literal backslash+w, not the word-character shorthand \w. This meant the eval pattern was matching 'evaluate-fitness' (the ' before eval and u after both satisfied "not backslash or w"), causing all template-weekly-fitness-summary tests to throw "Expression contains unsupported syntax". Fixed by changing [^\\w$.][^\w$.] and [^\\w$][^\w$].

jaeko44 and others added 3 commits April 1, 2026 10:08
- Updated the workflow completion summary to provide more context when there are pending nodes on inactive branches.
- Improved messaging for terminal completed states to clarify when no further action is required.
- Introduced a new function `collectTaskBranchBindingCandidates` to gather potential branch candidates from task metadata.
- Enhanced the `trigger.task_available` node to log task branch binding details, including resolved branches and candidates.
- Refined regex patterns in condition expressions to improve security by preventing unsupported syntax.

Co-authored-by: bosun-ve[bot] <262908237+bosun-ve[bot]@users.noreply.github.com>
…ithub.com/virtengine/bosun into bosun/codex-self-improvement-loop-commits

Co-authored-by: bosun-ve[bot] <262908237+bosun-ve[bot]@users.noreply.github.com>
… runtime metrics

Co-authored-by: bosun-ve[bot] <262908237+bosun-ve[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bosun-attached Bosun PR attachment marker bosun-pr-public PR observed by Bosun but not trusted for high-risk automation

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants