You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
feat(updater): tier 3 — auto update with grace window (#7607) (#7720)
* feat(updater): scheduled execution state + graceStartTag dedupe field (#7607)
Preparation for Tier 3 of the auto-update subsystem:
- ExecutionStatus gains `scheduled` (targetTag, scheduledFor, startedAt).
- EmailSendLog gains `graceStartTag` for one-shot grace-start email dedupe.
- state validator accepts the new shape, requires per-status fields,
and backfills graceStartTag=null on a Tier 1/2 state file.
Plus the implementation plan at
docs/superpowers/plans/2026-05-11-auto-update-pr3-tier3-auto.md.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* feat(updater): decideSchedule pure decision function (#7607)
Adds src/node/updater/Scheduler.ts with the Tier 3 pure decision logic:
- schedules when canAuto + idle/verified/terminal-cleared
- reschedules when a newer tag appears mid-grace
- emits a grace-start email (once per tag) when adminEmail is set
- cancels a stale schedule when policy flips canAuto off
- no-ops during in-flight / terminal states
- clamps preApplyGraceMinutes to [0, 7 days]
Also extends Notifier's EmailKind union with 'grace-start' so the
decision result types correctly.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* feat(updater): scheduler timer runner with arm/cancel (#7607)
Adds createSchedulerRunner to Scheduler.ts:
- arm(): clears any prior timer, sets a fresh one for scheduledFor
- cancel(): clears the pending timer, idempotent
- past scheduledFor → fires with delay=0 (rehydrate after restart-in-grace)
- single-fire-per-arm semantics; armedFor cleared on fire
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* refactor(updater): extract apply pipeline shared by HTTP + scheduler (#7607)
Lifts the preflight → drain → execute orchestration out of the
/admin/update/apply HTTP handler into src/node/updater/applyPipeline.ts.
The HTTP handler keeps its 4xx status mapping; the pipeline owns the
state transitions, lock release, drain coordination, and rollback hand-
off. The new ApplyPipelineDeps interface accepts an onAccepted callback
so the HTTP path can still 202 mid-flow while the Tier 3 scheduler path
(next commit) can no-op.
Adds `scheduled` to the apply allowed-entry list so an admin can "Apply
now" during the Tier 3 grace window.
13 vitest cases cover happy / preflight-failed / cancelled / busy /
lock-held / scheduled-entry / rollback / lock-release. Existing 12
mocha integration tests still pass without change.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* feat(updater): wire Tier 3 scheduler into boot + performCheck (#7607)
- expressCreateServer instantiates the scheduler runner and rehydrates
the timer when a prior boot left state.execution = scheduled
- performCheck evaluates decideSchedule after the notifier pass:
schedule transitions state + sends grace-start email + arms timer;
cancel-schedule resets to idle + cancels timer
- shutdown cancels the timer
- exposes cancelScheduler() so the cancel endpoint (next commit) can
drop the pending schedule
- buildSchedulerApplyDeps() supplies the full production-wired pipeline
deps (preflight, executor, rollback) for the scheduler-triggered apply
Adds tests/backend/specs/updater-scheduler-integration.ts covering
boot-rehydrate fire-on-past and the decision-to-state round-trip.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* feat(updater): cancel handler supports Tier 3 scheduled state (#7607)
POST /admin/update/cancel now accepts execution.status === 'scheduled'
in addition to preflight/draining. The handler calls cancelScheduler()
to drop the pending in-process timer, then transitions state to idle
with lastResult.outcome = 'cancelled' (mirroring the existing pattern).
Adds a Tier 3 integration test that seeds a scheduled state, calls
/admin/update/cancel, and asserts the state machine landed correctly.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* feat(admin): countdown + cancel UI for Tier 3 scheduled updates (#7607)
- store.ts: extend Execution union with the scheduled variant
- UpdatePage.tsx: render countdown panel during scheduled; Apply button
is relabelled "Apply now" so the admin can skip the remaining grace;
Cancel button accepts scheduled state
- UpdateBanner.tsx: dedicated scheduled banner with live remaining time
- en.json: new i18n keys (execution.scheduled, banner.scheduled,
page.scheduled.{title,countdown,apply_now})
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* test(updater): playwright spec for Tier 3 scheduled UI (#7607)
Three cases against a mocked /admin/update/status:
- countdown panel + Apply now + Cancel render when execution is scheduled
- Cancel button posts /admin/update/cancel and triggers re-fetch
- /admin (banner) shows "Auto-update to <tag> scheduled" copy
Mirrors the existing update-page-actions.spec.ts mock pattern (page.route).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* docs(updater): document Tier 3 auto with grace window (#7607)
- doc/admin/updates.md: flip Tier 3 from "designed, not yet implemented"
to current; expand preApplyGraceMinutes table row; add a Tier 3
section explaining schedule / cancel / Apply now / restart-in-grace
and the grace-start email
- settings.json.template: clarify the preApplyGraceMinutes comment
- CHANGELOG.md: Unreleased entry for Tier 3
- runbook §11: full Tier 3 smoke (happy, cancel, apply-now, restart-in-
grace, email) plus the additional sign-off checkboxes
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix(admin): UpdatePage handles missing execution field; scope spec locator (#7607)
Two CI fixes for PR #7720:
1. UpdatePage.tsx — optional-chain us.execution.status. Integration test
stubs (update-banner.spec.ts) ship payloads without the Tier 2/3
execution / lastResult / lockHeld fields; without optional chaining
on the new scheduled-derivation line the whole page crashed before
the h1 rendered, breaking the unrelated "renders current version"
test.
2. update-scheduled.spec.ts — scope the v2.7.2 assertion to the
.update-scheduled section. The regex was matching three elements
(banner, countdown panel, changelog link) and tripped Playwright's
strict-mode locator check.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix(updater): address Qodo review (Tier 3 race conditions + tier-off bypass) (#7607)
Four fixes for bugs flagged by Qodo's review of PR #7720:
1. **Tier=off bypasses scheduler** (correctness). expressCreateServer
used to instantiate the scheduler and rehydrate any persisted
`scheduled` state regardless of `updates.tier`. A user who set
`tier: "off"` after a schedule had been persisted would still see
the timer fire after restart. The boot path now skips scheduler
creation when tier is off and explicitly clears a stale scheduled
state to idle (logged so the admin sees what happened).
2. **Timer fire skips state recheck** (reliability). The scheduler's
timer callback called applyUpdate() directly. Race: admin clicks
Cancel at the same instant the timer fires, or the tier flips
during the grace window. Now schedulerTriggerApply re-loads state
and re-evaluates policy via a new pure decideTriggerApply() helper
in Scheduler.ts. If state is no longer scheduled (or scheduled for a
different tag), aborts. If policy now denies auto, persists state
back to idle and aborts.
3. **Apply-now leaves scheduler timer armed** (correctness). The apply
endpoint accepts `scheduled` as an entry status but didn't cancel
the in-process scheduler timer. After the admin clicks Apply now,
the still-armed timer could later fire and attempt another apply
(especially if the manual one finishes in preflight-failed, which
is also an allowed-entry status). Apply handler now calls
cancelScheduler() when entering from `scheduled`.
4. **scheduledFor not validated as timestamp** (reliability). State
validator only required scheduledFor / startedAt etc. to be
non-empty strings; a hand-edited "scheduledFor": "garbage" would
pass validation and yield NaN delay → immediate fire. The
validator now requires known timestamp fields to be parseable
via Date.parse().
Tests: 6 new decideTriggerApply cases + 3 new state.ts validation
cases. 189 vitest pass / 29 mocha integration pass / ts-check clean.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copy file name to clipboardExpand all lines: CHANGELOG.md
+4-1Lines changed: 4 additions & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -8,7 +8,10 @@
8
8
- Terminal `rollback-failed` state surfaces a strong banner; the admin clicks Acknowledge once they've manually recovered to clear the lock and re-allow Tier 2 attempts.
9
9
- New settings under `updates.*`: `preApplyGraceMinutes`, `drainSeconds`, `rollbackHealthCheckSeconds`, `diskSpaceMinMB`, `requireSignature`, `trustedKeysPath`. Tag signature verification is opt-in (default `false`) — see `doc/admin/updates.md` for the keyring setup.
10
10
-**A process supervisor (systemd / pm2 / docker `--restart=unless-stopped`) is required to apply updates.** Without one, exit 75 leaves the instance down.
11
-
- Tiers 3 (auto with grace window) and 4 (autonomous in maintenance window) remain designed but unimplemented and will land in subsequent releases.
11
+
-**Self-update subsystem — Tier 3 (auto with grace window).**
12
+
- On a git install, set `updates.tier: "auto"` to have new releases applied automatically after `preApplyGraceMinutes`. During the grace window, `/admin/update` shows a live countdown plus Cancel and Apply now buttons. Schedules are persisted to `var/update-state.json`, so an Etherpad restart during the grace window rehydrates the timer instead of losing the schedule. A new release tag detected mid-grace re-arms the timer; if `adminEmail` is set, a one-shot `grace-start` notification fires per scheduled tag (issue #7607).
13
+
- The terminal `rollback-failed` state continues to disable auto/autonomous attempts globally until acknowledged; manual click stays available because an admin click *is* the intervention the terminal state requires.
14
+
- Tier 4 (autonomous in a maintenance window) remains designed but unimplemented and will land in a subsequent release.
Copy file name to clipboardExpand all lines: doc/admin/updates.md
+32-2Lines changed: 32 additions & 2 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -4,7 +4,7 @@ Etherpad ships with a built-in update subsystem.
4
4
5
5
-**Tier 1 (notify)** — default. A banner appears in the admin UI when a new release is available, and pad users see a discreet badge if the running version is severely outdated or flagged as vulnerable. No execution.
6
6
-**Tier 2 (manual click)** — admins on a git install can click "Apply update" at `/admin/update`. Etherpad drains active sessions, runs `git fetch / checkout / pnpm install / pnpm run build:ui`, and exits with code 75 so a process supervisor restarts it on the new version. Auto-rolls back on failure.
7
-
-**Tier 3 (auto with grace window)** — designed, not yet implemented.
7
+
-**Tier 3 (auto with grace window)** — opt-in. On a git install, a newly detected release transitions execution state to `scheduled` and is applied after `preApplyGraceMinutes`. During the grace window, `/admin/update` shows a live countdown plus Cancel and Apply now buttons; an admin email (if `adminEmail` is set) fires once per scheduled tag.
8
8
-**Tier 4 (autonomous in maintenance window)** — designed, not yet implemented.
9
9
10
10
## Settings
@@ -42,7 +42,7 @@ In `settings.json`:
42
42
|`updates.checkIntervalHours`|`6`| How often to poll GitHub Releases. |
43
43
|`updates.githubRepo`|`"ether/etherpad"`| Override for forks. |
44
44
|`updates.requireAdminForStatus`|`false`| Lock the `/admin/update/status` endpoint to authenticated admin sessions. Default `false` matches existing Etherpad behavior — `/health` already exposes `releaseId` publicly, and changelog data comes from a public GitHub release. Set `true` to hide the full update payload from non-admins without disabling the updater (`tier: "off"` is the heavier opt-out that removes the endpoints entirely). |
45
-
|`updates.preApplyGraceMinutes`|`0`|**Tier 3 only.** Wait this many minutes between detecting a new release and starting the drain so the admin can cancel. Has no effect at tier `"manual"`. |
45
+
|`updates.preApplyGraceMinutes`|`0`|**Tier 3 only.** Wait this many minutes between detecting a new release and starting the drain so the admin can cancel via `/admin/update`. `0` applies immediately when allowed. Clamped to `[0, 7*24*60]` (one week). Has no effect at tier `"manual"`. |
46
46
|`updates.drainSeconds`|`60`| How long to broadcast "restart imminent" announcements to active pads before exiting. T-60 / T-30 / T-10 broadcasts fire automatically at the matching offsets within this window. |
47
47
|`updates.rollbackHealthCheckSeconds`|`60`| After a fresh boot post-update, give `/health` this long to come up. If it doesn't, RollbackHandler restores the previous SHA. |
48
48
|`updates.diskSpaceMinMB`|`500`| Pre-flight refuses to start an update unless the install volume has at least this many MB free. |
@@ -156,6 +156,36 @@ The check shells out to `git verify-tag <tag>`. The keyring at `trustedKeysPath`
156
156
157
157
Tier 2 deliberately refuses to apply on `installMethod: "docker"` because in-container `git fetch / pnpm install / build:ui` doesn't survive a container restart — the orchestrator brings the container back up on the same image tag and the work is lost. Docker installs stay on Tier 1 (banner + version status) for now.
158
158
159
+
## Tier 3 — auto with grace window
160
+
161
+
Tier 3 builds on Tier 2 by scheduling the apply automatically when a new release is detected. The same `git fetch / checkout / pnpm install / build:ui / exit 75` pipeline runs — only the trigger changes.
162
+
163
+
To enable, on a git install: set `updates.tier: "auto"` and (optionally) `updates.preApplyGraceMinutes` to the grace duration you want.
164
+
165
+
### What happens when a new release lands
166
+
167
+
1. The periodic version checker (`updates.checkIntervalHours`) hits GitHub Releases.
168
+
2. If `policy.canAuto` is true (install is git, no terminal `rollback-failed` state, tier is `"auto"` or `"autonomous"`), the scheduler transitions `execution.status` to `scheduled` with `scheduledFor = now + preApplyGraceMinutes`.
169
+
3. The schedule is persisted to `var/update-state.json`, so an Etherpad restart inside the grace window rehydrates the timer rather than losing the schedule.
170
+
4.`/admin/update` shows a live countdown panel plus two buttons:
171
+
-**Cancel** — `POST /admin/update/cancel` returns the state to `idle` and drops the in-process timer.
172
+
-**Apply now** — `POST /admin/update/apply` skips the remaining grace; the regular Tier 2 pipeline runs immediately.
173
+
5. When the timer fires, the scheduler runs the exact same pipeline as a manual Tier 2 click: pre-flight → drain → execute → exit 75.
174
+
175
+
### Re-scheduling and stale state
176
+
177
+
- If a newer release tag appears while a schedule is pending, the scheduler re-arms the timer for the new tag. The `email.graceStartTag` dedupe field guards against duplicate `grace-start` notifications.
178
+
- If `updates.tier` is flipped back to `"manual"` or `"notify"` while a schedule is pending, the next periodic check cancels the schedule (state back to `idle`).
179
+
-`rollback-failed` disables Tier 3 globally. The admin must `POST /admin/update/acknowledge` (or visit `/admin/update` and click Acknowledge) before any further auto-schedules are armed. Tier 2 manual click stays available because the admin click *is* the intervention the terminal state requires.
180
+
181
+
### Email (`adminEmail` set)
182
+
183
+
A single `grace-start` notification fires per scheduled tag:
184
+
185
+
> [Etherpad] Auto-update scheduled for 2.7.2
186
+
187
+
with the `scheduledFor` timestamp. Etherpad core does not yet wire SMTP; the message logs as `(would send email)` until a future PR adds a transport. Cadence and dedupe still update correctly.
188
+
159
189
The right way to give docker admins an in-product Apply button is to delegate to the orchestrator rather than mutate the container. Two patterns to consider in a follow-up PR:
160
190
161
191
-**Instructions-only.** When the page detects `installMethod: docker`*and* a newer release exists, swap the policy-denial copy for actionable instructions (`docker pull etherpad/etherpad:<tag>` for plain docker; `docker compose pull && docker compose up -d` for compose). Cheap, no new attack surface.
0 commit comments