Skip to content

Commit 421b83c

Browse files
authored
k8s: shared-cluster safety checks and deployment-id decoupling (#748)
- **Kind extraMount compatibility**: fail fast at `deployment start` when a new deployment's mounts don't match the running cluster; warn when the first cluster is created without a `kind-mount-root` umbrella; replace the cryptic `ConfigException` with readable errors when the cluster is missing - **Auto-ConfigMap for file-level host-path compose volumes** (so-7fc): `../config/foo.sh:/opt/foo.sh`-style binds become per-namespace ConfigMaps at deploy start instead of aliasing via the kind extraMount chain. `deploy create` rejects `:rw`, subdirs, and over-budget sources. Deployment-dir layout unchanged - **Namespace ownership**: stamp the namespace with `laconic.com/deployment-dir` on create; fail loudly if another deployment tries to land in the same namespace. Pre-existing namespaces adopt ownership on next start - **deployment-id / cluster-id decoupling**: split the two roles (kube context vs resource-name prefix) into separate `deployment.yml` fields. Backward-compat fallback keeps existing resource names stable - Close stale pebbles `so-n1n` and `so-ad7`
1 parent eb4704b commit 421b83c

10 files changed

Lines changed: 772 additions & 65 deletions

File tree

.pebbles/events.jsonl

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -46,3 +46,9 @@
4646
{"type":"comment","timestamp":"2026-04-17T08:13:32.753112339Z","issue_id":"so-o2o","payload":{"body":"Tested the version-detection fix (commit 832ab66d) locally. Fix works for its scope but surfaces two more bugs downstream. Current approach is broken at the architectural level, not just one-bug-fixable.\n\nWhat 832ab66d does: captures etcd image ref from crictl after cluster create, writes to {backup_dir}/etcd-image.txt, reads it on subsequent cleanup runs. Self-adapts to Kind upgrades. No more hardcoded v3.5.9. Confirmed locally: etcd-image.txt is written after first create, cleanup on second start uses it, member.backup-YYYYMMDD-HHMMSS dir is produced (proves cleanup ran end-to-end).\n\nWhat still fails after version fix: kubeadm init on cluster recreate. apiserver comes up but returns:\n- 403 Forbidden: User \"kubernetes-admin\" cannot get path /livez\n- 500: Body was not decodable ... json: cannot unmarshal array into Go value of type struct\n- eventually times out waiting for apiserver /livez\n\nTwo new bugs behind those:\n\n(a) Restore step corrupts binary values. In _clean_etcd_keeping_certs the restore loop is:\n key=$(echo $encoded | base64 -d | jq -r .key | base64 -d)\n val=$(echo $encoded | base64 -d | jq -r .value | base64 -d)\n echo \"$val\" | /backup/etcdctl put \"$key\"\nk8s stores objects as protobuf. Piping raw protobuf through bash variable expansion + echo mangles non-printable bytes, truncates at null bytes, and appends a trailing newline. Explains the \"cannot unmarshal\" from apiserver — the kubernetes Service/Endpoints objects in /registry are corrupted on re-put.\n\n(b) Whitelist is too narrow. We keep only /registry/secrets/caddy-system and the /registry/services entries for kubernetes. Everything else is deleted — including /registry/clusterrolebindings (cluster-admin is gone), /registry/serviceaccounts, /registry/secrets/kube-system (bootstrap tokens), RBAC roles, apiserver's auth config. Explains the 403 for kubernetes-admin — cluster-admin binding doesn't exist yet and kubeadm's pre-addon health check can't authorize.\n\nFixing (a) would mean rewriting the restore step to not use shell piping — either use a proper etcdctl-based Go tool, or write directly to the on-disk snapshot format. Fixing (b) means exhaustively whitelisting everything kubeadm/apiserver bootstrapping needs — a moving target across k8s versions. Both together are a significant undertaking for the actual requirement (\"keep 4 Caddy secrets across cluster recreate\").\n\nDecision: merge 832ab66d for the narrow version-detection fix + diagnosis trail, then implement the kubectl-level backup/restore on a separate branch. The etcd approach is not salvageable at reasonable cost."}}
4747
{"type":"comment","timestamp":"2026-04-17T11:04:26.542659482Z","issue_id":"so-o2o","payload":{"body":"Shipped in PR #746. Etcd-persistence approach replaced with a kubectl-level Caddy Secret backup/restore gated on kind-mount-root.\n\nSummary of what landed:\n- components/ingress/caddy-cert-backup.yaml: SA/Role/RoleBinding + CronJob (alpine/kubectl:1.35.3) firing every 5min, writes {kind-mount-root}/caddy-cert-backup/caddy-secrets.yaml via atomic tmp+rename.\n- install_ingress_for_kind splits into 3 phases: pre-Deployment manifests → _restore_caddy_certs (kubectl apply from backup file) → Caddy Deployment → _install_caddy_cert_backup. Caddy pod can't exist until phase 3, so certs are always in place before secret_store startup.\n- Deleted _clean_etcd_keeping_certs, _get_etcd_host_path_from_kind_config, _capture_etcd_image, _read_etcd_image_ref, _etcd_image_ref_path and the etcd+PKI block in _generate_kind_mounts.\n- No new spec keys.\n\nTest coverage in tests/k8s-deploy/run-deploy-test.sh: install assertion after first --perform-cluster-management start, plus full E2E (seed fake manager=caddy Secret → trigger CronJob → verify backup file → stop/start --perform-cluster-management for cluster recreate → assert secret restored with matching decoded value).\n\nWoodburn migration: one-shot host-kubectl export to seed {kind-mount-root}/caddy-cert-backup/caddy-secrets.yaml was done manually on the running cluster (the in-cluster CronJob couldn't reach the host because the /srv/kind → /mnt extraMount was staged in kind-config.yml but never applied to the running cluster — it was added after cluster creation). File is in place for the eventual cluster recreate."}}
4848
{"type":"close","timestamp":"2026-04-17T11:04:26.999711375Z","issue_id":"so-o2o","payload":{}}
49+
{"type":"create","timestamp":"2026-04-20T13:14:26.312724048Z","issue_id":"so-7fc","payload":{"description":"## Problem\n\nFile-level host-path compose volumes (e.g. `../config/foo.sh:/opt/foo.sh`) were synthesized into a kind extraMount + k8s hostPath PV chain with a sanitized containerPath (`/mnt/host-path-\u003csanitized\u003e`).\n\n- On kind: two deployments of the same stack sharing a cluster collide at that containerPath — kind only honors the first deployment's bind, so subsequent deployments' pods silently read the first's file. No error, no warning.\n- On real k8s: the same code emits `hostPath: /mnt/host-path-*` but nothing populates that path on worker nodes — effectively broken.\n\nFile-level host-path binds are conceptually k8s ConfigMaps. The `snowballtools-base-backend` stack already uses the ConfigMap-backed named-volume pattern manually; this issue is to make that automatic for all stacks.\n\n## Resolution\n\nImplemented on branch `feat/so-b86-auto-configmap-host-path` (commit `cb84388d`), stacked on top of `feat/kind-mount-invariant-check`.\n\n**No deployment-dir file rewriting.** Compose files, spec.yml, and `{deployment_dir}/config/\u003cpod\u003e/` are untouched — trivially diffable against stack source, no synthetic volume names. ConfigMaps are materialized at deploy start and visible only in k8s (`kubectl get cm -n \u003cns\u003e`).\n\n### Deploy create — validation only\n\n| Source shape | Behavior |\n|---|---|\n| Single file | Accepted |\n| Flat directory, no subdirs, ≤ ~700 KiB | Accepted |\n| Directory with subdirs | `DeployerException` — guidance: embed in image / split configmaps / initContainer |\n| File or directory \u003e ~700 KiB | `DeployerException` — ConfigMap budget (accounts for base64 + metadata) |\n| `:rw` on any host-path bind | `DeployerException` — use a named volume for writable data |\n\n### Deploy start — k8s object generation\n\n- `cluster_info.get_configmaps()` walks pod + job compose volumes and emits a `V1ConfigMap` per host-path bind (deduped by sanitized name), content read from `{deployment_dir}/config/\u003cpod\u003e/\u003cfile\u003e`.\n- `volumes_for_pod_files` emits `V1ConfigMapVolumeSource` instead of `V1HostPathVolumeSource` for host-path binds.\n- `volume_mounts_for_service` stats the source and sets `V1VolumeMount.sub_path` to the filename when source is a regular file.\n- `_generate_kind_mounts` no longer emits `/mnt/host-path-*` extraMounts — ConfigMap path bypasses the kind node FS entirely.\n\n### Transition\n\nThe `/mnt/host-path-*` skip in `check_mounts_compatible` is retained as a transition tolerance for deployments created before this change. Test coverage in `tests/k8s-deploy/run-deploy-test.sh` asserts host-path ConfigMaps exist in the namespace, compose/spec in deployment dir unchanged, and no `/mnt/host-path-*` entries in kind-config.yml.","priority":"2","title":"File-level host-path compose volumes alias across deployments sharing a kind cluster","type":"bug"}}
50+
{"type":"status_update","timestamp":"2026-04-20T13:14:26.833816262Z","issue_id":"so-7fc","payload":{"status":"closed"}}
51+
{"type":"comment","timestamp":"2026-04-21T05:57:12.476299839Z","issue_id":"so-n1n","payload":{"body":"Already merged: 929bdab8 is an ancestor of origin/main; all four extraMount emit sites in helpers.py carry `propagation: HostToContainer` (umbrella, per-volume named, per-volume host-path, high-memlock spec)."}}
52+
{"type":"status_update","timestamp":"2026-04-21T05:57:12.928842469Z","issue_id":"so-n1n","payload":{"status":"closed"}}
53+
{"type":"comment","timestamp":"2026-04-21T06:08:13.933886638Z","issue_id":"so-ad7","payload":{"body":"Fixed in PR #744 (cf8b7533). get_services() now includes the maintenance pod in the container-ports map so its per-pod Service is built and available for the Ingress swap."}}
54+
{"type":"status_update","timestamp":"2026-04-21T06:08:14.457815115Z","issue_id":"so-ad7","payload":{"status":"closed"}}

docs/deployment_patterns.md

Lines changed: 138 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -164,6 +164,44 @@ To stop a single deployment without affecting the cluster:
164164
laconic-so deployment --dir my-deployment stop --skip-cluster-management
165165
```
166166

167+
Stacks sharing a cluster must agree on mount topology. See
168+
[Volume Persistence in k8s-kind](#volume-persistence-in-k8s-kind).
169+
170+
### cluster-id vs deployment-id
171+
172+
Each deployment's `deployment.yml` carries two identifiers with
173+
different roles:
174+
175+
- **`cluster-id`** — which kind cluster this deployment attaches to.
176+
Used for the kube-config context name (`kind-{cluster-id}`) and for
177+
kind lifecycle ops. Inherited from the running cluster at
178+
`deploy create` time when one exists; freshly generated otherwise.
179+
Shared across every deployment that joins the same cluster.
180+
- **`deployment-id`** — this particular deployment's identity.
181+
Generated fresh on every `deploy create` and never inherited. Flows
182+
into `app_name`, the prefix on every k8s resource name this
183+
deployment creates (PVs, ConfigMaps, Deployments, PVCs, …). Distinct
184+
per deployment even when the cluster is shared.
185+
186+
The split prevents silent resource-name collisions between
187+
deployments sharing a cluster: two deployments of the same stack,
188+
or any two deployments that happen to declare a volume with the same
189+
name, still produce distinct `{deployment-id}-{vol}` PV names.
190+
191+
**Backward compatibility**: `deployment.yml` files written before the
192+
`deployment-id` field existed fall back to using `cluster-id` as the
193+
deployment-id. Existing resource names stay stable across this
194+
upgrade — no PV renames, no re-bind, no data orphaning. The next
195+
`deploy create` writes both fields going forward.
196+
197+
**Namespace ownership**: on top of distinct resource names, SO stamps
198+
the k8s namespace with a `laconic.com/deployment-dir` annotation on
199+
first creation. A subsequent `deployment start` from a different
200+
deployment directory that would land in the same namespace fails
201+
with a `DeployerException` pointing at the `namespace:` spec
202+
override. Catches operator-error cases where the same deployment dir
203+
is effectively registered twice.
204+
167205
## Volume Persistence in k8s-kind
168206

169207
k8s-kind has 3 storage layers:
@@ -172,7 +210,9 @@ k8s-kind has 3 storage layers:
172210
- **Kind Node**: A Docker container simulating a k8s node
173211
- **Pod Container**: Your workload
174212

175-
For k8s-kind, volumes with paths are mounted from Docker Host → Kind Node → Pod via extraMounts.
213+
Volumes with paths are mounted from Docker Host → Kind Node → Pod via kind
214+
`extraMounts`. Kind applies `extraMounts` only at cluster creation — they
215+
cannot be added to a running cluster.
176216

177217
| spec.yml volume | Storage Location | Survives Pod Restart | Survives Cluster Restart |
178218
|-----------------|------------------|---------------------|-------------------------|
@@ -200,3 +240,100 @@ Empty-path volumes appear persistent because they survive pod restarts (data liv
200240
in Kind Node container). However, this data is lost when the kind cluster is
201241
recreated. This "false persistence" has caused data loss when operators assumed
202242
their data was safe.
243+
244+
### Shared Clusters: Use `kind-mount-root`
245+
246+
Because kind `extraMounts` can only be set at cluster creation, the first
247+
deployment to start locks in the mount topology. Later deployments that
248+
declare new `extraMounts` have them silently ignored — their PVs fall
249+
through to the kind node's overlay filesystem and lose data on cluster
250+
destroy.
251+
252+
The fix is an umbrella mount. Set `kind-mount-root` in the spec, pointing
253+
at a host directory all stacks will share:
254+
255+
```yaml
256+
# spec.yml
257+
kind-mount-root: /srv/kind
258+
259+
volumes:
260+
my-data: /srv/kind/my-stack/data # visible at /mnt/my-stack/data in-node
261+
```
262+
263+
SO emits a single `extraMount` (`<kind-mount-root>` → `/mnt`). Any new
264+
host subdirectory under the root is visible in the node immediately — no
265+
cluster recreate needed to add stacks.
266+
267+
**All stacks sharing a cluster must agree on `kind-mount-root`** and keep
268+
their host paths under it.
269+
270+
### Mount Compatibility Enforcement
271+
272+
`laconic-so deployment start` validates mount topology:
273+
274+
- **On first cluster creation** without an umbrella mount: prints a
275+
warning (future stacks may require a full recreate to add mounts).
276+
- **On cluster reuse**: compares the new deployment's `extraMounts`
277+
against the live mounts on the control-plane container. Any mismatch
278+
(wrong host path, or mount missing) fails the deploy.
279+
280+
### Static files in compose volumes → auto-ConfigMap
281+
282+
Compose volumes that bind a host file or flat directory into a container
283+
(e.g. `../config/test/script.sh:/opt/run.sh`) are used to inject static
284+
content that ships with the stack. k8s doesn't have a native notion of
285+
this — the canonical way to inject static content is a ConfigMap.
286+
287+
At `deploy start`, laconic-so auto-generates a namespace-scoped
288+
ConfigMap per host-path compose volume (deduped by source) and mounts
289+
it into the pod instead of routing the bind through the kind node:
290+
291+
| Source shape | Behavior |
292+
|---|---|
293+
| Single file | ConfigMap with one key (the filename); pod mount uses `subPath` so the single key lands at the compose target path |
294+
| Flat directory (no subdirs, ≤ ~700 KiB) | ConfigMap with one key per file; pod mount exposes all keys at the target path |
295+
| Directory with subdirs, or over budget | Rejected at `deploy create` — embed in the container image, split into multiple ConfigMaps, or use an initContainer |
296+
| `:rw` on any host-path bind | Rejected at `deploy create` — use a named volume with a spec-configured host path for writable data |
297+
298+
The deployment dir layout is unchanged: compose files stay verbatim and
299+
`spec.yml` is not rewritten. Source files remain under
300+
`{deployment_dir}/config/{pod}/` (as copied by `deploy create`); the
301+
ConfigMap is built from them at deploy start and no kind extraMount is
302+
emitted for these paths.
303+
304+
This works identically on kind and real k8s (ConfigMaps are
305+
cluster-native; no node-side landing pad required), and two deployments
306+
of the same stack sharing a cluster get their own per-namespace
307+
ConfigMaps — no aliasing.
308+
309+
### Writable / generated data → named volume + host path
310+
311+
For volumes the workload *writes to* (databases, ledgers, caches, logs),
312+
use a named volume backed by a spec-configured host path under
313+
`kind-mount-root`:
314+
315+
```yaml
316+
# compose
317+
volumes:
318+
- my-data:/var/lib/foo
319+
320+
# spec.yml
321+
kind-mount-root: /srv/kind
322+
volumes:
323+
my-data: /srv/kind/my-stack/data
324+
```
325+
326+
Works on both kind (via the umbrella mount) and real k8s (operator
327+
provisions `/srv/kind/my-stack/data` on each node).
328+
329+
### Migrating an Existing Cluster
330+
331+
If a cluster was created without an umbrella mount and you need to add a
332+
stack that requires new host-path mounts, the cluster must be recreated:
333+
334+
1. Back up ephemeral state (DBs, caches) from PVs that lack host mounts —
335+
these are in the kind node overlay FS and do not survive `kind delete`.
336+
2. Update every stack's spec to set a shared `kind-mount-root` and place
337+
host paths under it.
338+
3. Stop all deployments, destroy the cluster, recreate it by starting any
339+
stack (umbrella now active), and restore state.

stack_orchestrator/constants.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -23,6 +23,7 @@
2323
k8s_kind_deploy_type = "k8s-kind"
2424
k8s_deploy_type = "k8s"
2525
cluster_id_key = "cluster-id"
26+
deployment_id_key = "deployment-id"
2627
kube_config_key = "kube-config"
2728
deploy_to_key = "deploy-to"
2829
network_key = "network"

stack_orchestrator/deploy/deployment_context.py

Lines changed: 27 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -26,6 +26,7 @@
2626
class DeploymentContext:
2727
deployment_dir: Path
2828
id: str
29+
deployment_id: str
2930
spec: Spec
3031
stack: Stack
3132

@@ -48,8 +49,27 @@ def get_compose_file(self, name: str):
4849
return self.get_compose_dir() / f"docker-compose-{name}.yml"
4950

5051
def get_cluster_id(self):
52+
"""Identifier of the kind cluster this deployment attaches to.
53+
54+
Shared across deployments that join the same kind cluster. Used
55+
for the kube-config context name (`kind-{cluster-id}`) and for
56+
kind cluster lifecycle ops.
57+
"""
5158
return self.id
5259

60+
def get_deployment_id(self):
61+
"""Identifier of this particular deployment's k8s resources.
62+
63+
Distinct per deployment even when multiple deployments share a
64+
cluster. Used as compose_project_name → app_name → prefix for
65+
all k8s resource names (PVs, ConfigMaps, Deployments, …).
66+
67+
Backward compat: for deployment.yml files written before this
68+
field existed, falls back to cluster-id so existing on-disk
69+
resource names remain stable (no PV renames, no re-bind).
70+
"""
71+
return self.deployment_id
72+
5373
def init(self, dir: Path):
5474
self.deployment_dir = dir.absolute()
5575
self.spec = Spec()
@@ -60,6 +80,12 @@ def init(self, dir: Path):
6080
if deployment_file_path.exists():
6181
obj = get_yaml().load(open(deployment_file_path, "r"))
6282
self.id = obj[constants.cluster_id_key]
83+
# Fallback to cluster-id for deployments created before the
84+
# deployment-id field was introduced. Keeps existing resource
85+
# names stable across this upgrade.
86+
self.deployment_id = obj.get(
87+
constants.deployment_id_key, self.id
88+
)
6389
# Handle the case of a legacy deployment with no file
6490
# Code below is intended to match the output from _make_default_cluster_name()
6591
# TODO: remove when we no longer need to support legacy deployments
@@ -68,6 +94,7 @@ def init(self, dir: Path):
6894
unique_cluster_descriptor = f"{path},{self.get_stack_file()},None,None"
6995
hash = hashlib.md5(unique_cluster_descriptor.encode()).hexdigest()[:16]
7096
self.id = f"{constants.cluster_name_prefix}{hash}"
97+
self.deployment_id = self.id
7198

7299
def modify_yaml(self, file_path: Path, modifier_func):
73100
"""Load a YAML, apply a modification function, and write it back."""

0 commit comments

Comments
 (0)