Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
41 commits
Select commit Hold shift + click to select a range
c419451
Add Helm charts for FASTDB deployment
fifteen3 Jan 29, 2026
c147129
Add deploy script, Helm-managed registry secret, and documentation
fifteen3 Jan 29, 2026
a172720
Add a script for configuring kubectl to access the SLAC desc-fastdb v…
fifteen3 Jan 30, 2026
4dd9b1c
Add --external-url flag for subdirectory frontend builds
fifteen3 Jan 30, 2026
5e52543
Update HELM_HOWTO.md to be consistent with helm-deploy.sh
fifteen3 Feb 2, 2026
68dafd4
Clarify helm-deploy.sh is a convenience script, not required
fifteen3 Feb 2, 2026
3ea38e5
Add --context and --create-cluster flags to helm-deploy.sh
fifteen3 Feb 3, 2026
25543f4
Make shared PVC access modes configurable, fix Kind local deploy
fifteen3 Feb 3, 2026
1833312
fix: Add / as the value for EXTERNAL URL so that javacsript imports …
fifteen3 Feb 3, 2026
e2c0aed
Add podman support, --load-images flag, and MetalLB annotation
Feb 4, 2026
97fd7f5
Add HOWTO_LOGIN.md for user creation and password-reset workflow
Feb 4, 2026
126ce7b
Intermediate state with kind cluster and postgres replication via s3 …
Mar 27, 2026
875c926
Match setting namespace for new yaml files
Mar 27, 2026
39be605
Updates to get postgres working at slac.
Mar 27, 2026
835b0a6
Nersc setup with external ingress is working now for one site
Apr 3, 2026
a0c9081
Tweaks to get SLAC working
Apr 3, 2026
780628f
add missing files
Apr 3, 2026
e7d2fc6
creating a new filters overview page in the docs
jscora Apr 13, 2026
ecde55b
adding details on how to create a fink filter
jscora Apr 13, 2026
59b6e5f
working on what are filters
jscora Apr 13, 2026
ae7c9e6
isort formatting
taiwithers Apr 10, 2026
cc346d8
spell/sphinx warnings
taiwithers Apr 10, 2026
633cfc8
minor formatting
jscora Apr 13, 2026
784990d
Merge branch 'main' of https://github.com/LSSTDESC/FASTDB into sidrat…
jscora Apr 13, 2026
4b00909
working on ANTARES documentation
jscora Apr 14, 2026
2030998
Update docs/filters.rst to use schema instead of table
jscora Apr 14, 2026
e7ec4cc
working on ANTARES filter creation and testing
jscora Apr 14, 2026
878a4e9
Merge branch 'sidrat-staging' of https://github.com/sidratresearch/FA…
jscora Apr 14, 2026
b6e3357
added in more required alert data to address comments, some other min…
jscora Apr 14, 2026
da9e637
minor formatting change
jscora Apr 14, 2026
9172f2a
removed brokers that aren't of interest at the moment
jscora Apr 15, 2026
58e097b
minor wording edit
jscora Apr 15, 2026
40cc949
wording updates
jscora Apr 15, 2026
87cfd57
removed vscode settings file
jscora Apr 15, 2026
99d06e7
misc wording updates
jscora Apr 15, 2026
217a48f
filters - add description for Pitt-Google
taiwithers Apr 16, 2026
e897b90
removed not required parameters
jscora Apr 16, 2026
9d831cb
adding a little more information for each broker
jscora Apr 16, 2026
7be3ff0
added back in the mostly required parameters
jscora Apr 16, 2026
af3ead1
Merge pull request #80 from sidratresearch/sidrat-staging
rknop Apr 16, 2026
5b19636
Update code to work with tests and docker-compose.yaml
Apr 28, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
177 changes: 177 additions & 0 deletions ADR-HELM-DEPLOY.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,177 @@
# ADR: Deploy Script with kubectl cp for Helm Code Delivery

**Status:** Accepted
**Date:** 2026-01-29
**Decision:** Use a deploy script that copies build artifacts to a shared PVC via `kubectl exec` + tar pipe, rather than baking code into images or using Helm hooks

## Context

FASTDB's Helm deployment needs application code (Python, SQL migrations, config) available at `/fastdb` inside pods. The `install/` directory is a build artifact produced by Automake (`./configure && make install` via `docker-compose run makeinstall`) — it is not in git and not baked into Docker images.

After `helm install`, pods that mount the code PVC crash because the PVC is empty. We need a mechanism to populate the code PVC as part of the deploy workflow.

### Constraints

- `install/` is a build artifact, not in version control
- SLAC S3DF vclusters do not support hostPath mounts (unlike local Kind clusters)
- The code PVC (`ReadWriteMany`) is shared across webap, queryrunner, shell, and createdb pods
- Docker images are generic base images (Python runtime, PostgreSQL client, etc.) — code is mounted in, not copied during build
- The shell pod runs `sleep infinity` with no code dependency, making it available immediately after deploy

## Decision

**Use a deploy script (`scripts/helm-deploy.sh`) that orchestrates the full deploy cycle:**

1. Build `install/` via `docker-compose run makeinstall`
2. Run `helm upgrade --install` to create/update all Kubernetes resources
3. Wait for the shell pod to be ready
4. Copy `install/` and `db/` contents to the code PVC via tar pipe through the shell pod
5. Restart webap and queryrunner deployments to pick up the new code

Code is copied using tar pipes for reliable content-level transfer:

```bash
tar cf - -C install . | kubectl exec -i -n $NS $SHELL_POD -- tar xf - -C /fastdb/
kubectl exec -n $NS $SHELL_POD -- mkdir -p /fastdb/db
tar cf - -C db . | kubectl exec -i -n $NS $SHELL_POD -- tar xf - -C /fastdb/db/
```

## Rationale

### 1. No Template Changes Required

The existing Helm templates already mount the code PVC at `/fastdb`. The shell pod runs `sleep infinity` with no startup dependency on code being present. No init containers, Helm hooks, or sidecar changes are needed.

### 2. Shell Pod is the Natural Entry Point

The shell deployment is already part of the chart for debugging. It starts immediately (no code dependency), mounts the code PVC read-write, and is always available. Using it as the copy target is a zero-cost approach.

### 3. Tar Pipe is Reliable

`kubectl cp` uses tar under the hood but has known issues with symlinks and permissions. Using explicit `tar cf - | kubectl exec -i -- tar xf -` gives direct control over what gets copied and where, handles all file types correctly, and streams without intermediate files.

### 4. createdb Job Self-Heals

The createdb Job uses `restartPolicy: OnFailure`. If it starts before code is on the PVC, it fails and retries automatically. Once the script copies code, the next retry succeeds. No ordering dependency needs to be encoded in Helm.

### 5. Separating Build from Deploy is Correct

The build step (`docker-compose run makeinstall`) runs Automake in a container with the full toolchain. The deploy step copies the result. This separation means:

- Builds are reproducible (same container, same toolchain)
- Deploy doesn't need the build toolchain
- `--skip-build` allows redeploying the same code (e.g., after a config change)
- `--skip-helm` allows updating just the code without touching Kubernetes resources

## Alternatives Considered

### Alternative 1: Bake Code into Docker Images

Build `install/` during `docker build` so images contain the code.

**Rejected because:**
- The current architecture deliberately separates runtime images from application code
- Every code change would require rebuilding and pushing all images (webap, queryrunner, shell, createdb)
- Image sizes would increase significantly
- Local development workflow uses host mounts for fast iteration — baking code into images would break this pattern
- Would require changing the existing `docker-compose` build structure

### Alternative 2: Helm Hook (pre-install/post-install Job)

Use a Helm hook Job that runs `kubectl cp` or pulls code from a git repo.

**Rejected because:**
- Hook Jobs can't access the host filesystem to copy local build artifacts
- A git-clone hook would need git credentials in the cluster and wouldn't have the Automake build step
- Hook ordering with PVC creation is fragile
- Adds template complexity for something that's better handled outside Helm

### Alternative 3: Init Container with Git Clone

Add an init container to webap/queryrunner that clones the repo and runs `make install`.

**Rejected because:**
- Requires git credentials in the cluster
- Requires the full Automake toolchain in the init container image
- Dramatically increases pod startup time (clone + configure + make)
- Network dependency at pod startup (git clone can fail)
- Every pod restart rebuilds from source
- `install/` should be built once, not per-pod

### Alternative 4: S3/Object Storage Artifact

Upload `install/` to S3, download in init container.

**Rejected because:**
- Adds infrastructure dependency (S3 bucket, credentials)
- Over-engineered for a development/small-team deployment
- Still needs an init container or sidecar
- SLAC S3DF may not have S3 access from vclusters

### Alternative 5: hostPath Volume (Kind-only approach)

Mount the host filesystem directly into pods.

**Rejected because:**
- Only works with Kind (local development)
- SLAC S3DF vclusters do not allow hostPath mounts
- Not portable across deployment environments
- Already used as the Kind-specific path (`volumes.codeHostPath: true`)

## Trade-off Analysis

| Concern | Deploy Script (chosen) | Bake into Images | Init Container | Helm Hook |
|---------|----------------------|-------------------|----------------|-----------|
| **Template changes** | None | Major | Moderate | Moderate |
| **Build/deploy separation** | Clean | Coupled | Coupled | Partial |
| **Pod startup time** | Normal | Normal | Slow (build) | Normal |
| **Network dependency** | kubectl only | Registry | Git/network | Varies |
| **Code update speed** | Fast (copy) | Slow (rebuild all images) | Slow (rebuild) | Moderate |
| **Complexity** | One shell script | Dockerfile changes | Init container config | Hook job config |
| **Works in vcluster** | Yes | Yes | Yes | Yes |

## Consequences

### Positive

- Zero changes to existing Helm templates
- Fast code updates (`--skip-build --skip-helm` for code-only redeploy)
- Clear separation between build and deploy steps
- Works in all target environments (Kind, SLAC S3DF, NERSC SPIN)
- Script is self-documenting with `--help`

### Negative

- Requires running the script (not pure `helm install`)
- Code is not part of the Helm release — `helm rollback` doesn't roll back code
- Brief window after `helm install` where pods have no code (createdb retries; webap/queryrunner restart after copy)
- Depends on shell pod being enabled and healthy

### Mitigations

1. **Script is the documented deploy path** — `HELM_HOWTO.md` references the script as the primary method
2. **createdb retries automatically** — `restartPolicy: OnFailure` handles the timing gap
3. **webap/queryrunner restart after copy** — ensures they always start with fresh code
4. **`--skip-build` and `--skip-helm` flags** — allow partial runs for specific scenarios

## Usage

```bash
# Full deploy (build + helm + copy + restart)
./scripts/helm-deploy.sh ccosta-dev ./helm/fastdb/values-ccosta-dev.yaml

# Code-only update (skip build and helm, just copy and restart)
./scripts/helm-deploy.sh ccosta-dev ./helm/fastdb/values-ccosta-dev.yaml --skip-build --skip-helm

# Config change (skip build, run helm upgrade, copy, restart)
./scripts/helm-deploy.sh ccosta-dev ./helm/fastdb/values-ccosta-dev.yaml --skip-build
```

## When to Reconsider

Reconsider this decision if:

1. **Images are rebuilt to include code** — If the team decides to bake code into images, the copy step becomes unnecessary
2. **CI/CD pipeline is added** — A pipeline could build images with code baked in, making the script unnecessary for automated deploys
3. **Code PVC is replaced** — If the architecture moves away from shared PVCs for code delivery
4. **Team grows significantly** — Larger teams may need more formal artifact management (container registry, artifact storage)
115 changes: 115 additions & 0 deletions HOWTO_LOGIN.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,115 @@
# How to Create a User and Log In

FASTDB uses challenge-response authentication (RKAuth). Users must exist
in the `authuser` PostgreSQL table before they can log in. A password is
set through a reset-password link delivered by email (captured by Mailhog
in development).

## Prerequisites

- Mailhog must be enabled in your deployment so the password-reset email
can be captured. In Helm values, set `mailhog.enabled: true`.
- The `webap` pod must be running.

## 1. Create the user in PostgreSQL

Open a psql session on the primary Postgres pod and insert a row into
`authuser`. Only `username`, `displayname`, and `email` are required;
the password fields (`pubkey`, `privkey`) are populated later via the
password-reset flow.

### Kubernetes (Helm deployment)

```bash
# Replace <namespace> with your namespace (e.g. fastdb-local, ccosta-dev)
kubectl exec -it -n <namespace> deployment/postgres -- \
psql -U postgres -d fastdb -c \
"INSERT INTO authuser (username, displayname, email)
VALUES ('<username>', '<Display Name>', '<user>@mailhog');"
```

### Docker Compose (local development)

```bash
docker compose exec postgres \
psql -U postgres -d fastdb -c \
"INSERT INTO authuser (username, displayname, email)
VALUES ('<username>', '<Display Name>', '<user>@mailhog');"
```

Replace `<username>`, `<Display Name>`, and `<user>` with the desired
values. The email domain does not matter for Mailhog — anything will be
delivered — but `<user>@mailhog` is the convention used in tests.

## 2. Trigger a password-reset email

1. Open the FASTDB web application in your browser.
2. On the login page, click **"Request Password Reset"**.
3. Enter either the **username** or **email** you used in step 1.
4. Click **"Email Password Reset Link"**.

The application sends an email containing a password-reset URL to the
address on file. In development this email is captured by Mailhog.

## 3. Retrieve the reset link from Mailhog

### Option A: Mailhog Web UI

If Mailhog has external access enabled, open the web UI in a browser:

| Deployment | URL |
|---|---|
| Docker Compose | `http://localhost:8025` |
| Kind (NodePort) | `http://localhost:30025` |

Find the email titled **"fastdb password reset"** and copy the reset
URL from the message body.

### Option B: kubectl logs

If the Mailhog web UI is not exposed, the reset URL appears in the
pod logs:

```bash
# Kubernetes
kubectl logs -n <namespace> deployment/mailhog | grep resetpassword
```

```bash
# Docker Compose
docker compose logs mailhog | grep resetpassword
```

The log line contains a URL of the form:

```
https://<host>/auth/resetpassword?uuid=<uuid>
```

## 4. Set the password

1. Open the reset URL from step 3 in your browser.
2. Enter and confirm a new password.
3. Click the submit button.

The browser generates an RSA key pair, encrypts the private key with
your password, and stores both keys in the database. You can now log in
with your username and password.

## Quick-reference: full workflow in one go

```bash
NAMESPACE=fastdb-local # adjust to your namespace

# Create user
kubectl exec -it -n "$NAMESPACE" deployment/postgres -- \
psql -U postgres -d fastdb -c \
"INSERT INTO authuser (username, displayname, email)
VALUES ('alice', 'Alice Developer', 'alice@mailhog');"

# (In browser: go to the login page → "Request Password Reset" →
# enter "alice" → "Email Password Reset Link")

# Grab the reset URL from mailhog logs
kubectl logs -n "$NAMESPACE" deployment/mailhog | grep resetpassword
```
33 changes: 33 additions & 0 deletions admin/dev/kind-config.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
kind: Cluster
apiVersion: kind.x-k8s.io/v1alpha4

nodes:
- role: control-plane
extraPortMappings:
- containerPort: 30080
hostPort: 8080
protocol: TCP

- containerPort: 30090
hostPort: 9000
protocol: TCP

- containerPort: 30091
hostPort: 9001
protocol: TCP

- containerPort: 30092
hostPort: 9010
protocol: TCP

- containerPort: 30093
hostPort: 9011
protocol: TCP

- containerPort: 30432
hostPort: 5432
protocol: TCP

- containerPort: 30025
hostPort: 8025
protocol: TCP
Loading
Loading