feat: add Cloud Run support with scale-to-zero#9
Conversation
There was a problem hiding this comment.
Code Review — PR #9
Summary: Adds Cloud Run infrastructure: HTTP health server, SIGTERM handler, multi-stage Dockerfile, and BOT_NAME env var support.
Changes reviewed:
src/server.ts— new minimal HTTP health-check serversrc/index.ts— wires up health server and SIGTERM handlerDockerfile— multi-stage build with separate build and runtime stages
Quality Assessment:
| Aspect | Status | Notes |
|---|---|---|
| Code Quality | pass | Clean, well-documented, minimal footprint |
| Tests | warning | No tests in the project (pre-existing); health server is simple enough |
| Security | pass | No secrets exposed; health endpoint returns static "ok" |
| Performance | pass | Lightweight HTTP server, no overhead concern |
Issues found:
-
Dockerfile runtime image size (minor): The runtime stage uses
node:20(~1 GB) while the build stage usesnode:20-slim(~200 MB). The PR description says this is intentional forlivekit-rtcnative binaries, which is a valid reason. However, worth confirming thatnode:20-slimtruly fails at runtime — if it works, switching would cut ~800 MB from the deployed image. -
SIGTERM handler doesn't close the health server or LiveKit connection: The handler logs and calls
process.exit(0), which is fine for Cloud Run's 10-second grace period since Node will forcibly close sockets on exit. But if you ever need a longer drain period (e.g., finishing a chat response in flight), you'd wantserver.close()and to signal the LiveKit worker to disconnect. Not blocking, just flagging for future consideration. -
Health server
servervariable is unused outsidestartHealthServer: If you later need to close the server (see above), you'd need to return or export it. The current fire-and-forget pattern is fine for now.
Suggestions:
- Consider adding a
/.well-known/readyor/healthzroute distinction so you can differentiate Cloud Run startup probes from liveness checks in the future. Not needed now, but a common pattern. - The comment
"no-op locally if PORT unset"inindex.tsis slightly misleading — the server will start on port 8080 by default even locally (sincePORT || "8080"always resolves). If the intent is truly no-op when running locally, you could gate onprocess.env.PORTbeing defined. Low priority since it's harmless.
Verdict: APPROVE
Clean, well-scoped PR. The health server is minimal, the SIGTERM handler covers the Cloud Run lifecycle, and the multi-stage Dockerfile is a good pattern. The minor items above are non-blocking suggestions for future iteration.
c39b6ba to
2f9849e
Compare
Add minimal HTTP health server (server.ts) for Cloud Run container lifecycle — responds 200 to all requests for health checks and wake-up pings. Add SIGTERM handler for graceful shutdown. Multi-stage Dockerfile builds TypeScript in-container using node:20 (required for livekit-rtc native binary). BOT_NAME env var configures bot persona when CLI arg not present. Co-Authored-By: Claude <noreply@anthropic.com>
2f9849e to
29ed8d1
Compare
Summary
server.ts) for Cloud Run container lifecycleBOT_NAMEenv var for configuring bot persona on Cloud RunTest plan
PORT=8080 node dist/index.js startstarts health server and LiveKit workercurl localhost:8080returns 200BOT_NAME=clawdresolves correct configStack: 2/3 — depends on #8
Generated with Claude Code