This playbook describes how to run Flamehaven FileSearch in a secured, observable environment. Adjust to match your infrastructure.
[Client] -> [Reverse Proxy / WAF] -> [Flamehaven API (Uvicorn/Gunicorn)] -> [Document Storage]
|
+--> [Prometheus/Grafana]
- Reverse proxy terminates TLS (nginx, Traefik, Cloudflare).
- API runs as non-root user.
- Persistent storage mounted at
/data/documents.
docker build -t flamehaven-filesearch:1.4.0 .
docker run -d --name flamehaven \
-p 8000:8000 \
-e GEMINI_API_KEY=$GEMINI_API_KEY \
-e DEFAULT_MODEL=gemini-2.5-flash \
-e ENVIRONMENT=production \
-v /srv/flamehaven/data:/app/data \
flamehaven-filesearch:1.4.0Tips
- Use
--restart unless-stopped. - Set
LOG_LEVEL=infoto reduce noise. - On Kubernetes, add readiness probe hitting
/health.
/etc/systemd/system/flamehaven.service
[Unit]
Description=Flamehaven FileSearch API
After=network.target
[Service]
Environment="GEMINI_API_KEY=/etc/secrets/gemini.key"
Environment="ENVIRONMENT=production"
WorkingDirectory=/opt/flamehaven
ExecStart=/opt/flamehaven/.venv/bin/flamehaven-api
User=flamehaven
Group=flamehaven
Restart=on-failure
[Install]
WantedBy=multi-user.targetEnable with systemctl enable --now flamehaven.
Example nginx snippet:
server {
listen 443 ssl;
server_name search.example.com;
ssl_certificate /etc/letsencrypt/live/search/fullchain.pem;
ssl_certificate_key /etc/letsencrypt/live/search/privkey.pem;
location / {
proxy_pass http://127.0.0.1:8000;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
proxy_set_header Host $host;
}
}Enable HTTP/2 and OCSP stapling for best performance.
| Scenario | Recommendation |
|---|---|
| < 50 req/min | Single instance (Uvicorn, 2 workers) |
| 50–500 req/min | Gunicorn with --workers $(CPU*2) + --worker-class uvicorn.workers.UvicornWorker |
| > 500 req/min | Horizontal scaling behind load balancer; externalize cache (Redis) |
Set RATE limits more aggressively when running multiple replicas to avoid exhausting Gemini quota.
- Metrics – Scrape
/prometheus(requiresFLAMEHAVEN_METRICS_ENABLED=1and admin access unless internal). Example configuration for Prometheus:- job_name: flamehaven scrape_interval: 15s metrics_path: /prometheus static_configs: - targets: ['flamehaven:8000']
- Logging – Send STDOUT to Loki/ELK. JSON logs contain
service,version,request_id,environment. - Tracing – Propagate
X-Request-IDthrough reverse proxy to correlate
across services.
Use PostgreSQL to persist local fallback metadata across restarts and optionally back the vector store with pgvector. Metadata persistence is only used in offline/local fallback mode (no external LLM).
Environment variables
POSTGRES_ENABLED=1
POSTGRES_DSN=postgresql://user:pass@host:5432/flamehaven
POSTGRES_SCHEMA=public
VECTOR_BACKEND=postgres
VECTOR_POSTGRES_TABLE=flamehaven_vectors
VISION_ENABLED=0
VISION_PROVIDER=auto
Notes
- Tables are auto-created on startup.
- The vector backend requires the
pgvectorextension. - Set
VECTOR_BACKEND=memoryto keep vectors in-process. - Use a dedicated schema to isolate Flamehaven metadata.
- Restrict network access to the database host only.
- Documents: If using local storage, back up
/srv/flamehaven/data. Consider object storage (S3, GCS) for durability. - Configuration: Store
.env.productionin secure secret manager. - Rotation: Regenerate Gemini API key every 90 days. Update secrets via CI/CD pipeline.
- Run container as non-root (
USER 1000). - Enable
ufw/iptablesto allow ingress only on 80/443. - Add WAF rules for
/api/upload/*. - Periodically run
gitleaksandtrufflehog(already configured in CI). - Keep dependencies up to date (
pip install -U flamehaven-filesearch).
- TLS certificate valid and auto-renewed.
- Rate limits tuned for workload.
- Prometheus scrape working; alerts configured for
errors_totalandrate_limit_exceeded. - Backups tested.
- Runbook stored with the operations team.
With these steps you can safely run Flamehaven FileSearch in production for internal knowledge bases or customer-facing search experiences.