Open-source failure intelligence for founders, investors, and researchers.
Startup Graveyard turns startup postmortems into structured, queryable, explainable research assets.
中文辅助理解:把创业失败从“故事阅读”升级成“结构化研究资产”。
Runnable alpha · Open source · 40+ published seed cases
Visual placeholders and image prompts for the README live under docs/assets and docs/IMAGE_PROMPTS.md.
Most startup failure content is still trapped in anecdotal postmortems, scattered news coverage, and founder folklore.
That format is useful for reading, but weak for research. It is hard to compare cases across sectors, markets, business models, and failure modes. It is even harder to ask practical questions like:
- Which failure patterns repeat across adjacent startups?
- What signals usually show up before a shutdown?
- How do marketplace, fintech, or climate startups fail differently?
- What does a founder, investor, or product team need to study before repeating the same path?
Startup Graveyard exists to answer those questions with structure instead of vibes.
Startup Graveyard is an open-source failure intelligence platform.
It combines:
- A structured case library of startup shutdowns and postmortems
- Research workflows for filtering, saving, exporting, and sharing insight slices
- Grounded analysis through Failure Copilot
- Admin workflows for ingestion, review, evidence management, and publication
- Platform primitives for search, indexing, taxonomy normalization, and research operations
This repository is not a slide deck or static mock. It is a runnable alpha product with a working public surface, admin workflows, and an evolving commercial foundation.
- Founders who want to study failure patterns before they scale into them
- Investors who want structured downside pattern recognition, not isolated anecdotes
- Researchers and analysts who need reusable case intelligence, not one-off reading notes
- Product, strategy, and operating teams building internal research workflows around startup failure
- Explore a structured case dataset with filters across industry, country, business model, closure year, and primary failure reason
- Open the Research Hub to start from reusable research questions instead of random browsing
- Ask Failure Copilot grounded questions across the archive
- Save case filters as reusable research views
- Export Markdown and PDF research briefs
- Publish shareable public brief links
- Collaborate through Team Workspaces, shared saved views, and shared cases
- Run admin review and ingestion flows from source snapshot to published case
The product already supports a real content production loop:
- Capture or draft a case.
- Attach evidence and snapshots.
- Extract and normalize structured signals.
- Review and publish.
- Index for search, similarity, Copilot, and research outputs.
The current architecture is organized around three layers.
- Homepage and case explorer
- Research Hub
- Case detail pages
- Failure Copilot
- Saved views, report export, and public brief shares
- Draft and review workflow
- Evidence and source snapshot management
- Ingestion handlers and scheduler triggers
- Team billing recovery, outreach, and recovery playbooks
- Ops dashboard for research, billing, and recovery workflows
- PostgreSQL with
pgvector,pg_trgm, andcitext - Shared schema and OpenAPI contract layer
- Search index and embedding pipelines
- Taxonomy normalization and backfill jobs
- Eval, telemetry, and commercial operations primitives
Startup Graveyard is not trying to be a content farm, a meme archive, or a collection of startup horror stories.
Its differentiation is structural:
- The unit of value is a reusable research asset, not a pageview
- Cases are normalized into comparable entities, factors, timelines, and lessons
- Copilot is grounded in the case archive instead of free-form speculation
- Outputs are shareable and operational: saved views, briefs, PDF exports, and team workflows
- The repo includes both the public product and the operational machinery behind it
This project is best described as a runnable alpha.
What is true today:
- The repository contains a working public product, admin surface, and platform workflows
- The seed dataset includes 40+ published cases
- Saved views, briefs, PDF exports, Team Workspaces, billing foundations, and recovery operations already exist
- The system includes structured ingestion, indexing, eval, and operational instrumentation
- Admin Dashboard now exposes platform diagnostics for runtime state, recent failed ingestion jobs, and derived operational alerts
- The diagnostics layer also flags stale running ingestion jobs and gives operators a direct reclaim path from the dashboard
- The platform layer now also exposes ingestion worker health, recent heartbeat history, last tick / last processed job, and stalled or erroring worker alerts
- The dashboard now also shows ingestion queue backlog age and recent throughput, so operators can tell whether the worker is healthy but the queue is still accumulating
- Operators can now capture point-in-time platform snapshots, keep a short diagnostics history, and also schedule recurring snapshot capture for queue / worker / alert drift
- The dashboard now also summarizes snapshot trends, so operators can tell whether backlog, alert volume, or worker errors are actually improving over recent captures
- The dashboard now also rolls those snapshots up into recent hourly windows, so operators can separate sustained degradation from a single noisy spike
- The diagnostics layer now also flags
snapshot_trend_regressingwhen the latest rollup window is measurably worse than the previous one, so operators get an actionable warning instead of just raw history - The platform layer now also tracks snapshot cadence, missed intervals, regression streak/severity, and alert suppression, so operators can tell whether a trend warning is new, severe, or already covered by more specific runtime alerts
- The platform diagnostics now also expose a 24h snapshot metrics surface, including scheduled coverage, cadence adherence, regression-window count, and recent peak queue/alert/worker-error pressure
What is not true today:
- This is not yet a polished production SaaS
- The dataset is not yet at large-scale coverage
- Some commercial and operational flows are still maturing behind the scenes
Detailed maturity tracking lives in docs/PRODUCT_MATURITY_PLAN.md.
- Node.js 20+
- pnpm 9+
- Docker Desktop
pnpm installcp .env.example .envRecommended minimum local setup:
NODE_ENV=development
PORT=18080
DATABASE_URL=postgresql://postgres:postgres@127.0.0.1:5433/sg
WEB_BASE_URL=http://127.0.0.1:3000
API_BASE_URL=http://127.0.0.1:18080
ADMIN_API_KEY=dev-admin-key
JWT_SECRET=change-me-in-productionOptional integrations:
OPENAI_API_KEYANTHROPIC_API_KEYSTRIPE_*- recovery outreach / CRM / webhook / Slack env vars from
.env.example
Create a clean local database:
make db-resetOr run the pieces separately:
make db-up
make db-migrate
make db-seedmake devUseful variants:
make dev-api
make dev-web
pnpm build
pnpm --filter @sg/api start
pnpm --filter @sg/web start- Web home:
http://127.0.0.1:3000/ - Research Hub:
http://127.0.0.1:3000/research - Failure Copilot:
http://127.0.0.1:3000/copilot - Account:
http://127.0.0.1:3000/auth/account - Ops dashboard:
http://127.0.0.1:3000/admin/dashboard - Review queue:
http://127.0.0.1:3000/admin/reviews - Cases admin:
http://127.0.0.1:3000/admin/cases - API docs:
http://127.0.0.1:18080/docs - Health:
http://127.0.0.1:18080/health
- Next.js 16 App Router
- React 19
- TypeScript
- Fastify 5
- Zod
- TypeScript
- PostgreSQL 16
pgvectorpg_trgmcitext- shared schema package
- OpenAPI contract package
- OpenAI / Anthropic providers
- vector indexing
- eval datasets and replayable regression runs
- prompt telemetry and cost tracking
The test strategy is layered.
- Fast feedback through mock-repository API tests
- PostgreSQL integration coverage for the main data and workflow paths
- Contract and type safety through shared schemas and OpenAPI alignment
- Build validation across API and web apps
Common commands:
pnpm format:check
pnpm lint
pnpm typecheck
pnpm --filter @sg/api test
pnpm --filter @sg/api test:pg
pnpm buildThe current roadmap is staged around four layers:
M1: trusted data foundationM2: research product loopM3: commercial product loopM4: platform hardening and operational maturity- runtime diagnostics, failed-job visibility, alerting, tracing, runbooks, and production safety rails
See the detailed plan in docs/PRODUCT_MATURITY_PLAN.md.
apps/web/app/page.tsx: homepage and research entryapps/web/app/research/page.tsx: Research Hubapps/web/app/copilot/page.tsx: Failure Copilot workbenchapps/web/app/components/SavedViewsManager.tsx: reusable research asset workflowsservices/api/src/ingestion/: snapshots, extraction, indexing, scheduler handlersservices/api/src/recoveryOutreach/: recovery operations channels and playbooksservices/api/src/repositories/: mock and PostgreSQL implementationspackages/contracts/openapi/startup-graveyard.v1.yaml: API contract
Contributions are welcome, especially in these areas:
- New structured cases and evidence
- Taxonomy, normalization, and labeling quality
- Research workflows and insight UX
- Copilot quality, eval coverage, and prompt iteration
- Platform reliability and local developer experience
Before changing outward-facing copy, read:
- The dataset is still intentionally small and seed-stage
- The product is still alpha, even though many flows are already runnable
- Some internal ops and commercial workflows are more mature than the outward product surface
- The current public positioning should stay disciplined: open-source, research-oriented, and credible
