Skip to content

Restore monorepo split: separate calculator (app.policyengine.org) and website (policyengine.org) #495

@MaxGhenis

Description

@MaxGhenis

Summary

Re-implement the monorepo split that was reverted due to production issues. The goal is to deploy:

  • app.policyengine.org → Calculator app (policy simulations, household builder)
  • www.policyengine.org → Website (homepage, blog, team, research)

What Was Built (PR #494, reverted)

Architecture

  • Turborepo monorepo with npm workspaces
  • @policyengine/design-system package for shared tokens/components
  • Separate entry points: CalculatorApp.tsx, WebsiteApp.tsx
  • Separate HTML files: calculator.html, website.html
  • VITE_APP_MODE environment variable controls which app to build
  • Three Vercel projects connected to same repo:
    • policyengine-calculator → app.policyengine.org
    • policyengine-websitewww.policyengine.org
    • policyengine-app-v2 → legacy combined build

Key Files Created

  • vercel.json - Website project config
  • vercel.calculator.json - Calculator project config
  • packages/design-system/ - Shared design tokens
  • app/src/CalculatorRouter.tsx - Calculator routes
  • app/src/WebsiteRouter.tsx - Website routes
  • app/src/CalculatorApp.tsx - Calculator app with all providers
  • app/src/WebsiteApp.tsx - Website app with all providers

Postmortem: What Went Wrong

Root Cause: Deployed Unreviewed Code to Production Domains

The site broke BEFORE the PR was even merged. The mistake was using vercel alias set to point production domains at deployments from an unmerged feature branch.

Actions that broke production:

vercel --prod --scope policy-engine           # Deployed feature branch
vercel alias set <deployment> www.policyengine.org   # Pointed PRODUCTION at it
vercel alias set <deployment> app.policyengine.org   # Pointed PRODUCTION at it

The correct approach would have been:

vercel --scope policy-engine                  # Deploy to preview URL (no --prod)
# Test at preview URL: https://policyengine-website-xxxxx.vercel.app
# Only after PR merge does production update automatically

Secondary Issue: Missing Providers

The WebsiteApp.tsx was initially created without required React context providers (Redux, QueryClient), causing runtime errors. But this would have been caught during PR review if we'd tested the preview URL instead of pushing to production.

Timeline

  1. PR Add monorepo structure with design system and app split #494 created with monorepo split code
  2. Production domains pointed at feature branch deployments - bypassing PR review
  3. Site immediately showed React errors
  4. Team noticed production was running from monorepo-split branch before PR was merged
  5. Decision made to revert to restore service

Required Controls Before Re-implementing

1. Vercel CLI Protocol & Verification Script

Create a verification script to run before ANY manual Vercel CLI changes that affect production:

# scripts/verify-vercel-deployment.sh
#!/bin/bash
set -e

DEPLOY_URL=$1

if [ -z "$DEPLOY_URL" ]; then
  echo "Usage: ./verify-vercel-deployment.sh <deployment-url>"
  exit 1
fi

echo "🔍 Verifying deployment: $DEPLOY_URL"

# 1. Check HTTP 200
echo "Checking HTTP status..."
STATUS=$(curl -s -o /dev/null -w "%{http_code}" "$DEPLOY_URL")
if [ "$STATUS" != "200" ]; then
  echo "❌ FAILED: HTTP status $STATUS (expected 200)"
  exit 1
fi
echo "✅ HTTP 200 OK"

# 2. Check for React error indicators
echo "Checking for React errors..."
BODY=$(curl -s "$DEPLOY_URL")
if echo "$BODY" | grep -q "Unexpected Application Error"; then
  echo "❌ FAILED: React error boundary triggered"
  exit 1
fi
if echo "$BODY" | grep -q "Cannot destructure property"; then
  echo "❌ FAILED: Provider/context error detected"
  exit 1
fi
echo "✅ No React errors detected"

# 3. Check key elements exist
echo "Checking page structure..."
if ! echo "$BODY" | grep -q "<title>PolicyEngine</title>"; then
  echo "❌ FAILED: Missing expected title"
  exit 1
fi
echo "✅ Page structure OK"

# 4. Check key routes (fetch a few paths)
for path in "/" "/us" "/uk"; do
  ROUTE_STATUS=$(curl -s -o /dev/null -w "%{http_code}" "$DEPLOY_URL$path")
  if [ "$ROUTE_STATUS" != "200" ]; then
    echo "⚠️  Warning: Route $path returned $ROUTE_STATUS"
  else
    echo "✅ Route $path OK"
  fi
done

echo ""
echo "✅ All checks passed for $DEPLOY_URL"
echo ""
echo "⚠️  BEFORE aliasing to production:"
echo "   1. Manually test the deployment in a browser"
echo "   2. Check console for JavaScript errors"
echo "   3. Test key user flows (homepage, calculator)"
echo "   4. Get team sign-off if this is a major change"

Protocol for manual Vercel changes:

  1. Deploy to preview (no --prod): vercel --scope policy-engine
  2. Run verification: ./scripts/verify-vercel-deployment.sh <preview-url>
  3. Manually test in browser
  4. For major changes, get team sign-off
  5. Only then alias to production (if needed outside of normal Git flow)

2. Pre-merge Verification

  • Mandatory preview URL check - Before merging PRs that touch app entry points, manually verify the Vercel preview loads without errors
  • Add PR template checkbox: "I have verified the Vercel preview URL loads correctly"

3. Automated E2E Smoke Tests in CI

  • Add Playwright or Cypress test that:
    • Loads the app
    • Verifies no React error boundary is triggered
    • Checks that key routes render
  • Run against Vercel preview URL in CI

4. Integration Tests for App Entry Points

  • Add tests that render CalculatorApp and WebsiteApp fully (not mocked)
  • Verify all required providers are present

5. Monitoring

  • Set up error monitoring (Sentry) to catch React errors quickly
  • Alert on error rate spikes

Vercel Cleanup Needed

The following were created and may need cleanup:

  • Vercel project: policyengine-calculator
  • Vercel project: policyengine-website
  • DNS record in Squarespace: app CNAME → cname.vercel-dns.com
  • Domain alias: app.policyengine.org

Implementation Plan (When Ready)

  1. Create verification script - scripts/verify-vercel-deployment.sh
  2. Add E2E smoke tests to CI - Before any split work
  3. Fix and test both apps - Using preview URLs only
  4. Run verification script - On preview deployments
  5. Get team sign-off - Before any production changes
  6. Merge PR - Let Vercel Git integration handle production deployment
  7. Monitor - Watch for errors after merge

Key Lessons

  1. Always test on preview URLs first - Vercel gives every deployment a unique URL
  2. Run verification script before production changes - Automated sanity checks catch obvious failures
  3. Don't point production domains at unreviewed code - Even if tests pass, get human verification

Related

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions