Skip to content

Conversation

@fzheng
Copy link
Owner

@fzheng fzheng commented Dec 7, 2025

No description provided.

fzheng and others added 19 commits December 5, 2025 19:03
Phase 3b implementation:
- Thompson Sampling in hl-sage with NIG posterior sampling
- ATR-based dynamic stops (BTC 2.0x, ETH 1.5x multipliers)
- Trader correlation computation with phi correlation
- Episode tracking with R-multiple calculation
- Comprehensive integration tests (22 new tests)
- Operational runbook for system monitoring
- Fix E2E tests baseURL for Windows compatibility

Test coverage:
- test_thompson_sampling.py: NIG sampling, explore/exploit tradeoff
- test_correlation.py: Bucket IDs, phi correlation, effK calculation
- test_atr.py: ATR computation, stop distance bounds
- test_episode.py: VWAP entry/exit, R-multiple calculation
- test_integration.py: Full signal flow integration

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
Peer review findings addressed:
- check_episode_consensus() now uses detector.get_stop_fraction(asset)
  instead of hardcoded ASSUMED_STOP_FRACTION (1%)
- Added startup trigger for correlation job when no correlations exist
- Removed dead test file (test_pnl.py) that referenced removed function

Tests:
- Added TestATRToConsensusFlow class (4 new tests)
- Total: 101 Python tests passing, 128 E2E tests passing

Docs:
- Added "Peer Review Findings & Fixes" section to DEVELOPMENT_PLAN.md
- Updated Phase 3b status to Complete

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
- Fix performPrime() to await upsertCurrentPosition() calls instead of fire-and-forget
- Fix Legacy tab to clear cache and refresh fills on tab activation
- Fix Alpha Pool auto-refresh to properly set is_running state
- Fix refreshAlphaPool() to not show generic loading during refresh progress
- Add 14 unit tests for position priming and tab switching fixes
- Add E2E tests for tab switching data refresh behavior
- Update test count to 977, update docs with December 2025 bug fixes

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Dashboard & API:
- Add Alpha Pool refresh progress UI with spinner and step display
- Add /dashboard/api/alpha-pool/refresh/status endpoint for progress polling
- Add /dashboard/api/alpha-pool/fills endpoint for pool-specific fills
- Add /dashboard/api/legacy/fills endpoint for leaderboard-only fills
- Improve Alpha Pool table CSS styling and responsiveness

Backend:
- Add listLiveFillsForAddresses() for filtered fills queries
- Add SAGE_URL env var for hl-stream to proxy Alpha Pool requests
- Add DECIDE_URL env var for hl-sage to fetch consensus outcomes
- Wire hl-stream -> hl-sage dependency in docker-compose
- Add reconciliation support for EpisodeFill in hl-decide

Tests:
- Add integration tests for listLiveFillsForAddresses

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Fix selectors to use actual HTML IDs (tab-alpha-pool, tab-legacy-leaderboard)
- Fix selectors to use actual data-testid attributes (tab-content-alpha-pool, tab-content-legacy-leaderboard)
- Add refresh button as valid Alpha Pool UI state

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
…y, dynamic price bands

- ATR-based dynamic price bands using R-units instead of fixed 8 BPS
- ATR staleness checks with configurable max age (300s default)
- Asset-specific ATR fallbacks (BTC: 0.4%, ETH: 0.6%)
- Correlation time-decay with exponential halflife (3 days default)
- Correlation freshness checks with 7-day staleness threshold
- 24 new tests for ATR and correlation enhancements

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
…ents

ATR Enhancements:
- ATR_STRICT_MODE (default true) blocks gating on hardcoded fallback
- 24h realized vol fallback from marks_1m before using hardcoded values
- ATR validity gate integrated into ConsensusDetector

Vote Weighting Improvements:
- Logarithmic scaling (log(1 + notional/base)) as default mode
- Equity-normalized mode (sqrt(notional/equity)) when data available
- Configurable via VOTE_WEIGHT_MODE, VOTE_WEIGHT_LOG_BASE, VOTE_WEIGHT_MAX
- Track notional and equity in Vote dataclass

Developer Experience:
- Added Makefile with test/docker/dev commands
- npm scripts: test:all, test:py, docker:restart, docker:rebuild, docker:wipe
- 39 new tests for ATR and vote weighting

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Rewrite README with high-level business overview
- Add website link (sigmapilot.ai)
- Move technical details to docs/ARCHITECTURE.md
- Remove deprecated fetchUserBtcFills function and tests
- Add test dependencies to Python requirements.txt

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add periodic_correlation_refresh_task() to recompute correlations daily
- Configurable via CORR_REFRESH_INTERVAL_HOURS env var (default 24h)
- Ensures correlation matrix stays fresh with new trading data
- Works with existing decay mechanism (3-day half-life)
- Document new env vars in ARCHITECTURE.md

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add 7 new integration tests for correlation refresh and hydration
- Test decay application during detector hydration
- Test effK calculation with hydrated correlations
- Test background task configuration defaults
- Update test counts: 151 Python tests, 973 TypeScript tests

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
When a Hyperliquid WebSocket connection closes (due to network issues,
server-side disconnect, etc.), the RealtimeTracker now automatically:
1. Detects the disconnection
2. Saves the list of subscribed users
3. Reconnects with a new WebSocket
4. Resubscribes all users to userEvents
5. Retries on failure with exponential backoff

This fixes a bug where fills would stop being recorded after a WebSocket
disconnect, causing addresses to show outdated fills in the Legacy tab.

Note: Addresses beyond the first 10 in the watchlist still rely on manual
backfill due to Hyperliquid's 10-user-per-WebSocket-per-IP limit.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Changed /dashboard/api/legacy/fills/fetch-history to fetch ALL available
fills from Hyperliquid API (up to 2000 per address) instead of just the
first N fills.

This ensures that any gaps between what's stored in the DB and the current
time are properly filled when triggering a manual backfill.

The `limit` parameter now only controls how many fills to return in the
API response, not how many to fetch from Hyperliquid.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Replace hardcoded weight = min(notional/100000, 1.0) with
  calculate_vote_weight() which respects VOTE_WEIGHT_MODE (log/equity/linear)
- Replace fixed 8 bps price band check with
  consensus_detector.passes_latency_and_price_gates() which uses
  ATR-based R-unit drift thresholds
- Use proper Vote namedtuple instead of dicts for type safety
- Add docstring noting centralized function usage to prevent future drift

This addresses the peer review feedback about consensus logic drift
between main.py and consensus.py.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Rate Limiting (ts-lib/hyperliquid.ts):
- Add RateLimiter class with configurable calls/second (default 2/s)
- Wrap all Hyperliquid SDK calls in withRateLimitAndRetry()
- Handle 429 errors with exponential backoff (3 retries)
- Configurable via HL_SDK_CALLS_PER_SECOND, HL_SDK_MAX_RETRIES

Observability Metrics (hl-decide):
- ATR metrics: staleness counter, fallback usage, age gauge, blocked counter
- Correlation metrics: staleness gauge, age, decay factor, pairs loaded
- EffK metrics: value histogram, default fallback counter
- Weight distribution: histogram, max gauge, Gini coefficient gauge

Helper functions:
- calculate_gini() for weight distribution inequality measurement
- update_weight_metrics() and update_correlation_metrics()

These metrics enable alerting on:
- Trading with stale ATR/correlation data
- Over-reliance on default correlation (ρ=0.3)
- Weight concentration (high Gini = few traders dominating)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Risk Configuration (consensus.py):
- MAX_POSITION_SIZE_PCT: 2% per position (conservative)
- MAX_TOTAL_EXPOSURE_PCT: 10% total (up to 5 positions)
- MAX_DAILY_LOSS_PCT: 5% drawdown halt
- MIN_SIGNAL_CONFIDENCE: 55% win probability required
- MIN_SIGNAL_EV_R: Positive EV required (configurable)
- MAX_LEVERAGE: 1x until Kelly sizing implemented
- SIGNAL_COOLDOWN_SECONDS: 5 minutes between same-symbol signals

Gate 5 - Risk Limits:
- check_risk_limits() validates signals before generation
- Rejects low confidence (<55%) or low EV signals
- Metrics: signal_risk_rejected_total, signal_generated_total

Tests (test_risk_limits.py):
- 13 tests for risk limit logic and default bounds
- Verifies conservative defaults are properly bounded

These are static fail-safes until Phase 4 implements proper Kelly
criterion and dynamic position sizing. All limits configurable via
environment variables for tuning.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Reduced from 600+ lines to ~300 lines
- Added Phase 3c (Observability & Hardening) as complete
- Clearly marked Phase 4 (Risk Management) as next
- Consolidated configuration reference
- Added architecture diagram
- Removed redundant historical details

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Rate Limiter Improvements:
- Weight-aware rate limiting (tracks API weight consumption)
- API weight table per Hyperliquid docs (userFills: 20, clearinghouseState: 2)
- Budget-based throttling at 70%/90% thresholds
- Telemetry: 429 counter, backoff delays, calls by operation

Data Quality Metrics:
- effK fallback counter when default ρ is used
- Correlation coverage % (pairs with data vs pool size)
- Pool size gauge for correlation monitoring

Weight Distribution Monitoring:
- Saturation counter (weights at/near cap)
- Saturation percentage gauge
- Per-asset tracking (BTC/ETH)

Quant Pipeline Integration Tests (9 new):
- Low vol + high correlation regime
- High vol + low correlation regime
- Weight disparity effects on effK
- effK fallback tracking
- NIG weight formula verification
- R-multiple sensitivity to stop
- Consensus gates progression
- ATR staleness gating
- Correlation decay effects

Test count: 173 Python tests (was 164)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add getLastActivityPerAddress() function using DISTINCT ON for efficient per-address queries
- Add /dashboard/api/alpha-pool/last-activity endpoint to get most recent fill per trader
- Add click-to-toggle functionality for Last Activity column (relative/absolute format)
- Add 15 E2E tests for Last Activity toggle feature
- Update API documentation with new Dashboard API endpoints
- Add data-testid for Last Activity column header

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add createCounter and createGauge helpers to ts-lib metrics
- Add initRateLimiterMetrics to expose rate limiting stats via /metrics
- Add /data-health endpoint to hl-decide for operational dashboards
  - ATR freshness status (stale/fresh/missing) per asset
  - Correlation coverage health (loaded pairs vs expected)
  - Weight distribution health (Gini, saturation %)
  - Aggregated warnings and overall status

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@fzheng
Copy link
Owner Author

fzheng commented Dec 9, 2025

High-level summary

Feature: Adds “Alpha Pool” support across services (hl-stream, hl-scout, hl-sage, ts-lib/hyperliquid). New endpoints for alpha-pool data, backfill/holdings/last-activity, and UI work in the dashboard.
Risk/ops: Introduces a weight-aware rate limiter and retry wrapper for Hyperliquid SDK calls (packages/ts-lib/hyperliquid.ts) — good for robustness but needs careful logging and observability tuning.
Tests: Adds unit and E2E tests (pytest in hl-decide and Playwright e2e tests) and test deps to requirements.
UX: Dashboard HTML/CSS updated to surface Alpha Pool refresh status, holdings, last-activity toggle, and new tab separation (Legacy vs Alpha).
Backwards compatibility: Several endpoints are split into explicit Legacy endpoints (e.g. /legacy/fills/) and new /alpha-pool/ endpoints.
High-priority issues and suggestions (actionable)

Missing request body when proxying non-GET requests (likely break for POST proxy)
Where you changed: fetch(target, { method: req.method, headers: { 'x-owner-key': OWNER_TOKEN }, });
Problem: If req.method is POST/PUT the handler does not forward the request body, content-type, or other headers (and probably not the original req stream). That will break proxied POST endpoints (e.g. adding pinned accounts or alpha-pool refresh if proxySage/proxyScout use the same pattern).
Fix: For proxying, forward method, headers (including Content-Type), and the request body (or pipe the stream). Example (Express + node-fetch): include body: JSON.stringify(req.body) for JSON, or use req.pipe when streaming. Also copy other headers that matter (authorization, content-type). If using fetch with RequestInit, for large bodies prefer streaming.
Authorization when proxying to hl-sage
Observation: New hl-stream alpha-pool endpoints call sageUrl directly (fetch(${sageUrl}/alpha-pool/addresses?...)) without the OWNER_TOKEN header. If hl-sage requires authentication/owner-key, these calls will fail or leak access.
Suggestion: Either use the same proxy function that includes OWNER_TOKEN or ensure proxySage forwards appropriate credentials. Audit all fetch() calls to internal services and ensure owner-key is forwarded where required.
SQL address case-sensitivity bug in getLastActivityPerAddress
Where: packages/ts-lib/src/persist.ts -> getLastActivityPerAddress
Issue: You normalize addresses to lowercase in JS, but the SQL uses WHERE address = ANY($1) (no LOWER(address)). If the DB stored addresses in mixed-case, the match may fail.
Fix: Change WHERE to use LOWER(address) = ANY($1) OR pass canonical-cased addresses that match DB storage format. Example: WHERE type = 'trade' AND LOWER(address) = ANY($1) and query with normalizedAddresses.
Missing import: math used in consensus.calculate_vote_weight
Where: services/hl-decide/app/consensus.py
Issue: calculate_vote_weight uses math.sqrt and math.log but I don’t see math imported in the diff. This will raise NameError at runtime.
Fix: Add "import math" at the top.
Inconsistent logging / print usage
Where: consensus.py now includes print() calls for gating warnings.
Suggestion: Replace print() with your structured logger (logger.xxx) to keep consistency and observability.
ATR gating: strict mode and fallback semantics
Observation: You introduced ATR-based gating and an "is_atr_valid_for_gating" check. You also kept a legacy fallback constant CONSENSUS_MAX_PRICE_BAND_BPS and a new CONSENSUS_MAX_PRICE_DRIFT_R. This is good, but:
Ensure ATR provider updates are robust and ATR_STRICT_MODE behavior is documented/configurable in staging vs prod.
In strict mode (ATR_STRICT_MODE=true) you block signals if ATR data is missing; consider an explicit admin/ops fallback metric or safe-mode flag to avoid accidental full stoppage.
Suggest adding clear logging and metrics around "atr validity" to detect when signals are blocked because ATR unavailable.
eff_k_from_corr fallback_counter_callback semantics
Issue: The code tracks fallback_count for unknown correlation pairs and calls fallback_counter_callback() if fallback_count > 0, but it doesn’t pass the fallback_count nor adjust for double-counting; comment mentions dividing by 2 but callback not given a value.
Suggestion: Either pass the fallback_count (or unique-pair count) to the callback or adjust the logic so it reports the correct number. Example: compute fallback_count_pairs = fallback_count / 2 then call callback(fallback_count_pairs) or increment a metric with that value.
withRateLimitAndRetry: swallowing non-429 errors
Where: packages/ts-lib/src/hyperliquid.ts
Issue: On any non-rate-limit error, withRateLimitAndRetry returns the fallback silently. That can mask real errors and make debugging harder.
Suggestion: At a minimum log the error with operationName and attempt count. Consider rethrowing non-rate-limit errors in dev/CI, or make a configurable behavior: either return fallback or rethrow/log. E.g. log error stack and operation details.
SQL casting of payload->>'at' to timestamptz may raise on bad format
Where: multiple queries using COALESCE((payload->>'at')::timestamptz, at)
Issue: If payload->>'at' is present but in an unexpected format, the cast (::timestamptz) will throw and the query will error (COALESCE won’t rescue cast exceptions).
Suggestion: Use try_cast patterns or safe conversion in SQL (e.g., to_timestamp(...) or check with to_timestamp only if matches a known format), or ensure all payload timestamps are normalized before insert. Alternatively, use an SQL function that returns NULL on parse failure.
Adding many new endpoints — auth, rate-limiting, and monitoring
Recommendation: New endpoints that trigger heavy DB or cross-service work (alpha-pool/refresh, holdings, fetch-history) should be:
Protected (owner tokens or admin-only)
Rate-limited
Instrumented with metrics and structured logs to track usage and failures
For example, alpha-pool/fetch-history may cause a big DB scan — ensure it’s guarded.
Proxy fetch change (method-only) — cross-site headers & CORS
The change to fetch with method but no headers beyond owner-key may drop other important headers (e.g., X-Forwarded-For, content-type). Ensure required headers are forwarded and CORS considerations covered if requests come from the browser.
Tests: heavy new E2E tests may be flaky
The Playwright tests are comprehensive, but many use page.waitForTimeout and rely on default timeouts. Consider replacing fixed timeouts with explicit waits for relevant network requests / DOM states to reduce flakiness.
Also ensure e2e tests are runnable in CI (they changed baseURL to localhost, which is appropriate for dev; confirm CI's network / container routing matches).
Minor/UX/Style notes

In dashboard.css there are duplicate @Keyframes pulse definitions added; consolidate to single definition to avoid confusion.
Some CSS rules use animations and spinner sizes changed — verify visual regression on small screens.
In packages/ts-lib listLiveFillsForAddresses, whereClause building is OK but ensure you sanitize placeholders properly and guard against empty address arrays (you already handle that).
Consider adding more metrics/logs for the new alpha-pool refresh endpoint: start/finish durations, per-step progress.
Suggested follow-ups (I can do these if you want)

I can produce patch snippets for:
forwarding request body in proxy fetch (Express code)
fix persist.getLastActivityPerAddress SQL to use LOWER(address)
add missing import math to consensus.py and replace print with logger
improve withRateLimitAndRetry to log non-429 errors
Run a quick code search for other places where request proxying was changed to ensure all proxies forward bodies/headers.
Run a quick grep for uses of "OWNER_TOKEN" / proxy patterns to ensure authentication is consistently forwarded.

Implement centralized WebSocket slot management with priority-based allocation:
- SubscriptionManager class in ts-lib for managing limited WebSocket slots
- Priority order: pinned (0) > legacy (1) > alpha-pool (2)
- Manual demote: free a WebSocket slot without auto-filling it
- Manual promote: use a freed slot for a polling address
- Slot reservation: demoted slots stay empty until manually promoted
- Dashboard UI: clickable icons with popover for promote/demote actions
- API endpoints: /subscriptions/status, /methods, /demote, /promote

Tests:
- 48 unit tests for SubscriptionManager (slot reservation, demote/promote)
- E2E tests for dashboard UI (popover, API calls, slot display)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@fzheng fzheng deployed to phase-3 - hlbot-db December 10, 2025 15:54 — with Render Active
@fzheng fzheng temporarily deployed to phase-3 - nats December 10, 2025 15:54 — with Render Inactive
@fzheng fzheng temporarily deployed to phase-3 - hl-scout December 10, 2025 15:56 — with Render Inactive
@fzheng fzheng temporarily deployed to phase-3 - hl-decide December 10, 2025 15:56 — with Render Inactive
@fzheng fzheng temporarily deployed to phase-3 - hl-sage December 10, 2025 15:56 — with Render Inactive
@fzheng fzheng temporarily deployed to phase-3 - hl-stream December 10, 2025 15:58 — with Render Inactive
Alpha Pool Improvements:
- Auto-backfill historical fills when addresses first added to pool
- Add manual backfill endpoint: POST /alpha-pool/backfill/{address}
- Track newly inserted vs updated addresses separately during refresh

WebSocket Fill Routing Fix:
- Fix data integrity bug where fills were broadcast to ALL handlers
- Route fills to specific user's handler based on msg.data.user
- Add comprehensive tests for fill routing logic

E2E Test Improvements:
- Extract mock data and setup into reusable fixtures
- Fix WebSocket test to filter out TradingView's connections
- Fix External Links test to navigate to Legacy tab first
- Increase navigation timeouts for mobile-chrome (60s)
- Add isMobile checks for table visibility assertions

Documentation:
- Update DEVELOPMENT_PLAN.md with backfill feature details
- Fix E2E test count (110 tests across 6 spec files)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
* docs: Add Phase 3e design documentation

Analysis Documents:
- sigmapilot-current-architecture.md: Complete architecture overview
  including services, message bus, database schema, signal pipeline
- nofx-ux-and-architecture-notes.md: Reference study of NOFX project
  with UX patterns and architectural ideas (no code copied)

UX Specification:
- dashboard-redesign.md: Complete redesign with 6 main sections
  (Overview, Positions, Signals, Decisions, Traders, Settings)
  including user stories, wireframes, and data requirements

Technical Design:
- multi-exchange-architecture.md: Exchange adapter interface,
  Hyperliquid and Binance implementations, credential encryption,
  rate limiting, auto-trade routing
- decision-logging.md: AI decision log schema, reasoning generator,
  database design, API endpoints, WebSocket events

These documents provide the foundation for implementing:
- Unified multi-exchange P&L tracking
- Human-readable AI decision explanations
- Web-based strategy configuration
- Auto-trade with per-exchange limits

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* docs: Remove external reference document

* feat: Implement Phase 3f Selection Integrity with Shadow Ledger

## Shadow Ledger & Snapshot System
- Add trader_snapshots table (migration 025) for daily snapshots
- Implement snapshot.py with:
  - Thompson sampling with stored seeds for reproducibility
  - Benjamini-Hochberg FDR control (correct k* finding)
  - Skill p-value computation with winsorization
  - Cost-adjusted R-multiple estimation
  - TraderSnapshot dataclass with full state
- Add walk-forward replay for backtesting snapshots
- Create /snapshots/* API endpoints for snapshot management

## Risk Governor (hl-decide)
- Add risk_governor.py with drawdown tracking and circuit breakers
- Implement portfolio.py for position/exposure management
- Add decision_logger.py for audit trail
- Add executor.py for paper trading simulation
- Create risk_governor_state table (migration 026)
- Add decision_logs table (migration 023)
- Add execution_config table (migration 024)

## Configurable Thresholds
- Make FDR thresholds configurable via environment variables:
  - SNAPSHOT_MIN_EPISODES (default: 5, prod: 30)
  - SNAPSHOT_FDR_ALPHA (default: 0.10)
  - SNAPSHOT_MIN_AVG_R_NET (default: 0.0, prod: 0.05)
  - ROUND_TRIP_COST_BPS (default: 30)
  - DEATH_DRAWDOWN_PCT (default: 0.80)
  - DEATH_ACCOUNT_FLOOR (default: 10000)
  - CENSOR_INACTIVE_DAYS (default: 30)

## Fresh Install Support
- Add /alpha-pool/backfill-all endpoint for bulk historical data
- Create scripts/init-alpha-pool.sh for initialization after docker compose up
- Add npm run init:alpha-pool and make init commands

## Tests
- 49 tests for snapshot module (test_snapshot.py, test_snapshot_extended.py)
- 13 tests for walk-forward replay (test_walkforward.py)
- 87 tests for risk governor (test_risk_governor.py, test_risk_governor_extended.py)
- 6 tests for decision logger (test_decision_logger.py)

## Documentation
- Update DEVELOPMENT_PLAN.md with Phase 3f details
- Add PHASE_3F_TEST_CASES.md with comprehensive test plan
- Add phase-3e-tech-plan.md design document
- Update README.md and CLAUDE.md with fresh install instructions

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* feat: Add automatic Alpha Pool initialization on fresh install

## Changes
- hl-sage now auto-detects fresh installs (empty Alpha Pool)
- Automatically runs initialization: refresh → backfill → snapshot
- Watch progress with: docker compose logs -f hl-sage

## New Environment Variables
- ALPHA_POOL_AUTO_INIT: Enable/disable auto-init (default: true)
- ALPHA_POOL_AUTO_INIT_DELAY_MS: Rate limit delay (default: 500ms)

## Cross-Platform Support
- Added scripts/init-alpha-pool.mjs (Node.js version)
- npm run init:alpha-pool now uses Node.js (works on Windows)
- Bash script still available for Linux/Mac

## Documentation
- Updated DEVELOPMENT_PLAN.md with auto-init details
- Updated PHASE_3F_TEST_CASES.md with auto-init instructions
- Makefile init target now uses npm for cross-platform support

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* docs: Update DEVELOPMENT_PLAN.md with latest test counts and files

- Update test counts: 1,035 TS, 391 Python, 220 E2E tests
- Add Phase 3f key files: snapshot.py, walkforward.py, decision_logger.py, risk_governor.py
- Add init script to key files
- Update last modified date

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
@fzheng fzheng merged commit 004aec9 into main Dec 13, 2025
1 check passed
@fzheng fzheng deleted the phase-3 branch December 13, 2025 03:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants