-
Notifications
You must be signed in to change notification settings - Fork 0
Phase 3 #13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Phase 3b implementation: - Thompson Sampling in hl-sage with NIG posterior sampling - ATR-based dynamic stops (BTC 2.0x, ETH 1.5x multipliers) - Trader correlation computation with phi correlation - Episode tracking with R-multiple calculation - Comprehensive integration tests (22 new tests) - Operational runbook for system monitoring - Fix E2E tests baseURL for Windows compatibility Test coverage: - test_thompson_sampling.py: NIG sampling, explore/exploit tradeoff - test_correlation.py: Bucket IDs, phi correlation, effK calculation - test_atr.py: ATR computation, stop distance bounds - test_episode.py: VWAP entry/exit, R-multiple calculation - test_integration.py: Full signal flow integration 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
Peer review findings addressed: - check_episode_consensus() now uses detector.get_stop_fraction(asset) instead of hardcoded ASSUMED_STOP_FRACTION (1%) - Added startup trigger for correlation job when no correlations exist - Removed dead test file (test_pnl.py) that referenced removed function Tests: - Added TestATRToConsensusFlow class (4 new tests) - Total: 101 Python tests passing, 128 E2E tests passing Docs: - Added "Peer Review Findings & Fixes" section to DEVELOPMENT_PLAN.md - Updated Phase 3b status to Complete 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
- Fix performPrime() to await upsertCurrentPosition() calls instead of fire-and-forget - Fix Legacy tab to clear cache and refresh fills on tab activation - Fix Alpha Pool auto-refresh to properly set is_running state - Fix refreshAlphaPool() to not show generic loading during refresh progress - Add 14 unit tests for position priming and tab switching fixes - Add E2E tests for tab switching data refresh behavior - Update test count to 977, update docs with December 2025 bug fixes 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Dashboard & API: - Add Alpha Pool refresh progress UI with spinner and step display - Add /dashboard/api/alpha-pool/refresh/status endpoint for progress polling - Add /dashboard/api/alpha-pool/fills endpoint for pool-specific fills - Add /dashboard/api/legacy/fills endpoint for leaderboard-only fills - Improve Alpha Pool table CSS styling and responsiveness Backend: - Add listLiveFillsForAddresses() for filtered fills queries - Add SAGE_URL env var for hl-stream to proxy Alpha Pool requests - Add DECIDE_URL env var for hl-sage to fetch consensus outcomes - Wire hl-stream -> hl-sage dependency in docker-compose - Add reconciliation support for EpisodeFill in hl-decide Tests: - Add integration tests for listLiveFillsForAddresses 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Fix selectors to use actual HTML IDs (tab-alpha-pool, tab-legacy-leaderboard) - Fix selectors to use actual data-testid attributes (tab-content-alpha-pool, tab-content-legacy-leaderboard) - Add refresh button as valid Alpha Pool UI state 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
…y, dynamic price bands - ATR-based dynamic price bands using R-units instead of fixed 8 BPS - ATR staleness checks with configurable max age (300s default) - Asset-specific ATR fallbacks (BTC: 0.4%, ETH: 0.6%) - Correlation time-decay with exponential halflife (3 days default) - Correlation freshness checks with 7-day staleness threshold - 24 new tests for ATR and correlation enhancements 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
…ents ATR Enhancements: - ATR_STRICT_MODE (default true) blocks gating on hardcoded fallback - 24h realized vol fallback from marks_1m before using hardcoded values - ATR validity gate integrated into ConsensusDetector Vote Weighting Improvements: - Logarithmic scaling (log(1 + notional/base)) as default mode - Equity-normalized mode (sqrt(notional/equity)) when data available - Configurable via VOTE_WEIGHT_MODE, VOTE_WEIGHT_LOG_BASE, VOTE_WEIGHT_MAX - Track notional and equity in Vote dataclass Developer Experience: - Added Makefile with test/docker/dev commands - npm scripts: test:all, test:py, docker:restart, docker:rebuild, docker:wipe - 39 new tests for ATR and vote weighting 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Rewrite README with high-level business overview - Add website link (sigmapilot.ai) - Move technical details to docs/ARCHITECTURE.md - Remove deprecated fetchUserBtcFills function and tests - Add test dependencies to Python requirements.txt 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add periodic_correlation_refresh_task() to recompute correlations daily - Configurable via CORR_REFRESH_INTERVAL_HOURS env var (default 24h) - Ensures correlation matrix stays fresh with new trading data - Works with existing decay mechanism (3-day half-life) - Document new env vars in ARCHITECTURE.md 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add 7 new integration tests for correlation refresh and hydration - Test decay application during detector hydration - Test effK calculation with hydrated correlations - Test background task configuration defaults - Update test counts: 151 Python tests, 973 TypeScript tests 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
When a Hyperliquid WebSocket connection closes (due to network issues, server-side disconnect, etc.), the RealtimeTracker now automatically: 1. Detects the disconnection 2. Saves the list of subscribed users 3. Reconnects with a new WebSocket 4. Resubscribes all users to userEvents 5. Retries on failure with exponential backoff This fixes a bug where fills would stop being recorded after a WebSocket disconnect, causing addresses to show outdated fills in the Legacy tab. Note: Addresses beyond the first 10 in the watchlist still rely on manual backfill due to Hyperliquid's 10-user-per-WebSocket-per-IP limit. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Changed /dashboard/api/legacy/fills/fetch-history to fetch ALL available fills from Hyperliquid API (up to 2000 per address) instead of just the first N fills. This ensures that any gaps between what's stored in the DB and the current time are properly filled when triggering a manual backfill. The `limit` parameter now only controls how many fills to return in the API response, not how many to fetch from Hyperliquid. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Replace hardcoded weight = min(notional/100000, 1.0) with calculate_vote_weight() which respects VOTE_WEIGHT_MODE (log/equity/linear) - Replace fixed 8 bps price band check with consensus_detector.passes_latency_and_price_gates() which uses ATR-based R-unit drift thresholds - Use proper Vote namedtuple instead of dicts for type safety - Add docstring noting centralized function usage to prevent future drift This addresses the peer review feedback about consensus logic drift between main.py and consensus.py. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Rate Limiting (ts-lib/hyperliquid.ts): - Add RateLimiter class with configurable calls/second (default 2/s) - Wrap all Hyperliquid SDK calls in withRateLimitAndRetry() - Handle 429 errors with exponential backoff (3 retries) - Configurable via HL_SDK_CALLS_PER_SECOND, HL_SDK_MAX_RETRIES Observability Metrics (hl-decide): - ATR metrics: staleness counter, fallback usage, age gauge, blocked counter - Correlation metrics: staleness gauge, age, decay factor, pairs loaded - EffK metrics: value histogram, default fallback counter - Weight distribution: histogram, max gauge, Gini coefficient gauge Helper functions: - calculate_gini() for weight distribution inequality measurement - update_weight_metrics() and update_correlation_metrics() These metrics enable alerting on: - Trading with stale ATR/correlation data - Over-reliance on default correlation (ρ=0.3) - Weight concentration (high Gini = few traders dominating) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Risk Configuration (consensus.py): - MAX_POSITION_SIZE_PCT: 2% per position (conservative) - MAX_TOTAL_EXPOSURE_PCT: 10% total (up to 5 positions) - MAX_DAILY_LOSS_PCT: 5% drawdown halt - MIN_SIGNAL_CONFIDENCE: 55% win probability required - MIN_SIGNAL_EV_R: Positive EV required (configurable) - MAX_LEVERAGE: 1x until Kelly sizing implemented - SIGNAL_COOLDOWN_SECONDS: 5 minutes between same-symbol signals Gate 5 - Risk Limits: - check_risk_limits() validates signals before generation - Rejects low confidence (<55%) or low EV signals - Metrics: signal_risk_rejected_total, signal_generated_total Tests (test_risk_limits.py): - 13 tests for risk limit logic and default bounds - Verifies conservative defaults are properly bounded These are static fail-safes until Phase 4 implements proper Kelly criterion and dynamic position sizing. All limits configurable via environment variables for tuning. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Reduced from 600+ lines to ~300 lines - Added Phase 3c (Observability & Hardening) as complete - Clearly marked Phase 4 (Risk Management) as next - Consolidated configuration reference - Added architecture diagram - Removed redundant historical details 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Rate Limiter Improvements: - Weight-aware rate limiting (tracks API weight consumption) - API weight table per Hyperliquid docs (userFills: 20, clearinghouseState: 2) - Budget-based throttling at 70%/90% thresholds - Telemetry: 429 counter, backoff delays, calls by operation Data Quality Metrics: - effK fallback counter when default ρ is used - Correlation coverage % (pairs with data vs pool size) - Pool size gauge for correlation monitoring Weight Distribution Monitoring: - Saturation counter (weights at/near cap) - Saturation percentage gauge - Per-asset tracking (BTC/ETH) Quant Pipeline Integration Tests (9 new): - Low vol + high correlation regime - High vol + low correlation regime - Weight disparity effects on effK - effK fallback tracking - NIG weight formula verification - R-multiple sensitivity to stop - Consensus gates progression - ATR staleness gating - Correlation decay effects Test count: 173 Python tests (was 164) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add getLastActivityPerAddress() function using DISTINCT ON for efficient per-address queries - Add /dashboard/api/alpha-pool/last-activity endpoint to get most recent fill per trader - Add click-to-toggle functionality for Last Activity column (relative/absolute format) - Add 15 E2E tests for Last Activity toggle feature - Update API documentation with new Dashboard API endpoints - Add data-testid for Last Activity column header 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add createCounter and createGauge helpers to ts-lib metrics - Add initRateLimiterMetrics to expose rate limiting stats via /metrics - Add /data-health endpoint to hl-decide for operational dashboards - ATR freshness status (stale/fresh/missing) per asset - Correlation coverage health (loaded pairs vs expected) - Weight distribution health (Gini, saturation %) - Aggregated warnings and overall status 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
|
High-level summary Feature: Adds “Alpha Pool” support across services (hl-stream, hl-scout, hl-sage, ts-lib/hyperliquid). New endpoints for alpha-pool data, backfill/holdings/last-activity, and UI work in the dashboard. Missing request body when proxying non-GET requests (likely break for POST proxy) In dashboard.css there are duplicate @Keyframes pulse definitions added; consolidate to single definition to avoid confusion. I can produce patch snippets for: |
Implement centralized WebSocket slot management with priority-based allocation: - SubscriptionManager class in ts-lib for managing limited WebSocket slots - Priority order: pinned (0) > legacy (1) > alpha-pool (2) - Manual demote: free a WebSocket slot without auto-filling it - Manual promote: use a freed slot for a polling address - Slot reservation: demoted slots stay empty until manually promoted - Dashboard UI: clickable icons with popover for promote/demote actions - API endpoints: /subscriptions/status, /methods, /demote, /promote Tests: - 48 unit tests for SubscriptionManager (slot reservation, demote/promote) - E2E tests for dashboard UI (popover, API calls, slot display) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Alpha Pool Improvements:
- Auto-backfill historical fills when addresses first added to pool
- Add manual backfill endpoint: POST /alpha-pool/backfill/{address}
- Track newly inserted vs updated addresses separately during refresh
WebSocket Fill Routing Fix:
- Fix data integrity bug where fills were broadcast to ALL handlers
- Route fills to specific user's handler based on msg.data.user
- Add comprehensive tests for fill routing logic
E2E Test Improvements:
- Extract mock data and setup into reusable fixtures
- Fix WebSocket test to filter out TradingView's connections
- Fix External Links test to navigate to Legacy tab first
- Increase navigation timeouts for mobile-chrome (60s)
- Add isMobile checks for table visibility assertions
Documentation:
- Update DEVELOPMENT_PLAN.md with backfill feature details
- Fix E2E test count (110 tests across 6 spec files)
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
* docs: Add Phase 3e design documentation Analysis Documents: - sigmapilot-current-architecture.md: Complete architecture overview including services, message bus, database schema, signal pipeline - nofx-ux-and-architecture-notes.md: Reference study of NOFX project with UX patterns and architectural ideas (no code copied) UX Specification: - dashboard-redesign.md: Complete redesign with 6 main sections (Overview, Positions, Signals, Decisions, Traders, Settings) including user stories, wireframes, and data requirements Technical Design: - multi-exchange-architecture.md: Exchange adapter interface, Hyperliquid and Binance implementations, credential encryption, rate limiting, auto-trade routing - decision-logging.md: AI decision log schema, reasoning generator, database design, API endpoints, WebSocket events These documents provide the foundation for implementing: - Unified multi-exchange P&L tracking - Human-readable AI decision explanations - Web-based strategy configuration - Auto-trade with per-exchange limits 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * docs: Remove external reference document * feat: Implement Phase 3f Selection Integrity with Shadow Ledger ## Shadow Ledger & Snapshot System - Add trader_snapshots table (migration 025) for daily snapshots - Implement snapshot.py with: - Thompson sampling with stored seeds for reproducibility - Benjamini-Hochberg FDR control (correct k* finding) - Skill p-value computation with winsorization - Cost-adjusted R-multiple estimation - TraderSnapshot dataclass with full state - Add walk-forward replay for backtesting snapshots - Create /snapshots/* API endpoints for snapshot management ## Risk Governor (hl-decide) - Add risk_governor.py with drawdown tracking and circuit breakers - Implement portfolio.py for position/exposure management - Add decision_logger.py for audit trail - Add executor.py for paper trading simulation - Create risk_governor_state table (migration 026) - Add decision_logs table (migration 023) - Add execution_config table (migration 024) ## Configurable Thresholds - Make FDR thresholds configurable via environment variables: - SNAPSHOT_MIN_EPISODES (default: 5, prod: 30) - SNAPSHOT_FDR_ALPHA (default: 0.10) - SNAPSHOT_MIN_AVG_R_NET (default: 0.0, prod: 0.05) - ROUND_TRIP_COST_BPS (default: 30) - DEATH_DRAWDOWN_PCT (default: 0.80) - DEATH_ACCOUNT_FLOOR (default: 10000) - CENSOR_INACTIVE_DAYS (default: 30) ## Fresh Install Support - Add /alpha-pool/backfill-all endpoint for bulk historical data - Create scripts/init-alpha-pool.sh for initialization after docker compose up - Add npm run init:alpha-pool and make init commands ## Tests - 49 tests for snapshot module (test_snapshot.py, test_snapshot_extended.py) - 13 tests for walk-forward replay (test_walkforward.py) - 87 tests for risk governor (test_risk_governor.py, test_risk_governor_extended.py) - 6 tests for decision logger (test_decision_logger.py) ## Documentation - Update DEVELOPMENT_PLAN.md with Phase 3f details - Add PHASE_3F_TEST_CASES.md with comprehensive test plan - Add phase-3e-tech-plan.md design document - Update README.md and CLAUDE.md with fresh install instructions 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * feat: Add automatic Alpha Pool initialization on fresh install ## Changes - hl-sage now auto-detects fresh installs (empty Alpha Pool) - Automatically runs initialization: refresh → backfill → snapshot - Watch progress with: docker compose logs -f hl-sage ## New Environment Variables - ALPHA_POOL_AUTO_INIT: Enable/disable auto-init (default: true) - ALPHA_POOL_AUTO_INIT_DELAY_MS: Rate limit delay (default: 500ms) ## Cross-Platform Support - Added scripts/init-alpha-pool.mjs (Node.js version) - npm run init:alpha-pool now uses Node.js (works on Windows) - Bash script still available for Linux/Mac ## Documentation - Updated DEVELOPMENT_PLAN.md with auto-init details - Updated PHASE_3F_TEST_CASES.md with auto-init instructions - Makefile init target now uses npm for cross-platform support 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * docs: Update DEVELOPMENT_PLAN.md with latest test counts and files - Update test counts: 1,035 TS, 391 Python, 220 E2E tests - Add Phase 3f key files: snapshot.py, walkforward.py, decision_logger.py, risk_governor.py - Add init script to key files - Update last modified date 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
No description provided.