Skip to content

Commit 90a185e

Browse files
feat: add extraction analysis admin dashboard tab (#272)
* feat: mount discogs/musicbrainz data volumes on API service (read-only) * feat: add extraction analysis router with versions endpoint (Task 2) Add GET /api/admin/extraction-analysis/versions that scans flagged/ directories in the Discogs and MusicBrainz data roots and returns versions with their entity types. Includes input validation, configure() wiring in api.py, and full TDD test coverage (15 tests passing). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * feat: add extraction analysis summary endpoint and configure in test fixture (Task 3) Wire _extraction_analysis_router.configure() into the shared test_client fixture so the router is properly initialised in integration tests. The summary endpoint (GET /api/admin/extraction-analysis/{version}/summary) with all its helpers (_find_version_root, _read_violations, _load_state_marker, _build_violation_summary) was introduced alongside Task 2; this commit captures the test-fixture wiring that belongs to Task 3. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * feat: add violations list and detail endpoints (Task 4) Adds GET /api/admin/extraction-analysis/{version}/violations with pagination and entity_type/severity/rule filters, and GET /api/admin/extraction-analysis/{version}/violations/{record_id} returning raw XML and parsed JSON alongside violation records. Includes _load_record_files helper and _make_flagged_version test fixture shared by Tasks 4-6. 11 new tests. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * feat: add parsing-errors endpoint with defusedxml classification (Task 5) Adds GET /api/admin/extraction-analysis/{version}/parsing-errors that classifies each violation as parsing_error (XML has value, JSON doesn't), source_issue (both lack the field), or indeterminate (files missing). Results are cached in memory with a 5-minute TTL. Adds defusedxml and pyyaml as api dependencies with matching type stubs. 7 new tests. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * feat: add compare and prompt-context endpoints (Task 6) Adds GET /api/admin/extraction-analysis/{version}/compare/{other_version} returning per-rule direction deltas (improved/worsened/unchanged) between two versions, and POST /api/admin/extraction-analysis/{version}/prompt-context assembling violation records with raw XML, parsed JSON, and optional rule definition loaded from extraction-rules.yaml. Also adds defusedxml, pyyaml, and their type stubs as project dependencies. 11 new tests. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * feat(dashboard): add extraction analysis proxy routes (Task 7) Add 7 proxy routes for the extraction analysis API under /admin/api/extraction-analysis/*, update path validation regex to allow dots for version strings (e.g. 20240101.0), update affected tests, and add full test coverage for all new routes. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * feat(dashboard): add Extraction Analysis tab HTML (Task 8) Add tab button, sub-view CSS, and full panel HTML for the Extraction Analysis tab: single-version report with pipeline status, violation entity cards, and rule breakdown table; version comparison view; AI prompt generator view; and record detail modal. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * feat(dashboard): add Extraction Analysis JavaScript logic (Task 9) Add _esc() helper, EA state properties, event bindings, and all EA methods: version fetching, report rendering (pipeline status, entity cards, rule breakdown), record detail modal, version comparison, select-all, prompt generation and clipboard copy. All dynamic content uses textContent or programmatic DOM construction — never innerHTML with API data. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * test: improve extraction analysis coverage to satisfy Codecov requirements Add targeted tests for all 35 missing lines across extraction_analysis.py and admin_proxy.py, bringing both files from 92%/96% to 100% coverage. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
1 parent e66b5bd commit 90a185e

13 files changed

Lines changed: 3148 additions & 11 deletions

api/api.py

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -41,6 +41,7 @@
4141
import api.routers.collection as _collection_router
4242
import api.routers.credits as _credits_router
4343
import api.routers.explore as _explore_router
44+
import api.routers.extraction_analysis as _extraction_analysis_router
4445
import api.routers.insights as _insights_router
4546
import api.routers.insights_compute as _insights_compute_router
4647
import api.routers.label_dna as _label_dna_router
@@ -256,6 +257,10 @@ async def lifespan(_app: FastAPI) -> AsyncGenerator[None]: # pragma: no cover
256257
_search_router.configure(_pool, _redis)
257258
_insights_compute_router.configure(_neo4j, _pool, _redis)
258259
_admin_router.configure(_pool, _redis, _config, neo4j_driver=_neo4j)
260+
_extraction_analysis_router.configure(
261+
discogs_root=os.environ.get("DISCOGS_DATA_ROOT"),
262+
musicbrainz_root=os.environ.get("MUSICBRAINZ_DATA_ROOT"),
263+
)
259264
_musicbrainz_router.configure(_pool, _neo4j)
260265
_network_router.configure(_neo4j, _redis)
261266
_rarity_router.configure(_neo4j, _pool, _redis)
@@ -397,6 +402,7 @@ async def metrics_middleware(request: Request, call_next: Any) -> Any:
397402
app.include_router(_collection_router.router)
398403
app.include_router(_recommend_router.router)
399404
app.include_router(_admin_router.router)
405+
app.include_router(_extraction_analysis_router.router)
400406
app.include_router(_nlq_router.router)
401407
app.include_router(_rarity_router.router)
402408
app.include_router(_network_router.router)

0 commit comments

Comments
 (0)