reh3376 · reh3376 · May 6, 2026 · May 6, 2026
@@ -7,6 +7,20 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
 
 ## [Unreleased]
 
+_Nothing yet._
+
+## [0.9.0] - 2026-05-06
+
+### Breaking
+
+- **Phase 13.5 production LLM runtime cutover (port 8101 → 8102)** — `mlx_lm.server` on port 8101 has been replaced by `llama-server` (llama.cpp b9000+, OpenAI-compat) on port 8102 serving `mdemg-llm-v1.Q5_K_M.gguf`. Operators with `LLM_ENDPOINT=http://127.0.0.1:8101/v1` (or `http://host.docker.internal:8101/v1` in a containerized `.env`) will fail mdemg's startup preflight after upgrading. **Migration**: edit `.env` and change the port `8101` → `8102` everywhere. The `com.mdemg.llama-server.plist` LaunchAgent is auto-installed by the formula's `post_install` hook (`mdemg upgrade --docker-only`). Rollback path: restore `~/Library/LaunchAgents/com.mdemg.mlx-server.plist.disabled-phase13_5`, set `LLM_ENDPOINT=http://127.0.0.1:8101/v1`, bootout llama-server, bootstrap mlx-server (will reintroduce the ~14-min Metal-OOM crash cycle that motivated the cutover). Full rationale in CLAUDE.md "Local LLM Runtime — llama.cpp llama-server (Phase 13.5 cutover)".
+- **`MLX_*` env vars deprecated in favor of `LLM_*`** — Phase 13.6 renamed the watchdog/preflight env-var family. Legacy `MLX_*` names continue to work but emit `WARN config: env var deprecated, please rename` at boot. Aliases removable ≥1 release cycle from this commit. Migration: rename `MLX_WATCHDOG_ENABLED` → `LLM_WATCHDOG_ENABLED`, `MLX_PROBE_INTERVAL_SEC` → `LLM_PROBE_INTERVAL_SEC`, `MLX_PROBE_TIMEOUT_SEC` → `LLM_PROBE_TIMEOUT_SEC`, `MLX_FAIL_FAST_ENABLED` → `LLM_FAIL_FAST_ENABLED`, `MDEMG_ALLOW_NO_MLX` → `MDEMG_ALLOW_NO_LLM`. Internal Go package (`internal/mlxprobe/`) and Prometheus metric prefix (`mdemg_mlx_*`) retained as operator-invisible / dashboard-coupled.
+
+### Added
+
+- **POST-FT-LORA-PHASE10.5: UBENCH framework promotion** (2026-05-06, commit `0389b49`) — Promotes `neural.benchmarks.run_benchmark` to a UxTS-pattern framework. New `docs/tests/ubench/{schema,specs,runners,contracts}/` tree with lint / contract / run modes; `mdemg.ubench.json` pinned at config sha `76c97eb8` + holdout sha `b2004783` (17 specs / 108 rows / `min_rows_per_task=3`). New `make test-ubench{,-lint,-contract,-run}` Makefile targets. Pytest contract entry: `pytest docs/tests/ubench/contracts/`. Closes #215. Phase 10.5 sub-part 3 (Phase 5 SFT re-pass on `guardrail.evaluate`, ~4-8h MLX) remains operator-deferred (#216); UBENCH contract is the framework-level mitigation for it staying open. Feature doc: `docs/features/ubench-framework.md`.
+- **Claude Code GitHub App workflows** (2026-05-06, PRs #378 + #379) — `.github/workflows/claude.yml` handles `@claude` mentions in issues and PR comments; `.github/workflows/claude-code-review.yml` auto-reviews PRs on `opened`/`synchronize`/`ready_for_review`/`reopened`. Powered by `anthropics/claude-code-action@v1`, the `CLAUDE_CODE_OAUTH_TOKEN` repo secret, and Claude Code GitHub App installation on `reh3376/mdemg`.
+
 ### Changed
 
 - **POST-FT-LORA-PHASE14.2.3: Per-category context-column weight (default-on)** (2026-05-06) — **120q full A/B PASSED merge gate** (mean +0.009, std -0.023, 11 improvements, **0 regressions**). The Phase 14.2.2 forensic identified 3 categories where the column consistently regressed (`service_relationships` -0.043, `business_logic_constraints` -0.023, `relationship` -0.017). This sprint zero-weights the column on those 3 categories while keeping default `RETRIEVAL_CONTEXT_COLUMN_WEIGHT=0.10` elsewhere — mirror of Phase 14.1's per-category sparse-gate dispatch. New env knob `RETRIEVAL_CONTEXT_COLUMN_CATEGORY_WEIGHTS` (JSON map) with the 3-category zero-weight default seed; operator JSON value REPLACES seed. **Defaults flipped**: `CONTEXT_FINGERPRINT_ENABLED` `false→true`, `RETRIEVAL_CONTEXT_COLUMN_ENABLED` `false→true`. Per-category breakdown post-flip: `architecture_structure` +0.030, `computed_value` +0.070, `data_flow_integration` +0.010, `business_logic_constraints` +0.005 (was -0.023, now neutral), `service_relationships` -0.010 (was -0.043). std dropped 0.072→0.049 — much tighter retrieval scores. min jumped 0.000→0.350 — zero-score rescues stick. Operator opt-out: `RETRIEVAL_CONTEXT_COLUMN_ENABLED=false`. Post: [`phase_14_2_3_post.md`](docs/development/post-ft-lora/phase_14_2_3_post.md). OpenAI spend: ~$10 (one B_full; A_full reused from 14.2.2).