pinellolab · lucapinello · Apr 25, 2026 · Apr 25, 2026
diff --git a/audits/2026-04-24_v27_scorched_earth.md b/audits/2026-04-24_v27_scorched_earth.md
@@ -0,0 +1,159 @@
+# Chorus v27 Scorched-Earth Audit
+
+**Date**: 2026-04-24
+**Platform**: macOS 15.7.4 / Apple M3 Ultra / 96 GB
+**Branch**: `main` @ `41916af` (post v26 sweep)
+**Auditor**: Claude Opus 4.7 (1M context)
+**Fix branch**: `fix/2026-04-24-v27-track-id-validator`
+
+## Why this audit
+
+User asked for a fresh-from-zero install + every-layer test pass after
+v26 was declared "done". v22, v23, v24, v25 all reported zero blockers
+on macOS arm64 — but those audits each kept *something* warm
+(downloads cache, env, genome) so an actual greenfield user run had
+not been replayed since v21. v27 closes that gap.
+
+## What was scorched
+
+| Item                                | Size before |
+| ----------------------------------- | ----------- |
+| `~/.local/share/mamba/envs/chorus*` (7 envs) | ~10.4 GB |
+| `~/.chorus/`                        | 1.5 GB |
+| `genomes/hg38.fa*`                  | 3.0 GB |
+| `downloads/`                        | 9.9 GB |
+| **Total**                           | **~24.8 GB** |
+
+Verified empty afterwards: `mamba env list | grep chorus` → none,
+`ls ~/.chorus genomes/ downloads/` → not present.
+
+## What was reinstalled (chronological)
+
+Following README TLDR Step-by-Step exactly as a new user would:
+
+1. **Step 1 — base env** (5 min)
+   ```bash
+   mamba env create -f environment.yml
+   mamba activate chorus
+   pip install -e .
+   ```
+   Clean. `chorus --help` works.
+
+2. **Step 2 — `chorus setup --oracle all`** (60 min, unattended)
+
+   Per-oracle wall-clock from setup log:
+
+   | Oracle | env  | weights | total |
+   | ------ | ---- | ------- | ----- |
+   | alphagenome | 0:46 | 1:30 | 12:01 (incl. hg38 download 9:40) |
+   | borzoi      | 0:37 | 0:25 | 1:02 |
+   | chrombpnet  | 0:34 | 7:42 | 8:16 |
+   | enformer    | 0:33 | 0:24 | 0:57 |
+   | legnet      | 0:30 | 0:46 | 1:16 |
+   | sei         | 0:32 | 34:46 | 35:18 |
+   | **Total**   |      |       | **~60 min** |
+
+   No errors, no warnings, no manual intervention required. HF token
+   accepted on first try (`✓ HuggingFace auth ok (user: lucapinello, via env)`).
+
+3. **Step 3 — README Python snippet** — runs in 12 s end-to-end:
+   - Loads Enformer in `chorus-enformer` env
+   - WT mean signal: `0.468` for ENCFF413AHU at chr11:5247000-5248000
+   - Variant scan: 3 alt alleles scored (`['reference','alt_1','alt_2','alt_3']`)
+   - No ref-mismatch warning (the v25 fix to `['C','A','G','T']` holds)
+
+4. **Step 4 — MCP** — verified separately via E2E integration test:
+   - `pytest tests/test_integration.py::test_mcp_e2e_list_oracles_and_analyze_variant -m integration`
+   - 280 s, PASSED
+   - Spawns `chorus-mcp` stdio subprocess via `fastmcp.Client`
+   - Calls `list_oracles` → `load_oracle` → `analyze_variant_multilayer`
+     for SORT1 rs12740374 in HepG2 with AlphaGenome
+   - Real AlphaGenome predict, real DNASE/EFO:0001187 track resolution
+
+## Verification beyond TLDR
+
+- `chorus list` → 6/6 ✓ Installed, exit 0
+- `chorus health` → 6/6 ✓ Healthy in 43 s, exit 0
+- **Quickstart notebook** (`single_oracle_quickstart.ipynb`) executed end-to-end on first try (after the P0 fix below)
+- **Multi-oracle smoke** — Enformer + ChromBPNet predict on chr1:109274000-109276114, both return finite values, ChromBPNet ATAC:K562 model auto-downloaded from ENCODE (728 MB) when first invoked
+- **Discovery smoke** — `discover_variant_effects` for SORT1 rs12740374 returns the expected 5-key dict (`cell_type_ranking`, `layer_rankings`, `total_tracks_scored`, `selected_tracks`, `report`)
+- **Fast pytest** — 340 passed, 1 skipped (regression check post-validator-fix)
+
+## Findings
+
+### P0 (would break a new user) — fixed in this PR
+
+**P0-1: `_validate_assay_ids` rejects FANTOM CAGE identifiers** (`chorus/oracles/enformer.py:351`, `chorus/oracles/borzoi.py:302`)
+
+The track-ID guard introduced in v26 PR #44 only treated `ENCFF*` strings as identifiers; everything else fell through to the description-substring path. FANTOM CAGE identifiers (`CNhs11250`, etc.) are valid identifiers but don't start with `ENCFF`, so they were silently classified as descriptions. `get_tracks_by_description("CNhs11250")` returns empty → guard raises `InvalidAssayError`. The shipped quickstart notebook uses `['ENCFF413AHU', 'CNhs11250']` and would have failed for every new user on cell In[8]:
+
+```
+InvalidAssayError: Enformer does not recognise these track IDs: ['CNhs11250']
+```
+
+**Fix**: Try `get_track_by_identifier` *first* unconditionally; fall through to description lookup only if identifier lookup returns None. Same change applied to Enformer + Borzoi (Borzoi has the same code path with the same bug). Quickstart notebook now executes clean.
+
+This is exactly the kind of bug a kept-warm install would never catch — the previous owner's environment already had the metadata cached and the guard was never exercised on a real first call.
+
+### P1 (would confuse but recoverable) — fixed in this PR
+
+**P1-1: `chorus-mcp --help` lists 20 tools, MCP server registers 22** (`chorus/mcp/server.py:1598`)
+
+The hand-maintained list in the `--help` print block missed `discover_variant` and `fine_map_causal_variant`. README and MCP_WALKTHROUGH advertise both, FastMCP registers both, but a user running `chorus-mcp --help` to confirm the surface would think the README overstates the capabilities.
+
+**Fix**: Reordered into 4 logical groups (Discovery / Lifecycle / Predict / Analyze) with the alphabetised list explicitly tagged `(22)`.
+
+**P1-2: Dead anchor in `docs/MCP_WALKTHROUGH.md:14`**
+
+Link `../README.md#mcp-server-ai-assistant-integration` resolves to nothing — README's heading is `### MCP server` → slug `#mcp-server`.
+
+**Fix**: Updated to `#mcp-server`.
+
+### P1 (deferred to a follow-up PR — non-blocking)
+
+These are documentation drift items that don't break workflows:
+
+- `examples/notebooks/single_oracle_quickstart.ipynb` md cell 1 — install steps say `chorus genome download hg38` after `chorus setup --oracle enformer`, but `chorus setup` already pre-fetches hg38. Redundant but harmless.
+- `examples/notebooks/advanced_multi_oracle_analysis.ipynb` md cell 2 — same redundant hg38 step + omits `chorus setup --oracle legnet` despite the "Extra: MPRA-LegNet" section requiring it.
+- `examples/notebooks/comprehensive_oracle_showcase.ipynb` md cell 1 — track-count table has small mismatches: ChromBPNet says "Per-model (DNASE, ATAC)" but Operation 8 covers CHIP too; Sei row says "128 bp" resolution but Sei outputs a single 4096 bp scalar (resolution_bp=None in ORACLE_SPECS); LegNet "1 track" should be 3 (K562/HepG2/WTC11).
+- `audits/AUDIT_CHECKLIST.md:88,156` — track-count line mixes "total" with "CDF-backed" without labels; AlphaGenome counted as 5,168 in one row and 5,731 in another.
+
+### P2 (minor / informational) — not fixed
+
+- `chorus setup` "✓ chrombpnet weights ready" is the env-probe success — it does *not* download the user-specified model (e.g. ATAC:K562). First `predict()` triggers the real ENCODE tarball pull (728 MB). Behaviour is fine, but doc could note it. Same applies to all per-model oracles (chrombpnet, sei, legnet) where setup probes one default and predict pulls the rest on demand.
+- `discover_variant_effects` triggers 7 identical "Annotation file already exists" log lines — looks like 7 separate `AnnotationManager` instances each logging the same probe. Cosmetic; merge or demote to DEBUG.
+- ChromBPNet ATAC:K562 download progress logs at 14.4%, 28.8%, 43.3% etc. — every ~70 s, which is fine but could be coarsened to every 25%.
+- `_validate_assay_ids` allows description-substring matches (e.g. `DNASE:HepG2`) that the prediction template's `id2index` then rejects. Validator and template have inconsistent contracts. The error from the template is actionable, so this is a P2 polish item, not a blocker.
+
+## Summary
+
+- v27 audit found **1 P0** (track-ID validator regression, fixed) and **2 P1 docs** (fixed in same PR), plus 4 P1 doc-drift items deferred to a follow-up.
+- README TLDR Steps 1-4 work end-to-end after the fix. Total green-field
+  install + first-prediction time: **~62 min** (60 min setup + 2 min predict).
+- Without this scorched-earth run, the v26-introduced P0 in
+  `_validate_assay_ids` would have shipped to every new user.
+
+## Verified
+
+```bash
+# After the P0 fix
+pytest tests/ --ignore=tests/test_smoke_predict.py -q
+→ 340 passed, 1 skipped
+
+# Quickstart notebook (the one that broke pre-fix)
+jupyter nbconvert --to notebook --execute --inplace examples/notebooks/single_oracle_quickstart.ipynb
+→ exit 0, ~3 min, all 22 cells execute
+
+# MCP E2E (already in v9 integration suite)
+pytest tests/test_integration.py::test_mcp_e2e_list_oracles_and_analyze_variant -m integration
+→ 1 passed in 280.74s
+```
+
+## What is NOT in this audit
+
+- **Linux / CUDA path** — still deferred to user's other machine.
+- **Self-hosted CI runner** for smoke + integration tests — out of scope.
+- **Long-form notebooks** — `advanced_multi_oracle_analysis.ipynb` and
+  `comprehensive_oracle_showcase.ipynb` were skipped to keep the audit
+  to one fresh-day session. Smoke-tested via direct API calls instead.
+- **P1 notebook md-cell drift** — captured here, deferred to a follow-up.
diff --git a/chorus/mcp/server.py b/chorus/mcp/server.py
@@ -1595,14 +1595,20 @@ def main():
         print("  CHORUS_NO_TIMEOUT=1       Disable prediction timeouts")
         print("  CHORUS_MCP_OUTPUT_DIR=DIR  Set output directory for bedgraph files")
         print()
-        print("Tools provided: list_oracles, list_tracks, list_genomes,")
-        print("  get_genes_in_region, get_gene_tss, load_oracle, unload_oracle,")
-        print("  oracle_status, predict, predict_variant_effect,")
-        print("  predict_region_replacement, predict_region_insertion,")
-        print("  score_prediction_region, score_variant_effect_at_region,")
-        print("  predict_variant_effect_on_gene, analyze_variant_multilayer,")
-        print("  discover_variant_cell_types, analyze_region_swap,")
-        print("  simulate_integration, score_variant_batch")
+        # Tool list mirrors what `@mcp.tool()` registers below — kept
+        # alphabetised so audit drift is easy to spot. v27 audit found
+        # this list missing `discover_variant` and `fine_map_causal_variant`.
+        print("Tools provided (22):")
+        print("  Discovery: list_oracles, list_tracks, list_genomes,")
+        print("    get_genes_in_region, get_gene_tss")
+        print("  Lifecycle: load_oracle, unload_oracle, oracle_status")
+        print("  Predict:   predict, predict_variant_effect,")
+        print("    predict_region_replacement, predict_region_insertion,")
+        print("    predict_variant_effect_on_gene, score_prediction_region,")
+        print("    score_variant_effect_at_region, score_variant_batch")
+        print("  Analyze:   analyze_variant_multilayer, analyze_region_swap,")
+        print("    discover_variant, discover_variant_cell_types,")
+        print("    fine_map_causal_variant, simulate_integration")
         print()
         print("Prompts provided: getting_started, analyze_variant")
         return

diff --git a/chorus/oracles/borzoi.py b/chorus/oracles/borzoi.py
@@ -299,12 +299,14 @@ def _validate_assay_ids(self, assay_ids: List[str] | None = None):
         metadata = get_metadata()
         bad: List[str] = []
         for assay_id in assay_ids:
-            if assay_id.startswith('ENCFF'):
-                if metadata.get_track_by_identifier(assay_id) is None:
-                    bad.append(assay_id)
-            else:
-                if not metadata.get_tracks_by_description(assay_id):
-                    bad.append(assay_id)
+            # Try identifier lookup first (handles both ENCFF* and FANTOM
+            # CNhs* IDs — the ENCFF-only gate used to reject CAGE tracks
+            # like 'CNhs11250' that the quickstart notebook relies on.
+            if metadata.get_track_by_identifier(assay_id) is not None:
+                continue
+            if metadata.get_tracks_by_description(assay_id):
+                continue
+            bad.append(assay_id)
         if bad:
             raise InvalidAssayError(
                 f"Borzoi does not recognise these track IDs: {bad}. "

diff --git a/chorus/oracles/enformer.py b/chorus/oracles/enformer.py
@@ -348,12 +348,15 @@ def _validate_assay_ids(self, assay_ids: List[str] | None = None):
         metadata = get_metadata()
         bad: List[str] = []
         for assay_id in assay_ids:
-            if assay_id.startswith('ENCFF'):
-                if metadata.get_track_by_identifier(assay_id) is None:
-                    bad.append(assay_id)
-            else:
-                if not metadata.get_tracks_by_description(assay_id):
-                    bad.append(assay_id)
+            # Try identifier lookup first (handles both ENCFF* and FANTOM
+            # CNhs* IDs — the ENCFF-only gate used to reject CAGE tracks
+            # like 'CNhs11250' that the quickstart notebook relies on.
+            if metadata.get_track_by_identifier(assay_id) is not None:
+                continue
+            # Fall back to description lookup (e.g. 'DNASE:K562').
+            if metadata.get_tracks_by_description(assay_id):
+                continue
+            bad.append(assay_id)
         if bad:
             raise InvalidAssayError(
                 f"Enformer does not recognise these track IDs: {bad}. "

diff --git a/docs/MCP_WALKTHROUGH.md b/docs/MCP_WALKTHROUGH.md
@@ -11,7 +11,7 @@ what a user types in natural language and what Claude returns.
    chorus setup --oracle alphagenome   # or enformer, etc.
    ```
 
-2. MCP configured (one-time — see [main README](../README.md#mcp-server-ai-assistant-integration)):
+2. MCP configured (one-time — see [main README](../README.md#mcp-server)):
    ```bash
    claude mcp add chorus -- mamba run -n chorus chorus-mcp
    ```

diff --git a/tests/test_prediction_methods.py b/tests/test_prediction_methods.py
@@ -456,6 +456,37 @@ def test_unknown_track_id_gives_actionable_error(self):
         finally:
             meta_mod.get_metadata = orig
 
+        # v27 P0: FANTOM CAGE identifiers (e.g. 'CNhs11250') don't start
+        # with 'ENCFF' but are valid identifiers via
+        # ``get_track_by_identifier``. The previous validator only
+        # recognised ``ENCFF*`` as identifier candidates and dumped
+        # everything else into description-substring lookup — which then
+        # rejected ``CNhs11250`` because the description for that track
+        # is "CAGE:chronic myelogenous leukemia cell line:K562" and
+        # contains no 'CNhs' substring. The shipped quickstart notebook
+        # uses ``CNhs11250`` and broke for every new user.
+        fake_meta_with_cnhs = types.SimpleNamespace(
+            # Returns the index when called with the CNhs identifier;
+            # None otherwise — mirrors real metadata behaviour.
+            get_track_by_identifier=lambda x: 4828 if x == "CNhs11250" else None,
+            get_tracks_by_description=lambda x: [],
+        )
+        meta_mod.get_metadata = lambda: fake_meta_with_cnhs
+        try:
+            # Must NOT raise — CNhs11250 is a valid identifier even
+            # though it doesn't start with 'ENCFF'.
+            enformer_mod.EnformerOracle._validate_assay_ids(
+                stub, ["CNhs11250"]
+            )
+            # Mixing one valid + one invalid still raises, naming only
+            # the invalid one.
+            with pytest.raises(InvalidAssayError, match="ENCFF000XXX"):
+                enformer_mod.EnformerOracle._validate_assay_ids(
+                    stub, ["CNhs11250", "ENCFF000XXX"]
+                )
+        finally:
+            meta_mod.get_metadata = orig
+
     def test_error_handling_model_not_loaded(self):
         """Test error when model not loaded."""
         unloaded_oracle = MockOracle(reference_fasta=str(self.fasta_path))