Skip to content

Commit 4b9df48

Browse files
lp698claude
andcommitted
v23 addendum: notebooks + MCP + P2 fix for pre-v19-calibrated hardcoded position
Pushback from user on v23 initial report: "did you run all the notebooks? run all the MCP walkthroughs? check tracks, findings?" v23 had only installed enformer + run pytest. Re-opened the audit to cover the missing pieces. ## Notebooks run end-to-end (scorched-earth env) | Notebook | Cells | Errors | Warnings | |---|---|---|---| | single_oracle_quickstart.ipynb | 49 | 0 | 0 | | advanced_multi_oracle_analysis.ipynb (pre-fix) | 127 (57 code) | 0 | **1** | | advanced_multi_oracle_analysis.ipynb (post-fix) | 127 (57 code) | 0 | **0** | | comprehensive_oracle_showcase.ipynb | aborts cell 9 | 1 (expected) | — | comprehensive aborts on `No module named 'borzoi_pytorch'` — Borzoi env not installed in this scorched-earth scope. Expected. ## P2 fix landed in this PR examples/notebooks/advanced_multi_oracle_analysis.ipynb cell 67 had `first_G_position_in_int = 108`. That hardcoded offset was calibrated to the pre-v19 off-by-one in predict_variant_effect — 108 only pointed at the G of `CCAGAGGGC` because the ref-check was reading one position to the right of what the user said. Post-PR #32, the code correctly reads the base at the user-given position, and interval-offset 108 is the A in `CCAGAGGGC` — not the first G. The warning Provided reference allele 'G' does not match the genome at this position ('A'). Chorus will use the provided allele. was scientifically real: Chorus was substituting G at the A position and predicting "mutating the A before the motif" while the notebook claimed it was predicting "mutating the first G of the CTCF motif". Shipped notebook text said one thing; the actual computation tested another. Fix: `108 → 109` so variant_pos lands on 1-based chr2:246676 = the first G of CCAGAGGGC. Verified via extract_sequence. Re-ran the notebook post-fix: 0 errors, 0 warnings. ## MCP server end-to-end Spawned chorus-mcp over stdio via fastmcp.Client + StdioTransport. 3 tool calls succeeded: list_oracles (6 oracles, correct specs), list_tracks('enformer') (4 assay types + 1267 cell types), oracle_status ({"loaded_oracles":[]}). ## Extra oracles installed - chorus setup --oracle chrombpnet → ✓ 9m 2s (env + ATAC:K562 fold 0 from ENCODE + background + hg38 already present); marker written. - chorus setup --oracle legnet → ✓ ~2m (tiny weights); marker written. Both end with chorus health → Healthy. ## Walkthrough spot-check scripts/regenerate_multioracle.py --oracle chrombpnet reproduces the committed chrombpnet effect size for rs12740374 G>T within 2e-6 — CPU non-determinism; reverted the regen. ## Docs consistency Every tool name referenced in examples/walkthroughs/**/README.md (9 unique) exists in the MCP registry. No orphans. ## Deferred (not exercised in this v23 scope) - borzoi, sei, alphagenome setup/predict/walkthroughs — need `chorus setup --oracle all` (2–4h) + HF_TOKEN for AG - comprehensive_oracle_showcase.ipynb (needs all 6 oracles) - AG-primary walkthroughs (variant_analysis/SORT1/BCL11A/FTO, validation/CEBP/TERT, discovery/SORT1, causal/SORT1_locus, sequence_engineering, batch_scoring) — previously verified in v21/v22 Fast suite in fresh env: 338 passed / 2 skipped (unchanged). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
1 parent a645404 commit 4b9df48

15 files changed

Lines changed: 10738 additions & 1 deletion
Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,2 @@
1+
[NbConvertApp] Converting notebook examples/notebooks/single_oracle_quickstart.ipynb to notebook
2+
[NbConvertApp] Writing 671837 bytes to /tmp/v23_quickstart.ipynb

audits/2026-04-23_v23_scorched_earth/logs/09_setup_chrombpnet.txt

Lines changed: 328 additions & 0 deletions
Large diffs are not rendered by default.
Lines changed: 41 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,41 @@
1+
2026-04-23 08:05:55,571 - httpx - INFO - HTTP Request: GET https://pypi.org/pypi/fastmcp/json "HTTP/1.1 200 OK"
2+
3+
4+
╭──────────────────────────────────────────────────────────────────────────────╮
5+
│ │
6+
│ │
7+
│ ▄▀▀ ▄▀█ █▀▀ ▀█▀ █▀▄▀█ █▀▀ █▀█ │
8+
│ █▀ █▀█ ▄▄█ █ █ ▀ █ █▄▄ █▀▀ │
9+
│ │
10+
│ │
11+
│ │
12+
│ FastMCP 3.2.4 │
13+
│ https://gofastmcp.com │
14+
│ │
15+
│ 🖥 Server: Chorus Genomics, 3.2.4 │
16+
│ 🚀 Deploy free: https://horizon.prefect.io │
17+
│ │
18+
╰──────────────────────────────────────────────────────────────────────────────╯
19+
20+
21+
[04/23/26 08:05:55] INFO Starting MCP server 'Chorus transport.py:209
22+
Genomics' with transport 'stdio'
23+
2026-04-23 08:05:55,608 - mcp.server.lowlevel.server - INFO - Processing request of type CallToolRequest
24+
2026-04-23 08:05:55,610 - chorus.mcp.state - INFO - Auto-detected hg38 at /Users/lp698/chorus_test/chorus/genomes/hg38.fa
25+
2026-04-23 08:05:55,657 - chorus.core.environment.manager - INFO - Found mamba: 2.4.0
26+
2026-04-23 08:05:55,657 - chorus.core.platform - INFO - Detected platform: Darwin arm64 (key=macos_arm64, cuda=False)
27+
2026-04-23 08:05:55,840 - mcp.server.lowlevel.server - INFO - Processing request of type ListToolsRequest
28+
2026-04-23 08:05:55,844 - mcp.server.lowlevel.server - INFO - Processing request of type CallToolRequest
29+
2026-04-23 08:05:55,932 - chorus.oracles.enformer_source.enformer_metadata - INFO - Loaded 5313 track metadata entries
30+
2026-04-23 08:05:55,939 - mcp.server.lowlevel.server - INFO - Processing request of type CallToolRequest
31+
Spawning chorus-mcp ...
32+
list_oracles [first 500 chars]:
33+
{"oracles":[{"name":"enformer","description":"Enformer (DeepMind) — predict chromatin & gene expression from DNA sequence","framework":"TensorFlow","input_size_bp":393216,"output_bins":896,"resolution_bp":128,"assay_types":["DNASE","ATAC","CAGE","CHIP","RNA"],"environment_installed":true,"loaded":false},{"name":"borzoi","description":"Borzoi — high-resolution gene expression & chromatin prediction","framework":"PyTorch","input_size_bp":524288,"output_bins":6144,"resolution_bp":32,"assay_types":[
34+
35+
list_tracks(enformer) [first 300 chars]:
36+
{"oracle":"enformer","assay_types":["ATAC","CAGE","CHIP","DNASE"],"cell_types":["3xFLAG-AHR","3xFLAG-ARID4B","3xFLAG-ATF1","3xFLAG-ATF4","3xFLAG-BCL6","3xFLAG-CEBPA","3xFLAG-CEBPG","3xFLAG-CREB1","3xFLAG-DMAP1","3xFLAG-DNMT3B","3xFLAG-DRAP1","3xFLAG-ELF3","3xFLAG-ERF","3xFLAG-ETV5","3xFLAG-FOXA3","3
37+
38+
oracle_status [first 300 chars]:
39+
{"loaded_oracles":[]}
40+
41+
✓ MCP E2E works

audits/2026-04-23_v23_scorched_earth/logs/11_setup_legnet.txt

Lines changed: 494 additions & 0 deletions
Large diffs are not rendered by default.
Lines changed: 40 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,40 @@
1+
2026-04-23 08:15:10,795 - INFO - Device: auto-detect (GPU if available, else CPU)
2+
2026-04-23 08:15:10,796 - WARNING - Environment management not available. Oracle will run in current environment.
3+
2026-04-23 08:15:10,796 - INFO - Dowloading ChromBPNet into /Users/lp698/chorus_test/chorus/downloads/chrombpnet/ATAC_HepG2...
4+
2026-04-23 08:15:10,796 - INFO - Downloading https://www.encodeproject.org/files/ENCFF137WCM/@@download/ENCFF137WCM.tar.gz...
5+
2026-04-23 08:16:14,929 - INFO - ChromBPNet ATAC:HepG2: 0.10/0.72 GB (14.5%)
6+
2026-04-23 08:17:16,698 - INFO - ChromBPNet ATAC:HepG2: 0.21/0.72 GB (29.1%)
7+
2026-04-23 08:18:18,389 - INFO - ChromBPNet ATAC:HepG2: 0.31/0.72 GB (43.6%)
8+
2026-04-23 08:19:20,096 - INFO - ChromBPNet ATAC:HepG2: 0.42/0.72 GB (58.1%)
9+
2026-04-23 08:20:21,814 - INFO - ChromBPNet ATAC:HepG2: 0.52/0.72 GB (72.7%)
10+
2026-04-23 08:21:23,595 - INFO - ChromBPNet ATAC:HepG2: 0.63/0.72 GB (87.2%)
11+
2026-04-23 08:22:17,937 - INFO - Download completed!
12+
2026-04-23 08:22:21,386 - INFO - Loading ChromBPNet model...
13+
2026-04-23 08:22:23,156 - INFO - Auto-detected 1 GPU(s)
14+
2026-04-23 08:22:23.167345: I metal_plugin/src/device/metal_device.cc:1154] Metal device set to: Apple M3 Ultra
15+
2026-04-23 08:22:23.167371: I metal_plugin/src/device/metal_device.cc:296] systemMemory: 96.00 GB
16+
2026-04-23 08:22:23.167376: I metal_plugin/src/device/metal_device.cc:313] maxCacheSize: 36.00 GB
17+
2026-04-23 08:22:23.167409: I tensorflow/core/common_runtime/pluggable_device/pluggable_device_factory.cc:306] Could not identify NUMA node of platform GPU ID 0, defaulting to 0. Your kernel may not have been built with NUMA support.
18+
2026-04-23 08:22:23.167423: I tensorflow/core/common_runtime/pluggable_device/pluggable_device_factory.cc:272] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 0 MB memory) -> physical PluggableDevice (device: 0, name: METAL, pci bus id: <undefined>)
19+
2026-04-23 08:22:23,814 - INFO - Fingerprint not found. Saved model loading will continue.
20+
2026-04-23 08:22:23,815 - INFO - path_and_singleprint metric could not be logged. Saved model loading will continue.
21+
2026-04-23 08:22:23,930 - INFO - ChromBPNet model loaded successfully!
22+
2026-04-23 08:22:23,946 - INFO - Loaded per-track CDFs for 'chrombpnet': 24 tracks, CDFs: effect_cdfs, summary_cdfs, perbin_cdfs
23+
2026-04-23 08:22:23,946 - INFO - Predicting variant effect with chrombpnet ...
24+
2026-04-23 08:22:24.039616: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:117] Plugin optimizer for device_type GPU is enabled.
25+
2026-04-23 08:22:24,279 - INFO - Building variant report ...
26+
2026-04-23 08:22:24,280 - INFO - Annotation file already exists: /Users/lp698/chorus_test/chorus/annotations/gencode.v48.basic.annotation.gtf
27+
2026-04-23 08:22:24,280 - INFO - Loading GTF features (gene) from gencode.v48.basic.annotation.gtf (one-time)...
28+
2026-04-23 08:22:25,867 - INFO - Cached 78686 gene features from GTF
29+
2026-04-23 08:22:25,875 - INFO - Genes in prediction window: SORT1, CELSR2
30+
2026-04-23 08:22:25,875 - INFO - Annotation file already exists: /Users/lp698/chorus_test/chorus/annotations/gencode.v48.basic.annotation.gtf
31+
2026-04-23 08:22:25,875 - INFO - Loading GTF features (exon) from gencode.v48.basic.annotation.gtf (one-time)...
32+
2026-04-23 08:22:33,270 - INFO - Cached 1007879 exon features from GTF
33+
2026-04-23 08:22:33,521 - INFO - Annotation file already exists: /Users/lp698/chorus_test/chorus/annotations/gencode.v48.basic.annotation.gtf
34+
2026-04-23 08:22:33,521 - INFO - Loading GTF features (transcript) from gencode.v48.basic.annotation.gtf (one-time)...
35+
2026-04-23 08:22:35,790 - INFO - Cached 158367 transcript features from GTF
36+
2026-04-23 08:22:35,837 - INFO - Annotation file already exists: /Users/lp698/chorus_test/chorus/annotations/gencode.v48.basic.annotation.gtf
37+
2026-04-23 08:22:35,838 - INFO - Annotation file already exists: /Users/lp698/chorus_test/chorus/annotations/gencode.v48.basic.annotation.gtf
38+
2026-04-23 08:22:35,865 - INFO - HTML report written to /Users/lp698/chorus_test/chorus/examples/walkthroughs/validation/SORT1_rs12740374_multioracle/rs12740374_SORT1_chrombpnet_report.html
39+
2026-04-23 08:22:35,865 - INFO - ✓ wrote chrombpnet_variant_report.json, chrombpnet_variant_report.pkl, and rs12740374_SORT1_chrombpnet_report.html
40+
Loading directly
Lines changed: 25 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,25 @@
1+
2026-04-23 08:23:07,237 - chorus.core.base - INFO - CHORUS_NO_TIMEOUT is set - all timeouts disabled
2+
2026-04-23 08:23:07,237 - chorus.core.base - INFO - Device: auto-detect (GPU if available, else CPU)
3+
2026-04-23 08:23:07,238 - chorus.oracles.chrombpnet - INFO - Dowloading ChromBPNet into /Users/lp698/chorus_test/chorus/downloads/chrombpnet/DNASE_HepG2...
4+
2026-04-23 08:23:07,238 - chorus.oracles.chrombpnet - INFO - Downloading https://www.encodeproject.org/files/ENCFF615AKY/@@download/ENCFF615AKY.tar.gz...
5+
2026-04-23 08:24:17,424 - chorus.utils.http - INFO - ChromBPNet DNASE:HepG2: 0.10/0.72 GB (14.5%)
6+
2026-04-23 08:25:25,665 - chorus.utils.http - INFO - ChromBPNet DNASE:HepG2: 0.21/0.72 GB (29.0%)
7+
2026-04-23 08:26:34,103 - chorus.utils.http - INFO - ChromBPNet DNASE:HepG2: 0.31/0.72 GB (43.5%)
8+
2026-04-23 08:27:42,325 - chorus.utils.http - INFO - ChromBPNet DNASE:HepG2: 0.42/0.72 GB (58.0%)
9+
2026-04-23 08:28:50,504 - chorus.utils.http - INFO - ChromBPNet DNASE:HepG2: 0.52/0.72 GB (72.5%)
10+
2026-04-23 08:29:58,653 - chorus.utils.http - INFO - ChromBPNet DNASE:HepG2: 0.63/0.72 GB (87.0%)
11+
2026-04-23 08:30:59,619 - chorus.oracles.chrombpnet - INFO - Download completed!
12+
2026-04-23 08:31:03,125 - chorus.oracles.chrombpnet - INFO - Loading ChromBPNet model...
13+
2026-04-23 08:31:04,812 - chorus.oracles.chrombpnet - INFO - Auto-detected 1 GPU(s)
14+
2026-04-23 08:31:04.823728: I metal_plugin/src/device/metal_device.cc:1154] Metal device set to: Apple M3 Ultra
15+
2026-04-23 08:31:04.823757: I metal_plugin/src/device/metal_device.cc:296] systemMemory: 96.00 GB
16+
2026-04-23 08:31:04.823762: I metal_plugin/src/device/metal_device.cc:313] maxCacheSize: 36.00 GB
17+
2026-04-23 08:31:04.823792: I tensorflow/core/common_runtime/pluggable_device/pluggable_device_factory.cc:306] Could not identify NUMA node of platform GPU ID 0, defaulting to 0. Your kernel may not have been built with NUMA support.
18+
2026-04-23 08:31:04.823807: I tensorflow/core/common_runtime/pluggable_device/pluggable_device_factory.cc:272] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 0 MB memory) -> physical PluggableDevice (device: 0, name: METAL, pci bus id: <undefined>)
19+
2026-04-23 08:31:05,464 - absl - INFO - Fingerprint not found. Saved model loading will continue.
20+
2026-04-23 08:31:05,465 - absl - INFO - path_and_singleprint metric could not be logged. Saved model loading will continue.
21+
2026-04-23 08:31:05,584 - chorus.oracles.chrombpnet - INFO - ChromBPNet model loaded successfully!
22+
2026-04-23 08:31:05.665981: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:117] Plugin optimizer for device_type GPU is enabled.
23+
Loading directly
24+
alt_1/DNASE:HepG2: mean_delta=0.2374, max_abs_delta=15.0409
25+
chrombpnet SORT1 rs12740374 G>T — predict_variant_effect succeeded
Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,2 @@
1+
[NbConvertApp] Converting notebook examples/notebooks/advanced_multi_oracle_analysis.ipynb to notebook
2+
[NbConvertApp] Writing 2154017 bytes to /tmp/v23_advanced.ipynb

audits/2026-04-23_v23_scorched_earth/logs/15_nb_comprehensive.txt

Whitespace-only changes.
Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,13 @@
1+
=== MCP walkthrough docs — do they align with current tool list? ===
2+
`analyze_variant_multilayer` on the top cell types for a focused report.
3+
4+
=== tools referenced in walkthrough READMEs ===
5+
analyze_region_swap
6+
analyze_variant_multilayer
7+
discover_variant_cell_types
8+
fine_map_causal_variant
9+
list_tracks
10+
load_oracle
11+
predict
12+
score_variant_batch
13+
simulate_integration
Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,2 @@
1+
[NbConvertApp] Converting notebook examples/notebooks/advanced_multi_oracle_analysis.ipynb to notebook
2+
[NbConvertApp] Writing 2153108 bytes to /tmp/v23_advanced_fixed.ipynb

0 commit comments

Comments
 (0)