Skip to content

Multi-oracle report: add missing noise-floor/p99 scale block (6ed117f gap)#39

Merged
lucapinello merged 1 commit intochorus-applicationsfrom
fix/2026-04-22-multioracle-scale-block
Apr 23, 2026
Merged

Multi-oracle report: add missing noise-floor/p99 scale block (6ed117f gap)#39
lucapinello merged 1 commit intochorus-applicationsfrom
fix/2026-04-22-multioracle-scale-block

Conversation

@lucapinello
Copy link
Copy Markdown
Contributor

What I found

Commit 6ed117f ("IGV normalization explanation: causal report + LegNet stub parity") fixed the causal report and the LegNet stub but missed the multi-oracle report, which had its own terser intro.

Independent re-audit of the 4-part IGV contract (IGV embedded / ymax=3.0 signal tracks / scale-explanation block / assay:cell_type provenance) across all 17 shipped IGV-containing HTMLs flagged rs12740374_SORT1_multioracle_report.html at 1/5 scale-explanation markers — every other report had 5/5.

What users saw

Before this fix, the multi-oracle IGV block read only:

Signals are floor-rescaled to [0, 3.0] where 1.0 is the genome-wide p99 peak for that assay.

No mention of the p95 noise floor or the "top 1% of bins" framing the other reports ship. Users opening the multi-oracle view in isolation got a less-explicit explanation than in the per-oracle reports.

Fix

  • chorus/analysis/multi_oracle_report.py:712-724 — split the one-paragraph intro into two: the window/coverage caveat stays, plus the same "noise floor (p95) and peak threshold (p99)" block the single-oracle + causal reports emit.
  • Shipped rs12740374_SORT1_multioracle_report.html — hand-patched to insert the same paragraph at the matching location, preserving the existing IGV tracks. scripts/regenerate_multioracle.py --consolidate from committed JSONs silently drops the IGV section (per-oracle prediction pickles aren't persisted), so a naive regen would make things worse — hand-patch was the right call.

4-part audit result (after fix)

All 17 IGV-containing HTMLs now pass:

Check Result
(a) IGV embedded 17/17
(b) ymax=3.0 signal / 1.0 summary 17/17 consistent
(c) scale explanation (5/5 markers) 17/17 (was 16/17)
(d) assay:cell_type track provenance 17/17 (bare SPLICE_SITES labels are correctly cell-type-agnostic per AG metadata — investigated and confirmed not drift)

Flagged follow-up (not blocking this fix)

scripts/regenerate_multioracle.py --consolidate from committed JSONs silently drops the IGV section because per-oracle prediction pickles aren't persisted. Either persist the pickles or mark IGV-rebuild as requiring the --oracle <name> runs first. Worth a separate PR; doesn't affect shipped reports today because they're regenerated with the full pipeline.

Test plan

  • pytest tests/ --ignore=tests/test_smoke_predict.py -q339 passed / 1 skipped
  • Selenium re-render: multi-oracle HTML at 1600×4500 — new scale block visible, IGV container present, 0 JS console errors
  • 4-part audit across all 17 IGV-containing HTMLs — all pass
  • Track provenance: all oracle · assay:cell_type or assay:cell_type patterns confirmed, SPLICE_SITES cell-type-agnostic tracks verified against AG metadata (739 SPLICE_SITES tracks total; 5 have empty cell_type by design: donor/acceptor +/− strand + padding)

🤖 Generated with Claude Code

…gap)

Commit 6ed117f ("IGV normalization explanation: causal report + LegNet
stub parity") fixed the causal report and LegNet stub but missed the
multi-oracle report, which had its own terser IGV intro. Re-audit of
the 4-part IGV contract across all 17 shipped HTMLs found
rs12740374_SORT1_multioracle_report.html was missing 4/5 of the scale-
explanation markers the other reports ship ("noise floor", "p95",
"peak threshold", "rescaled using") — only "p99" appeared, in a
different phrasing.

## What users saw

Before: the multi-oracle IGV block said only
    "Signals are floor-rescaled to [0, 3.0] where 1.0 is the
     genome-wide p99 peak for that assay."
No mention of the p95 noise floor or "top 1% of bins" framing that
every single-oracle + causal report uses, so users opening the
multi-oracle view in isolation got a less-explicit explanation than
in the per-oracle reports.

## Fix

chorus/analysis/multi_oracle_report.py:712-724 — split the existing
single-paragraph intro into two paragraphs: the window/coverage caveat
stays, plus the same "noise floor (p95) and peak threshold (p99)"
block the single-oracle and causal reports emit. Future regenerations
of the multi-oracle HTML will ship the new block automatically.

Also hand-patched the shipped
  examples/walkthroughs/validation/SORT1_rs12740374_multioracle/rs12740374_SORT1_multioracle_report.html
to insert the same paragraph at the matching location, since
regenerating via scripts/regenerate_multioracle.py --consolidate
would also rebuild the IGV section from scratch — but without the
prediction pickles (not committed), that rebuild drops the IGV
entirely. The hand-patch preserves the existing IGV tracks while
adding the new explanation.

## 4-part audit result

All 17 IGV-containing HTMLs now pass:
- (a) IGV embedded   — 17/17
- (b) ymax=3.0 signal tracks / 1.0 summary tracks — 17/17 consistent
- (c) scale explanation (5/5 markers)   — 17/17 (was 16/17)
- (d) assay:cell_type track provenance  — 17/17
     (bare "SPLICE_SITES" labels on AlphaGenome donor/acceptor/padding
      tracks are correctly cell-type-agnostic per AG metadata)

## Known follow-up

scripts/regenerate_multioracle.py --consolidate from committed JSONs
silently drops the IGV section because the per-oracle prediction
pickles aren't persisted. Either persist them or mark IGV-rebuild as
requiring --oracle runs. Filed as a note; not blocking this fix.

Tests: 339 passed / 1 skipped.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@lucapinello lucapinello merged commit 5cd7dae into chorus-applications Apr 23, 2026
1 check passed
@lucapinello lucapinello deleted the fix/2026-04-22-multioracle-scale-block branch April 23, 2026 10:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant