Skip to content

perf(engine): faster shader transitions via page-side WebGL compositing#832

Merged
vanceingalls merged 17 commits into
mainfrom
vai/perf-page-side-compositing
May 15, 2026
Merged

perf(engine): faster shader transitions via page-side WebGL compositing#832
vanceingalls merged 17 commits into
mainfrom
vai/perf-page-side-compositing

Conversation

@vanceingalls
Copy link
Copy Markdown
Collaborator

@vanceingalls vanceingalls commented May 14, 2026

Summary

Moves shader transition blending from the Node-side layered pipeline into Chrome's WebGL, using a two-phase drawElementImage capture protocol. Default-on for SDR shader-transition renders — no flag needed.

Performance (15 scenes, 14 shader transitions, 28s @ 30fps, Mac)

Workers Published CLI (v0.6.6) This branch Speedup
1 135s 20.9s 6.5×
6 139s 6.9s 20.1×

Published CLI gets slower with more workers on shader transitions (coordination overhead dominates). Page-side compositing scales linearly with workers.

  • Native fidelity: uses Chromium's drawElementImage (same capture path as preview mode)
  • Auto-disables when unsafe: HDR, alpha output, video content, beginFrame mode

How it works

  1. Seek phase (page-side): GSAP seek + clone scenes into visible layoutsubtree staging canvases + set pending flag
  2. Paint force (engine-side): micro Page.captureScreenshot (1×1 clip) forces browser compositor to paint staging canvas children
  3. Composite phase (page-side): drawElementImage reads paint records → WebGL shader blend → GL overlay displayed
  4. Screenshot: engine captures the composited result in one opaque RGB frame

Why two phases?

Under virtual time, requestAnimationFrame is shimmed and the browser compositor doesn't paint new elements. drawElementImage requires a cached paint record, which only exists after a compositor pass. The micro-screenshot forces that pass without needing rAF.

Gating

Auto-enabled when all conditions met:

  • compiled.hasShaderTransitions
  • !hasHdrContent (HDR needs per-layer alpha compositing in Node)
  • !needsAlpha (WebM/MOV/PNG-sequence need transparent captures)
  • composition.videos.length === 0 (cloneNode loses video playback state)
  • captureMode !== "beginframe" (CDP micro-screenshot conflicts with beginFrame compositor)

Use --no-page-side-compositing to force the Node-side layered path.

Changes by package

engine

  • frameCapture.ts: two-phase protocol — detect pending flag after seek, micro-screenshot paint force, call resolve
  • config.ts: enablePageSideCompositing defaults to true

shader-transitions

  • engineModePageComposite.ts: full rewrite — two-phase drawElementImage compositor replacing the original broken drawElementImage attempt and the intermediate html2canvas approach
  • capture.ts: export stabilizeTransformedBoxShadows

producer

  • renderOrchestrator.ts: unified gating predicate (usePageSideCompositingForTransitions) gates both stub injection and layered-path bypass; addPreHeadScript for probe-created fileServer; !needsAlpha and videos.length === 0 exclusions
  • fileServer.ts: addPreHeadScript method on FileServerHandle for post-creation stub injection

cli

  • render.ts: flag default flipped to true; --no-page-side-compositing to disable
  • dockerRunArgs.ts: forwards --no-page-side-compositing when disabled

Test plan

  • Unit tests pass (shader-transitions: 10, engine: 594, cli: 309)
  • Render shader-perf fixture (15 scenes, 14 transitions) — correct output at 1 and 6 workers
  • Benchmarked against published v0.6.6 via npx hyperframes (both 1 and 6 workers)
  • Static frames identical to baseline
  • Transition frames show correct shader blending (domain-warp, glitch, etc.)
  • --no-page-side-compositing falls back to layered path
  • Docker flag forwarding tested (2 assertions in dockerRunArgs.test.ts)
  • CI integration tests
  • Test with video-containing composition (should auto-disable)

🤖 Generated with Claude Code

@vanceingalls vanceingalls force-pushed the vai/perf-page-side-compositing branch from f4ee937 to 759b399 Compare May 14, 2026 15:55
@vanceingalls vanceingalls changed the base branch from main to vai/cli-chrome-binary-fix May 14, 2026 15:55
@vanceingalls vanceingalls force-pushed the vai/cli-chrome-binary-fix branch from 5bc730a to c6a9818 Compare May 14, 2026 23:42
Copy link
Copy Markdown
Collaborator Author

vanceingalls commented May 14, 2026

Copy link
Copy Markdown
Collaborator

@miguel-heygen miguel-heygen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review: Page-side compositing for shader transitions

Why it's 6.5x faster

The architectural change is elegant. Before: each transition frame required TWO transparent-alpha screenshots (one per scene layer), pixel data transfer from Chrome → Node, color space conversion (rgb48le/rgba8), then CPU-bound GLSL shader blending in a Node worker_threads pool. Two round-trips through the screenshot pipeline plus CPU pixel math per frame.

After: a WebGL canvas overlay inside Chrome captures both scenes via drawElementImage, uploads them as textures, and runs the GLSL fragment shader on Chrome's GPU — all in-process. The engine takes ONE opaque RGB screenshot of the already-composited result. This eliminates: (1) two per-layer transparent screenshots, (2) Node-side color space conversion, (3) the entire Node-side shader-blend worker pool. The GPU work happens where the GPU already is.

One issue to resolve

Default mismatch: PR description says "default OFF" (opt-in), but the code sets default: true in the CLI flag and enablePageSideCompositing: true in DEFAULT_CONFIG. These need to agree — if ON is intended, existing CI fixtures may need PSNR-based pinning since f32 (WebGL) vs f64 (Node) precision means output is not bit-identical.

Notes (non-blocking)

  • composition.videos.length === 0 gate silently disables page-side compositing for any composition with embedded video. Worth documenting.
  • No PSNR-pinned visual regression test yet (documented as follow-up) — acceptable given the opt-in framing, less so if it defaults ON.
  • Error handling and fallback are solid — probes for drawElementImage + WebGL, falls back to opacity-flip with warning.

@vanceingalls vanceingalls changed the title perf(engine): page-side compositing for shader transitions (opt-in) perf(engine): 6.5× faster shader transitions via page-side WebGL compositing May 14, 2026
miguel-heygen
miguel-heygen previously approved these changes May 14, 2026
Copy link
Copy Markdown
Collaborator

@miguel-heygen miguel-heygen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Description updated — default ON matches the code now. LGTM.

@vanceingalls vanceingalls changed the base branch from vai/cli-chrome-binary-fix to graphite-base/832 May 15, 2026 00:09
vanceingalls and others added 8 commits May 15, 2026 00:09
Two correctness fixes from PR #821 self-review:

1. Cache priority order. Previous order was hyperframes-managed cache →
   puppeteer cache. HF cache is pinned to CHROME_VERSION (131-era) which
   lags 17+ releases behind upstream; if a user separately installed a
   newer chrome-headless-shell via @puppeteer/browsers install, the CLI
   would silently hand engine the older HF-cache binary while engine's
   own resolveHeadlessShellPath would have picked the newer one. Flip
   the priority so puppeteer cache wins, matching engine semantics.

2. Numeric (not lexicographic) version sort. `readdirSync.sort().reverse()`
   over names like `linux-148.0.7778.97` and `linux-99.0.6533.123` would
   return `linux-99...` first because character '9' outranks '1'. Parse
   each name into integer segments and compare them numerically.

Tests: add both-caches-populated and linux-148-beats-linux-99 cases.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ike)

Add an opt-in `--page-side-compositing` flag (CLI) backed by a new engine
config field `enablePageSideCompositing` and env var `HF_PAGE_SIDE_COMPOSITING`.
When set, SDR shader-transition compositions skip the Node-side layered blend
(the hf#677 chain) and instead run the shader inside Chrome via a page-side
WebGL canvas; the engine then captures ONE opaque RGB frame per output frame
via the existing streaming capture path.

This is the strongest non-beginFrame perf lever for Mac users, who cannot
take the beginFrame `~5×` path (Chromium structural limit, crbug.com/40656275).
Stacks on top of the hf#677 1.95× baseline.

Default OFF — existing fixture pins (byte-exact MP4 output) are preserved.
Opt-in path is intentionally PSNR-pinned, not byte-equal (WebGL is f32; Node
is f64). HDR content forces the existing layered path regardless.

Implementation:
- engine: new `EngineConfig.enablePageSideCompositing` (default false).
- producer/fileServer: new `HF_PAGE_SIDE_COMPOSITING_STUB` early-page script
  injected into the served HTML head when the flag is on.
- producer/renderOrchestrator: when the flag + no HDR + no png-sequence,
  route SDR transitions through the streaming path instead of the layered
  HDR stage.
- shader-transitions: new `engineModePageComposite.ts` installs a fullscreen
  WebGL compositor overlay and wraps `window.__hf.seek` so each seek inside
  a transition window captures both scenes via the Chromium
  `drawElementImage` API to GL textures, runs the fragment shader, and
  displays the composited result on the overlay canvas. The engine takes
  one screenshot per frame and sees the composited overlay.
- cli: new `--page-side-compositing` flag sets `HF_PAGE_SIDE_COMPOSITING=true`
  before producer load.
- scripts/page-side-compositing-smoke: bundled-CLI smoke that renders a
  representative fixture with and without the flag, validates the canary
  strings are in the shipped bundles, and writes a wall-time pair.

Determinism trade documented in the engine config doc-comment. The smoke
script enforces the bundled-CLI validation discipline from prior perf work
(see internal feedback note `validate_bundled_cli_not_dev_path`).

Runtime requirement: Chromium's `CanvasDrawElement` feature (already
enabled by the engine's `--enable-features=CanvasDrawElement` launch flag).
When the runtime feature is unavailable, the page-side installer logs a
warning and falls back to opacity-flip mode — the engine still takes the
streaming path; the transition window degrades to a hard scene swap. Vance
will validate on Mac Chrome where the feature is supported.

Co-Authored-By: Vai <vai@heygen.com>
…ture

The original drawElementImage approach fails in engine render mode because
the virtual-time shim prevents Chromium from generating paint records for
cloned elements. drawElementImage requires a cached paint record from the
browser's compositor — clones created at capture time never receive one
because (a) shimmed rAFs deadlock inside the seek wrapper, (b) original
rAFs don't produce real paints under virtual-time control, and
(c) layoutsubtree canvases don't apply CSS stylesheet rules to children.

Switch scene capture to html2canvas (foreignObjectRendering: false), the
same JS-based renderer already used by the preview-mode fallback path in
capture.ts. html2canvas reads computed styles and renders via its own
canvas drawing pipeline with no dependency on the browser paint cycle.

Also fixes:
- Engine seek must return the result so Puppeteer awaits async seek
  promises (frameCapture.ts).
- GSAP opacity cache: compositor must restore scene opacity before seek,
  not after — GSAP caches inline values and skips re-writes.
- Support check gates on WebGL availability, not drawElementImage.

Perf: 15-scene shader-perf fixture (28s, 14 transitions, 30fps)
  Baseline (Node-side layered): 137s
  Page-side (html2canvas+WebGL): 33s → 4.1× speedup

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…positor

- Use uploadTexture (zeroes canvas backing store after upload) to prevent
  ~2.2GB transient memory pressure across 280 html2canvas calls per render
- Add ignoreElements + stabilizeTransformedBoxShadows to html2canvas call,
  matching the preview-path capture.ts behavior
- Parallelize from/to scene captures with Promise.all
- Wrap post-capture render in try/finally so opacity is always restored
- Fix WebGL context leak in isPageSideCompositingSupported probe
- Remove dead ResolvedTransition.index field
- Export stabilizeTransformedBoxShadows from capture.ts

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Addresses three issues from staff review:

1. ignoreElements filter stripped all in-scene canvases (Chart.js, D3,
   p5.js) — narrowed to data-no-capture only since the compositor canvas
   is a body sibling never in the scene subtree.

2. Docker mode silently dropped --page-side-compositing — thread
   pageSideCompositing through DockerRenderOptions/buildDockerRunArgs
   with regression tests.

3. Fragmented gating across 4 independent sites could disagree:
   - Stub injection gated only on cfg flag (leaked into HDR/alpha)
   - Probe-created fileServer never got the stub
   - needsAlpha (WebM/MOV) not excluded from the gate
   - WebGL-unavailable fallback claimed layered path would run but
     orchestrator had already disabled it

   Fix: compute stub injection at the same site as the layered-bypass
   decision (after hasHdrContent is known), using addPreHeadScript on
   the already-running fileServer. Single predicate now gates both
   decisions, including !needsAlpha.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…iting

Replace html2canvas with native drawElementImage for scene capture in
the page-side compositor. drawElementImage reads from the browser's own
paint cache, giving pixel-identical output to the preview path.

The blocker was that cloned elements inside layoutsubtree canvases have
no cached paint record under virtual time — the compositor only paints
when explicitly triggered. Fix: split the seek+composite into two phases
with an engine-forced paint between them.

Phase 1 (seek wrapper, page-side):
  - GSAP seek positions the timeline
  - Clone FROM/TO scenes into visible layoutsubtree staging canvases
  - Set window.__hf_page_composite_pending flag

Engine paint force (frameCapture.ts):
  - Detect pending flag after seek returns
  - Fire micro Page.captureScreenshot (1x1 clip) via CDP to force the
    browser compositor to paint all visible elements including staging
    canvas children

Phase 2 (page.evaluate, page-side):
  - drawElementImage reads the now-valid paint records
  - Upload textures to WebGL, run shader, show GL overlay

Key insight: staging canvases must be visible (not opacity:0) for the
browser to paint their children. They sit at z-index:-9998, behind
the main DOM and covered by the GL overlay during transitions.

Perf: 15-scene fixture (28s, 14 transitions, 30fps):
  Baseline (Node-side layered): 137s
  html2canvas + WebGL:           33s (3.7×)
  drawElementImage + WebGL:      21s (6.6×)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- uploadTextureSource instead of uploadTexture: eliminates ~2.3GB of
  canvas buffer alloc/dealloc churn (persistent staging canvases don't
  need the one-shot zeroing behavior)
- Fold hasPending check into seek page.evaluate: eliminates one CDP
  round-trip per frame (~700 unnecessary IPC calls on non-transition
  frames)
- Fix renderShader error handling: on failure, leave source scenes
  visible as fallback instead of hiding both scenes + GL overlay
  (which produced black frames)
- Move mutable state declarations above resolveComposite to prevent
  TDZ risk on refactor

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
… guard

- Clear staging canvas children when leaving transition window (prevents
  visible clone bleed-through on transparent compositions)
- Clear __hf_page_composite_pending on all resolveComposite exit paths
- Guard micro-screenshot paint force against beginFrame mode (CDP
  Page.captureScreenshot conflicts with beginFrame compositor control)
- Update CLI flag description: document video/canvas limitation, remove
  stale PSNR claim

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@vanceingalls vanceingalls force-pushed the vai/perf-page-side-compositing branch from 4c11123 to c878919 Compare May 15, 2026 00:09
@graphite-app graphite-app Bot changed the base branch from graphite-base/832 to main May 15, 2026 00:10
@graphite-app graphite-app Bot dismissed miguel-heygen’s stale review May 15, 2026 00:10

The base branch was changed.

…ions

Page-side compositing is now enabled by default for SDR shader-transition
renders without video content. The 6.6× speedup applies automatically —
no flag needed.

Auto-disables when:
- HDR content detected
- Alpha output (WebM/MOV/PNG-sequence)
- Composition contains <video> elements (cloneNode loses playback state)
- beginFrame capture mode (Linux headless)

Use --no-page-side-compositing to force the Node-side layered path.

Changes:
- Engine config: enablePageSideCompositing defaults to true
- CLI: flag default flipped to true; --no-page-side-compositing disables
- Orchestrator: added composition.videos.length === 0 gate
- Docker: forwards --no-page-side-compositing when explicitly disabled
- Config tests updated for new default

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@vanceingalls vanceingalls force-pushed the vai/perf-page-side-compositing branch from c878919 to 63eb5bb Compare May 15, 2026 00:10
@vanceingalls vanceingalls changed the title perf(engine): 6.5× faster shader transitions via page-side WebGL compositing perf(engine): 20.1× faster shader transitions via page-side WebGL compositing May 15, 2026
@vanceingalls vanceingalls changed the title perf(engine): 20.1× faster shader transitions via page-side WebGL compositing perf(engine): faster shader transitions via page-side WebGL compositing May 15, 2026
Three-phase capture protocol lets shader transitions render video scenes
without falling back to the slow Node-side layered pipeline:

1. Seek → compositor records transition metadata, sets pending flag
2. onBeforeCapture → video frame injector updates <img> replacements
3. prepare → cloneNode picks up current video frames, img.decode() awaits
4. micro-screenshot → forces browser to paint cloned elements
5. resolve → drawElementImage reads paint records, shader composites

Key changes:
- Remove `composition.videos.length === 0` gate from orchestrator
- Split compositor resolve into prepare (clone) + resolve (shader)
- Move onBeforeCapture before compositor prepare in frameCapture.ts
- Await img.decode() on cloned data-URI images to prevent stale frames
- Stop manipulating scene opacity in compositor (GL canvas overlay suffices)
- Add gsap.set declaration for shader-transitions ambient types
- Add video_missing_timing_attrs lint rule for <video> without id/data-start/data-end

Performance: compositions with video now render at 7.5s (6 workers) instead
of 2m38s on the layered path.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Comment thread scripts/page-side-compositing-smoke/run.mjs Fixed
Copy link
Copy Markdown
Collaborator

@miguel-heygen miguel-heygen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re-reviewed with the new commit (a56ef01d). The three-phase protocol for video support is architecturally sound — moving video frame injection before cloneNode so clones contain the injected <img> replacements is the right fix. The img.decode() awaiting on data-URI images in clones is a good catch for stale paint cache.

One blocking issue — lint rule checks wrong attribute:

The video_missing_timing_attrs rule checks for data-end:

const dataEnd = readAttr(tag.raw, "data-end");

But data-end is computed by the timing compiler from data-start + data-duration — it never exists in source HTML. Authors write data-start + data-duration (optional for video, defaults to media duration). Every other lint rule in media.ts references data-duration, not data-end.

This rule will false-positive on every valid video element:

<!-- Valid source HTML — has data-start + data-duration, no data-end -->
<video id="hero" src="clip.mp4" data-start="0" data-duration="5" muted></video>
<!-- ↑ lint error: "missing data-end" — false positive -->

Fix: check data-duration instead of data-end. And since data-duration is optional for video (defaults to media duration per the spec), consider making it a warning when both data-start and data-duration are missing, not an error when just data-end is missing.

The test cases also use data-end="5" — should be data-duration="5" to match the actual authoring contract.

Rest of the commit looks good:

  • Removing the composition.videos.length === 0 gate is correct now that three-phase handles video
  • Removing the opacity toggling on source scenes is cleaner — GL canvas overlay covers them
  • gsap.set type addition is fine

Copy link
Copy Markdown
Collaborator

@jrusso1020 jrusso1020 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Architectural design is sound — Miguel's prior review dismissed once the default-on/off framing was reconciled, and the perf math holds up (the two-screenshot + Node-side blend → one-screenshot + Chrome-GPU blend swap is exactly what removes the 20× coordination overhead at higher worker counts). Most of my notes here are about gates that need confirmation before merge.

Audited

  • frameCapture.ts three-phase protocol — clean: seek + pending-flag check folded into one round-trip; before-capture hook runs first so injected video <img> replacements are picked up by the page-side compositor's cloneNode; the 1×1 Page.captureScreenshot paint-force is gated on hasPendingComposite && session.captureMode !== "beginframe". Reading sequence is correct for the documented invariants. ✓
  • config.tsenablePageSideCompositing: true default, HF_PAGE_SIDE_COMPOSITING env override. Matches the description's "default ON for SDR shader-transition renders." ✓
  • Determinism comment in config docstring — explicit "page-side WebGL is f32, not f64. Byte-equality fixture pins are NOT compatible with this path; the new path's correctness pin is PSNR-based." That's exactly the right thing to call out, and matches the regression-fixture pattern. ✓
  • Capture-mode auto-disableframeCapture.ts: session.captureMode !== "beginframe" is the right pivot. ✓
  • manager.ts puppeteer cache version ordering — the lexicographic .sort().reverse() → numeric compareVersionDirsDescending fix is genuinely useful (linux-99.x was sorting ahead of linux-148.x because '9' > '1') and the comment is sharp. But it's orthogonal to the perf work — see scope note below.

Worth confirming before merge — body-vs-diff mismatch on the video gate

The PR body lists the auto-disable conditions:

"Auto-disables when unsafe: HDR, alpha output, video content, beginFrame mode"

and later in the gating section:

"composition.videos.length === 0 (cloneNode loses video playback state)"

But the usePageSideCompositingForTransitions predicate in renderOrchestrator.ts:1752-1762 is:

cfg.enablePageSideCompositing &&
compiled.hasShaderTransitions &&
!hasHdrContent &&
!isPngSequence &&
!needsAlpha;

— I don't see composition.videos.length === 0 in there. Three of the four body-promised gates are present (HDR via !hasHdrContent, alpha via !needsAlpha, beginFrame via the frameCapture.ts site), but the video-count gate is missing from this predicate. Two possibilities:

  1. The gate exists at a different layer — e.g., engineModePageComposite.ts probes <video> elements at page-side init and bails out (no __hf_page_composite_pending flag → orchestrator's stub injection is a no-op). If so, please confirm — the orchestrator-level early-out would still be cleaner because it would skip injecting the stub script entirely.
  2. The gate isn't there yetcloneNode on a playing <video> produces a paused clone at the source's currentTime, and drawElementImage would read that paused frame. Page-side compositing on a video-bearing composition would silently freeze the video plane on every transition frame. The test plan's unchecked "Test with video-containing composition (should auto-disable)" row suggests this hasn't been verified yet.

If it's #2, the body's "auto-disables when unsafe: video content" claim is a Rule-3 mismatch and worth a one-line addition: && composition.videos.length === 0 (or the equivalent — compiled.hasVideos if that derived flag exists).

Required-check failures gating merge

  1. CLI smoke (required) — failure (job 76221755878). Couldn't pull the log, but the diff adds a new --page-side-compositing CLI flag + the new scripts/page-side-compositing-smoke/run.mjs invokes the bundled CLI. If packages/cli/dist/cli.js hasn't been rebuilt to include the flag yet, the smoke script's assertCliCanary would fail. Worth checking whether the CI's CLI smoke job is running this new smoke script or the standard one.

  2. CodeQL — failure + advanced-security bot review surfaced one alert: "Insecure creation of file in the os temp dir" at scripts/page-side-compositing-smoke/run.mjs. The script hardcodes /tmp/hf-page-side-smoke/ as WORK_DIR — predictable path in shared /tmp can allow symlink-attack pre-creation. Low-risk for a local smoke test, but easy to clear: replace the hardcoded /tmp/hf-page-side-smoke with mkdtempSync(path.join(os.tmpdir(), "hf-page-side-smoke-")). CodeQL alert closes automatically once the predictable path is gone.

  3. Analyze (python) — failure — likely a CodeQL Python-analyzer workflow issue (this PR has no Python). Probably the same root as the CodeQL failure above; check the workflow log.

Non-blocking notes

  • Scopemanager.ts puppeteer-cache version ordering (+~60 LoC) and the system-Chrome warn-string nudge are genuinely good fixes but unrelated to page-side compositing. Reviewable as-is, but a separate PR would have a cleaner revert surface if someone needs to roll back the perf work without losing the cache-ordering fix.

  • frameCapture.ts:hasPendingComposite read order — the second page.evaluate (resolve) doesn't check cancelled/teardown state. If the render job is aborted between the micro-screenshot and the resolve call, the compositor stays in pending state for the next frame. Probably benign because the next seek + prepare resets it, but worth a one-line if (!session.active) return guard if session.active exists in this scope.

  • The 1×1 clip paint-force — semantically the same trick the engine uses for Page.captureScreenshot elsewhere; just noting it overlaps slightly with the always-on-clip change from hf#838 (scale: 1 explicit clip changes Chrome compositor's geometry resolution path). The two changes coexist fine but if one regresses, the other shares the failure mode.

  • drawElementImage paint-record requirement — the body explains "drawElementImage requires a cached paint record, which only exists after a compositor pass". The micro-screenshot is the forcing function. Worth noting in frameCapture.ts:711-728 that this isn't a bandaid — it's the documented protocol. The inline comment is good; consider promoting to a doc-comment block since it's the non-obvious linchpin.

Once the video gate is confirmed (or added) and CI failures are sorted, happy to flip to APPROVE.

Review by Rames Jusso (pr-review)

Copy link
Copy Markdown
Collaborator

@jrusso1020 jrusso1020 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Self-correction on the video-gate flag.

Magi's review noted "Removing the composition.videos.length === 0 gate is correct now that three-phase handles video" — and that's right. The three-phase protocol runs onBeforeCapture (video frame injection) BEFORE the page-side prepare() does cloneNode, so the clone picks up the injected <img> replacement (a static frame from the video) instead of a <video> clone that would freeze at the source's currentTime. The architecture replaces the need for the gate.

So my prior "missing video gate" framing was wrong — the gate is intentionally absent because the architectural change makes it unnecessary. What's left is a stale PR body: the body still lists composition.videos.length === 0 and "auto-disables when unsafe: video content" among the gates, but those describe an older version of the design. Worth a body update so reviewers (and future readers) aren't pointed at a check that no longer needs to exist.

My other flags stand:

  • CLI smoke (required) failure — gates merge regardless
  • CodeQL "Insecure temporary file" on scripts/page-side-compositing-smoke/run.mjs — quick mkdtempSync fix
  • Magi's video_missing_timing_attrs blocker (lint rule checks data-end instead of data-duration — would false-positive every valid video element) is the substantive blocker

Once those three are addressed + the body's gate list is brought in sync with the three-phase architecture, happy to APPROVE.

Self-correction by Rames Jusso (pr-review)

vanceingalls and others added 5 commits May 15, 2026 10:57
…works without explicit attrs

The timing compiler now injects data-start="0" on <video> and <audio>
elements that lack it. This makes discoverMediaFromBrowser() find the
element (it queries video[data-start]), so the frame extraction pipeline
activates automatically. Videos "just work" without requiring authors to
add data-start, data-end, or id attributes.

Also removes the video_missing_timing_attrs lint rule — the compiler
handles the missing attributes automatically, so the lint rule would
only false-positive.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…ming

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…ideo discovery

Seeks the GSAP timeline in Puppeteer to discover when each video's parent
scene is visible (opacity > 0). Uses coarse sampling at 100ms steps followed
by binary search refinement to frame-level precision (1/60s). Only processes
videos with the data-hf-auto-start sentinel so author-specified timing is
never overridden.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…be stage

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…bug logging

The probe stage was skipping browser launch when composition duration was
already known, which meant discoverVideoVisibilityFromTimeline never ran.
Now needsBrowser also checks for data-hf-auto-start sentinel in compiled HTML.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown
Collaborator

@jrusso1020 jrusso1020 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re-approving on dbac2e7c. Magi's blocker is resolved, and the response is more complete than I expected.

Rather than fixing the buggy lint rule, you removed it entirely and replaced it with a different architectural shape:

Audited — the new commits

  • media.ts -28 / media.test.ts -42 — the video_missing_timing_attrs rule (Magi's blocker) is gone. The whole "warn on missing timing attrs" angle is replaced by the next two changes. ✓
  • timingCompiler.ts +7 / timingCompiler.test.ts +26 — compile-time auto-injection of data-start="0" when missing, plus a data-hf-auto-start sentinel to mark which videos got auto-injection. Three new tests pin: (a) auto-inject when missing, (b) sentinel on auto-inject, (c) NO sentinel when author provides data-start. Clean three-way distinction. ✓ This is the right shape — the compiler resolves the missing-attr problem at the source instead of warning about it.
  • htmlCompiler.ts:discoverVideoVisibilityFromTimeline +103 — runtime video visibility probing: seeks GSAP timeline at 0.1s intervals to find when each data-hf-auto-start video's parent .scene has opacity > 0, then binary-searches the boundaries to 1/60s frame precision. Suppresses events on seek (tl.totalTime(t, true)), resets to 0 before returning. ✓ Good cleanup discipline. Engine then writes the discovered window back to composition.videos[i].start/end.
  • probeStage.ts +26 — wires the visibility discovery into the probe stage; needsBrowser now also fires on hasAutoStartVideos so renders that previously skipped the probe (known duration, fully-resolved compositions) now run it when auto-injected timing is in play. Sensible — the alternative is rendering with the auto-injected start=0, duration=composition.duration defaults which would write video frames where the scene isn't visible.

Body claim verification

  • CodeQL now reports neutral (was failure on a56ef01). The temp-file alert on scripts/page-side-compositing-smoke/run.mjs likely became stale-but-still-open rather than newly-failing — worth a quick glance at the security tab to confirm the alert is dismissed/resolved before merge, but no longer build-gating.
  • CLI smoke (required) still in-progress on this commit; will gate merge if it fails again.
  • Body still mentions the composition.videos.length === 0 gate (in the "Gating" section) and "auto-disables when unsafe: video content." Stale text — the gate was intentionally removed when the three-phase protocol with pre-clone video frame injection landed. Worth a body update so future readers aren't pointed at a check that no longer needs to exist.
  • Test plan — still has "Test with video-containing composition (should auto-disable)" unchecked. With the new auto-injection + visibility-discovery path, the test should be re-framed as "Render a composition with videos and shader transitions — auto-injected timing windows discovered correctly, output preserves video frames at scene-visibility boundaries."

Non-blocking notes on the new runtime-discovery code

  1. videoEl.closest(".scene") || videoEl assumes a CSS class convention (htmlCompiler.ts:1196). If a composition uses different scene wrapping (div.layer, section.slide, etc.), the discovery falls back to the video element itself — which typically inherits opacity from its parent stacking context, so it may still produce the right answer. But if <video> has opacity: 1 while parent has opacity: 0, computed opacity on the <video> is unaffected — getComputedStyle(videoEl).opacity returns the video's own opacity, not the effective stacked opacity. Worth verifying compositions that don't use .scene still discover correctly, or use getBoundingClientRect() + IntersectionObserver-style visibility instead. Probably fine for current HF authoring conventions; brittle to future composition shapes.

  2. hasAutoStartVideos = compiled.html.includes("data-hf-auto-start") (probeStage.ts) — substring check could false-positive on string literals containing that token in the compiled HTML. A regex against attribute syntax (e.g., /data-hf-auto-start[=>\s]/) would be more robust. Minor.

  3. Probe perf cost — ~10Hz sampling means ~280 seekTl + getComputedStyle calls per video on a 28-second composition. At ~5ms per round-trip across the CDP boundary, that's ~1.4s added to the probe stage per video. For multi-video compositions, multiplicative. Acceptable given the discovery benefit, but worth a perf note in the comment or scaling SAMPLE_STEP with compositionDuration for very long renders.

  4. Edge: video in a scene that's never visiblefirstVisible === nullcontinue, no window written, video keeps its auto-injected start=0 default. Likely results in the engine rendering frames for a video the user can't see. Probably benign (the engine just generates output that gets composited at opacity: 0) but worth a log warning so authors notice their scene structure is busted.

Once the body's composition.videos.length === 0 gate text is updated (or removed) and CLI smoke comes back green, this is fully clear.

Re-review by Rames Jusso (pr-review)

Copy link
Copy Markdown
Collaborator

@miguel-heygen miguel-heygen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re-reviewed on dbac2e7c. My original blocker (lint rule checking data-end instead of data-duration) is fully resolved — you went further than just fixing the attribute name:

  1. Removed the lint rule entirely — correct call. The problem was architectural, not a lint gap.
  2. Compile-time auto-injection — timing compiler injects data-start="0" + data-hf-auto-start sentinel when videos lack data-start. This is the right layer to solve it.
  3. Runtime visibility discoverydiscoverVideoVisibilityFromTimeline seeks the GSAP timeline at 10Hz to find when each auto-start video's parent scene is visible, then binary-searches to 1/60s. Clever approach.
  4. Probe triggerneedsBrowser now fires on hasAutoStartVideos so the browser probe runs even when duration is known.

Rames' non-blocking notes are worth tracking (.scene class assumption, substring check, scaling SAMPLE_STEP with duration) but none are blocking.

CLI smoke green. Three-phase compositor protocol + video support + auto-start pipeline all land together cleanly.

vanceingalls and others added 2 commits May 15, 2026 13:51
Replaces hardcoded /tmp/hf-page-side-smoke with a unique temp directory
via mkdtempSync to resolve CodeQL "insecure temporary file" alert.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown
Collaborator

@jrusso1020 jrusso1020 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re-approving on a995e304. All my prior flags now resolved + CI fully green.

The two new commits address the CodeQL temp-file finding I'd flagged: WORK_DIR = mkdtempSync(join(tmpdir(), "hf-page-side-smoke-")) is exactly the pattern I'd suggested — randomized temp dir per invocation, no symlink-attack window. Doc comment updated to use the <tmpdir> placeholder.

CI

  • All required checks now green: Test, Typecheck, Lint, Build, CodeQL (was neutral/failure on prior commits → now success), Test: runtime contract, CLI smoke (required), Smoke: global install, Tests on windows-latest, Render on windows-latest, Preview parity, preview-regression, Perf:drift/fps/load/parity/scrub, player-perf, regression, all 8 regression-shards green
  • mergeable_state: blocked is reviewer-gate

Recap of resolved items

  1. CodeQL "Insecure temporary file" → fixed via mkdtempSync
  2. CLI smoke (required) failure on earlier commits → green now ✓
  3. Magi's video_missing_timing_attrs lint-rule blocker → resolved via lint-rule removal + compile-time auto-injection + runtime visibility discovery ✓
  4. My video-gate "missing" framing → self-corrected (architectural change makes the gate unnecessary) ✓

Clear to merge.

Re-review by Rames Jusso (pr-review)

Copy link
Copy Markdown
Collaborator

@miguel-heygen miguel-heygen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re-reviewed on a995e304 (2 commits past the prior approvals on dbac2e7c).

New commits since last approval:

  • 1cbf3f4e — fixes smoke test to use mkdtempSync instead of a static temp path (prevents collisions on concurrent runs)
  • a995e304 — oxfmt formatting on the smoke script

Both are trivial and don't touch the core architecture. The substance reviewed:

Architecture — page-side WebGL compositing (344-line engineModePageComposite.ts):

  • Three-phase capture protocol (prepare → micro-screenshot → resolve) is clean. The clone-before-paint approach ensures drawElementImage reads current bitmaps, not stale paint cache entries.
  • Seek wrapper installed via setInterval polling for window.__hf.seek (up to 10s) — reasonable given the async bridge initialization order. The opacity-flip timeline still runs underneath, so fallback is automatic if the compositor fails.
  • Staging canvases use layoutsubtree + fixed z-index -9998 — positioned off-viewport-layer to avoid capturing in the final screenshot.

Browser cache priority fix (manager.ts):

  • Puppeteer cache now checked before HF-managed cache — correct, since HF cache is pinned to a stale Chrome version.
  • parseVersionSegments + compareVersionDirsDescending fix the lexicographic sort bug (linux-99 > linux-148 character-wise). Good regression tests covering the exact scenario.
  • Note: the same lexicographic bug still exists in engine's resolveHeadlessShellPath (mentioned in the comment). Worth a follow-up to fix there too.

Video timing auto-injection (timingCompiler.ts):

  • data-hf-auto-start sentinel cleanly separates author-provided from auto-injected timing, so discoverVideoVisibilityFromTimeline only processes auto-injected videos.
  • Binary search for visibility boundaries (1/60s precision) is a smart approach — coarse linear scan + binary refinement keeps the seek count reasonable.

Orchestrator wiring:

  • addPreHeadScript injected after HDR check — correct order, since HDR forces layered path regardless.
  • needsAlpha and isPngSequence also bypass page-side compositing — all edge cases accounted for.

Smoke test:

  • Tests bundled CLI path (not dev path) — validates the full pipeline including tsup bundling. Canary string check ensures the compositor code survived tree-shaking.

Clean, well-architected feature. The two post-approval commits are trivial fixes.

@vanceingalls vanceingalls merged commit f84cc49 into main May 15, 2026
41 checks passed
@vanceingalls vanceingalls deleted the vai/perf-page-side-compositing branch May 15, 2026 23:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants