perf(engine): faster shader transitions via page-side WebGL compositing#832
Conversation
f4ee937 to
759b399
Compare
5bc730a to
c6a9818
Compare
This stack of pull requests is managed by Graphite. Learn more about stacking. |
miguel-heygen
left a comment
There was a problem hiding this comment.
Review: Page-side compositing for shader transitions
Why it's 6.5x faster
The architectural change is elegant. Before: each transition frame required TWO transparent-alpha screenshots (one per scene layer), pixel data transfer from Chrome → Node, color space conversion (rgb48le/rgba8), then CPU-bound GLSL shader blending in a Node worker_threads pool. Two round-trips through the screenshot pipeline plus CPU pixel math per frame.
After: a WebGL canvas overlay inside Chrome captures both scenes via drawElementImage, uploads them as textures, and runs the GLSL fragment shader on Chrome's GPU — all in-process. The engine takes ONE opaque RGB screenshot of the already-composited result. This eliminates: (1) two per-layer transparent screenshots, (2) Node-side color space conversion, (3) the entire Node-side shader-blend worker pool. The GPU work happens where the GPU already is.
One issue to resolve
Default mismatch: PR description says "default OFF" (opt-in), but the code sets default: true in the CLI flag and enablePageSideCompositing: true in DEFAULT_CONFIG. These need to agree — if ON is intended, existing CI fixtures may need PSNR-based pinning since f32 (WebGL) vs f64 (Node) precision means output is not bit-identical.
Notes (non-blocking)
composition.videos.length === 0gate silently disables page-side compositing for any composition with embedded video. Worth documenting.- No PSNR-pinned visual regression test yet (documented as follow-up) — acceptable given the opt-in framing, less so if it defaults ON.
- Error handling and fallback are solid — probes for
drawElementImage+ WebGL, falls back to opacity-flip with warning.
miguel-heygen
left a comment
There was a problem hiding this comment.
Description updated — default ON matches the code now. LGTM.
Two correctness fixes from PR #821 self-review: 1. Cache priority order. Previous order was hyperframes-managed cache → puppeteer cache. HF cache is pinned to CHROME_VERSION (131-era) which lags 17+ releases behind upstream; if a user separately installed a newer chrome-headless-shell via @puppeteer/browsers install, the CLI would silently hand engine the older HF-cache binary while engine's own resolveHeadlessShellPath would have picked the newer one. Flip the priority so puppeteer cache wins, matching engine semantics. 2. Numeric (not lexicographic) version sort. `readdirSync.sort().reverse()` over names like `linux-148.0.7778.97` and `linux-99.0.6533.123` would return `linux-99...` first because character '9' outranks '1'. Parse each name into integer segments and compare them numerically. Tests: add both-caches-populated and linux-148-beats-linux-99 cases. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ike) Add an opt-in `--page-side-compositing` flag (CLI) backed by a new engine config field `enablePageSideCompositing` and env var `HF_PAGE_SIDE_COMPOSITING`. When set, SDR shader-transition compositions skip the Node-side layered blend (the hf#677 chain) and instead run the shader inside Chrome via a page-side WebGL canvas; the engine then captures ONE opaque RGB frame per output frame via the existing streaming capture path. This is the strongest non-beginFrame perf lever for Mac users, who cannot take the beginFrame `~5×` path (Chromium structural limit, crbug.com/40656275). Stacks on top of the hf#677 1.95× baseline. Default OFF — existing fixture pins (byte-exact MP4 output) are preserved. Opt-in path is intentionally PSNR-pinned, not byte-equal (WebGL is f32; Node is f64). HDR content forces the existing layered path regardless. Implementation: - engine: new `EngineConfig.enablePageSideCompositing` (default false). - producer/fileServer: new `HF_PAGE_SIDE_COMPOSITING_STUB` early-page script injected into the served HTML head when the flag is on. - producer/renderOrchestrator: when the flag + no HDR + no png-sequence, route SDR transitions through the streaming path instead of the layered HDR stage. - shader-transitions: new `engineModePageComposite.ts` installs a fullscreen WebGL compositor overlay and wraps `window.__hf.seek` so each seek inside a transition window captures both scenes via the Chromium `drawElementImage` API to GL textures, runs the fragment shader, and displays the composited result on the overlay canvas. The engine takes one screenshot per frame and sees the composited overlay. - cli: new `--page-side-compositing` flag sets `HF_PAGE_SIDE_COMPOSITING=true` before producer load. - scripts/page-side-compositing-smoke: bundled-CLI smoke that renders a representative fixture with and without the flag, validates the canary strings are in the shipped bundles, and writes a wall-time pair. Determinism trade documented in the engine config doc-comment. The smoke script enforces the bundled-CLI validation discipline from prior perf work (see internal feedback note `validate_bundled_cli_not_dev_path`). Runtime requirement: Chromium's `CanvasDrawElement` feature (already enabled by the engine's `--enable-features=CanvasDrawElement` launch flag). When the runtime feature is unavailable, the page-side installer logs a warning and falls back to opacity-flip mode — the engine still takes the streaming path; the transition window degrades to a hard scene swap. Vance will validate on Mac Chrome where the feature is supported. Co-Authored-By: Vai <vai@heygen.com>
…ture The original drawElementImage approach fails in engine render mode because the virtual-time shim prevents Chromium from generating paint records for cloned elements. drawElementImage requires a cached paint record from the browser's compositor — clones created at capture time never receive one because (a) shimmed rAFs deadlock inside the seek wrapper, (b) original rAFs don't produce real paints under virtual-time control, and (c) layoutsubtree canvases don't apply CSS stylesheet rules to children. Switch scene capture to html2canvas (foreignObjectRendering: false), the same JS-based renderer already used by the preview-mode fallback path in capture.ts. html2canvas reads computed styles and renders via its own canvas drawing pipeline with no dependency on the browser paint cycle. Also fixes: - Engine seek must return the result so Puppeteer awaits async seek promises (frameCapture.ts). - GSAP opacity cache: compositor must restore scene opacity before seek, not after — GSAP caches inline values and skips re-writes. - Support check gates on WebGL availability, not drawElementImage. Perf: 15-scene shader-perf fixture (28s, 14 transitions, 30fps) Baseline (Node-side layered): 137s Page-side (html2canvas+WebGL): 33s → 4.1× speedup Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…positor - Use uploadTexture (zeroes canvas backing store after upload) to prevent ~2.2GB transient memory pressure across 280 html2canvas calls per render - Add ignoreElements + stabilizeTransformedBoxShadows to html2canvas call, matching the preview-path capture.ts behavior - Parallelize from/to scene captures with Promise.all - Wrap post-capture render in try/finally so opacity is always restored - Fix WebGL context leak in isPageSideCompositingSupported probe - Remove dead ResolvedTransition.index field - Export stabilizeTransformedBoxShadows from capture.ts Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Addresses three issues from staff review:
1. ignoreElements filter stripped all in-scene canvases (Chart.js, D3,
p5.js) — narrowed to data-no-capture only since the compositor canvas
is a body sibling never in the scene subtree.
2. Docker mode silently dropped --page-side-compositing — thread
pageSideCompositing through DockerRenderOptions/buildDockerRunArgs
with regression tests.
3. Fragmented gating across 4 independent sites could disagree:
- Stub injection gated only on cfg flag (leaked into HDR/alpha)
- Probe-created fileServer never got the stub
- needsAlpha (WebM/MOV) not excluded from the gate
- WebGL-unavailable fallback claimed layered path would run but
orchestrator had already disabled it
Fix: compute stub injection at the same site as the layered-bypass
decision (after hasHdrContent is known), using addPreHeadScript on
the already-running fileServer. Single predicate now gates both
decisions, including !needsAlpha.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…iting
Replace html2canvas with native drawElementImage for scene capture in
the page-side compositor. drawElementImage reads from the browser's own
paint cache, giving pixel-identical output to the preview path.
The blocker was that cloned elements inside layoutsubtree canvases have
no cached paint record under virtual time — the compositor only paints
when explicitly triggered. Fix: split the seek+composite into two phases
with an engine-forced paint between them.
Phase 1 (seek wrapper, page-side):
- GSAP seek positions the timeline
- Clone FROM/TO scenes into visible layoutsubtree staging canvases
- Set window.__hf_page_composite_pending flag
Engine paint force (frameCapture.ts):
- Detect pending flag after seek returns
- Fire micro Page.captureScreenshot (1x1 clip) via CDP to force the
browser compositor to paint all visible elements including staging
canvas children
Phase 2 (page.evaluate, page-side):
- drawElementImage reads the now-valid paint records
- Upload textures to WebGL, run shader, show GL overlay
Key insight: staging canvases must be visible (not opacity:0) for the
browser to paint their children. They sit at z-index:-9998, behind
the main DOM and covered by the GL overlay during transitions.
Perf: 15-scene fixture (28s, 14 transitions, 30fps):
Baseline (Node-side layered): 137s
html2canvas + WebGL: 33s (3.7×)
drawElementImage + WebGL: 21s (6.6×)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- uploadTextureSource instead of uploadTexture: eliminates ~2.3GB of canvas buffer alloc/dealloc churn (persistent staging canvases don't need the one-shot zeroing behavior) - Fold hasPending check into seek page.evaluate: eliminates one CDP round-trip per frame (~700 unnecessary IPC calls on non-transition frames) - Fix renderShader error handling: on failure, leave source scenes visible as fallback instead of hiding both scenes + GL overlay (which produced black frames) - Move mutable state declarations above resolveComposite to prevent TDZ risk on refactor Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
… guard - Clear staging canvas children when leaving transition window (prevents visible clone bleed-through on transparent compositions) - Clear __hf_page_composite_pending on all resolveComposite exit paths - Guard micro-screenshot paint force against beginFrame mode (CDP Page.captureScreenshot conflicts with beginFrame compositor control) - Update CLI flag description: document video/canvas limitation, remove stale PSNR claim Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
4c11123 to
c878919
Compare
c6a9818 to
c1ba528
Compare
The base branch was changed.
…ions Page-side compositing is now enabled by default for SDR shader-transition renders without video content. The 6.6× speedup applies automatically — no flag needed. Auto-disables when: - HDR content detected - Alpha output (WebM/MOV/PNG-sequence) - Composition contains <video> elements (cloneNode loses playback state) - beginFrame capture mode (Linux headless) Use --no-page-side-compositing to force the Node-side layered path. Changes: - Engine config: enablePageSideCompositing defaults to true - CLI: flag default flipped to true; --no-page-side-compositing disables - Orchestrator: added composition.videos.length === 0 gate - Docker: forwards --no-page-side-compositing when explicitly disabled - Config tests updated for new default Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
c878919 to
63eb5bb
Compare
Three-phase capture protocol lets shader transitions render video scenes without falling back to the slow Node-side layered pipeline: 1. Seek → compositor records transition metadata, sets pending flag 2. onBeforeCapture → video frame injector updates <img> replacements 3. prepare → cloneNode picks up current video frames, img.decode() awaits 4. micro-screenshot → forces browser to paint cloned elements 5. resolve → drawElementImage reads paint records, shader composites Key changes: - Remove `composition.videos.length === 0` gate from orchestrator - Split compositor resolve into prepare (clone) + resolve (shader) - Move onBeforeCapture before compositor prepare in frameCapture.ts - Await img.decode() on cloned data-URI images to prevent stale frames - Stop manipulating scene opacity in compositor (GL canvas overlay suffices) - Add gsap.set declaration for shader-transitions ambient types - Add video_missing_timing_attrs lint rule for <video> without id/data-start/data-end Performance: compositions with video now render at 7.5s (6 workers) instead of 2m38s on the layered path. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
miguel-heygen
left a comment
There was a problem hiding this comment.
Re-reviewed with the new commit (a56ef01d). The three-phase protocol for video support is architecturally sound — moving video frame injection before cloneNode so clones contain the injected <img> replacements is the right fix. The img.decode() awaiting on data-URI images in clones is a good catch for stale paint cache.
One blocking issue — lint rule checks wrong attribute:
The video_missing_timing_attrs rule checks for data-end:
const dataEnd = readAttr(tag.raw, "data-end");But data-end is computed by the timing compiler from data-start + data-duration — it never exists in source HTML. Authors write data-start + data-duration (optional for video, defaults to media duration). Every other lint rule in media.ts references data-duration, not data-end.
This rule will false-positive on every valid video element:
<!-- Valid source HTML — has data-start + data-duration, no data-end -->
<video id="hero" src="clip.mp4" data-start="0" data-duration="5" muted></video>
<!-- ↑ lint error: "missing data-end" — false positive -->Fix: check data-duration instead of data-end. And since data-duration is optional for video (defaults to media duration per the spec), consider making it a warning when both data-start and data-duration are missing, not an error when just data-end is missing.
The test cases also use data-end="5" — should be data-duration="5" to match the actual authoring contract.
Rest of the commit looks good:
- Removing the
composition.videos.length === 0gate is correct now that three-phase handles video - Removing the opacity toggling on source scenes is cleaner — GL canvas overlay covers them
gsap.settype addition is fine
jrusso1020
left a comment
There was a problem hiding this comment.
Architectural design is sound — Miguel's prior review dismissed once the default-on/off framing was reconciled, and the perf math holds up (the two-screenshot + Node-side blend → one-screenshot + Chrome-GPU blend swap is exactly what removes the 20× coordination overhead at higher worker counts). Most of my notes here are about gates that need confirmation before merge.
Audited
frameCapture.tsthree-phase protocol — clean: seek + pending-flag check folded into one round-trip; before-capture hook runs first so injected video<img>replacements are picked up by the page-side compositor'scloneNode; the 1×1Page.captureScreenshotpaint-force is gated onhasPendingComposite && session.captureMode !== "beginframe". Reading sequence is correct for the documented invariants. ✓config.ts—enablePageSideCompositing: truedefault,HF_PAGE_SIDE_COMPOSITINGenv override. Matches the description's "default ON for SDR shader-transition renders." ✓- Determinism comment in config docstring — explicit "page-side WebGL is f32, not f64. Byte-equality fixture pins are NOT compatible with this path; the new path's correctness pin is PSNR-based." That's exactly the right thing to call out, and matches the regression-fixture pattern. ✓
- Capture-mode auto-disable —
frameCapture.ts:session.captureMode !== "beginframe"is the right pivot. ✓ manager.tspuppeteer cache version ordering — the lexicographic.sort().reverse()→ numericcompareVersionDirsDescendingfix is genuinely useful (linux-99.xwas sorting ahead oflinux-148.xbecause'9' > '1') and the comment is sharp. But it's orthogonal to the perf work — see scope note below.
Worth confirming before merge — body-vs-diff mismatch on the video gate
The PR body lists the auto-disable conditions:
"Auto-disables when unsafe: HDR, alpha output, video content, beginFrame mode"
and later in the gating section:
"
composition.videos.length === 0(cloneNodeloses video playback state)"
But the usePageSideCompositingForTransitions predicate in renderOrchestrator.ts:1752-1762 is:
cfg.enablePageSideCompositing &&
compiled.hasShaderTransitions &&
!hasHdrContent &&
!isPngSequence &&
!needsAlpha;— I don't see composition.videos.length === 0 in there. Three of the four body-promised gates are present (HDR via !hasHdrContent, alpha via !needsAlpha, beginFrame via the frameCapture.ts site), but the video-count gate is missing from this predicate. Two possibilities:
- The gate exists at a different layer — e.g.,
engineModePageComposite.tsprobes<video>elements at page-side init and bails out (no__hf_page_composite_pendingflag → orchestrator's stub injection is a no-op). If so, please confirm — the orchestrator-level early-out would still be cleaner because it would skip injecting the stub script entirely. - The gate isn't there yet —
cloneNodeon a playing<video>produces a paused clone at the source'scurrentTime, anddrawElementImagewould read that paused frame. Page-side compositing on a video-bearing composition would silently freeze the video plane on every transition frame. The test plan's unchecked "Test with video-containing composition (should auto-disable)" row suggests this hasn't been verified yet.
If it's #2, the body's "auto-disables when unsafe: video content" claim is a Rule-3 mismatch and worth a one-line addition: && composition.videos.length === 0 (or the equivalent — compiled.hasVideos if that derived flag exists).
Required-check failures gating merge
-
CLI smoke (required)— failure (job 76221755878). Couldn't pull the log, but the diff adds a new--page-side-compositingCLI flag + the newscripts/page-side-compositing-smoke/run.mjsinvokes the bundled CLI. Ifpackages/cli/dist/cli.jshasn't been rebuilt to include the flag yet, the smoke script'sassertCliCanarywould fail. Worth checking whether the CI'sCLI smokejob is running this new smoke script or the standard one. -
CodeQL— failure + advanced-security bot review surfaced one alert: "Insecure creation of file in the os temp dir" atscripts/page-side-compositing-smoke/run.mjs. The script hardcodes/tmp/hf-page-side-smoke/asWORK_DIR— predictable path in shared/tmpcan allow symlink-attack pre-creation. Low-risk for a local smoke test, but easy to clear: replace the hardcoded/tmp/hf-page-side-smokewithmkdtempSync(path.join(os.tmpdir(), "hf-page-side-smoke-")). CodeQL alert closes automatically once the predictable path is gone. -
Analyze (python) — failure— likely a CodeQL Python-analyzer workflow issue (this PR has no Python). Probably the same root as the CodeQL failure above; check the workflow log.
Non-blocking notes
-
Scope —
manager.tspuppeteer-cache version ordering (+~60 LoC) and the system-Chrome warn-string nudge are genuinely good fixes but unrelated to page-side compositing. Reviewable as-is, but a separate PR would have a cleaner revert surface if someone needs to roll back the perf work without losing the cache-ordering fix. -
frameCapture.ts:hasPendingCompositeread order — the secondpage.evaluate(resolve) doesn't checkcancelled/teardown state. If the render job is aborted between the micro-screenshot and the resolve call, the compositor stays in pending state for the next frame. Probably benign because the next seek + prepare resets it, but worth a one-lineif (!session.active) returnguard ifsession.activeexists in this scope. -
The 1×1
clippaint-force — semantically the same trick the engine uses forPage.captureScreenshotelsewhere; just noting it overlaps slightly with the always-on-clip change from hf#838 (scale: 1explicit clip changes Chrome compositor's geometry resolution path). The two changes coexist fine but if one regresses, the other shares the failure mode. -
drawElementImagepaint-record requirement — the body explains "drawElementImagerequires a cached paint record, which only exists after a compositor pass". The micro-screenshot is the forcing function. Worth noting inframeCapture.ts:711-728that this isn't a bandaid — it's the documented protocol. The inline comment is good; consider promoting to a doc-comment block since it's the non-obvious linchpin.
Once the video gate is confirmed (or added) and CI failures are sorted, happy to flip to APPROVE.
— Review by Rames Jusso (pr-review)
jrusso1020
left a comment
There was a problem hiding this comment.
Self-correction on the video-gate flag.
Magi's review noted "Removing the composition.videos.length === 0 gate is correct now that three-phase handles video" — and that's right. The three-phase protocol runs onBeforeCapture (video frame injection) BEFORE the page-side prepare() does cloneNode, so the clone picks up the injected <img> replacement (a static frame from the video) instead of a <video> clone that would freeze at the source's currentTime. The architecture replaces the need for the gate.
So my prior "missing video gate" framing was wrong — the gate is intentionally absent because the architectural change makes it unnecessary. What's left is a stale PR body: the body still lists composition.videos.length === 0 and "auto-disables when unsafe: video content" among the gates, but those describe an older version of the design. Worth a body update so reviewers (and future readers) aren't pointed at a check that no longer needs to exist.
My other flags stand:
- CLI smoke (required) failure — gates merge regardless
- CodeQL "Insecure temporary file" on
scripts/page-side-compositing-smoke/run.mjs— quickmkdtempSyncfix - Magi's
video_missing_timing_attrsblocker (lint rule checksdata-endinstead ofdata-duration— would false-positive every valid video element) is the substantive blocker
Once those three are addressed + the body's gate list is brought in sync with the three-phase architecture, happy to APPROVE.
— Self-correction by Rames Jusso (pr-review)
…works without explicit attrs The timing compiler now injects data-start="0" on <video> and <audio> elements that lack it. This makes discoverMediaFromBrowser() find the element (it queries video[data-start]), so the frame extraction pipeline activates automatically. Videos "just work" without requiring authors to add data-start, data-end, or id attributes. Also removes the video_missing_timing_attrs lint rule — the compiler handles the missing attributes automatically, so the lint rule would only false-positive. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…ming Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…ideo discovery Seeks the GSAP timeline in Puppeteer to discover when each video's parent scene is visible (opacity > 0). Uses coarse sampling at 100ms steps followed by binary search refinement to frame-level precision (1/60s). Only processes videos with the data-hf-auto-start sentinel so author-specified timing is never overridden. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…be stage Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…bug logging The probe stage was skipping browser launch when composition duration was already known, which meant discoverVideoVisibilityFromTimeline never ran. Now needsBrowser also checks for data-hf-auto-start sentinel in compiled HTML. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
jrusso1020
left a comment
There was a problem hiding this comment.
Re-approving on dbac2e7c. Magi's blocker is resolved, and the response is more complete than I expected.
Rather than fixing the buggy lint rule, you removed it entirely and replaced it with a different architectural shape:
Audited — the new commits
media.ts-28 /media.test.ts-42 — thevideo_missing_timing_attrsrule (Magi's blocker) is gone. The whole "warn on missing timing attrs" angle is replaced by the next two changes. ✓timingCompiler.ts+7 /timingCompiler.test.ts+26 — compile-time auto-injection ofdata-start="0"when missing, plus adata-hf-auto-startsentinel to mark which videos got auto-injection. Three new tests pin: (a) auto-inject when missing, (b) sentinel on auto-inject, (c) NO sentinel when author providesdata-start. Clean three-way distinction. ✓ This is the right shape — the compiler resolves the missing-attr problem at the source instead of warning about it.htmlCompiler.ts:discoverVideoVisibilityFromTimeline+103 — runtime video visibility probing: seeks GSAP timeline at 0.1s intervals to find when eachdata-hf-auto-startvideo's parent.scenehasopacity > 0, then binary-searches the boundaries to 1/60s frame precision. Suppresses events on seek (tl.totalTime(t, true)), resets to 0 before returning. ✓ Good cleanup discipline. Engine then writes the discovered window back tocomposition.videos[i].start/end.probeStage.ts+26 — wires the visibility discovery into the probe stage;needsBrowsernow also fires onhasAutoStartVideosso renders that previously skipped the probe (known duration, fully-resolved compositions) now run it when auto-injected timing is in play. Sensible — the alternative is rendering with the auto-injectedstart=0, duration=composition.durationdefaults which would write video frames where the scene isn't visible.
Body claim verification
- CodeQL now reports neutral (was failure on
a56ef01). The temp-file alert onscripts/page-side-compositing-smoke/run.mjslikely became stale-but-still-open rather than newly-failing — worth a quick glance at the security tab to confirm the alert is dismissed/resolved before merge, but no longer build-gating. - CLI smoke (required) still in-progress on this commit; will gate merge if it fails again.
- Body still mentions the
composition.videos.length === 0gate (in the "Gating" section) and "auto-disables when unsafe: video content." Stale text — the gate was intentionally removed when the three-phase protocol with pre-clone video frame injection landed. Worth a body update so future readers aren't pointed at a check that no longer needs to exist. - Test plan — still has "Test with video-containing composition (should auto-disable)" unchecked. With the new auto-injection + visibility-discovery path, the test should be re-framed as "Render a composition with videos and shader transitions — auto-injected timing windows discovered correctly, output preserves video frames at scene-visibility boundaries."
Non-blocking notes on the new runtime-discovery code
-
videoEl.closest(".scene") || videoElassumes a CSS class convention (htmlCompiler.ts:1196). If a composition uses different scene wrapping (div.layer,section.slide, etc.), the discovery falls back to the video element itself — which typically inherits opacity from its parent stacking context, so it may still produce the right answer. But if<video>hasopacity: 1while parent hasopacity: 0, computed opacity on the<video>is unaffected —getComputedStyle(videoEl).opacityreturns the video's own opacity, not the effective stacked opacity. Worth verifying compositions that don't use.scenestill discover correctly, or usegetBoundingClientRect()+ IntersectionObserver-style visibility instead. Probably fine for current HF authoring conventions; brittle to future composition shapes. -
hasAutoStartVideos = compiled.html.includes("data-hf-auto-start")(probeStage.ts) — substring check could false-positive on string literals containing that token in the compiled HTML. A regex against attribute syntax (e.g.,/data-hf-auto-start[=>\s]/) would be more robust. Minor. -
Probe perf cost — ~10Hz sampling means ~280
seekTl + getComputedStylecalls per video on a 28-second composition. At ~5ms per round-trip across the CDP boundary, that's ~1.4s added to the probe stage per video. For multi-video compositions, multiplicative. Acceptable given the discovery benefit, but worth a perf note in the comment or scalingSAMPLE_STEPwithcompositionDurationfor very long renders. -
Edge: video in a scene that's never visible —
firstVisible === null→continue, no window written, video keeps its auto-injectedstart=0default. Likely results in the engine rendering frames for a video the user can't see. Probably benign (the engine just generates output that gets composited atopacity: 0) but worth a log warning so authors notice their scene structure is busted.
Once the body's composition.videos.length === 0 gate text is updated (or removed) and CLI smoke comes back green, this is fully clear.
— Re-review by Rames Jusso (pr-review)
miguel-heygen
left a comment
There was a problem hiding this comment.
Re-reviewed on dbac2e7c. My original blocker (lint rule checking data-end instead of data-duration) is fully resolved — you went further than just fixing the attribute name:
- Removed the lint rule entirely — correct call. The problem was architectural, not a lint gap.
- Compile-time auto-injection — timing compiler injects
data-start="0"+data-hf-auto-startsentinel when videos lackdata-start. This is the right layer to solve it. - Runtime visibility discovery —
discoverVideoVisibilityFromTimelineseeks the GSAP timeline at 10Hz to find when each auto-start video's parent scene is visible, then binary-searches to 1/60s. Clever approach. - Probe trigger —
needsBrowsernow fires onhasAutoStartVideosso the browser probe runs even when duration is known.
Rames' non-blocking notes are worth tracking (.scene class assumption, substring check, scaling SAMPLE_STEP with duration) but none are blocking.
CLI smoke green. Three-phase compositor protocol + video support + auto-start pipeline all land together cleanly.
Replaces hardcoded /tmp/hf-page-side-smoke with a unique temp directory via mkdtempSync to resolve CodeQL "insecure temporary file" alert. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
jrusso1020
left a comment
There was a problem hiding this comment.
Re-approving on a995e304. All my prior flags now resolved + CI fully green.
The two new commits address the CodeQL temp-file finding I'd flagged: WORK_DIR = mkdtempSync(join(tmpdir(), "hf-page-side-smoke-")) is exactly the pattern I'd suggested — randomized temp dir per invocation, no symlink-attack window. Doc comment updated to use the <tmpdir> placeholder.
CI
- All required checks now green:
Test,Typecheck,Lint,Build,CodeQL(wasneutral/failureon prior commits → nowsuccess),Test: runtime contract,CLI smoke (required),Smoke: global install,Tests on windows-latest,Render on windows-latest,Preview parity,preview-regression,Perf:drift/fps/load/parity/scrub,player-perf,regression, all 8 regression-shards green ✓ mergeable_state: blockedis reviewer-gate
Recap of resolved items
- CodeQL "Insecure temporary file" → fixed via
mkdtempSync✓ - CLI smoke (required) failure on earlier commits → green now ✓
- Magi's
video_missing_timing_attrslint-rule blocker → resolved via lint-rule removal + compile-time auto-injection + runtime visibility discovery ✓ - My video-gate "missing" framing → self-corrected (architectural change makes the gate unnecessary) ✓
Clear to merge.
— Re-review by Rames Jusso (pr-review)
miguel-heygen
left a comment
There was a problem hiding this comment.
Re-reviewed on a995e304 (2 commits past the prior approvals on dbac2e7c).
New commits since last approval:
1cbf3f4e— fixes smoke test to usemkdtempSyncinstead of a static temp path (prevents collisions on concurrent runs)a995e304— oxfmt formatting on the smoke script
Both are trivial and don't touch the core architecture. The substance reviewed:
Architecture — page-side WebGL compositing (344-line engineModePageComposite.ts):
- Three-phase capture protocol (prepare → micro-screenshot → resolve) is clean. The clone-before-paint approach ensures
drawElementImagereads current bitmaps, not stale paint cache entries. - Seek wrapper installed via
setIntervalpolling forwindow.__hf.seek(up to 10s) — reasonable given the async bridge initialization order. The opacity-flip timeline still runs underneath, so fallback is automatic if the compositor fails. - Staging canvases use
layoutsubtree+ fixed z-index-9998— positioned off-viewport-layer to avoid capturing in the final screenshot.
Browser cache priority fix (manager.ts):
- Puppeteer cache now checked before HF-managed cache — correct, since HF cache is pinned to a stale Chrome version.
parseVersionSegments+compareVersionDirsDescendingfix the lexicographic sort bug (linux-99>linux-148character-wise). Good regression tests covering the exact scenario.- Note: the same lexicographic bug still exists in engine's
resolveHeadlessShellPath(mentioned in the comment). Worth a follow-up to fix there too.
Video timing auto-injection (timingCompiler.ts):
data-hf-auto-startsentinel cleanly separates author-provided from auto-injected timing, sodiscoverVideoVisibilityFromTimelineonly processes auto-injected videos.- Binary search for visibility boundaries (1/60s precision) is a smart approach — coarse linear scan + binary refinement keeps the seek count reasonable.
Orchestrator wiring:
addPreHeadScriptinjected after HDR check — correct order, since HDR forces layered path regardless.needsAlphaandisPngSequencealso bypass page-side compositing — all edge cases accounted for.
Smoke test:
- Tests bundled CLI path (not dev path) — validates the full pipeline including tsup bundling. Canary string check ensures the compositor code survived tree-shaking.
Clean, well-architected feature. The two post-approval commits are trivial fixes.

Summary
Moves shader transition blending from the Node-side layered pipeline into Chrome's WebGL, using a two-phase
drawElementImagecapture protocol. Default-on for SDR shader-transition renders — no flag needed.Performance (15 scenes, 14 shader transitions, 28s @ 30fps, Mac)
Published CLI gets slower with more workers on shader transitions (coordination overhead dominates). Page-side compositing scales linearly with workers.
drawElementImage(same capture path as preview mode)How it works
layoutsubtreestaging canvases + set pending flagPage.captureScreenshot(1×1 clip) forces browser compositor to paint staging canvas childrendrawElementImagereads paint records → WebGL shader blend → GL overlay displayedWhy two phases?
Under virtual time,
requestAnimationFrameis shimmed and the browser compositor doesn't paint new elements.drawElementImagerequires a cached paint record, which only exists after a compositor pass. The micro-screenshot forces that pass without needing rAF.Gating
Auto-enabled when all conditions met:
compiled.hasShaderTransitions!hasHdrContent(HDR needs per-layer alpha compositing in Node)!needsAlpha(WebM/MOV/PNG-sequence need transparent captures)composition.videos.length === 0(cloneNodeloses video playback state)captureMode !== "beginframe"(CDP micro-screenshot conflicts with beginFrame compositor)Use
--no-page-side-compositingto force the Node-side layered path.Changes by package
engine
frameCapture.ts: two-phase protocol — detect pending flag after seek, micro-screenshot paint force, call resolveconfig.ts:enablePageSideCompositingdefaults totrueshader-transitions
engineModePageComposite.ts: full rewrite — two-phasedrawElementImagecompositor replacing the original brokendrawElementImageattempt and the intermediatehtml2canvasapproachcapture.ts: exportstabilizeTransformedBoxShadowsproducer
renderOrchestrator.ts: unified gating predicate (usePageSideCompositingForTransitions) gates both stub injection and layered-path bypass;addPreHeadScriptfor probe-created fileServer;!needsAlphaandvideos.length === 0exclusionsfileServer.ts:addPreHeadScriptmethod onFileServerHandlefor post-creation stub injectioncli
render.ts: flag default flipped totrue;--no-page-side-compositingto disabledockerRunArgs.ts: forwards--no-page-side-compositingwhen disabledTest plan
npx hyperframes(both 1 and 6 workers)--no-page-side-compositingfalls back to layered path🤖 Generated with Claude Code