Skip to content

Prefer NVENC for WebRTC H264 encoding#2329

Open
balthazur wants to merge 8 commits into
mainfrom
codex/h264-nvenc-webrtc
Open

Prefer NVENC for WebRTC H264 encoding#2329
balthazur wants to merge 8 commits into
mainfrom
codex/h264-nvenc-webrtc

Conversation

@balthazur
Copy link
Copy Markdown
Contributor

@balthazur balthazur commented May 12, 2026

Summary

  • Prefer NVIDIA h264_nvenc for aiortc H.264 WebRTC encoding when available.
  • Fall back to the existing libx264 encoder when NVENC is unavailable or fails.
  • Add scoped [WEBRTC_NVENC] logs for patch activation, encoder selection, fallback, and resolution-triggered encoder recreation.

Why

Modal GPU workers run on NVIDIA GPUs, so H.264 hardware encoding can reduce server-side encode latency without changing the negotiated browser codec. H.264 stays the WebRTC-compatible path; this does not introduce H.265/HEVC or bitrate/congestion-control changes.

Testing

  • python3 -m py_compile inference/core/interfaces/webrtc_worker/h264_nvenc.py inference/core/interfaces/webrtc_worker/webrtc.py inference/core/interfaces/stream_manager/manager_app/webrtc.py
  • Local smoke test with repo venv and aiortc 1.14.0: confirmed h264_nvenc fallback to libx264 on a non-NVIDIA Mac still produces H.264 packets.
  • git diff --check

Staging benchmark

Encoder Hardware 1080p encode time Relative speed Notes
libx264 CPU ~14-17 ms/frame steady-state baseline Works everywhere, but uses CPU and can become expensive at higher resolution/bitrate.
h264_nvenc NVIDIA GPU avg ~2.6 ms/frame, min ~2.1 ms, max ~3.1 ms ~5-6x faster Uses the dedicated NVIDIA video encoder on T4/L4/L40S. Frees CPU and gives much more headroom for 1080p WebRTC output.

In staging, software libx264 encoding took roughly 14-17 ms per 1080p frame. With h264_nvenc, sampled encode time dropped to about 2.6 ms per frame, roughly a 5-6x improvement. This does not by itself solve all WebRTC congestion/ramp-up behavior, but it removes server-side H.264 encoding as a major bottleneck on NVIDIA GPU Modal workers.

For comparison, I also tested VP8: backend VP8 encode sampled around 10-17 ms/frame, while browser-side encode/decode stayed in the same low-ms range as H.264, so VP8 did not look like the better backend output codec.


Note

Medium Risk
Monkey-patches aiortc's H264Encoder to prefer GPU NVENC, which can affect WebRTC video stability/compatibility and may surface runtime/driver-specific failures despite the fallback path.

Overview
Prefers NVIDIA NVENC for WebRTC H.264 video encoding when available.

Adds h264_nvenc support via a runtime patch (prefer_h264_nvenc_encoder) that overrides aiortc.codecs.h264.H264Encoder._encode_frame to try opening an NVENC av.CodecContext, update bitrate, recreate the encoder on resolution change, and fall back to the original libx264 path on unavailability or encode errors (with [WEBRTC_NVENC] scoped logs).

Enables this behavior by invoking prefer_h264_nvenc_encoder() at import time in both the stream manager WebRTC app and the WebRTC worker entrypoints.

Reviewed by Cursor Bugbot for commit 8805940. Bugbot is set up for automated code reviews on this repo. Configure here.

Copy link
Copy Markdown

@cursor cursor Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Fix All in Cursor

❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

Reviewed by Cursor Bugbot for commit 6a9096f. Configure here.

Comment thread inference/core/interfaces/webrtc_worker/h264_nvenc.py Outdated
@grzegorz-roboflow
Copy link
Copy Markdown
Collaborator

lets measure this, CPU->GPU->CPU round trip might make this actually slower

@balthazur
Copy link
Copy Markdown
Contributor Author

lets measure this, CPU->GPU->CPU round trip might make this actually slower

@grzegorz-roboflow what exactly do you want to measure? So far, I measured the raw encoding time. Let me know, I can run a comparison

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants