Skip to content

docs: design convergence optimization plan#60

Merged
0x1abin merged 18 commits intomainfrom
cursor/design-optimization-plan-85ea
May 7, 2026
Merged

docs: design convergence optimization plan#60
0x1abin merged 18 commits intomainfrom
cursor/design-optimization-plan-85ea

Conversation

@0x1abin
Copy link
Copy Markdown
Owner

@0x1abin 0x1abin commented Apr 28, 2026

Summary

  • Update the design draft to match the current feature matrix, resource profile, and RFC-first protocol guidance.
  • Replace the stale week-based implementation schedule with the current phase status.
  • Add Phase 10 as an active design convergence plan covering documentation convergence, output lifetimes, timeout API, and RTC orchestration split.
  • Merge latest origin/main into this branch.
  • Remove the public nanortc_t layout / API-ABI boundary optimization item from the plan.

Verification

  • CC=gcc CXX=g++ ./scripts/ci-check.sh --clean — 42/42 passed before merging latest main.
  • CC=gcc CXX=g++ ./scripts/ci-check.sh --fast — 15/15 passed after merging latest main.
  • CC=gcc CXX=g++ ./scripts/ci-check.sh --fast — 15/15 passed after removing the public layout optimization item.
Open in Web Open in Cursor 

Co-authored-by: 0x1abin <0x1abin@gmail.com>
@cursor cursor Bot force-pushed the cursor/design-optimization-plan-85ea branch from 2a09417 to c219f40 Compare April 28, 2026 03:28
@cursor cursor Bot changed the title docs: 设计收敛优化计划 docs: design convergence optimization plan Apr 28, 2026
…ation-plan-85ea

Co-authored-by: 0x1abin <0x1abin@gmail.com>
@0x1abin 0x1abin marked this pull request as ready for review April 28, 2026 03:41
Copilot AI review requested due to automatic review settings April 28, 2026 03:41
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Updates NanoRTC’s documentation set to reflect current implementation reality and adds an active Phase 10 plan to drive “design convergence” work into small, reviewable follow-up PRs.

Changes:

  • Added Phase 10 execution plan covering API boundary/lifetime rules, timeout API, accessor cleanup, and nano_rtc.c refactor strategy.
  • Refreshed the design draft’s feature matrix/module mapping, resource references, and roadmap/status section to match current phases and “RFC-first” guidance.
  • Updated the plans index (docs/PLANS.md) to include Phase 10 and adjust total session estimates.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.

File Description
docs/exec-plans/active/phase10-design-convergence.md New active plan defining Phase 10 scope and sub-PR breakdown for convergence/API boundary work
docs/design-docs/nanortc-design-draft.md Updates feature flags/mapping, resource pointers, and roadmap/status summary to match current state and RFC-first guidance
docs/PLANS.md Adds Phase 10 to the active plans index and updates total effort estimate

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread docs/exec-plans/active/phase10-design-convergence.md Outdated
cursoragent and others added 16 commits April 28, 2026 03:56
Co-authored-by: 0x1abin <0x1abin@gmail.com>
Use the standard `**Estimated effort:**` label and reflect the
2–3 session estimate already in `docs/PLANS.md` so the index
and the phase doc do not contradict each other.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…etime contract

Adds a public deadline-aggregator API so embedded event loops can block
in select()/poll()/epoll_wait()/RTOS waits up to the next protocol tick
instead of a fixed cadence. Aggregates per-subsystem deadlines from ICE
(check pacing + consent), STUN srflx retry, TURN refresh timers, SCTP
RTO/heartbeat, and RTCP SR cadence; clamps to NANORTC_MIN_POLL_INTERVAL_MS
during DTLS handshake (crypto-provider-driven retransmit) and falls back
to a 1000 ms idle cap so callers never sleep indefinitely.

Per-module helpers: ice_next_timeout_ms / turn_next_timeout_ms /
nsctp_next_timeout_ms expose the smallest armed deadline of each layer.

Documents the output queue's pointer-lifetime contract on
nanortc_output_t: payload pointers (transmit.data, event payloads) are
valid only until the next nanortc_poll_output() / nanortc_handle_input()
/ nanortc_destroy() call. Internal scratch (turn_buf, video pkt_ring,
DTLS/SRTP buffers) is reused, so callers must drain or memcpy before
re-entering the library.

Real callsites: examples/common/run_loop_linux.c and run_loop_esp.c
shorten their select() timeout via the aggregator before blocking, so
DTLS retransmits / ICE checks / SCTP RTOs that fall inside the
max_poll_ms ceiling fire on the library's actual deadlines.

Coverage:
- tests/test_next_timeout.c — idle cap, DTLS handshake clamp, ICE
  pacing, ICE consent, STUN srflx retry, TURN refresh, multi-source min
- tests/test_output_lifetime.c — pkt_ring drain-before-wrap, pkt_ring
  overrun aliasing, TURN lazy wrap rewrite, oversized wrap drop

Closes Phase 10 PR-2 + PR-3.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…negotiate.c

Moves the SDP-driven negotiation entry points (nanortc_create_offer /
nanortc_accept_offer / nanortc_accept_answer), trickle-ICE candidate
helpers (nanortc_add_local_candidate / nanortc_add_remote_candidate /
nanortc_end_of_candidates plus the candidate-string builder), and the
WebRTC-style iceServers configuration (nanortc_set_ice_servers /
rtc_apply_ice_servers) out of nano_rtc.c into a sibling translation
unit nano_negotiate.c.

The split is deliberately shallow: the receive/timer backbone
(rtc_process_receive, rtc_process_timers, output queue helpers, media
send paths) stays in nano_rtc.c because it shares state — out_queue,
TURN wrap meta, SRTP session — that splitting would either fragment
or force into widely-exposed internal helpers.

Cross-file contract lives in src/nano_rtc_internal.h (not part of the
public API): rtc_emit_event_full, rtc_build_candidate_str,
rtc_emit_ice_candidate, rtc_cache_fingerprint, rtc_apply_ice_servers.

No public API change. Module dependency graph unchanged. CMake source
list updated for both host and ESP-IDF builds.

Closes Phase 10 PR-4.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Phase 10 PR-1 documentation pass:

- design-docs/nanortc-design-draft.md — drop the inline event-loop
  template, the per-module line-count estimates, the duplicated phase
  status table, and the verbose crypto provider / SDP-conditional code
  walkthroughs. Pivot to a "Further reading" pointer block (ARCHITECTURE,
  PLANS, QUALITY_SCORE, RFC index, memory profiles, run_loop.c) so the
  draft is the *concept* document and the authoritative implementation
  details live where the code is.
- engineering/development-workflow.md — add a Design Freshness Checklist
  so PRs that change protocol behavior, public API, feature flags, or
  resource footprint update the four canonical docs in the same PR.
- exec-plans/active/phase10-design-convergence.md — mark PR-3 LANDED
  with concrete file references (include/nanortc.h, src/nano_rtc.c,
  tests/test_next_timeout.c, run_loop_{linux,esp}.c) and remove the
  duplicated API contract / verification list (single source of truth
  is the public-header doxygen).
- exec-plans/active/phase8-continued-optimization.md — collapse P2-C
  into a one-line redirect; design context lives in Phase 10 PR-3.
- PLANS.md — phase 10 status row reflects PR-3 landed 2026-04-29.

Closes Phase 10 PR-1.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
CI's "Architecture Constraints / symbol naming" check enforces an
allowlist of internal module prefixes (nano_, nanortc_, stun_, ice_,
turn_, …). The Phase 10 PR-4 split exposed five helpers that were
previously file-private static symbols and used a bare rtc_ prefix —
not on the allowlist:

  rtc_apply_ice_servers
  rtc_build_candidate_str
  rtc_cache_fingerprint
  rtc_emit_event_full
  rtc_emit_ice_candidate

Rename them to nano_rtc_* so they fit the existing nano_ allowed
prefix and read as the rtc orchestration sub-symbols, matching the
nano_rtc_internal.h filename convention.

No behavior change. Run clang-format on the touched files for the
one line that overran the column limit after the longer name.

Verification: cmake build with all features (DC+AUDIO+VIDEO+H265),
nm symbol scan with the CI allowlist (zero violations), 26/26 ctest
pass, clang-format clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Move the rtc_enqueue_transmit() out_queue overflow counter out of the
NANORTC_FEATURE_TURN guard so CORE_ONLY/DATA/AUDIO builds also see
caller-side drain bugs surface as a non-zero counter instead of a
silent drop. The accompanying NANORTC_LOGW already fired universally;
only the stats field was TURN-only.

Also add a tests/CMakeLists.txt small-ring variant
(test_output_lifetime_min) that compiles the existing lifetime burst
test against NANORTC_OUT_QUEUE_SIZE=4 + NANORTC_VIDEO_PKT_RING_SIZE=4
so the aliasing window is small enough for the cases
(drain-before-wrap, overrun-aliases-pre-drain, lazy-wrap-rewrites,
oversized-payload-drops) to actually trip in CI. At default 32/32
sizes the burst the test generates fits without wrap, which masked
regressions in the dispatch path.

Library struct layout is fixed at compile time, so the variant
re-compiles NANORTC_SOURCES into the test executable with the
overrides; ccache makes the rebuild near-free.

PR-2 §Verification — phase 10 design convergence.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…sk log

Wire the PR-2 audit signals into the example so the bench operator can
sample them without a working serial monitor:

- /debug HTTP endpoint adds a "lifetime" line carrying out_q occupancy,
  pkt_overrun, wrap_drop, tx_full, via_turn, and direct counters
  (TURN-only counters fall back to 0 when NANORTC_FEATURE_TURN is off).
  Bumps the response buffer 512 -> 768 to fit the new fields.
- webrtc_task emits the same line via ESP_LOGI every 5 s while
  s_connected, so a passive serial capture is enough to verify the
  contract holds across a full session.

5 s cadence keeps the serial console quiet under static scenes while
pinpointing under-sized rings or caller drain bugs before they reach
the wire. Used during the phase 10 PR-2 hardware bench on ESP32-P4
nano.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…p HW H.264

Phase 10 PR-2 swept smaller ring sizes (16/16, then 32/16) on the
ESP32-P4 nano camera bench. Both initially looked clean over a 5-min
static-scene capture but regressed once the camera saw real motion:
1080p P-frames in motion hit 12-15 KB → 11-13 FU-A fragments per
nanortc_send_video() call, which combined with concurrent audio
(50 pkt/s) + RTCP + ICE consent leaves no headroom in the dispatch
queue under FreeRTOS scheduling jitter, surfacing as repeated
"tx queue full, dropping output" log lines and visible playback
freezes.

- sdkconfig.defaults: keep CONFIG_NANORTC_OUT_QUEUE_SIZE=32 (pkt_ring
  inherits) and replace the comment with a one-liner pointing at the
  PR-2 sweep finding so future tuning attempts see the prior failure
  mode without re-reading the doc.
- main/Kconfig.projbuild: bump CONFIG_EXAMPLE_H264_BITRATE_KBPS default
  1024 -> 2048. The encoder isn't starving detail in moving scenes at
  the higher target, and the rings are sized to absorb the resulting
  bursts; the help text records the tradeoff.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…izing caveat

- PLANS.md: mark phase 10 PR-2 (output lifetime + ESP32-P4 nano
  hardware bench) as landed 2026-04-30.
- exec-plans/active/phase10-design-convergence.md: replace the prior
  "Resource sweep (follow-up)" + "Correction" sections with a single
  tight paragraph documenting that 16/16 and 32/16 both broke under
  bursty 1080p P-frames, why the 5-min static-scene methodology
  missed it, and why the example now pins 32/32 + 2048 kbps.
- engineering/memory-profiles.md: split the OUT_QUEUE_SIZE column to
  show the per-example tier (8 audio/DC, 16 esp32_video, 32
  esp32_camera) and add a callout under the NACK ring section that
  OUT_QUEUE_SIZE itself is not freely shrinkable on HD profiles —
  give the sizing formula
  (>= ceil(max_p_frame_bytes/MTU) + audio + control_headroom).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Phase 10 PR-4 slice 2: the trailing NANORTC_HAVE_MEDIA_TRANSPORT region
of nano_rtc.c (track add APIs, audio/video send + packetization, video
pkt_ring, H.265 sprop, keyframe/PLI feedback, BWE knobs, per-track stats)
only touched media-owned state on nanortc_t and is moved verbatim into
src/nano_rtc_media.c. The file-static rtc_enqueue_transmit helper is
promoted to nano_rtc_enqueue_transmit and declared in
nano_rtc_internal.h so the new TU can enqueue without re-implementing
the lazy TURN wrap path or the output lifetime contract. CMake source
lists (host + ESP-IDF) updated; nano_rtc.c shrinks from 2,623 to 1,930
lines. Public API, module dependency graph, and feature-matrix behavior
are unchanged. Phase 10 doc records the slice as landed.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…rtc_media.c

Phase 10 PR-4 slice 3 — pure move, no behavior change.

- Add nano_rtc_media_handle_rtp_or_rtcp() and nano_rtc_media_emit_rtcp_sr_cadence()
  to nano_rtc_internal.h.
- rtc_process_receive() and rtc_process_timers() shrink to one-line dispatchers
  under #if NANORTC_HAVE_MEDIA_TRANSPORT.
- Prune now-unused #include (nano_rtcp.h, nano_twcc.h, nano_h264.h) from nano_rtc.c.
- nano_rtc.c now owns the transport backbone (ICE/TURN/DTLS/SCTP demux, output
  queue, timer dispatcher, lifecycle); nano_rtc_media.c owns every media-specific
  path (track APIs, send packetization, pkt_ring, BWE knobs, RTP/RTCP receive,
  RTCP SR cadence).

./scripts/ci-check.sh passed 42/42 incl. libdatachannel interop.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Pure rename for naming consistency with nano_rtc_media.c. Updates
CMakeLists.txt source lists (host + ESP-IDF), comment refs in
nano_rtc.c and nano_rtc_internal.h, and the phase10 plan log entry.
No logic change.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sync ARCHITECTURE.md, design draft, QUALITY_SCORE, and PLANS with the
post-PR-4 orchestration split (nano_rtc.c + nano_rtc_negotiate.c +
nano_rtc_media.c) and current 7-combo feature matrix.

- ARCHITECTURE.md: extend module graph with nano_h265 / nano_twcc /
  nano_log / nano_media / nano_annex_b / nano_base64; add explicit rows
  for TURN, H265, ICE_SRFLX in the feature table; rewrite State Machine
  layer + Key Files to cover the orchestration trio and
  src/nano_rtc_internal.h.
- design-draft: drop the stale "~60KB RAM" line in §1.1, add a
  MEDIA_H265 row (TBD pending Phase 3.5 wiring) to §1.3, and rewrite
  §6.1 so it no longer contradicts §8's reference table.
- QUALITY_SCORE: add Media common (nano_media), nano_twcc, and the
  logging shim; rewrite the Main FSM row for the three-TU split; note
  the matrix grew to 7-combo x 2-crypto.
- development-workflow: promote ARCHITECTURE.md to the first item in
  the Design Freshness Checklist (5 docs, not 4).
- PLANS / phase10: mark PR-1 landed 2026-05-03 and record the landed
  work subsection.
```
@0x1abin 0x1abin merged commit 23c9c01 into main May 7, 2026
25 checks passed
@0x1abin 0x1abin deleted the cursor/design-optimization-plan-85ea branch May 7, 2026 02:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants