Skip to content

Add optimizations for connection & handshake#820

Open
noboruma wants to merge 4 commits intopion:mainfrom
noboruma:mem-opt
Open

Add optimizations for connection & handshake#820
noboruma wants to merge 4 commits intopion:mainfrom
noboruma:mem-opt

Conversation

@noboruma
Copy link
Copy Markdown
Contributor

@noboruma noboruma commented Mar 28, 2026

Description

Add MarshalInto functions to allow for buffer pre-allocation of headers
Add pooling around timers
Avoid unnecessary copies in compact scenarios where the packets are already MTU size
Add cheap cache for MessageCertificate
Avoid unnecessary copy/allocations in Content ApplicationData

@noboruma noboruma force-pushed the mem-opt branch 3 times, most recently from 160fcda to 4b744a7 Compare March 29, 2026 15:36
@noboruma noboruma marked this pull request as draft March 30, 2026 01:22
@noboruma noboruma force-pushed the mem-opt branch 11 times, most recently from 6e083b7 to ca3feea Compare March 30, 2026 08:06
@theodorsm
Copy link
Copy Markdown
Member

Thanks for working on optimizing the library, really cool!

Just a few comments on the draft so far:

I don't know if we should export a new API with the MarshalInto functions, I am slightly against it, but I can be persuaded. Is that a common pattern?

Could you provide some benchmarks with before/after your optimizations?

@JoTurk
Copy link
Copy Markdown
Member

JoTurk commented Mar 30, 2026

Our RTP library has MarshalTo for packets[1] and headers[2], and we introduced it to extensions recently [3], I think we should look into a standardized patterns across pion for zero allocation (i see a lot of PRs for reducing allocations and better APIs).

  1. https://github.com/pion/rtp/blob/8331114bb459ef73fc51fd710aaedd1b78f407c6/packet.go#L527
  2. https://github.com/pion/rtp/blob/8331114bb459ef73fc51fd710aaedd1b78f407c6/packet.go#L272
  3. pion/rtp@1577354

+1 for MarshalTo, but since we're planning to do major release for RTP soon, we can change it to something else.

@noboruma noboruma force-pushed the mem-opt branch 3 times, most recently from 694ce4d to 1ebfd5b Compare April 2, 2026 01:00
@noboruma
Copy link
Copy Markdown
Contributor Author

noboruma commented Apr 2, 2026

@JoTurk Do you suggest to rename MarshalInto into MarshalTo and adopt the same API (returning n copied bytes)?
I am also proposing another function: Size() but I find it a bit too generic, MarshalSize() might be better?
This helps identifying how many bytes are needed before doing marshalling, so we can turn allocs from O(n) to O(1).

Just some heads up @theodorsm , we are doing handshakes at scale (>100k connections) and this change is reducing CPU usage significantly (50% spikes reductions, GC duration 80% down) - memory usage is smoother overall. However we have detected a leak so it makes bench-marking a bit tricky right now. I will definitely give more accurate numbers soon.
There are many paths were we used to do 2+ allocs and copies and now we only need 1 alloc. This helps the GC overall

@noboruma noboruma marked this pull request as ready for review April 2, 2026 01:08
@noboruma noboruma force-pushed the mem-opt branch 2 times, most recently from 36aa143 to 17e4f90 Compare April 2, 2026 01:23
@codecov
Copy link
Copy Markdown

codecov Bot commented Apr 2, 2026

Codecov Report

❌ Patch coverage is 81.38075% with 89 lines in your changes missing coverage. Please review.
✅ Project coverage is 82.18%. Comparing base (d7a09d4) to head (d26cbc8).

Files with missing lines Patch % Lines
...otocol/handshake/message_certificate_request_13.go 62.50% 9 Missing and 6 partials ⚠️
pkg/protocol/handshake/message_certificate_13.go 77.55% 6 Missing and 5 partials ⚠️
.../protocol/handshake/message_server_key_exchange.go 79.62% 10 Missing and 1 partial ⚠️
pkg/protocol/handshake/handshake.go 52.63% 5 Missing and 4 partials ⚠️
pkg/protocol/handshake/message_client_hello.go 87.75% 4 Missing and 2 partials ⚠️
pkg/protocol/handshake/message_server_hello.go 84.21% 3 Missing and 3 partials ⚠️
.../protocol/handshake/message_client_key_exchange.go 83.87% 3 Missing and 2 partials ⚠️
...g/protocol/handshake/message_certificate_verify.go 69.23% 2 Missing and 2 partials ⚠️
...protocol/handshake/message_hello_verify_request.go 63.63% 2 Missing and 2 partials ⚠️
pkg/protocol/recordlayer/header.go 60.00% 2 Missing and 2 partials ⚠️
... and 8 more
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #820      +/-   ##
==========================================
- Coverage   82.47%   82.18%   -0.30%     
==========================================
  Files         121      121              
  Lines        6928     7240     +312     
==========================================
+ Hits         5714     5950     +236     
- Misses        803      846      +43     
- Partials      411      444      +33     
Flag Coverage Δ
go 82.18% <81.38%> (-0.30%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@JoTurk
Copy link
Copy Markdown
Member

JoTurk commented Apr 2, 2026

@JoTurk Do you suggest to rename MarshalInto into MarshalTo and adopt the same API (returning n copied bytes)?
I am also proposing another function: Size() but I find it a bit too generic, MarshalSize() might be better?

yeah Size might be too generic, we have many MarshalSize in Pion. and we use it with MarshalTo in RTP.

@noboruma noboruma force-pushed the mem-opt branch 2 times, most recently from f77901b to cfa33b3 Compare April 6, 2026 04:49
@noboruma
Copy link
Copy Markdown
Contributor Author

noboruma commented Apr 10, 2026

@theodorsm @JoTurk I have updated the API
Any blocker on your side? At our scale (200k DTLS connections) we see a huge difference with this change. CPU and RAM usage are benefiting a lot here, as you can see here:
Screenshot 2026-04-10 at 22 06 33

^ Here is a 20h run. On the left the stable piece is with the proposed changes, on the right after 12h is when using origin/main directly (there is 1 reboot in between).

@JoTurk
Copy link
Copy Markdown
Member

JoTurk commented Apr 10, 2026

Direction looks good to me, thank you so much for up streaming this btw. I'll need to review the details tho.
Can you fix the failing tests?

@noboruma noboruma force-pushed the mem-opt branch 5 times, most recently from 09d21b8 to 169bbdd Compare April 11, 2026 12:11
@noboruma
Copy link
Copy Markdown
Contributor Author

@JoTurk I have fixed the tests, only the API break is being reported but this one should be ignored?

@JoTurk
Copy link
Copy Markdown
Member

JoTurk commented Apr 11, 2026

I have fixed the tests, only the API break is being reported but this one should be ignored?

No sadly, we can't break the API yet (We plan to do it with DTLS 1.3 is ready), we'll have to do a work around. or introduce another API.

@JoTurk
Copy link
Copy Markdown
Member

JoTurk commented Apr 28, 2026

@noboruma Just a heads-up, we've started introducing breaking changes on main for DTLS 1.3, and we won't be cutting tags from main anytime soon until v4. If you're fine waiting, you can keep the breaking API and proceed with review and merging into main.

If you'd prefer to get this feature into v3 now instead, feel free to re-target the PR to the dtls-1.2 branch.

thank you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants