perf(dso): release bpfAllocationMutex during binaryParser.Parse — eliminate serialization#100
Open
KorsarOfficial wants to merge 1 commit intoyandex:mainfrom
Open
perf(dso): release bpfAllocationMutex during binaryParser.Parse — eliminate serialization#100KorsarOfficial wants to merge 1 commit intoyandex:mainfrom
KorsarOfficial wants to merge 1 commit intoyandex:mainfrom
Conversation
Author
|
📄 Full analysis report (PDF): 08-perforator-optimizations.pdf Covers complexity analysis, concurrency audit, and verification for all 7 optimizations. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Closes #94
Problem
populateDSOholdsbpfAllocationMutexfor the entire function body, including the expensivebinaryParser.Parse(ctx, f)call. When multiple goroutines load different DSOs concurrently, they serialize on this lock even though each parses an independent ELF file.Concurrency analysis
Let k = concurrent DSO loads, T_parse = ELF parse latency.
For k = 8 concurrent loads, T_parse ≈ 50 ms each:
Implementation
Split into two critical sections with an unlocked Parse gap:
Key correctness details:
BinaryClassis computed as a local variable during the unlocked gap, then written todso.BinaryClassonly under the lock — no data race.bpfBinaryManager.Addwhen two goroutines race through Parse.MoveFromCache/Releaseremain under exclusive lock (they mutate BPF manager state).sync.Mutexretained (not RWMutex — no read-lock callers exist).