Skip to content

Fix race in tunnel waitGroup causing negative counter panic#586

Merged
jpillora merged 1 commit into
masterfrom
fix-wg-negative-counter
Apr 30, 2026
Merged

Fix race in tunnel waitGroup causing negative counter panic#586
jpillora merged 1 commit into
masterfrom
fix-wg-negative-counter

Conversation

@jpillora
Copy link
Copy Markdown
Owner

Summary

  • Replace the lock-free atomic.Int32 counter in share/tunnel/wg.go with a sync.Mutex so Add/Done/DoneAll observe consistent state
  • Add() previously bumped its atomic counter before calling inner.Add(). A concurrent DoneAll() (from BindSSH's ctx-cancel goroutine) could observe n>0 and call inner.Done() in that window, panicking with sync: negative WaitGroup counter — exactly the stack trace in panic: sync: negative WaitGroup counter #585
  • Wait() stays outside the mutex to avoid deadlock against blocked waiters

Fixes #585

Test plan

  • go test -race ./... passes
  • Added share/tunnel/wg_test.go covering: idempotent Done, DoneAll drain, and concurrent Add/DoneAll stress under -race

🤖 Generated with Claude Code

Add() bumped its atomic counter before calling inner.Add(), so a
concurrent DoneAll() could observe n>0 and call inner.Done() before
inner.Add() ran, panicking with "sync: negative WaitGroup counter".
Serialize Add/Done/DoneAll under a mutex; Wait stays outside to avoid
deadlock.

Fixes #585

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings April 30, 2026 22:10
@jpillora jpillora merged commit b9d1219 into master Apr 30, 2026
12 checks passed
@jpillora jpillora deleted the fix-wg-negative-counter branch April 30, 2026 22:14
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes a concurrency race in the tunnel’s custom waitGroup wrapper that could panic with sync: negative WaitGroup counter under concurrent Add() and DoneAll() usage (as reported in #585).

Changes:

  • Replace the lock-free atomic counter with a sync.Mutex-protected counter to make Add/Done/DoneAll observe consistent state.
  • Adjust Done/DoneAll to update the counter and underlying sync.WaitGroup while holding the same mutex.
  • Add tests covering idempotent Done, DoneAll drain behavior, and a concurrent Add/DoneAll stress scenario.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File Description
share/tunnel/wg.go Makes the wrapper’s counter updates and underlying sync.WaitGroup operations mutually consistent via a mutex, eliminating the observed race window.
share/tunnel/wg_test.go Adds regression/stress tests to exercise the previously failing concurrent behavior and validate wrapper semantics.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread share/tunnel/wg.go
Comment on lines +13 to +15
w.n += n
w.inner.Add(n)
w.mu.Unlock()
Comment thread share/tunnel/wg.go
Comment on lines +28 to +33
w.mu.Lock()
for w.n > 0 {
w.n--
w.inner.Done()
}
w.mu.Unlock()
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

panic: sync: negative WaitGroup counter

2 participants