Skip to content

fix: detect and log WASM guard traps; mark module failed after panic#2698

Merged
lpcox merged 2 commits intomainfrom
copilot/integrity-audit-fix-warnings
Mar 28, 2026
Merged

fix: detect and log WASM guard traps; mark module failed after panic#2698
lpcox merged 2 commits intomainfrom
copilot/integrity-audit-fix-warnings

Conversation

Copy link
Copy Markdown
Contributor

Copilot AI commented Mar 27, 2026

The integrity audit (W-2) observed a wasm error: unreachable (Rust guard panic) that was only visible in logs under DEBUG=*, and after the trap the guard's internal state (global policy context set by label_agent) could be silently corrupted for subsequent calls.

Changes

internal/guard/wasm.go

  • Add failed/failedErr fields to WasmGuard (accessed only while mu is held) to track post-trap state
  • Add isWasmTrap(err error) bool — detects errors containing wasm error: (the prefix wazero/Rust produces on any trap)
  • callWasmFunction now:
    • Checks failed at entry and returns immediately with a wrapped error referencing the original trap
    • On a trap: calls logger.LogError("backend", ...) (visible in mcp-gateway.log without DEBUG=*), sets failed = true
if isWasmTrap(err) {
    logger.LogError("backend", "WASM guard trap: guard=%s, func=%s, error=%v", g.name, funcName, err)
    g.failed = true
    g.failedErr = err
}

internal/guard/wasm_test.go

  • TestIsWasmTrap: nil, generic, and various trap error strings
  • TestWasmGuardFailedState: module refuses further calls once failed; original error is reachable via errors.Is

.github/workflows/integrity-filtering-audit.md

  • Add wasm error: / WASM guard trap to Step 3 grep patterns
  • Add explicit Step 4 for WASM panic detection
  • Classify WASM guard traps as 🟡 Warning in findings

Warning

Firewall rules blocked me from connecting to one or more addresses (expand for details)

I tried to connect to the following addresses, but was blocked by firewall rules:

  • example.com
    • Triggering command: /tmp/go-build2016825021/b334/launcher.test /tmp/go-build2016825021/b334/launcher.test -test.testlogfile=/tmp/go-build2016825021/b334/testlog.txt -test.paniconexit0 -test.timeout=10m0s /tmp/go-build2016825021/b297/vet.cfg ache/go/1.25.8/x-c=4 ache/go/1.25.8/x-nolocalimports x_amd64/vet .rs ssions.rs s hUvueDyGfjP- go_.�� 64/src/net ternal/boring/bbig/big.go 64/pkg/tool/linux_amd64/compile -I /tmp/go-build389-atomic -I 64/pkg/tool/linu-buildtags (dns block)
  • invalid-host-that-does-not-exist-12345.com
    • Triggering command: /tmp/go-build2016825021/b319/config.test /tmp/go-build2016825021/b319/config.test -test.testlogfile=/tmp/go-build2016825021/b319/testlog.txt -test.paniconexit0 -test.timeout=10m0s --no�� Xs_UNqwZq /home/REDACTED/work/gh-aw-mcpg/gh--c=4 x_amd64/vet (dns block)
  • nonexistent.local
    • Triggering command: /tmp/go-build2016825021/b334/launcher.test /tmp/go-build2016825021/b334/launcher.test -test.testlogfile=/tmp/go-build2016825021/b334/testlog.txt -test.paniconexit0 -test.timeout=10m0s /tmp/go-build2016825021/b297/vet.cfg ache/go/1.25.8/x-c=4 ache/go/1.25.8/x-nolocalimports x_amd64/vet .rs ssions.rs s hUvueDyGfjP- go_.�� 64/src/net ternal/boring/bbig/big.go 64/pkg/tool/linux_amd64/compile -I /tmp/go-build389-atomic -I 64/pkg/tool/linu-buildtags (dns block)
  • slow.example.com
    • Triggering command: /tmp/go-build2016825021/b334/launcher.test /tmp/go-build2016825021/b334/launcher.test -test.testlogfile=/tmp/go-build2016825021/b334/testlog.txt -test.paniconexit0 -test.timeout=10m0s /tmp/go-build2016825021/b297/vet.cfg ache/go/1.25.8/x-c=4 ache/go/1.25.8/x-nolocalimports x_amd64/vet .rs ssions.rs s hUvueDyGfjP- go_.�� 64/src/net ternal/boring/bbig/big.go 64/pkg/tool/linux_amd64/compile -I /tmp/go-build389-atomic -I 64/pkg/tool/linu-buildtags (dns block)
  • this-host-does-not-exist-12345.com
    • Triggering command: /tmp/go-build2016825021/b343/mcp.test /tmp/go-build2016825021/b343/mcp.test -test.testlogfile=/tmp/go-build2016825021/b343/testlog.txt -test.paniconexit0 -test.timeout=10m0s -I /opt/hostedtoolcache/go/1.25.8/x-c=4 cfg x_amd64/vet --gdwarf-5 --64 (dns block)

If you need me to access, download, or install something from one of these locations, you can either:


✨ Let Copilot coding agent set things up for you — coding agent works faster and does higher quality work when set up for your repo.

Copilot AI changed the title [WIP] Fix warnings in integrity filtering audit report fix: detect and log WASM guard traps; mark module failed after panic Mar 28, 2026
Copilot AI requested a review from lpcox March 28, 2026 00:10
@lpcox lpcox marked this pull request as ready for review March 28, 2026 14:39
Copilot AI review requested due to automatic review settings March 28, 2026 14:39
@lpcox lpcox merged commit 40df91f into main Mar 28, 2026
27 checks passed
@lpcox lpcox deleted the copilot/integrity-audit-fix-warnings branch March 28, 2026 14:40
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR improves the MCP gateway’s WASM-based guard reliability by detecting WASM execution traps (e.g., Rust panics), logging them at the normal error level, and preventing further use of a potentially corrupted module state.

Changes:

  • Track a “failed” state on WasmGuard after a detected WASM trap and short-circuit subsequent calls with an error referencing the original trap.
  • Add isWasmTrap(err) to identify trap-style failures via the wasm error: prefix and log traps via logger.LogError.
  • Add unit tests for trap-string detection and for the post-failure short-circuit behavior; update the integrity audit workflow to grep for these indicators.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.

File Description
internal/guard/wasm.go Detects/logs WASM traps and marks a guard instance permanently failed after a trap to prevent state corruption.
internal/guard/wasm_test.go Adds tests for trap detection and for the guard’s “failed state” short-circuit behavior.
.github/workflows/integrity-filtering-audit.md Updates audit instructions/grep patterns to surface WASM traps/panics in routine log review.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

// policy context stored by label_agent), so all subsequent calls are
// unsafe until the guard is reloaded.
if g.failed {
return nil, fmt.Errorf("WASM guard '%s' is unavailable after a previous trap: %w", g.name, g.failedErr)
Copy link

Copilot AI Mar 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the g.failed early-return, fmt.Errorf(... %w, g.failedErr) can accidentally return a nil error if g.failedErr is ever nil (Go's fmt.Errorf returns nil when wrapping a nil error). That would make callWasmFunction return (nil, nil) even though the guard is failed. Consider ensuring failedErr is always non-nil when failed is true, or avoid %w when failedErr == nil (e.g., use a sentinel error).

Suggested change
return nil, fmt.Errorf("WASM guard '%s' is unavailable after a previous trap: %w", g.name, g.failedErr)
if g.failedErr != nil {
return nil, fmt.Errorf("WASM guard '%s' is unavailable after a previous trap: %w", g.name, g.failedErr)
}
return nil, fmt.Errorf("WASM guard '%s' is unavailable after a previous trap", g.name)

Copilot uses AI. Check for mistakes.
Comment on lines +1062 to +1064
ctx := context.Background()
_, err := g.callWasmFunction(ctx, "label_response", []byte(`{}`))
require.Error(t, err)
Copy link

Copilot AI Mar 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

callWasmFunction now documents the precondition that g.mu must be held by the caller, but these tests call it directly without taking the mutex. To keep tests aligned with the contract (and avoid future races if the test expands), consider locking g.mu around the call or switching the test to exercise the behavior via LabelAgent/LabelResponse which already hold the lock.

Copilot uses AI. Check for mistakes.
Comment on lines +1052 to +1056
t.Run("failed guard returns error immediately for callWasmFunction", func(t *testing.T) {
// Build a minimal valid WasmGuard by hand to exercise the failed-state path
// without needing a full WASM binary.
originalErr := errors.New("WASM function call failed: wasm error: unreachable")
g := &WasmGuard{
Copy link

Copilot AI Mar 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new trap-handling behavior in callWasmFunction (detect trap, log, set g.failed=true, persist failedErr) isn't directly exercised: the tests only validate isWasmTrap and the pre-set failed-state short-circuit. Consider adding an end-to-end test using a tiny WASM module whose exported function executes unreachable so you can assert that a real trap flips failed and that subsequent calls immediately fail with an error that unwraps to the original trap.

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[integrity-audit] Integrity Filtering Audit — github/gh-aw (2026-03-27)

3 participants