Standalone Activity: preserve server-generated request IDs across restarts#9724
Open
dandavison wants to merge 2 commits intomainfrom
Open
Standalone Activity: preserve server-generated request IDs across restarts#9724dandavison wants to merge 2 commits intomainfrom
dandavison wants to merge 2 commits intomainfrom
Conversation
Calls validateAndPopulateStartRequest twice with the same request pointer (simulating RetryableInterceptor retry behavior) and asserts both clones carry the same server-generated RequestId. Fails without the fix in the preceding commit: each clone got a fresh UUID because the RequestId was generated after the clone.
Move RequestId generation before the proto clone so that the
RetryableInterceptor, which retries with the same request pointer,
carries the same server-generated RequestId on every attempt. Previously
the RequestId was generated on the clone, so each retry got a fresh UUID
that could not match the stored RequestId, breaking dedup and potentially
causing a double-create if the activity completed between attempts.
Fix Terminate/CancelActivity request-ID dedup across server-side retries
Same issue as StartActivityExecution: the RequestId was generated on a
clone, so RetryableInterceptor retries got a fresh UUID each time. For
Terminate, this caused a spurious FailedPrecondition error on retry
("already terminated with request ID ..."). For RequestCancel, same
pattern ("cancellation already requested with request ID ...").
Fix: set RequestId on the original request before any clone, matching
the OSS pattern used by workflow handlers.
These handlers had no other mutations requiring a clone, so the clone is
simply removed.
f39f920 to
38bd190
Compare
stephanos
reviewed
Mar 27, 2026
|
|
||
| if req.GetRequestId() == "" { | ||
| // Since this mutates the request, we clone it first so that any retries use the original request. | ||
| req = common.CloneProto(req) |
Contributor
There was a problem hiding this comment.
fwiw, in my PR now I've moved this into the validation and normalization methods like this:
if req.GetRequestId() == "" {
req.RequestId = uuid.NewString()
} else if len(req.GetRequestId()) > config.MaxIDLengthLimit() {
return serviceerror.NewInvalidArgumentf("request_id exceeds length limit. Length=%d Limit=%d",
len(req.GetRequestId()), config.MaxIDLengthLimit())
}
bergundy
approved these changes
Mar 27, 2026
Member
bergundy
left a comment
There was a problem hiding this comment.
Just FTR, the duplicates can easily happen if a client retries and doesn't specify a request ID. This makes it a bit worse.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What changed?
When generating a request ID server-side, set it on the request struct before any cloning so that the mutation is re-used by all retries.
Why?
Without this, there is a bug, although I have not attempted to repro it:
requestIDrequestIDis generated and set on a cloned copyRetryableInterceptornot to receive the Ack (it sees a context expiry)How did you test it?
It would be hard to repro that bug.
Potential risks
Could introduce incorrectness into Standalone Activity
Note
Medium Risk
Touches standalone activity request-id generation/dedup behavior; mistakes could affect idempotency (duplicate starts/cancels/terminations) but the change is small and covered by a unit test for the start path.
Overview
Ensures standalone activity frontend operations re-use a single
RequestIdacross retry attempts by generating the ID on the original request before cloning/mutation invalidateAndPopulateStartRequest, and by no longer cloning requests when auto-populatingRequestIdforTerminateActivityExecutionandRequestCancelActivityExecution.Adds a unit test (
TestRequestIdStableAcrossRetries) that callsvalidateAndPopulateStartRequesttwice with the same request pointer and asserts theRequestIdstays stable for both server-generated and client-provided IDs.Written by Cursor Bugbot for commit 38bd190. This will update automatically on new commits. Configure here.