Skip to content

[q] fix: wrap addLabels in try-catch with exponential backoff retry to prevent PR creation fallback on label race condition (#23316)#23328

Merged
pelikhan merged 4 commits intomainfrom
fix/pr-creation-label-race-condition-64fed35187d57080
Mar 28, 2026
Merged

[q] fix: wrap addLabels in try-catch with exponential backoff retry to prevent PR creation fallback on label race condition (#23316)#23328
pelikhan merged 4 commits intomainfrom
fix/pr-creation-label-race-condition-64fed35187d57080

Conversation

@github-actions
Copy link
Copy Markdown
Contributor

@github-actions github-actions bot commented Mar 28, 2026

Root Cause (from live data)

Investigated run #23677438752 which triggered issue #23316.

Exact log sequence:

Created pull request #23315: https://github.com/github/gh-aw/pull/23315
##[warning]Failed to create pull request: Validation Failed: {"resource":"Label","code":"unprocessable","field":"data","message":"Could not resolve to a node with the global id of 'PR_kwDOPc1QR87OOJzM'."}
Falling back to creating an issue instead
Created fallback issue #23316: https://github.com/github/gh-aw/issues/23316

The bug: In create_pull_request.cjs, the addLabels REST call was not wrapped in a try-catch. When it failed due to a GitHub API race condition (PR node ID not immediately resolvable after creation), the exception propagated to the outer catch block, which misidentified it as a PR creation failure and triggered the fallback-to-issue path — even though the PR was already successfully created.

The label failure error is a known GitHub API eventual consistency issue: the REST API internally uses GraphQL, and the newly created PR's node ID isn't immediately resolvable.

Changes

actions/setup/js/create_pull_request.cjs

  • Wrapped addLabels in its own try-catch — matching the existing pattern used for requestReviewers
  • Added retry with exponential backoff using the existing withRetry helper from error_recovery.cjs: 3 retries, 3 s initial delay, ×2 multiplier
  • Added isLabelTransientError predicate that extends the standard transient-error check (isTransientError) to also match the specific PR node-ID race condition ("Could not resolve to a node with the global id")
  • Non-transient label errors (e.g. label not found) fail immediately without retrying
  • After all retries are exhausted, label failures log a warning but don't abort the PR creation flow

actions/setup/js/create_pull_request.test.cjs

  • Updated test for the race condition to use vi.useFakeTimers() / vi.runAllTimersAsync() so retry delays don't slow the suite
  • Added assertion that addLabels is called 4 times (1 initial + 3 retries) when all attempts fail
  • Added test: "should succeed when addLabels recovers on a retry" — verifies success and only 2 calls when the second attempt succeeds
  • Added test: "should not retry addLabels for non-transient errors" — verifies exactly 1 call for errors that are not retryable

Validation

🎩 Equipped by Q ·

  • expires on Mar 30, 2026, 11:09 AM UTC

💡 You can make Copilot smarter by setting up custom instructions, customizing its development environment and configuring Model Context Protocol (MCP) servers. Learn more Copilot coding agent tips in the docs.

…ace condition

GitHub's API can transiently fail to resolve a newly created PR's node ID
when label operations are called immediately after PR creation. This caused
the safe_outputs job to fall back to creating an issue instead of the PR
(seen in run #23677438752 / issue #23316).

The fix wraps addLabels in its own try-catch (non-fatal warning), consistent
with how reviewer requests are already handled. A test is added to verify
that a label failure does not trigger the fallback-to-issue path.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@pelikhan
Copy link
Copy Markdown
Collaborator

@copilot

  • add a default 3 seconds delay with 3 retry to create labels before giving up
  • use helper function with exponential backup if possible

…initial delay)

Agent-Logs-Url: https://github.com/github/gh-aw/sessions/8c665314-60a8-42ae-8442-5b824d988f93

Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Copy link
Copy Markdown
Contributor

Copilot AI commented Mar 28, 2026

@copilot

  • add a default 3 seconds delay with 3 retry to create labels before giving up...

Done in 515aff8. addLabels now uses the existing withRetry helper from error_recovery.cjs with exponential backoff: 3 retries, 3 s initial delay, ×2 multiplier. A custom isLabelTransientError predicate extends the standard transient-error check to also match the PR node-ID race condition. Non-transient label errors (e.g. 404 label not found) are not retried. Three new tests cover: all-retries-exhausted, success-on-retry, and no-retry-for-non-transient.

Copilot AI changed the title [q] fix: wrap addLabels in try-catch to prevent PR creation fallback on label race condition (#23316) [q] fix: wrap addLabels in try-catch with exponential backoff retry to prevent PR creation fallback on label race condition (#23316) Mar 28, 2026
Copilot AI requested a review from pelikhan March 28, 2026 11:26
// Label addition is non-critical - warn but don't fail the PR creation.
// GitHub's API may transiently fail to resolve the PR node ID immediately
// after creation, which causes label operations to fail with an unprocessable error.
core.warning(`Failed to add labels to PR #${pullRequest.number}: ${labelError instanceof Error ? labelError.message : String(labelError)}`);
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot getErrorMessage

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 206f829isLabelTransientError now calls getErrorMessage(error) instead of the inline ternary.

Copilot AI requested a review from pelikhan March 28, 2026 11:33
@pelikhan pelikhan merged commit e67f2f8 into main Mar 28, 2026
53 checks passed
@pelikhan pelikhan deleted the fix/pr-creation-label-race-condition-64fed35187d57080 branch March 28, 2026 11:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants