ci: bound every job with timeout-minutes so a stalled runner fails fast#1436
Open
xuyushun441-sys wants to merge 1 commit into
Open
ci: bound every job with timeout-minutes so a stalled runner fails fast#1436xuyushun441-sys wants to merge 1 commit into
xuyushun441-sys wants to merge 1 commit into
Conversation
The Test job hung in_progress for ~19 min (then ~11 min on re-run) on PR #1433 — a stalled runner / non-exiting vitest worker (the cancelled job's cleanup terminated orphaned turbo + node test processes). No job declared `timeout-minutes`, so such a stall runs unbounded up to GitHub's 6h default instead of failing fast. Add per-job timeouts (test 20m, build/e2e 30m, docs/dev-server 15m, changeset-check 10m) — the suites normally finish in well under these. A future stall now fails quickly and is retryable rather than blocking the PR for hours. This bounds the symptom; the underlying intermittent worker-exit hang (not introduced by any one PR — it has passed on the same suite) is a separate follow-up (pinpoint via a hanging-process reporter in CI). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
|
The latest updates on your projects. Learn more about Vercel for GitHub. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Why
On PR #1433 the Test job hung
in_progressfor ~19 min, and again for ~11 min on re-run. Diagnosis:vitest.config.mtssetstestTimeout: 15000, so a hanging test fails at 15s — a 19-min hang can't be a test body. It's a stalled runner / non-exiting vitest worker: the cancelled job's cleanup step terminated orphanedturbo+ severalnode/sh(vitest) processes that never exited.timeout-minutes, so such a stall runs unbounded up to GitHub's 6-hour default instead of failing fast.What
Add per-job
timeout-minutesto every job inci.yml:A future stall now fails quickly and is retryable instead of blocking a PR for hours.
Scope
This bounds the symptom. The underlying intermittent worker-exit hang (an open handle keeping a vitest worker alive under the turbo fan-out) is a separate follow-up — best pinpointed by running CI once with a hanging-process reporter (
why-is-node-running/--reporter=hanging-process) to name the leaking package.🤖 Generated with Claude Code