test_runner: add flaky option to retry on failure#63661
Conversation
|
Review requested:
|
Han5991
left a comment
There was a problem hiding this comment.
A few self-review notes on the trickier parts, anchored to the exact lines.
| // (at first `.assert` access), preserving the historical counting behavior | ||
| // exactly - opting out of `flaky` changes nothing. | ||
| const capturedPlan = test.plan; | ||
| const currentPlan = () => (test.flakyRetries > 0 ? test.plan : capturedPlan); |
There was a problem hiding this comment.
This is the one change that touches the non-flaky path: the assertion plan is now read at use-time, but only for flaky tests (test.flakyRetries > 0); non-flaky tests keep the registration-time capture (capturedPlan). Please confirm non-flaky plan counting is unchanged — the regression guard is the assert-before-plan case in test/parallel/test-runner-flaky.js.
| // final attempt creates no subtest - leaving outputSubtestCount > 0 with | ||
| // an empty `subtests`, which crashed report() (see the guard there). | ||
| this.subtests = []; | ||
| this.outputSubtestCount = 0; |
There was a problem hiding this comment.
Subtest crash fix (1/2): on retry we clear subtests and reset outputSubtestCount. Without the reset the count stayed > 0 while subtests was emptied, which crashed the reporter (see the guard in 2/2).
| // `subtests` can be empty here when a flaky retry cleared it even though | ||
| // outputSubtestCount was non-zero; fall back to this test's own nesting | ||
| // rather than dereferencing subtests[0] (which would crash the run). | ||
| const subtestNesting = this.subtests.length ? |
There was a problem hiding this comment.
Subtest crash fix (2/2) + known limitation: this guards the empty-subtests dereference that aborted the run. Known limitation: intermediate-attempt subtests are emitted to the stream before the retry decision, so they cannot be recalled — the output can still contain leaked subtest lines and a slightly inflated # tests. Full fidelity would need buffering each attempt's events and replaying only the surviving one. Is the crash-fix-only version OK to land, with fidelity as a follow-up?
| cancelled: kCanceledTests.has(item.data.details?.error?.failureType), | ||
| // A retried timeout that exhausted (retryCount > 0) is a failure, not a | ||
| // cancellation only an un-retried timeout/abort stays cancelled. | ||
| cancelled: kCanceledTests.has(item.data.details?.error?.failureType) && |
There was a problem hiding this comment.
run() process-isolation parent re-counts the child's serialized events, so two things are handled here: an exhausted flaky timeout (retryCount > 0) is promoted out of cancelled into a failure, and any flaky-marked test (retryCount present, even 0) counts toward flaky. This cross-process re-derivation is easy to get subtly wrong — please sanity-check.
| publishError(err); | ||
| } | ||
| if (isTestFailureError(err)) { | ||
| if (err.failureType === kTestTimeoutFailure && this.flakyRetries === 0) { |
There was a problem hiding this comment.
Timeout handling: a non-flaky timeout stays cancelled (#cancel), but a flaky one goes through fail() so it can retry and, when the budget is exhausted, counts as a failure rather than a cancellation. The flakyRetries === 0 guard is what preserves the existing non-flaky behavior.
| this.expectedAssertions = plan; | ||
| this.cancelled = false; | ||
| if (flaky === undefined || flaky === null) { | ||
| this.flakyRetries = this.parent?.flakyRetries ?? 0; |
There was a problem hiding this comment.
Semantics per the proposal: nearest-wins inheritance (this parent fallback), flaky: true → 20 retries (kDefaultFlakyRetries, L106; set just below), an explicit positive integer is used as-is, and 0/negative/non-integer values throw via validateInteger. Confirming these defaults match the proposal's intent would help.
8ec070e to
b9fa345
Compare
Add a `flaky` option that re-runs a failing test until it passes, intended for tests with unavoidable nondeterminism. Setting `flaky: true` retries up to 20 times; `flaky: <positive integer>` sets an explicit retry budget. The option is accepted on tests and suites and via the it.flaky/test.flaky/describe.flaky/suite.flaky shorthands; a test-case value overrides an inherited suite value (nearest wins), and `flaky: false` opts a test out. Only the final attempt is observable: intermediate failures emit no test:fail, no per-test diagnostics, and nothing on the node.test error channel. Each result carries a new `retryCount` field on the test:pass, test:fail, and test:complete events (the number of retries performed, `undefined` for non-flaky tests), reporters print a `# FLAKY` directive, and the run summary gains a `flaky` counter. beforeEach/afterEach re-run on every attempt while before/after run once, so per-attempt state is reset and retries do not leak state. An externally aborted test and an expectFailure are not retried. A flaky test whose timeout is exhausted is reported as a failure rather than cancelled. Co-authored-by: vespa7 <98526766+vespa7@users.noreply.github.com> Signed-off-by: sangwook <rewq5991@gmail.com>
b9fa345 to
454ccc0
Compare
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #63661 +/- ##
==========================================
- Coverage 91.95% 90.31% -1.64%
==========================================
Files 379 732 +353
Lines 166454 236652 +70198
Branches 25427 44575 +19148
==========================================
+ Hits 153058 213730 +60672
- Misses 13104 14617 +1513
- Partials 292 8305 +8013
🚀 New features to boost your workflow:
|
Add failing regression tests for nine defects found while reviewing the flaky feature, each asserting the corrected behavior: - a flaky + expectFailure test that unexpectedly passes must not retry - a flaky parent must retry when only a subtest fails - node.test tracing start/end must stay balanced across retries - flaky must not propagate from a test to its own subtests - the MockTracker must be reset between retries - a test-level after hook must run after the final attempt - a retried attempt must abort the previous attempt's signal - an in-flight subtest from a discarded attempt must not corrupt the run - a flaky + expectFailure error must still reach the node.test channel Signed-off-by: sangwook <rewq5991@gmail.com>
Fix eight defects in the flaky retry loop, each covered by a regression test added in the previous commit: - do not retry a flaky expectFailure test that unexpectedly passes - retry a flaky parent when a failure surfaces only through a subtest - publish the tracing channel end per attempt so spans stay balanced - inherit flaky only from a suite parent, not a test to its subtests - reset the MockTracker between retries - run a test-level after hook only after the final attempt - abort the previous attempt's signal so its work is torn down - publish a flaky expectFailure throw's error to the node.test channel The remaining intermediate-attempt subtest output inflation across retries is the documented known limitation and is unchanged. Signed-off-by: sangwook <rewq5991@gmail.com>
3ff2a6b to
a1989d8
Compare
This feature is based on the
flakyproposal in the nodejs/test-runner repo, which I used as the reference for the API surface and semantics. A test can be markedflakyso the runner re-runs it until it passes — useful for tests with unavoidable nondeterminism.Supersedes #61746 (original work by @vespa7, credited as co-author); this is a fresh implementation on current
main.What it adds
flaky: true→ retry up to 20 times;flaky: <positive integer>→ explicit budget;flaky: false→ off. Invalid values throw at registration.it.flaky/test.flaky/describe.flaky/suite.flakyshorthands. Nearest value wins (a test-case overrides its suite).test:fail, no per-test diagnostics, and nothing on thenode.testerror channel.retryCountfield ontest:pass/test:fail/test:complete; reporters print a# FLAKY …directive; the run summary gains aflakycounter.beforeEach/afterEachre-run per attempt (state reset between retries);before/afterrun once. External abort andexpectFailureare not retried.Points I'd appreciate eyes on
I've left inline review comments on the specific lines that would most benefit from review — the flaky-scoped assertion-plan read, the subtest crash fix and its known limitation, and the
run()process-isolation parent re-counting.Testing
New
test/parallel/test-runner-flaky.jswith deterministic, counter-based fixtures, plus a reporter snapshot test. Fulltest_runnersuite passes; lint clean.