-
Notifications
You must be signed in to change notification settings - Fork 3.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: implement flag to fail flaky tests #30618
Conversation
@microsoft-github-policy-service agree company="BJSS" |
This comment has been minimized.
This comment has been minimized.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the PR!
I'd like to understand the feature a little better. Do we want to stop recognizing flaky tests at all? Are we after the test runner exit code? Or some changes in the terminal/html report?
Looking at the issue, my initial understanding is "introduce --fail-on-flaky cli option that will set exitCode=1 when at least one test was flaky". However, this PR introduces many more changes, so let's align on the intended behavior first.
Hi there, The intended behaviour from this PR, and my understanding of the issue matches yours - any changes in the code extra to that reflect my unfamiliarity with the codebase. I think Russell's comment / the discussion around behaviour reflects the user / manual tester perspective where the change in exit code isn't immediately visible. |
Sounds great! In this case, I think we should plumb the new CLI option similar to @BJSS-russell-pollock I think this should be clear since reporter shows the "Flaky" count as non-zero. I'd assume anyone using this option will know that flakiness means "failed test run", because the behavior was opt in. Let me know what you think. |
Absolutely makes sense. |
Excellent - I'll have a look and update it accordingly. |
5e27e31
to
cf2dada
Compare
This comment has been minimized.
This comment has been minimized.
be62f16
to
013b61f
Compare
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
2af62ce
to
faf8d5d
Compare
This comment has been minimized.
This comment has been minimized.
Okay, reworked as suggested - much less additional code! |
This comment has been minimized.
This comment has been minimized.
faf8d5d
to
f60861e
Compare
This comment has been minimized.
This comment has been minimized.
f60861e
to
2b7ee0a
Compare
This comment has been minimized.
This comment has been minimized.
2b7ee0a
to
26c32a2
Compare
@dgozman This should be ok for re-review when you're ready - the failed tests in previous pipeline runs look like noise to me, but happy to dig if you think it's deeper |
Test results for "tests 1"1 failed 2 flaky27346 passed, 662 skipped Merge workflow run. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great, merging in. Thank you for the PR!
Implements feature requested in #30457
The test runner treats flaky tests as failures when the flag is enabled, but still reports flaky tests as flaky in the reporting interface. It feels like something worth discussing as this behaviour makes sense to me, but looked a bit odd to @BJSS-russell-pollock when I ran this past him.
Closes #30457.