Skip to content

Conversation

@boomanaiden154
Copy link
Contributor

This patch adds functionality to the premerge advisor to explain away
flaky failures. For this, we just look to see that the failure observed
is failing for a sufficient amount of time, in our case 200 commits.

boomanaiden154 added a commit to boomanaiden154/llvm-zorg that referenced this pull request Nov 4, 2025
This patch adds functionality to the premerge advisor to explain away
flaky failures. For this, we just look to see that the failure observed
is failing for a sufficient amount of time, in our case 200 commits.

Pull Request: llvm#641
Created using spr 1.3.7

[skip ci]
Created using spr 1.3.7
Created using spr 1.3.7

[skip ci]
Created using spr 1.3.7
boomanaiden154 added a commit to boomanaiden154/llvm-zorg that referenced this pull request Nov 5, 2025
This patch adds functionality to the premerge advisor to explain away
flaky failures. For this, we just look to see that the failure observed
is failing for a sufficient amount of time, in our case 200 commits.

Pull Request: llvm#641
return None


def _try_explain_flaky_failure(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe a comment describing the particular definition of "flaky" this function is using would be handy?

It looks to me like it's testing if this failure has been seen before over at least a 200 commit range between oldest/newest sighting? What's to stop that being "it's been failing at HEAD for at least 200 revisions"?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a function level docstring.

I've never seen a non-flaky test failure that has been failing for more than ~20 commits. I think a OOM more in range should be safe. I thought about also adding a heuristic around the longest running subsequence to catch cases like this, but I think it adds complexity for little benefit:

  1. As mentioned, 200 is about an OOM bigger than the ranges we've seen in practice.
  2. We're exactly matching the failure message, so the test would need to fail in the exact same way to trigger this.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd probably still go for a test that verifies the failures are discontiguous from ToT rather than a heuristic that guesses that something seems unlikely to be a long-standing failure at HEAD - seems cheap/simple enough to do. Because bad advice can be pretty confusing.

But I guess it's a start.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I'll add in that functionality soon. It would be good to have it for the failing at head case too.

Created using spr 1.3.7
boomanaiden154 added a commit to boomanaiden154/llvm-zorg that referenced this pull request Nov 14, 2025
This patch adds functionality to the premerge advisor to explain away
flaky failures. For this, we just look to see that the failure observed
is failing for a sufficient amount of time, in our case 200 commits.

Pull Request: llvm#641
ldionne and others added 2 commits November 14, 2025 09:42
Created using spr 1.3.7

[skip ci]
Created using spr 1.3.7
@boomanaiden154 boomanaiden154 changed the base branch from users/boomanaiden154/main.ci-explain-flaky-failures to main November 14, 2025 17:43
@boomanaiden154 boomanaiden154 merged commit 3ae62e0 into main Nov 14, 2025
9 checks passed
@boomanaiden154 boomanaiden154 deleted the users/boomanaiden154/ci-explain-flaky-failures branch November 14, 2025 17:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants