Skip to content

Conversation

@valeriy42
Copy link
Contributor

@valeriy42 valeriy42 commented Nov 13, 2025

Both testWithDatastream in RegressionIT.java and testWithDatastreams in ClassificationIT.java sometimes fail because assertThatAuditMessagesMatch doesn't reliably wait for all audit messages to be written. Audit messages are written asynchronously (fire-and-forget), creating a race condition. The tests complete quickly (on a small dataset: 300 training + 50 non-training rows), increasing the likelihood of a race condition. Both tests use the same assertThatAuditMessagesMatch method from MlNativeDataFrameAnalyticsIntegTestCase, so fixing it resolves both test failures.

This PR verifies all expected message prefixes exist, adds robust waiting with timeout and retries, refreshes the notifications index before each check, and restores a more lenient size check, ensuring reliability and thoroughness.

Fixes #128166
Fixes #129457

…rtion in MlNativeDataFrameAnalyticsIntegTestCase to ensure all expected prefixes are found in fetched messages.
@valeriy42 valeriy42 added >test Issues or PRs that are addressing/adding tests :ml Machine learning labels Nov 13, 2025
@valeriy42 valeriy42 self-assigned this Nov 13, 2025
…stCase to utilize streams for improved readability and efficiency in checking expected prefixes against fetched messages.
@valeriy42 valeriy42 added the Team:ML Meta label for the ML team label Nov 14, 2025
@valeriy42 valeriy42 marked this pull request as ready for review November 14, 2025 09:34
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/ml-core (Team:ML)

@valeriy42 valeriy42 merged commit 14b5736 into elastic:main Nov 14, 2025
34 checks passed
@valeriy42 valeriy42 deleted the tests/is-128166 branch November 14, 2025 14:37
weizijun added a commit to weizijun/elasticsearch that referenced this pull request Nov 16, 2025
* main: (135 commits)
  Mute org.elasticsearch.upgrades.IndexSortUpgradeIT testIndexSortForNumericTypes {upgradedNodes=1} elastic#138130
  Mute org.elasticsearch.upgrades.IndexSortUpgradeIT testIndexSortForNumericTypes {upgradedNodes=2} elastic#138129
  Mute org.elasticsearch.search.basic.SearchWithRandomDisconnectsIT testSearchWithRandomDisconnects elastic#138128
  [DiskBBQ] avoid EsAcceptDocs bug by calling cost before building iterator (elastic#138127)
  Log NOT_PREFERRED shard movements (elastic#138069)
  Improve bulk loading of binary doc values (elastic#137995)
  Add internal action for getting inference fields and inference results for those fields (elastic#137680)
  Address issue with DateFieldMapper#isFieldWithinQuery(...) (elastic#138032)
  WriteLoadConstraintDecider: Have separate rate limiting for canRemain and canAllocate decisions (elastic#138067)
  Adding NodeContext to TransportBroadcastByNodeAction (elastic#138057)
  Mute org.elasticsearch.simdvec.ESVectorUtilTests testSoarDistanceBulk elastic#138117
  Mute org.elasticsearch.xpack.esql.qa.single_node.GenerativeIT test elastic#137909
  Backport batched_response_might_include_reduction_failure version to 8.19 (elastic#138046)
  Add summary metrics for tdigest fields (elastic#137982)
  Add gp-llm-v2 model ID and inference endpoint (elastic#138045)
  Various tracing fixes (elastic#137908)
  [ML] Fixing KDE evaluate() to return correct ValueAndMagnitude object (elastic#128602)
  Mute org.elasticsearch.xpack.shutdown.NodeShutdownIT testStalledShardMigrationProperlyDetected elastic#115697
  [ML] Fix Flaky Audit Message Assertion in testWithDatastream for RegressionIT and ClassificationIT (elastic#138065)
  [ML] Fix Non-Deterministic Training Set Selection in RegressionIT testTwoJobsWithSameRandomizeSeedUseSameTrainingSet (elastic#138063)
  ...

# Conflicts:
#	rest-api-spec/src/yamlRestTest/resources/rest-api-spec/test/search.vectors/200_dense_vector_docvalue_fields.yml
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

:ml Machine learning Team:ML Meta label for the ML team >test Issues or PRs that are addressing/adding tests v9.3.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[CI] ClassificationIT testWithDatastreams failing [CI] RegressionIT testWithDatastream failing

3 participants