[ML] Fix Non-Deterministic Training Set Selection in RegressionIT testTwoJobsWithSameRandomizeSeedUseSameTrainingSet #138063

valeriy42 · 2025-11-13T20:41:24Z

The test testTwoJobsWithSameRandomizeSeedUseSameTrainingSet fails intermittently because documents may be processed in different orders during reindexing. Since we use an online reservoir sampling algorithm, this order actually matters. To ensure deterministic reindexing of the document sequence, both the number of shards and the number of segments must be 1.

This PR fixes the test by creating the source index with only 1 segment. This ensures deterministic document order during reindexing, resulting in consistent ID assignments and training set selection when using the same seed.

Fixes #117805

…rocessing order

valeriy42 · 2025-11-14T09:54:38Z

...ti-node-tests/src/javaRestTest/java/org/elasticsearch/xpack/ml/integration/RegressionIT.java


    public void testTwoJobsWithSameRandomizeSeedUseSameTrainingSet() throws Exception {
        String sourceIndex = "regression_two_jobs_with_same_randomize_seed_source";
        indexData(sourceIndex, 100, 0);


indexData is calling directly client().admin().indices().prepareCreate() instead of prepareCreate() of the test framework. This ensures that the index has always 1 shard.

However, it still can have multiple segments which then leads to non-deterministics order in which the reservoir sampling might get the documents. Hence, we need to fix both the shards and the segments to 1.

elasticsearchmachine · 2025-11-14T09:57:35Z

Pinging @elastic/ml-core (Team:ML)

davidkyle

LGTM

* main: (135 commits) Mute org.elasticsearch.upgrades.IndexSortUpgradeIT testIndexSortForNumericTypes {upgradedNodes=1} elastic#138130 Mute org.elasticsearch.upgrades.IndexSortUpgradeIT testIndexSortForNumericTypes {upgradedNodes=2} elastic#138129 Mute org.elasticsearch.search.basic.SearchWithRandomDisconnectsIT testSearchWithRandomDisconnects elastic#138128 [DiskBBQ] avoid EsAcceptDocs bug by calling cost before building iterator (elastic#138127) Log NOT_PREFERRED shard movements (elastic#138069) Improve bulk loading of binary doc values (elastic#137995) Add internal action for getting inference fields and inference results for those fields (elastic#137680) Address issue with DateFieldMapper#isFieldWithinQuery(...) (elastic#138032) WriteLoadConstraintDecider: Have separate rate limiting for canRemain and canAllocate decisions (elastic#138067) Adding NodeContext to TransportBroadcastByNodeAction (elastic#138057) Mute org.elasticsearch.simdvec.ESVectorUtilTests testSoarDistanceBulk elastic#138117 Mute org.elasticsearch.xpack.esql.qa.single_node.GenerativeIT test elastic#137909 Backport batched_response_might_include_reduction_failure version to 8.19 (elastic#138046) Add summary metrics for tdigest fields (elastic#137982) Add gp-llm-v2 model ID and inference endpoint (elastic#138045) Various tracing fixes (elastic#137908) [ML] Fixing KDE evaluate() to return correct ValueAndMagnitude object (elastic#128602) Mute org.elasticsearch.xpack.shutdown.NodeShutdownIT testStalledShardMigrationProperlyDetected elastic#115697 [ML] Fix Flaky Audit Message Assertion in testWithDatastream for RegressionIT and ClassificationIT (elastic#138065) [ML] Fix Non-Deterministic Training Set Selection in RegressionIT testTwoJobsWithSameRandomizeSeedUseSameTrainingSet (elastic#138063) ... # Conflicts: # rest-api-spec/src/yamlRestTest/resources/rest-api-spec/test/search.vectors/200_dense_vector_docvalue_fields.yml

Use single shard for index data.

d6c37f8

valeriy42 added the >test Issues or PRs that are addressing/adding tests label Nov 13, 2025

elasticsearchmachine added the v9.3.0 label Nov 13, 2025

remove muted test

61bacc9

valeriy42 added the :ml Machine learning label Nov 13, 2025

Refactor RegressionIT to use force merge for deterministic document p…

3276b62

…rocessing order

valeriy42 commented Nov 14, 2025

View reviewed changes

valeriy42 marked this pull request as ready for review November 14, 2025 09:57

Merge branch 'main' into tests/is-117805

3c850fd

elasticsearchmachine added the Team:ML Meta label for the ML team label Nov 14, 2025

davidkyle approved these changes Nov 14, 2025

View reviewed changes

valeriy42 merged commit c0e0bda into elastic:main Nov 14, 2025
34 checks passed

valeriy42 deleted the tests/is-117805 branch November 14, 2025 14:17

mark-vieira mentioned this pull request Nov 19, 2025

[CI] RegressionIT testTwoJobsWithSameRandomizeSeedUseSameTrainingSet failing #138319

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[ML] Fix Non-Deterministic Training Set Selection in RegressionIT testTwoJobsWithSameRandomizeSeedUseSameTrainingSet #138063

[ML] Fix Non-Deterministic Training Set Selection in RegressionIT testTwoJobsWithSameRandomizeSeedUseSameTrainingSet #138063

valeriy42 commented Nov 13, 2025 •

edited

Loading

Uh oh!

valeriy42 Nov 14, 2025

Uh oh!

elasticsearchmachine commented Nov 14, 2025

Uh oh!

davidkyle left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

[ML] Fix Non-Deterministic Training Set Selection in RegressionIT testTwoJobsWithSameRandomizeSeedUseSameTrainingSet #138063

[ML] Fix Non-Deterministic Training Set Selection in RegressionIT testTwoJobsWithSameRandomizeSeedUseSameTrainingSet #138063

Conversation

valeriy42 commented Nov 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

valeriy42 Nov 14, 2025

Choose a reason for hiding this comment

Uh oh!

elasticsearchmachine commented Nov 14, 2025

Uh oh!

davidkyle left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

valeriy42 commented Nov 13, 2025 •

edited

Loading