[fix](regression) Stabilize rowsets timestamp filters by mrhhsg · Pull Request #65050 · apache/doris

mrhhsg · 2026-06-30T14:39:03Z

What problem does this PR solve?

Issue Number: None

Problem Summary: The system rowsets regressions used second-level current timestamps as lower bounds for NEWEST_WRITE_TIMESTAMP. In one test, the timestamp was captured after the table was created and could filter out the initial rowset on a boundary race. In the cloud-only scan variant, the timestamp could be equal to the create-table rowset timestamp and unexpectedly include version 0-1. Capture stable lower bounds on the intended side of table creation for each case, so the tests still exercise timestamp filters while avoiding second-level boundary races.

Release note

None

Check List (For Author)

Test:
- Regression test: ./run-regression-test.sh --run -d query_p0/system -s test_query_sys_rowsets
- Regression test: ./run-regression-test.sh --run -d query_p0/system -s test_query_sys_rowsets (200 consecutive runs)
- Regression test: ./run-regression-test.sh --run -d query_p0/system -s test_query_sys_scan_rowsets (local non-cloud guard path)
- Regression test: ./run-regression-test.sh --run -d query_p0/system -s test_query_sys_rowsets
Behavior changed: No
Does this need documentation: No

hello-stephen · 2026-06-30T14:39:09Z

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

What problem was fixed (it's best to include specific error reporting information). How it was fixed.
Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
What features were added. Why was this function added?
Which code was refactored and why was this part of the code refactored?
Which functions were optimized and what is the difference before and after the optimization?

mrhhsg · 2026-06-30T14:39:09Z

/review

github-actions

Review result: requesting changes for one test-stability gap.

Critical checkpoint conclusions:

Goal/test proof: the changed test now captures a stable lower bound before table creation and keeps the expected rowset versions unchanged.
Scope: the edit is small, but the same timestamp-boundary pattern remains in a parallel rowsets regression.
Concurrency/lifecycle/config/compatibility/persistence: no product-code, config, protocol, lifecycle, or storage-format changes in this PR.
Parallel paths: M-1 identifies the remaining cloud-only rowsets scan test path with the same second-level >= now race.
Test results: PR body reports query_p0/system test_query_sys_rowsets runs including 20 consecutive runs; I did not run local regression here because this checkout has no thirdparty/installed.
User focus: no additional focus was supplied.

Subagent conclusions: optimizer-rewrite and tests-session-config both reported no candidates in their initial passes. After M-1 was added to the ledger/proposed comment set, convergence round 1 ended with both live subagents replying NO_NEW_VALUABLE_FINDINGS.

### What problem does this PR solve? Issue Number: None Problem Summary: The system rowsets regressions used second-level current timestamps as lower bounds for NEWEST_WRITE_TIMESTAMP. In one test, the timestamp was captured after the table was created and could filter out the initial rowset on a boundary race. In the cloud-only scan variant, the timestamp could be equal to the create-table rowset timestamp and unexpectedly include version 0-1. Capture stable lower bounds on the intended side of table creation for each case, so the tests still exercise timestamp filters while avoiding second-level boundary races. ### Release note None ### Check List (For Author) - Test: - Regression test: doris-local-regression --network 10.26.20.3/24 all -d query_p0/system -s test_query_sys_rowsets - Regression test: doris-local-regression --network 10.26.20.3/24 run -d query_p0/system -s test_query_sys_rowsets (200 consecutive runs) - Regression test: doris-local-regression --network 10.26.20.3/24 run -d query_p0/system -s test_query_sys_scan_rowsets (local non-cloud guard path) - Regression test: doris-local-regression --network 10.26.20.3/24 run -d query_p0/system -s test_query_sys_rowsets - Behavior changed: No - Does this need documentation: No

mrhhsg · 2026-06-30T15:11:34Z

/review

github-actions

Review completed for PR 65050 at f4c3543.

I did not find any new blocking issue or any new valuable inline comment beyond the existing thread that has already been addressed by the current patch.

Critical checkpoint conclusions:

Goal and proof: the PR targets second-level NEWEST_WRITE_TIMESTAMP boundary flakiness in the rowsets regressions. The normal rowsets test now captures a lower bound before table creation, preserving the expected rowsets4 output that includes 0-1 through 5-5; the cloud scan variant now waits past the create-table rowset second before capturing the lower bound, preserving the expected rowsets4 output that excludes 0-1.
Scope and focus: the live GitHub PR file list and compare API show only the two rowsets regression suites. The change is small and focused on timestamp filter stabilization.
Parallel paths: the sibling cloud-only scan test is now updated, so the existing inline concern about leaving test_query_sys_scan_rowsets.groovy flaky is covered by the current head.
Tests and expected results: the expected-output files already encode the intended difference between the local and cloud scan variants. I verified the changed queries against those expectations and the BE rowsets scanner path that exposes newest_write_timestamp as a DATETIME from Unix-second rowset metadata.
Concurrency, lifecycle, configs, compatibility, transactions, and observability: this is regression-test-only code and does not add runtime product state, config, persistence, protocol, or logging behavior.
Performance: the added 1.1s sleep is limited to the cloud-only regression path and is directly tied to the second-level timestamp boundary being tested.
Hygiene: git diff --check is clean for the true PR parent-to-head range over the changed files.

User focus: no additional user-provided review focus was present.

Subagent conclusions:

optimizer-rewrite reported NO_NEW_VALUABLE_FINDINGS; no optimizer/rewrite candidate became an inline comment.
tests-session-config reported NO_NEW_VALUABLE_FINDINGS; no test/config candidate became an inline comment.
Main merged MAIN-DUP-001 as a duplicate/resolution note for the existing inline thread about the cloud-only sibling test.
Convergence round 1 ended with both live subagents reporting NO_NEW_VALUABLE_FINDINGS for the same empty proposed final comment set after the last ledger update.

mrhhsg · 2026-07-01T01:07:10Z

run buildall

HappenLee

LGTM

github-actions · 2026-07-01T03:40:22Z

PR approved by at least one committer and no changes requested.

fixed

github-actions Bot previously requested changes Jun 30, 2026

View reviewed changes

Comment thread regression-test/suites/query_p0/system/test_query_sys_rowsets.groovy

mrhhsg force-pushed the fix/stable-rowsets-regression branch from cceab75 to f4c3543 Compare June 30, 2026 15:11

mrhhsg changed the title ~~[fix](regression) Stabilize rowsets timestamp filter~~ [fix](regression) Stabilize rowsets timestamp filters Jun 30, 2026

github-actions Bot reviewed Jun 30, 2026

View reviewed changes

mrhhsg marked this pull request as ready for review July 1, 2026 01:05

HappenLee approved these changes Jul 1, 2026

View reviewed changes

github-actions Bot added the approved Indicates a PR has been approved by one committer. label Jul 1, 2026

mrhhsg merged commit 9d93a33 into apache:master Jul 1, 2026
37 checks passed

mrhhsg deleted the fix/stable-rowsets-regression branch July 1, 2026 06:50

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[fix](regression) Stabilize rowsets timestamp filters#65050

[fix](regression) Stabilize rowsets timestamp filters#65050
mrhhsg merged 1 commit into
apache:masterfrom
mrhhsg:fix/stable-rowsets-regression

mrhhsg commented Jun 30, 2026 •

edited

Loading

Uh oh!

hello-stephen commented Jun 30, 2026

Uh oh!

mrhhsg commented Jun 30, 2026

Uh oh!

github-actions Bot left a comment

Uh oh!

Uh oh!

mrhhsg commented Jun 30, 2026

Uh oh!

github-actions Bot left a comment

Uh oh!

mrhhsg commented Jul 1, 2026

Uh oh!

HappenLee left a comment

Uh oh!

github-actions Bot commented Jul 1, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Conversation

mrhhsg commented Jun 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What problem does this PR solve?

Release note

Check List (For Author)

Uh oh!

hello-stephen commented Jun 30, 2026

Uh oh!

mrhhsg commented Jun 30, 2026

Uh oh!

github-actions Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

mrhhsg commented Jun 30, 2026

Uh oh!

github-actions Bot left a comment

Choose a reason for hiding this comment

Uh oh!

mrhhsg commented Jul 1, 2026

Uh oh!

HappenLee left a comment

Choose a reason for hiding this comment

Uh oh!

github-actions Bot commented Jul 1, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

mrhhsg commented Jun 30, 2026 •

edited

Loading