[fix](be) Validate task executor scan handles by Gabriel39 · Pull Request #65054 · apache/doris

Gabriel39 · 2026-07-01T01:43:57Z

What problem does this PR solve?

Issue Number: None

Related PR: None

Problem Summary: Task-executor scan scheduling could pass a null or invalid task handle into TimeSharingTaskExecutor. enqueue_splits and related paths cast the base TaskHandle to TimeSharingTaskHandle and immediately dereferenced the result, so a broken ScannerContext task-handle invariant caused BE to crash with SIGSEGV instead of returning a diagnostic error. This change validates scanner context, scan task, and task handle before submitting scan splits, and validates the task handle type at TimeSharingTaskExecutor entry points before dereferencing it.

Release note

None

Check List (For Author)

Test: Unit Test
- Added TimeSharingTaskExecutorTest coverage for null and invalid task handles.
- Tried: JDK_17=/usr/local/opt/openjdk@17/libexec/openjdk.jdk/Contents/Home JAVA_HOME=/usr/local/opt/openjdk@17/libexec/openjdk.jdk/Contents/Home ./run-be-ut.sh --run --filter='TimeSharingTaskExecutorTest.*'; build failed during CMake configure because thirdparty/installed is missing Snappy.
Behavior changed: Yes. Invalid task-executor scan handles now return InternalError instead of dereferencing a null cast result.
Does this need documentation: No

What problem does this PR solve?

Issue Number: close #xxx

Related PR: #xxx

Problem Summary:

Release note

None

Check List (For Author)

Test
- Regression test
- Unit Test
- Manual test (add detailed scripts or steps below)
- No need to test or manual test. Explain why:
  - This is a refactor/code format and no logic has been changed.
  - Previous test can cover this change.
  - No code files have been changed.
  - Other reason
Behavior changed:
- No.
- Yes.
Does this need documentation?
- No.
- Yes.

Check List (For Reviewer who merge this PR)

Confirm the release note
Confirm test cases
Confirm document
Add branch pick label

### What problem does this PR solve? Issue Number: None Related PR: None Problem Summary: Task-executor scan scheduling could pass a null or invalid task handle into TimeSharingTaskExecutor. enqueue_splits and related paths cast the base TaskHandle to TimeSharingTaskHandle and immediately dereferenced the result, so a broken ScannerContext task-handle invariant caused BE to crash with SIGSEGV instead of returning a diagnostic error. This change validates scanner context, scan task, and task handle before submitting scan splits, and validates the task handle type at TimeSharingTaskExecutor entry points before dereferencing it. ### Release note None ### Check List (For Author) - Test: Unit Test - Added TimeSharingTaskExecutorTest coverage for null and invalid task handles. - Tried: JDK_17=/usr/local/opt/openjdk@17/libexec/openjdk.jdk/Contents/Home JAVA_HOME=/usr/local/opt/openjdk@17/libexec/openjdk.jdk/Contents/Home ./run-be-ut.sh --run --filter='TimeSharingTaskExecutorTest.*'; build failed during CMake configure because thirdparty/installed is missing Snappy. - Behavior changed: Yes. Invalid task-executor scan handles now return InternalError instead of dereferencing a null cast result. - Does this need documentation: No

hello-stephen · 2026-07-01T01:44:04Z

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

What problem was fixed (it's best to include specific error reporting information). How it was fixed.
Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
What features were added. Why was this function added?
Which code was refactored and why was this part of the code refactored?
Which functions were optimized and what is the difference before and after the optimization?

Gabriel39 · 2026-07-01T01:49:01Z

run buildall

Gabriel39 · 2026-07-01T01:49:46Z

/review

github-actions

Found one blocking test issue: the new invalid-handle BE UT constructs TimeSharingTaskExecutor with uninitialized thread-count fields, so the added coverage is nondeterministic.

Critical checkpoint conclusions: the production goal is clear and the scanner/executor validation paths look focused; concurrency and lifecycle paths around ScannerContext scheduling, split re-enqueue, remove_task cleanup, and the one-shot submit overload were checked without finding another substantiated issue; no new config, persistence, wire/storage compatibility, FE-BE protocol, or observability concern was introduced by this patch; test coverage was added, but the new test setup must be fixed before it is reliable. Existing inline review context was empty. No additional user focus was provided. Validation limit: BE UT/build was not run because this runner is missing .worktree_initialized and thirdparty/installed; the live PR patch reverse-applied cleanly with whitespace errors enabled.

Subagent conclusions: optimizer-rewrite reported no optimizer/parallel scheduling candidate; tests-session-config reported TEST-1, which was merged into MAIN-1 and became the inline comment below. Convergence round 1 ended with both live subagents replying NO_NEW_VALUABLE_FINDINGS for the same ledger/comment set.

github-actions

Anchor-repaired review submission. The substantiated issue remains MAIN-1: the new invalid-handle BE UT constructs TimeSharingTaskExecutor with uninitialized thread-count fields, so the added coverage is nondeterministic.

Critical checkpoint conclusions: the production scanner/executor validation paths are focused and I did not find another substantiated production correctness issue after tracing ScannerContext scheduling, split re-enqueue, remove_task cleanup, and the one-shot submit overload. No new config, persistence, wire/storage compatibility, FE-BE protocol, or observability concern was introduced by this patch. Test coverage was added, but the new test setup must be fixed before it is reliable. Existing inline review context was empty; no additional user focus was provided. Validation limit: BE UT/build was not run because this runner is missing .worktree_initialized and thirdparty/installed; the live PR patch reverse-applied cleanly with whitespace errors enabled.

Subagent conclusions: optimizer-rewrite reported no optimizer/parallel scheduling candidate; tests-session-config reported TEST-1, merged into MAIN-1. After repairing the inline anchor, convergence round 2 ended with both live subagents replying NO_NEW_VALUABLE_FINDINGS for the corrected ledger/comment set.

github-actions · 2026-07-01T02:02:21Z

+    auto ticker = std::make_shared<TestingTicker>();
+
+    TimeSharingTaskExecutor::ThreadConfig thread_config;
+    thread_config.thread_name = "invalid_task_handle";


The new test leaves thread_config.max_thread_num and thread_config.min_thread_num uninitialized here. ThreadConfig only defaults max_queue_size; the constructor copies the two indeterminate ints into _max_threads/_min_threads, and init() immediately calls _try_create_thread(_min_threads, ...). That makes this test nondeterministic: it can try to create an arbitrary number of worker threads or fail depending on stack contents. Please initialize both fields, like the adjacent tests do, before constructing the executor.

hello-stephen · 2026-07-01T03:37:57Z

BE UT Coverage Report

Increment line coverage 50.00% (18/36) 🎉

Increment coverage report
Complete coverage report

Category	Coverage
Function Coverage	55.12% (21729/39420)
Line Coverage	38.61% (208044/538891)
Region Coverage	34.69% (163947/472669)
Branch Coverage	35.69% (71860/201348)

hello-stephen · 2026-07-01T05:05:25Z

BE Regression && UT Coverage Report

Increment line coverage 66.67% (24/36) 🎉

Increment coverage report
Complete coverage report

Category	Coverage
Function Coverage	74.16% (28535/38478)
Line Coverage	58.00% (310729/535700)
Region Coverage	54.78% (260061/474774)
Branch Coverage	56.13% (113155/201591)

1

github-actions · 2026-07-01T06:07:09Z

PR approved by at least one committer and no changes requested.

Task-executor scan scheduling could pass a null or invalid task handle into TimeSharingTaskExecutor. enqueue_splits and related paths cast the base TaskHandle to TimeSharingTaskHandle and immediately dereferenced the result, so a broken ScannerContext task-handle invariant caused BE to crash with SIGSEGV instead of returning a diagnostic error. This change validates scanner context, scan task, and task handle before submitting scan splits, and validates the task handle type at TimeSharingTaskExecutor entry points before dereferencing it.

github-actions Bot requested changes Jul 1, 2026

View reviewed changes

github-actions Bot previously requested changes Jul 1, 2026

View reviewed changes

zhangstar333 approved these changes Jul 1, 2026

View reviewed changes

github-actions Bot added the approved Indicates a PR has been approved by one committer. label Jul 1, 2026

Gabriel39 merged commit 023b766 into apache:master Jul 1, 2026
33 of 34 checks passed

Gabriel39 added dev/4.0.x dev/4.1.x and removed dev/4.0.x labels Jul 1, 2026

github-actions Bot added the dev/4.0.x-conflict label Jul 1, 2026

github-actions Bot mentioned this pull request Jul 1, 2026

branch-4.1: [fix](be) Validate task executor scan handles #65054 #65074

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[fix](be) Validate task executor scan handles#65054

[fix](be) Validate task executor scan handles#65054
Gabriel39 merged 1 commit into
apache:masterfrom
Gabriel39:fix_0630

Gabriel39 commented Jul 1, 2026

Uh oh!

hello-stephen commented Jul 1, 2026

Uh oh!

Gabriel39 commented Jul 1, 2026

Uh oh!

Gabriel39 commented Jul 1, 2026

Uh oh!

github-actions Bot left a comment

Uh oh!

github-actions Bot left a comment

Uh oh!

github-actions Bot Jul 1, 2026

Uh oh!

hello-stephen commented Jul 1, 2026

Uh oh!

hello-stephen commented Jul 1, 2026

Uh oh!

github-actions Bot commented Jul 1, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Conversation

Gabriel39 commented Jul 1, 2026

What problem does this PR solve?

Release note

Check List (For Author)

What problem does this PR solve?

Release note

Check List (For Author)

Check List (For Reviewer who merge this PR)

Uh oh!

hello-stephen commented Jul 1, 2026

Uh oh!

Gabriel39 commented Jul 1, 2026

Uh oh!

Gabriel39 commented Jul 1, 2026

Uh oh!

github-actions Bot left a comment

Choose a reason for hiding this comment

Uh oh!

github-actions Bot left a comment

Choose a reason for hiding this comment

Uh oh!

github-actions Bot Jul 1, 2026

Choose a reason for hiding this comment

Uh oh!

hello-stephen commented Jul 1, 2026

BE UT Coverage Report

Uh oh!

hello-stephen commented Jul 1, 2026

BE Regression && UT Coverage Report

Uh oh!

github-actions Bot commented Jul 1, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants