Skip to content

branch-4.1: [Fix](query cache) support partition-based instance parallelism #60974#61438

Open
github-actions[bot] wants to merge 1 commit intobranch-4.1from
auto-pick-60974-branch-4.1
Open

branch-4.1: [Fix](query cache) support partition-based instance parallelism #60974#61438
github-actions[bot] wants to merge 1 commit intobranch-4.1from
auto-pick-60974-branch-4.1

Conversation

@github-actions
Copy link
Contributor

Cherry-picked from #60974

### What problem does this PR solve?

When total tablets are much larger than pipeline capacity,
one-tablet-per-instance planning creates excessive BE concurrency
pressure in query-cache workloads.

Trigger partition-based planning when:
  total_tablets > parallel_pipeline_task_num * participating_be_num

Before:
  instance_num ~= total_tablets
After:
  instance_num ~= partitions_on_each_be

Per-BE planning example:
  BE1 tablets: p1[t1,t2], p2[t3]     -> instances: [p1:t1,t2], [p2:t3]
BE2 tablets: p1[t4], p2[t5,t6] -> instances: [p1:t4], [p2:t5,t6]

This keeps tablets from the same partition in one instance and separates
different partitions into different instances. If partition mapping is
incomplete
or partition planning fails, fallback to default planning for
correctness.

Tests:
- partition-based planning path
- fallback-to-default path (incomplete mapping)
- non-query-cache default planning path
@github-actions github-actions bot requested a review from yiguolei as a code owner March 17, 2026 12:21
@Thearas
Copy link
Contributor

Thearas commented Mar 17, 2026

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@dataroaring dataroaring reopened this Mar 17, 2026
@Thearas
Copy link
Contributor

Thearas commented Mar 17, 2026

run buildall

1 similar comment
@924060929
Copy link
Contributor

run buildall

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants