-
Notifications
You must be signed in to change notification settings - Fork 3.7k
[Opt](operator) Eliminate redundant copy in MultiCastOperator #60386
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
|
Thank you for your contribution to Apache Doris. Please clearly describe your PR:
|
|
run buildall |
TPC-H: Total hot run time: 31834 ms |
TPC-DS: Total hot run time: 190155 ms |
ClickBench: Total hot run time: 28.23 s |
BE UT Coverage ReportIncrement line coverage Increment coverage report
|
BE Regression && UT Coverage ReportIncrement line coverage Increment coverage report
|
5193e29 to
f8c3a60
Compare
|
run buildall |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR removes a redundant deep-copy in the MultiCast data path by relying on COW semantics, and updates several vectorized block/expr mutation sites to avoid in-place mutation when columns are shared. It also adds a regression suite to exercise complex MultiCast/CTE scenarios.
Changes:
- Remove per-pull column deep copy in
MultiCastDataStreamer(switch to shared blocks + COW). - Update multiple “clear/filter” paths to respect COW via
is_exclusive()checks andclone_empty()fallback. - Add a new regression test suite and expected output covering complex CTE MultiCast patterns.
Reviewed changes
Copilot reviewed 7 out of 7 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
| regression-test/suites/nereids_p0/cte/test_cte_multicast_complex.groovy | Adds a regression suite with multiple CTE consumers/filters to trigger MultiCast behavior. |
| regression-test/data/nereids_p0/cte/test_cte_multicast_complex.out | Expected results for the new regression suite. |
| be/src/vec/exprs/vexpr_context.cpp | Makes “filter-all” clearing COW-safe by avoiding in-place clears on shared columns. |
| be/src/vec/core/block.cpp | Makes filter_block_internal/const-filter emptying COW-safe; adds clarifying comments. |
| be/src/pipeline/exec/nested_loop_join_probe_operator.h | Makes block clearing COW-safe in the non-materialize path. |
| be/src/pipeline/exec/multi_cast_data_streamer.h / .cpp | Removes the deep-copy helper and finishes pulls without cloning columns. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
TPC-H: Total hot run time: 32031 ms |
ClickBench: Total hot run time: 28.21 s |
BE UT Coverage ReportIncrement line coverage Increment coverage report
|
BE Regression && UT Coverage ReportIncrement line coverage Increment coverage report
|
f8c3a60 to
bf1e12d
Compare
What problem does this PR solve?
Issue Number: close #xxx
Related PR: #xxx
Problem Summary:
we have COW and
shallow_mutate, so for most of scenarios we dont need to explicitly copy block's data. this PR fixed this for MultiCastOperator. and fixed all founded misuses. added tests.Release note
None
Check List (For Author)
Test
Behavior changed:
Does this need documentation?
Check List (For Reviewer who merge this PR)