[Opt](operator) Eliminate redundant copy in MultiCastOperator and fix wrong assume_mutable#60386
[Opt](operator) Eliminate redundant copy in MultiCastOperator and fix wrong assume_mutable#60386zclllyybb wants to merge 2 commits intoapache:masterfrom
Conversation
|
Thank you for your contribution to Apache Doris. Please clearly describe your PR:
|
|
run buildall |
TPC-H: Total hot run time: 31834 ms |
TPC-DS: Total hot run time: 190155 ms |
ClickBench: Total hot run time: 28.23 s |
BE UT Coverage ReportIncrement line coverage Increment coverage report
|
BE Regression && UT Coverage ReportIncrement line coverage Increment coverage report
|
5193e29 to
f8c3a60
Compare
|
run buildall |
There was a problem hiding this comment.
Pull request overview
This PR removes a redundant deep-copy in the MultiCast data path by relying on COW semantics, and updates several vectorized block/expr mutation sites to avoid in-place mutation when columns are shared. It also adds a regression suite to exercise complex MultiCast/CTE scenarios.
Changes:
- Remove per-pull column deep copy in
MultiCastDataStreamer(switch to shared blocks + COW). - Update multiple “clear/filter” paths to respect COW via
is_exclusive()checks andclone_empty()fallback. - Add a new regression test suite and expected output covering complex CTE MultiCast patterns.
Reviewed changes
Copilot reviewed 7 out of 7 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
| regression-test/suites/nereids_p0/cte/test_cte_multicast_complex.groovy | Adds a regression suite with multiple CTE consumers/filters to trigger MultiCast behavior. |
| regression-test/data/nereids_p0/cte/test_cte_multicast_complex.out | Expected results for the new regression suite. |
| be/src/vec/exprs/vexpr_context.cpp | Makes “filter-all” clearing COW-safe by avoiding in-place clears on shared columns. |
| be/src/vec/core/block.cpp | Makes filter_block_internal/const-filter emptying COW-safe; adds clarifying comments. |
| be/src/pipeline/exec/nested_loop_join_probe_operator.h | Makes block clearing COW-safe in the non-materialize path. |
| be/src/pipeline/exec/multi_cast_data_streamer.h / .cpp | Removes the deep-copy helper and finishes pulls without cloning columns. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
TPC-H: Total hot run time: 32031 ms |
ClickBench: Total hot run time: 28.21 s |
BE UT Coverage ReportIncrement line coverage Increment coverage report
|
BE Regression && UT Coverage ReportIncrement line coverage Increment coverage report
|
f8c3a60 to
bf1e12d
Compare
|
run buildall |
BE UT Coverage ReportIncrement line coverage Increment coverage report
|
TPC-H: Total hot run time: 31578 ms |
TPC-DS: Total hot run time: 190717 ms |
ClickBench: Total hot run time: 28.29 s |
BE Regression && UT Coverage ReportIncrement line coverage Increment coverage report
|
|
Paste performance results |
bf1e12d to
231e35e
Compare
|
run buildall |
What problem does this PR solve?
Issue Number: close #xxx
Related PR: #xxx
Problem Summary:
we have COW and
shallow_mutate, so for most of scenarios we dont need to explicitly copy block's data. this PR fixed this for MultiCastOperator. and fixed all founded misuses. added tests.Release note
None
Check List (For Author)
Test
Behavior changed:
Does this need documentation?
Check List (For Reviewer who merge this PR)