Generate a native ORCA plan for a replicated CTE in scalar subqueries#1818
Draft
Alena0704 wants to merge 1 commit into
Draft
Generate a native ORCA plan for a replicated CTE in scalar subqueries#1818Alena0704 wants to merge 1 commit into
Alena0704 wants to merge 1 commit into
Conversation
When a CTE over a DISTRIBUTED REPLICATED table is referenced from several scalar subqueries, ORCA puts the SharedScan Producer and Consumer on different slices. That cross-slice SharedScan used to hang. Until now we just avoided the hang: FHasCrossSliceReplicatedCTEConsumer detected this shape before DXL translation and fell back to the Postgres planner. This change lets ORCA handle the scalar-subquery case natively, so no fallback is needed: the replicated CTE is materialized once per consumer slice and shared by all references inside ORCA's own plan. The fix is in apply_shareinput_xslice (src/backend/cdb/cdbmutate.c). When a cross-slice ShareInputScan Consumer is found inside a SubPlan and the Producer's whole subtree is replicated -- it contains no Motion and every base-relation scan is over a replicated table (shareinput_subtree_is_replicated) -- the Consumer gets its own deep copy of that subtree with a fresh share_id and becomes a local, intra-slice producer (cross_slice = false, producer_slice_id = its own slice). The source is replicated, so every segment already holds the full data and the local copy is equivalent; the cross-slice coordination is gone. Sibling Consumers of the same CTE in the same slice reuse this copy (tracked by (orig_share_id, motId) -> new_share_id), so the CTE is materialized once and read by all references. cleanup_orphaned_producers then drops the original Producers that no longer have a Consumer. The reuse map, consumer counts and related state live in new ApplyShareInputContext fields in src/include/nodes/pathnodes.h. Checking the whole Producer subtree (instead of a single base Scan leaf) covers Producers built from UNION ALL / Append, aggregates, partitioned scans and joins of replicated tables, which otherwise produced the same hanging cross-slice SharedScan. A Producer whose subtree contains a Motion is left alone -- a local copy would not be equivalent. The transformation only runs for ORCA plans (apply_shareinput_xslice takes an is_orca flag). In the standard planner this pass runs after set_plan_references / replace_shareinput_targetlists / slice-table construction, so rewriting the tree there is unsafe -- and the Postgres fallback never produces the problematic cross-slice replicated SharedScan anyway (it uses InitPlans). The pre-DXL fallback check (CUtils::FHasCrossSliceReplicatedCTEConsumer) was too broad -- it fired for every cross-slice replicated CTE Consumer. Narrow it to the join case: a CTE Consumer under a duplicate-hazard / broadcast Motion (greengage 51fe92e). The scalar-subquery case no longer matches here and reaches the new materialization path instead. Add regression tests (shared_scan, qp_orca_fallback) covering the native materialization (single scan, UNION ALL/Append, repeated references) and the join case that ORCA still handles by pinning the Producer to one segment.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Generate a native ORCA plan for a replicated CTE in scalar subqueries
When a CTE over a DISTRIBUTED REPLICATED table is referenced from several scalar subqueries, ORCA puts the SharedScan Producer and Consumer on different slices. That cross-slice SharedScan used to hang.
Until now we just avoided the hang: FHasCrossSliceReplicatedCTEConsumer detected this shape before DXL translation and fell back to the Postgres planner. This change lets ORCA handle the scalar-subquery case natively, so no fallback is needed: the replicated CTE is materialized once per consumer slice and shared by all references inside ORCA's own plan.
The fix is in apply_shareinput_xslice (src/backend/cdb/cdbmutate.c). When a cross-slice ShareInputScan Consumer is found inside a SubPlan and the Producer's whole subtree is replicated -- it contains no Motion and every base-relation scan is over a replicated table (shareinput_subtree_is_replicated) -- the Consumer gets its own deep copy of that subtree with a fresh share_id and becomes a local, intra-slice producer (cross_slice = false, producer_slice_id = its own slice). The source is replicated, so every segment already holds the full data and the local copy is equivalent; the cross-slice coordination is gone. Sibling Consumers of the same CTE in the same slice reuse this copy (tracked by (orig_share_id, motId) -> new_share_id), so the CTE is materialized once and read by all references. cleanup_orphaned_producers then drops the original Producers that no longer have a Consumer. The reuse map, consumer counts and related state live in new ApplyShareInputContext fields in src/include/nodes/pathnodes.h.
Checking the whole Producer subtree (instead of a single base Scan leaf) covers Producers built from UNION ALL / Append, aggregates, partitioned scans and joins of replicated tables, which otherwise produced the same hanging cross-slice SharedScan. A Producer whose subtree contains a Motion is left alone -- a local copy would not be equivalent.
The transformation only runs for ORCA plans (apply_shareinput_xslice takes an is_orca flag). In the standard planner this pass runs after set_plan_references / replace_shareinput_targetlists / slice-table construction, so rewriting the tree there is unsafe -- and the Postgres fallback never produces the problematic cross-slice replicated SharedScan anyway (it uses InitPlans).
The pre-DXL fallback check (CUtils::FHasCrossSliceReplicatedCTEConsumer) was too broad -- it fired for every cross-slice replicated CTE Consumer. Narrow it to the join case: a CTE Consumer under a duplicate-hazard / broadcast Motion (greengage 51fe92e). The scalar-subquery case no longer matches here and reaches the new materialization path instead.
Add regression tests (shared_scan, qp_orca_fallback) covering the native materialization (single scan, UNION ALL/Append, repeated references) and the join case that ORCA still handles by pinning the Producer to one segment.
Fixes #ISSUE_Number
What does this PR do?
Type of Change
Breaking Changes
Test Plan
make installcheckmake -C src/test installcheck-cbdb-parallelImpact
Performance:
User-facing changes:
Dependencies:
Checklist
Additional Context
CI Skip Instructions