Skip to content

Generate a native ORCA plan for a replicated CTE in scalar subqueries#1818

Draft
Alena0704 wants to merge 1 commit into
apache:mainfrom
Alena0704:teach-orca-generate-cross-slice-shared-subplans
Draft

Generate a native ORCA plan for a replicated CTE in scalar subqueries#1818
Alena0704 wants to merge 1 commit into
apache:mainfrom
Alena0704:teach-orca-generate-cross-slice-shared-subplans

Conversation

@Alena0704

@Alena0704 Alena0704 commented Jun 11, 2026

Copy link
Copy Markdown
Contributor

Generate a native ORCA plan for a replicated CTE in scalar subqueries

When a CTE over a DISTRIBUTED REPLICATED table is referenced from several scalar subqueries, ORCA puts the SharedScan Producer and Consumer on different slices. That cross-slice SharedScan used to hang.

Until now we just avoided the hang: FHasCrossSliceReplicatedCTEConsumer detected this shape before DXL translation and fell back to the Postgres planner. This change lets ORCA handle the scalar-subquery case natively, so no fallback is needed: the replicated CTE is materialized once per consumer slice and shared by all references inside ORCA's own plan.

The fix is in apply_shareinput_xslice (src/backend/cdb/cdbmutate.c). When a cross-slice ShareInputScan Consumer is found inside a SubPlan and the Producer's whole subtree is replicated -- it contains no Motion and every base-relation scan is over a replicated table (shareinput_subtree_is_replicated) -- the Consumer gets its own deep copy of that subtree with a fresh share_id and becomes a local, intra-slice producer (cross_slice = false, producer_slice_id = its own slice). The source is replicated, so every segment already holds the full data and the local copy is equivalent; the cross-slice coordination is gone. Sibling Consumers of the same CTE in the same slice reuse this copy (tracked by (orig_share_id, motId) -> new_share_id), so the CTE is materialized once and read by all references. cleanup_orphaned_producers then drops the original Producers that no longer have a Consumer. The reuse map, consumer counts and related state live in new ApplyShareInputContext fields in src/include/nodes/pathnodes.h.

Checking the whole Producer subtree (instead of a single base Scan leaf) covers Producers built from UNION ALL / Append, aggregates, partitioned scans and joins of replicated tables, which otherwise produced the same hanging cross-slice SharedScan. A Producer whose subtree contains a Motion is left alone -- a local copy would not be equivalent.

The transformation only runs for ORCA plans (apply_shareinput_xslice takes an is_orca flag). In the standard planner this pass runs after set_plan_references / replace_shareinput_targetlists / slice-table construction, so rewriting the tree there is unsafe -- and the Postgres fallback never produces the problematic cross-slice replicated SharedScan anyway (it uses InitPlans).

The pre-DXL fallback check (CUtils::FHasCrossSliceReplicatedCTEConsumer) was too broad -- it fired for every cross-slice replicated CTE Consumer. Narrow it to the join case: a CTE Consumer under a duplicate-hazard / broadcast Motion (greengage 51fe92e). The scalar-subquery case no longer matches here and reaches the new materialization path instead.

Add regression tests (shared_scan, qp_orca_fallback) covering the native materialization (single scan, UNION ALL/Append, repeated references) and the join case that ORCA still handles by pinning the Producer to one segment.

Fixes #ISSUE_Number

What does this PR do?

Type of Change

  • Bug fix (non-breaking change)
  • New feature (non-breaking change)
  • Breaking change (fix or feature with breaking changes)
  • Documentation update

Breaking Changes

Test Plan

  • Unit tests added/updated
  • Integration tests added/updated
  • Passed make installcheck
  • Passed make -C src/test installcheck-cbdb-parallel

Impact

Performance:

User-facing changes:

Dependencies:

Checklist

Additional Context

CI Skip Instructions


When a CTE over a DISTRIBUTED REPLICATED table is referenced from several
scalar subqueries, ORCA puts the SharedScan Producer and Consumer on
different slices. That cross-slice SharedScan used to hang.

Until now we just avoided the hang: FHasCrossSliceReplicatedCTEConsumer
detected this shape before DXL translation and fell back to the Postgres
planner. This change lets ORCA handle the scalar-subquery case natively,
so no fallback is needed: the replicated CTE is materialized once per
consumer slice and shared by all references inside ORCA's own plan.

The fix is in apply_shareinput_xslice (src/backend/cdb/cdbmutate.c). When a
cross-slice ShareInputScan Consumer is found inside a SubPlan and the
Producer's whole subtree is replicated -- it contains no Motion and every
base-relation scan is over a replicated table (shareinput_subtree_is_replicated)
-- the Consumer gets its own deep copy of that subtree with a fresh
share_id and becomes a local, intra-slice producer (cross_slice = false,
producer_slice_id = its own slice). The source is replicated, so every
segment already holds the full data and the local copy is equivalent; the
cross-slice coordination is gone. Sibling Consumers of the same CTE in the
same slice reuse this copy (tracked by (orig_share_id, motId) ->
new_share_id), so the CTE is materialized once and read by all references.
cleanup_orphaned_producers then drops the original Producers that no longer
have a Consumer. The reuse map, consumer counts and related state live in
new ApplyShareInputContext fields in src/include/nodes/pathnodes.h.

Checking the whole Producer subtree (instead of a single base Scan leaf)
covers Producers built from UNION ALL / Append, aggregates, partitioned
scans and joins of replicated tables, which otherwise produced the same
hanging cross-slice SharedScan. A Producer whose subtree contains a Motion
is left alone -- a local copy would not be equivalent.

The transformation only runs for ORCA plans (apply_shareinput_xslice takes
an is_orca flag). In the standard planner this pass runs after
set_plan_references / replace_shareinput_targetlists / slice-table
construction, so rewriting the tree there is unsafe -- and the Postgres
fallback never produces the problematic cross-slice replicated SharedScan
anyway (it uses InitPlans).

The pre-DXL fallback check (CUtils::FHasCrossSliceReplicatedCTEConsumer)
was too broad -- it fired for every cross-slice replicated CTE Consumer.
Narrow it to the join case: a CTE Consumer under a duplicate-hazard /
broadcast Motion (greengage 51fe92e). The scalar-subquery case no longer
matches here and reaches the new materialization path instead.

Add regression tests (shared_scan, qp_orca_fallback) covering the native
materialization (single scan, UNION ALL/Append, repeated references) and
the join case that ORCA still handles by pinning the Producer to one
segment.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant