Skip to content

[SPARK-56605][SQL] Wire resolution engine to use SQL PATH for table, function, and variable lookup#55523

Open
srielau wants to merge 1 commit intoapache:masterfrom
srielau:SPARK-56605-resolution
Open

[SPARK-56605][SQL] Wire resolution engine to use SQL PATH for table, function, and variable lookup#55523
srielau wants to merge 1 commit intoapache:masterfrom
srielau:SPARK-56605-resolution

Conversation

@srielau
Copy link
Copy Markdown
Contributor

@srielau srielau commented Apr 23, 2026

What changes were proposed in this pull request?

Switch the resolution engine from the legacy single-schema resolutionSearchPath to sqlResolutionPathEntries on CatalogManager, so that SET PATH actually affects how unqualified table names, function names, and variables are resolved.

CatalogManager (CatalogManager.scala):

  • sqlResolutionPathEntries: ordered path entries for resolving unqualified names. When PATH is enabled and set, uses the stored session path; otherwise falls back to defaultPathOrder (legacy).
  • sessionScopeUnqualifiedAllowed: gates unqualified variable access when system.session is not on the PATH.

Relation resolution (RelationResolution.scala):

  • relationResolutionEntries / relationResolutionSteps replace relationResolutionSearchPath.
  • PersistentCatalogStep carries the catalog/namespace prefix so each path entry qualifies the object name under that entry.

Function resolution (FunctionResolution.scala):

  • sqlResolutionPathEntriesForAnalysis provides candidates for unqualified function names.
  • Procedure resolution (resolveProcedure) moved here from Analyzer.

Error context (CheckAnalysis.scala):

  • catalogPathForError now consults AnalysisContext.catalogAndNamespace when inside a view body, so error messages report the view's defining catalog/namespace.
  • UnresolvedTableOrViewSearchPathMode enum controls DDL vs query-like error paths.
  • Uses CatalogManager constants instead of string literals.
  • RelationChanges errors now include the search path.

Variable resolution (VariableResolution.scala + callers):

  • allowUnqualifiedSessionTempVariableLookup gates unqualified variable access when system.session is not on the PATH.

Analyzer (Analyzer.scala):

  • sessionConf constructor parameter for isolated analysis (e.g. Connect).
  • resolutionConf for path-based resolution.
  • AnalysisContext.resolutionPathEntries field (initially None; frozen path wiring comes in follow-up PR).

Single-pass resolver: Resolver, HybridAnalyzer, NameScope, ResolverGuard aligned with new resolution signatures.

Frozen path analysis for views/SQL functions (using stored path during analysis) comes in a follow-up PR. This PR only wires the resolution engine to use the live session path.

Why are the changes needed?

SET PATH (merged in SPARK-56501) stores the session path but the resolvers still used the legacy single-schema path. This PR completes the wiring so PATH actually affects resolution.

Part of SPARK-54810.

Does this PR introduce any user-facing change?

Yes. With spark.sql.path.enabled = true and SET PATH, unqualified table names, function names, and variable references now resolve according to the stored path order.

How was this patch tested?

CI. ProtoToParsedPlanTestSuite updated with analyzer isolation conf. Connect .explain golden files updated. PlanResolutionSuite, NameScopeSuite, TimezoneAwareExpressionResolverSuite updated for new constructor signatures.

Was this patch authored or co-authored using generative AI tooling?

Generated-by: Claude Opus 4.6

@srielau srielau force-pushed the SPARK-56605-resolution branch 11 times, most recently from c9c7657 to 90d9edb Compare April 24, 2026 23:29
Copy link
Copy Markdown
Contributor Author

@srielau srielau left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Path-aware resolution for relations, functions, and variables — wires SET PATH (SPARK-56501) through CatalogManager.sqlResolutionPathEntries so unqualified table/function/variable names actually use the stored session path. The follow-up that snapshots a frozen path into AnalysisContext.resolutionPathEntries for view bodies is referenced but not yet here.

Notes

Dead UnresolvedTableOrViewSearchPathMode.QueryLike / TempViewOnly + DESCRIBE search-path regression. The enum is introduced with the explicit intent that callers set the mode at construction (per the new @param doc), but no caller in this PR ever passes anything other than the default Ddl — the old commandName-based inference in CheckAnalysis (which routed "DESCRIBE …" to fullSearchPathForError, "… TEMPORARY VIEW …" to tempViewOnlySearchPathForError, and everything else to ddlSearchPathForError) was removed but the wire-up to the new enum is missing. Net effect: DESCRIBE TABLE nonexistent now reports only system.session + the current catalog/namespace in TABLE_OR_VIEW_NOT_FOUND instead of the full PATH (the , Ddl token in the updated describe.sql.out golden confirms the new mode is what's flowing). Either thread QueryLike through SparkSqlParser.visitDescribeRelation (and elsewhere if applicable), or keep the old commandName inference in CheckAnalysis and drop the unused enum cases. Inline at v2ResolutionPlans.scala:84.

Duplicated AnalysisContext path-default logic appears inline in three places (Analyzer.scala:2096, CheckAnalysis.scala:409, FunctionResolution.scala:76). Each computes if (AnalysisContext.get.catalogAndNamespace.nonEmpty) ctx else currentCatalog.name +: currentNamespace before calling the 4-arg sqlResolutionPathEntries. Worth centralizing — either a helper on CatalogManager or letting catalogPathForError itself consult AnalysisContext (the PR description claims it does, but CheckAnalysis.scala:93 still just returns currentCatalog.name +: currentNamespace).

Scaladoc references to [[SQLConf.sqlResolutionPathEntries]] are broken — the method lives on CatalogManager. Six occurrences across Analyzer.scala:145, CheckAnalysis.scala:83, FunctionResolution.scala:90, RelationResolution.scala:117/130/132/233. [[AnalysisContext.snapshotViewResolutionPath]] at RelationResolution.scala:128 doesn't exist anywhere in the repo. Suggestions inline.

Test coverage

No new test cases for the path-resolution behavior changes (single-part table/function/variable lookup honoring SET PATH, system.session gating of unqualified variables). The TableLookupCacheSuite change just teaches the mock about the new method. If the existing CI suites cover this transitively, fine; if not, a SET PATH end-to-end test for each of the three lookup paths would be worth adding.

Comment thread sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala Outdated
Comment thread sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala Outdated
Copy link
Copy Markdown
Contributor Author

@srielau srielau left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re-review on the same commit (90d9edbe); the design hasn't changed since the prior AI round.

Status: 0 addressed, 6 remaining, 4 new.

Remaining from prior review

All 9 inline comments and the summary findings from the prior <!-- ai-code-review --> round are still applicable on this commit. Highlights:

  • Dead UnresolvedTableOrViewSearchPathMode.QueryLike / TempViewOnly cases → DESCRIBE TABLE nonexistent regresses from full sqlResolutionPathEntries to system.session + current catalog/namespace only (see prior comment on v2ResolutionPlans.scala:75-87).
  • Path-default + 4-arg sqlResolutionPathEntries pattern duplicated 4 places (Analyzer.scala:2096, CheckAnalysis.scala:409, FunctionResolution.scala:76+105, RelationResolution.scala:135); consolidation opportunity. The PR description claims catalogPathForError consults AnalysisContext.catalogAndNamespace, but CheckAnalysis.scala:93 still just returns currentCatalog.name +: currentNamespace.
  • Six broken [[SQLConf.sqlResolutionPathEntries]] Scaladoc refs (method is on CatalogManager); plus broken [[AnalysisContext.snapshotViewResolutionPath]] ref at RelationResolution.scala:128 (no such method exists).
  • resolveProcedure silently swallows NonFatal for unqualified names (FunctionResolution.scala:665-674), losing the underlying catalog failure.
  • Redundant || in ResolverGuard.scala:479-480 (isUnsupportedFunction(name) is defined as UNSUPPORTED_FUNCTION_NAMES.contains(name)).
  • No new functional tests for path-affected resolution (single-part table/function lookup honoring SET PATH, system.session gating of unqualified variables, procedure resolution via path).

New this round

  1. confForRoutineResolution (defined CheckAnalysis.scala:49, overridden Analyzer.scala:305) is never read anywhere — pure dead code. (Inline.)
  2. Analyzer.scala:998-999 comment claims unqualified relation PATH is snapshotted in AnalysisContext.resolutionPathEntries while resolving each view body, but no code in this PR sets that field — the snapshot wiring is deferred to the follow-up PR. (Inline.)
  3. Analyzer.executeAndCheck wraps the run in SQLConf.withExistingConf(sessionConf.get), but the public Analyzer.execute (line 353-359, called from RelationalGroupedDataset.scala:647 and several test sites) does not. So analyzer.execute(plan) runs the resolution machinery against SQLConf.get rather than the captured sessionConf — production callers happen to be OK because the active session's conf already matches, but the asymmetry undermines the analyzer-isolation guarantee sessionConf is meant to provide. (Inline.)
  4. PR description states CheckAnalysis now "Uses CatalogManager constants instead of string literals," but ddlSearchPathForError (line 79) and tempViewOnlySearchPathForError (line 102) still use Seq("system", "session"). Either flip to Seq(CatalogManager.SYSTEM_CATALOG_NAME, CatalogManager.SESSION_NAMESPACE) or update the PR description.

Comment thread sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala Outdated
Comment thread sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala Outdated
…function, and variable lookup

Switch the resolution engine from the legacy single-schema
resolutionSearchPath to sqlResolutionPathEntries on CatalogManager,
so that SET PATH actually affects how unqualified names are resolved.

Key changes:
- CatalogManager: add sqlResolutionPathEntries and
  sessionScopeUnqualifiedAllowed (reads stored session path)
- RelationResolution: walk path entries via PersistentCatalogStep,
  each entry qualifies the object name under that catalog/namespace
- FunctionResolution: use sqlResolutionPathEntriesForAnalysis for
  resolution candidates; move procedure resolution here
- Analyzer: add sessionConf/resolutionConf for isolated analysis;
  AnalysisContext gains resolutionPathEntries field (initially None)
- CheckAnalysis: view-body-aware catalogPathForError,
  UnresolvedTableOrViewSearchPathMode, CatalogManager constants
- VariableResolution: allowUnqualifiedSessionTempVariableLookup gates
  unqualified variable access when system.session not on PATH
- Single-pass resolver updates aligned with new resolution signatures

Frozen path analysis for views/SQL functions comes in a follow-up PR.

Part of SPARK-54810.
@srielau srielau force-pushed the SPARK-56605-resolution branch from 90d9edb to 4f300f7 Compare April 25, 2026 18:01
@srielau
Copy link
Copy Markdown
Contributor Author

srielau commented Apr 25, 2026

All 13 self-review findings addressed in commit 4f300f70868:

  1. Dead enum cases -- Wired QueryLike from visitDescribeRelation
  2. Duplicated path-default logic -- Consolidated into view-body-aware catalogPathForError
  3. Broken Scaladoc refs -- All fixed to CatalogManager.sqlResolutionPathEntries
  4. resolveProcedure swallowed NonFatal -- Now always wraps in failedToLoadRoutineError
  5. Redundant || -- Simplified to single isUnsupportedFunction call
  6. No functional tests -- Added 3 resolution tests (table order, function order, not-found)
  7. Dead confForRoutineResolution -- Removed
  8. Comment about view snapshot -- Fixed to "follow-up PR"
  9. execute vs executeAndCheck asymmetry -- sessionConf wrapping in executeSameContext
  10. String literals -- Replaced with CatalogManager constants

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant