[SPARK-56605][SQL] Wire resolution engine to use SQL PATH for table, function, and variable lookup#55523
[SPARK-56605][SQL] Wire resolution engine to use SQL PATH for table, function, and variable lookup#55523srielau wants to merge 1 commit intoapache:masterfrom
Conversation
c9c7657 to
90d9edb
Compare
srielau
left a comment
There was a problem hiding this comment.
Path-aware resolution for relations, functions, and variables — wires SET PATH (SPARK-56501) through CatalogManager.sqlResolutionPathEntries so unqualified table/function/variable names actually use the stored session path. The follow-up that snapshots a frozen path into AnalysisContext.resolutionPathEntries for view bodies is referenced but not yet here.
Notes
Dead UnresolvedTableOrViewSearchPathMode.QueryLike / TempViewOnly + DESCRIBE search-path regression. The enum is introduced with the explicit intent that callers set the mode at construction (per the new @param doc), but no caller in this PR ever passes anything other than the default Ddl — the old commandName-based inference in CheckAnalysis (which routed "DESCRIBE …" to fullSearchPathForError, "… TEMPORARY VIEW …" to tempViewOnlySearchPathForError, and everything else to ddlSearchPathForError) was removed but the wire-up to the new enum is missing. Net effect: DESCRIBE TABLE nonexistent now reports only system.session + the current catalog/namespace in TABLE_OR_VIEW_NOT_FOUND instead of the full PATH (the , Ddl token in the updated describe.sql.out golden confirms the new mode is what's flowing). Either thread QueryLike through SparkSqlParser.visitDescribeRelation (and elsewhere if applicable), or keep the old commandName inference in CheckAnalysis and drop the unused enum cases. Inline at v2ResolutionPlans.scala:84.
Duplicated AnalysisContext path-default logic appears inline in three places (Analyzer.scala:2096, CheckAnalysis.scala:409, FunctionResolution.scala:76). Each computes if (AnalysisContext.get.catalogAndNamespace.nonEmpty) ctx else currentCatalog.name +: currentNamespace before calling the 4-arg sqlResolutionPathEntries. Worth centralizing — either a helper on CatalogManager or letting catalogPathForError itself consult AnalysisContext (the PR description claims it does, but CheckAnalysis.scala:93 still just returns currentCatalog.name +: currentNamespace).
Scaladoc references to [[SQLConf.sqlResolutionPathEntries]] are broken — the method lives on CatalogManager. Six occurrences across Analyzer.scala:145, CheckAnalysis.scala:83, FunctionResolution.scala:90, RelationResolution.scala:117/130/132/233. [[AnalysisContext.snapshotViewResolutionPath]] at RelationResolution.scala:128 doesn't exist anywhere in the repo. Suggestions inline.
Test coverage
No new test cases for the path-resolution behavior changes (single-part table/function/variable lookup honoring SET PATH, system.session gating of unqualified variables). The TableLookupCacheSuite change just teaches the mock about the new method. If the existing CI suites cover this transitively, fine; if not, a SET PATH end-to-end test for each of the three lookup paths would be worth adding.
srielau
left a comment
There was a problem hiding this comment.
Re-review on the same commit (90d9edbe); the design hasn't changed since the prior AI round.
Status: 0 addressed, 6 remaining, 4 new.
Remaining from prior review
All 9 inline comments and the summary findings from the prior <!-- ai-code-review --> round are still applicable on this commit. Highlights:
- Dead
UnresolvedTableOrViewSearchPathMode.QueryLike/TempViewOnlycases →DESCRIBE TABLE nonexistentregresses from fullsqlResolutionPathEntriestosystem.session + current catalog/namespaceonly (see prior comment onv2ResolutionPlans.scala:75-87). - Path-default + 4-arg
sqlResolutionPathEntriespattern duplicated 4 places (Analyzer.scala:2096,CheckAnalysis.scala:409,FunctionResolution.scala:76+105,RelationResolution.scala:135); consolidation opportunity. The PR description claimscatalogPathForErrorconsultsAnalysisContext.catalogAndNamespace, butCheckAnalysis.scala:93still just returnscurrentCatalog.name +: currentNamespace. - Six broken
[[SQLConf.sqlResolutionPathEntries]]Scaladoc refs (method is onCatalogManager); plus broken[[AnalysisContext.snapshotViewResolutionPath]]ref atRelationResolution.scala:128(no such method exists). resolveProceduresilently swallowsNonFatalfor unqualified names (FunctionResolution.scala:665-674), losing the underlying catalog failure.- Redundant
||inResolverGuard.scala:479-480(isUnsupportedFunction(name)is defined asUNSUPPORTED_FUNCTION_NAMES.contains(name)). - No new functional tests for path-affected resolution (single-part table/function lookup honoring
SET PATH,system.sessiongating of unqualified variables, procedure resolution via path).
New this round
confForRoutineResolution(definedCheckAnalysis.scala:49, overriddenAnalyzer.scala:305) is never read anywhere — pure dead code. (Inline.)Analyzer.scala:998-999comment claims unqualified relation PATH is snapshotted inAnalysisContext.resolutionPathEntrieswhile resolving each view body, but no code in this PR sets that field — the snapshot wiring is deferred to the follow-up PR. (Inline.)Analyzer.executeAndCheckwraps the run inSQLConf.withExistingConf(sessionConf.get), but the publicAnalyzer.execute(line 353-359, called fromRelationalGroupedDataset.scala:647and several test sites) does not. Soanalyzer.execute(plan)runs the resolution machinery againstSQLConf.getrather than the capturedsessionConf— production callers happen to be OK because the active session's conf already matches, but the asymmetry undermines the analyzer-isolation guaranteesessionConfis meant to provide. (Inline.)- PR description states
CheckAnalysisnow "UsesCatalogManagerconstants instead of string literals," butddlSearchPathForError(line 79) andtempViewOnlySearchPathForError(line 102) still useSeq("system", "session"). Either flip toSeq(CatalogManager.SYSTEM_CATALOG_NAME, CatalogManager.SESSION_NAMESPACE)or update the PR description.
…function, and variable lookup Switch the resolution engine from the legacy single-schema resolutionSearchPath to sqlResolutionPathEntries on CatalogManager, so that SET PATH actually affects how unqualified names are resolved. Key changes: - CatalogManager: add sqlResolutionPathEntries and sessionScopeUnqualifiedAllowed (reads stored session path) - RelationResolution: walk path entries via PersistentCatalogStep, each entry qualifies the object name under that catalog/namespace - FunctionResolution: use sqlResolutionPathEntriesForAnalysis for resolution candidates; move procedure resolution here - Analyzer: add sessionConf/resolutionConf for isolated analysis; AnalysisContext gains resolutionPathEntries field (initially None) - CheckAnalysis: view-body-aware catalogPathForError, UnresolvedTableOrViewSearchPathMode, CatalogManager constants - VariableResolution: allowUnqualifiedSessionTempVariableLookup gates unqualified variable access when system.session not on PATH - Single-pass resolver updates aligned with new resolution signatures Frozen path analysis for views/SQL functions comes in a follow-up PR. Part of SPARK-54810.
90d9edb to
4f300f7
Compare
|
All 13 self-review findings addressed in commit
|
What changes were proposed in this pull request?
Switch the resolution engine from the legacy single-schema
resolutionSearchPathtosqlResolutionPathEntriesonCatalogManager, so thatSET PATHactually affects how unqualified table names, function names, and variables are resolved.CatalogManager (
CatalogManager.scala):sqlResolutionPathEntries: ordered path entries for resolving unqualified names. When PATH is enabled and set, uses the stored session path; otherwise falls back todefaultPathOrder(legacy).sessionScopeUnqualifiedAllowed: gates unqualified variable access whensystem.sessionis not on the PATH.Relation resolution (
RelationResolution.scala):relationResolutionEntries/relationResolutionStepsreplacerelationResolutionSearchPath.PersistentCatalogStepcarries the catalog/namespace prefix so each path entry qualifies the object name under that entry.Function resolution (
FunctionResolution.scala):sqlResolutionPathEntriesForAnalysisprovides candidates for unqualified function names.resolveProcedure) moved here fromAnalyzer.Error context (
CheckAnalysis.scala):catalogPathForErrornow consultsAnalysisContext.catalogAndNamespacewhen inside a view body, so error messages report the view's defining catalog/namespace.UnresolvedTableOrViewSearchPathModeenum controls DDL vs query-like error paths.CatalogManagerconstants instead of string literals.RelationChangeserrors now include the search path.Variable resolution (
VariableResolution.scala+ callers):allowUnqualifiedSessionTempVariableLookupgates unqualified variable access whensystem.sessionis not on the PATH.Analyzer (
Analyzer.scala):sessionConfconstructor parameter for isolated analysis (e.g. Connect).resolutionConffor path-based resolution.AnalysisContext.resolutionPathEntriesfield (initiallyNone; frozen path wiring comes in follow-up PR).Single-pass resolver:
Resolver,HybridAnalyzer,NameScope,ResolverGuardaligned with new resolution signatures.Frozen path analysis for views/SQL functions (using stored path during analysis) comes in a follow-up PR. This PR only wires the resolution engine to use the live session path.
Why are the changes needed?
SET PATH(merged in SPARK-56501) stores the session path but the resolvers still used the legacy single-schema path. This PR completes the wiring so PATH actually affects resolution.Part of SPARK-54810.
Does this PR introduce any user-facing change?
Yes. With
spark.sql.path.enabled = trueandSET PATH, unqualified table names, function names, and variable references now resolve according to the stored path order.How was this patch tested?
CI.
ProtoToParsedPlanTestSuiteupdated with analyzer isolation conf. Connect.explaingolden files updated.PlanResolutionSuite,NameScopeSuite,TimezoneAwareExpressionResolverSuiteupdated for new constructor signatures.Was this patch authored or co-authored using generative AI tooling?
Generated-by: Claude Opus 4.6