sql: post repr types: Rename output_type() to output_sql_type() on all scalar funcs #35219
Draft
mgree wants to merge 41 commits intoMaterializeInc:mainfrom
Draft
sql: post repr types: Rename output_type() to output_sql_type() on all scalar funcs #35219mgree wants to merge 41 commits intoMaterializeInc:mainfrom
output_type() to output_sql_type() on all scalar funcs #35219mgree wants to merge 41 commits intoMaterializeInc:mainfrom
Conversation
…f trusting `Get`s, which now have canonicalized types
These fire when repr-type canonicalization does not fully prevent type mismatches, providing CI visibility without crashing production. - scalar.rs: base_eq_or_repr_eq_for_assertion panic -> soft_panic_or_log - context.rs: arrangement key fallback panic -> soft_panic_or_log - relation.rs: try_union trace -> soft_panic_or_log
This tweaks `MirScalarExpr::typ` function case and `try_col_with_input_cols`'s TableFunc and AggregateFunc cases, to prevent function return types from reintroducing non-canonical types, e.g. CastOidToRegProc or AclExplode.
…ationType analysis Partially inspired by MaterializeInc#35073
…text Add a ReduceContext enum to MirScalarExpr::reduce to distinguish between SQL-level callers (which need exact types like VarChar, RegProc) and optimizer callers (which should work with repr-canonical types). When context is ReduceContext::Optimizer, column types are canonicalized via repr round-trip at the start of reduce(). This ensures consistency after ReprizeSqlTypes has run. Note: The ReduceContext approach introduced in this commit has been superseded by the reduce_repr() approach in a later commit. The ReduceContext enum and its usage have been removed.
Add infrastructure: - SqlRelationType::from_repr(&ReprRelationType) in src/repr/src/relation.rs - MirRelationExpr::repr_typ() -> ReprRelationType in src/expr/src/relation.rs - MirRelationExpr::repr_typ_with_input_types(&[ReprRelationType]) in src/expr/src/relation.rs Switch the following transform files to work with ReprRelationType/ReprColumnType: - column_knowledge.rs: optimize() now takes &[ReprColumnType], all .typ() sites produce Vec<ReprColumnType>, scalar.repr_typ() for output types. - predicate_pushdown.rs: repr_typ() for nullable checks, extract_equal_or_both_null and helpers take &[ReprColumnType], uses repr_typ()/repr_typ_with_input_types throughout. - fusion/filter.rs: uses input.repr_typ().column_types directly. - join_implementation.rs: input_types is Vec<ReprRelationType> via repr_typ(), uses new_from_input_repr_types. - redundant_join.rs: same pattern as join_implementation.rs. - literal_constraints.rs: inp_typ switched to ReprRelationType, converts back to SqlRelationType only for MIR node construction (Constant, take_safely). Convert back to SqlColumnType/SqlRelationType only where structurally required (reduce() calls, MIR node construction).
Add take_safely_repr(Option<ReprRelationType>) and take_safely_with_repr_col_types(Vec<ReprColumnType>) to MirRelationExpr. These are thin wrappers that convert to SQL types internally, providing a cleaner API for optimizer transforms that work with repr types natively. Update 4 call sites: - equivalence_propagation.rs (2): eliminate SqlColumnType::from_repr conversions - predicate_pushdown.rs: eliminate SqlRelationType::from_repr - literal_constraints.rs: eliminate SqlRelationType::from_repr
Change the signature of on_unique and its helpers (on_unique_ranking_window_funcs, on_unique_window_agg) from &[SqlColumnType] to &[ReprColumnType]. Internally, convert to SqlColumnType once at the top for self.typ() calls that need SQL types. This eliminates the SqlColumnType::from_repr conversion in reduce_elision.rs, which was the only external caller.
Do the following easy switches from reduce(&[SqlColumnType]) to reduce_repr(&[ReprColumnType]): - simplify_to_literal / simplify_to_literal_with_result: It's ok to do the constant folding in MIR, because the result is untyped. - canonicalize.rs: 3 calls in canonicalize_equivalences, canonicalize_predicates, and replace_subexpr_and_reduce. These already had ReprColumnType available and were converting to SqlColumnType just to call reduce. The conversion is now removed entirely. - plan_index_exprs in query.rs: Index key expressions should have repr types so optimizer code can find matching indexes. - WebhookValidation::reduce_expression in plan.rs: Webhook validation expressions are used at runtime where repr types are the native currency.
Several transform files were storing SqlRelationType obtained via typ() (which converts stored ReprRelationType to SqlRelationType), only to convert back to ReprRelationType later. Switch these to use repr_typ() and store ReprRelationType directly, eliminating unnecessary round-trips. Files changed: projection_lifting.rs, projection_pushdown.rs, cse/anf.rs, union_cancel.rs, demand.rs, dataflow.rs.
Switch Scope struct in expr-test-util to use ReprRelationType instead of SqlRelationType. Update all insert/set/get/iter signatures accordingly. build_get no longer needs conversion, build_let uses repr_typ(), and reverse_syntax_override stores ReprRelationType directly. Update doc comments in fusion/filter.rs and predicate_pushdown.rs to construct MirRelationExpr::Constant directly with ReprRelationType instead of using MirRelationExpr::constant() with SqlRelationType.
|
Thanks for opening this PR! Here are a few tips to help make the review process smooth for everyone. PR title guidelines
Pre-merge checklist
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Motivation
Cleanup stacked on #35084.
Description
The various scalar functions compute their SQL output types given SQL input types. This renames functions to clarify and adds a shim that converts repr types up to canonical SQL types and then converts the resulting SQL type back to a repr type.
Verification
All the tests.