Repr types: Introduce repr types in all MIR code, and then change the MIR structs#35084
Conversation
base_eq_or_repr_eq_for_assertionbase_eq_or_repr_eq_for_assertion
Pre-merge checklist
|
a9ab2a6 to
71638ef
Compare
base_eq_or_repr_eq_for_assertion8e0ab4e to
42b4805
Compare
42b4805 to
7575ddb
Compare
mgree
left a comment
There was a problem hiding this comment.
Okay---lots of suggestions/possibilities for improvement. Nothing seems wrong, so I'm marking it as approved.
Seems like there was some kind of instance/ghcr issue running nightly, but there may be benchmark issues as well.
I'll leave it up to you as to which of these changes are worth making---many are tiny refactors, but we'll want to rerun nightly and release qualification if we make them...
src/adapter/src/catalog.rs
Outdated
| if let Ok(eval_result_datum) = mir.eval(&[], &arena) { | ||
| if let Some(return_styp) = return_styp { | ||
| let mir_typ = mir.typ(&[]); | ||
| let mir_typ = mir.sql_typ(&[]); |
There was a problem hiding this comment.
We could probably be more efficient here (we later compute mir_repr_scalar_type via round trip), but it's a test---I'm not worried about it.
src/adapter/src/coord.rs
Outdated
| ) -> SqlRelationType { | ||
| let mut typ = hir_expr.top_level_typ(); | ||
| typ.backport_nullability_and_keys(&mir_expr.typ()); | ||
| typ.backport_nullability_and_keys(&mir_expr.sql_typ()); |
There was a problem hiding this comment.
We could alter this to work off mir_expr.typ() directly. Happy to save this for a later refactor.
There was a problem hiding this comment.
Done, it was not hard
| syn::Lit::Str(l) => { | ||
| packer.push(Datum::from(l.value().as_str())); | ||
| Ok(String::as_column_type()) | ||
| Ok(ReprColumnType::from(&String::as_column_type())) |
There was a problem hiding this comment.
Should as_column_type() have a SQL and repr version? Overkill?
There was a problem hiding this comment.
Ideally, yes, but these 4 calls in this file would be the only usage, so I'm postponing this.
| littyp.nullable(matches!(&litval[..], "null")), | ||
| Ok(Some(MirScalarExpr::literal_from_row( | ||
| test_spec_to_row(std::iter::once((&litval[..], &littyp)))?, | ||
| ReprScalarType::from(&littyp), |
There was a problem hiding this comment.
Confirming that we're not losing nullability information because literal_from_row can just look. (Caught me out!)
| } | ||
| }); | ||
| popped_expr.reduce(&column_types); | ||
| // Using Optimizer context is fine here: this is called from optimizer |
There was a problem hiding this comment.
I don't understand this comment.
There was a problem hiding this comment.
Deleted.
(This comment was inserted by Claude, because I was telling it to be extra careful with what code is called from the optimizer vs. from upper levels when switching over stuff to repr types. So, it wanted to show me that it did check this carefully.)
| // Even better than what we were hoping for: Found contradicting | ||
| // literal constraints, so the whole relation is empty. | ||
| relation.take_safely(Some(inp_typ)); | ||
| relation.take_safely(Some(inp_typ.clone())); |
| // NB the ReprScalarType::eq ignorres nullability (correctly!) | ||
| // since record field nullability can legitimately differ between the stored | ||
| // type and the analysis-recomputed type. | ||
| if !new_type |
There was a problem hiding this comment.
All this could now just be new_type.column_types == typ.column_types.
There was a problem hiding this comment.
Wait, ReprColumnType does not have a manual Eq impl, so this would not ignore nullability, while the current code ignores nullability.
And, indeed, if I change this to if new_type.column_types != typ.column_types, I get the following panic at startup:
internal error: failed to load bootstrap view:
mz_object_history
error:
Optimizer(TransformError(Internal("scalar types do not match: [ReprColumnType { scalar_type: UInt64, nullable: false }, ReprColumnType { scalar_type: String, nullable: false }, ReprColumnType { scalar_type: String, nullable: false }, ReprColumnType { scalar_type: Jsonb, nullable: false }, ReprColumnType { scalar_type: String, nullable: true }, ReprColumnType { scalar_type: TimestampTz, nullable: false }, ReprColumnType { scalar_type: String, nullable: true }, ReprColumnType { scalar_type: UInt32, nullable: true }, ReprColumnType { scalar_type: String, nullable: true }, ReprColumnType { scalar_type: String, nullable: true }, ReprColumnType { scalar_type: String, nullable: true }, ReprColumnType { scalar_type: String, nullable: true }, ReprColumnType { scalar_type: String, nullable: true }, ReprColumnType { scalar_type: Array(MzAclItem), nullable: true }, ReprColumnType { scalar_type: String, nullable: false }, ReprColumnType { scalar_type: Bool, nullable: false }, ReprColumnType { scalar_type: String, nullable: true }, ReprColumnType { scalar_type: String, nullable: true }, ReprColumnType { scalar_type: UInt64, nullable: false }, ReprColumnType { scalar_type: String, nullable: false }, ReprColumnType { scalar_type: String, nullable: false }, ReprColumnType { scalar_type: Jsonb, nullable: false }, ReprColumnType { scalar_type: String, nullable: true }, ReprColumnType { scalar_type: TimestampTz, nullable: false }, ReprColumnType { scalar_type: String, nullable: false }, ReprColumnType { scalar_type: Bool, nullable: false }, ReprColumnType { scalar_type: String, nullable: true }] v [ReprColumnType { scalar_type: UInt64, nullable: false }, ReprColumnType { scalar_type: String, nullable: false }, ReprColumnType { scalar_type: String, nullable: false }, ReprColumnType { scalar_type: Jsonb, nullable: false }, ReprColumnType { scalar_type: String, nullable: true }, ReprColumnType { scalar_type: TimestampTz, nullable: false }, ReprColumnType { scalar_type: String, nullable: true }, ReprColumnType { scalar_type: UInt32, nullable: true }, ReprColumnType { scalar_type: String, nullable: true }, ReprColumnType { scalar_type: String, nullable: true }, ReprColumnType { scalar_type: String, nullable: true }, ReprColumnType { scalar_type: String, nullable: true }, ReprColumnType { scalar_type: String, nullable: true }, ReprColumnType { scalar_type: Array(MzAclItem), nullable: true }, ReprColumnType { scalar_type: String, nullable: false }, ReprColumnType { scalar_type: Bool, nullable: false }, ReprColumnType { scalar_type: String, nullable: false }, ReprColumnType { scalar_type: String, nullable: true }, ReprColumnType { scalar_type: UInt64, nullable: false }, ReprColumnType { scalar_type: String, nullable: false }, ReprColumnType { scalar_type: String, nullable: false }, ReprColumnType { scalar_type: Jsonb, nullable: false }, ReprColumnType { scalar_type: String, nullable: true }, ReprColumnType { scalar_type: TimestampTz, nullable: false }, ReprColumnType { scalar_type: String, nullable: false }, ReprColumnType { scalar_type: Bool, nullable: false }, ReprColumnType { scalar_type: String, nullable: false }]")))
Make sure that the schema name is specified in the builtin view's create sql statement.
5: __rustc::rust_begin_unwind
at /rustc/29483883eed69d5fb4db01964cdf2af4d86e9cb2/library/std/src/panicking.rs:697:5
6: core::panicking::panic_fmt
at /rustc/29483883eed69d5fb4db01964cdf2af4d86e9cb2/library/core/src/panicking.rs:75:14
7: mz_adapter::catalog::apply::<impl mz_adapter::catalog::state::CatalogState>::parse_builtin_views::{{closure}}::{{closure}}
at ./src/adapter/src/catalog/apply.rs:1659:21
8: mz_adapter::catalog::apply::<impl mz_adapter::catalog::state::CatalogState>::parse_builtin_views::{{closure}}
at ./src/adapter/src/catalog/apply.rs:1454:5
src/transform/src/reduce_elision.rs
Outdated
| .as_ref() | ||
| .expect("Expression not well-typed"); | ||
| .expect("Expression not well-typed") | ||
| .clone(); |
src/transform/src/union_cancel.rs
Outdated
| let relation_typ = base.typ(); | ||
| **base = MirRelationExpr::constant(vec![], relation_typ.clone()); | ||
| inputs[j] = MirRelationExpr::constant(vec![], relation_typ); | ||
| **base = MirRelationExpr::Constant { |
There was a problem hiding this comment.
Switch these back to smart constructors?
src/transform/src/union_cancel.rs
Outdated
| let relation_typ = inputs[i].typ(); | ||
| inputs[i] = MirRelationExpr::constant(vec![], relation_typ.clone()); | ||
| inputs[j] = MirRelationExpr::constant(vec![], relation_typ); | ||
| inputs[i] = MirRelationExpr::Constant { |
…f trusting `Get`s, which now have canonicalized types
These fire when repr-type canonicalization does not fully prevent type mismatches, providing CI visibility without crashing production. - scalar.rs: base_eq_or_repr_eq_for_assertion panic -> soft_panic_or_log - context.rs: arrangement key fallback panic -> soft_panic_or_log - relation.rs: try_union trace -> soft_panic_or_log
Do the following easy switches from reduce(&[SqlColumnType]) to reduce_repr(&[ReprColumnType]): - simplify_to_literal / simplify_to_literal_with_result: It's ok to do the constant folding in MIR, because the result is untyped. - canonicalize.rs: 3 calls in canonicalize_equivalences, canonicalize_predicates, and replace_subexpr_and_reduce. These already had ReprColumnType available and were converting to SqlColumnType just to call reduce. The conversion is now removed entirely. - plan_index_exprs in query.rs: Index key expressions should have repr types so optimizer code can find matching indexes. - WebhookValidation::reduce_expression in plan.rs: Webhook validation expressions are used at runtime where repr types are the native currency.
Several transform files were storing SqlRelationType obtained via typ() (which converts stored ReprRelationType to SqlRelationType), only to convert back to ReprRelationType later. Switch these to use repr_typ() and store ReprRelationType directly, eliminating unnecessary round-trips. Files changed: projection_lifting.rs, projection_pushdown.rs, cse/anf.rs, union_cancel.rs, demand.rs, dataflow.rs.
Switch Scope struct in expr-test-util to use ReprRelationType instead of SqlRelationType. Update all insert/set/get/iter signatures accordingly. build_get no longer needs conversion, build_let uses repr_typ(), and reverse_syntax_override stores ReprRelationType directly. Update doc comments in fusion/filter.rs and predicate_pushdown.rs to construct MirRelationExpr::Constant directly with ReprRelationType instead of using MirRelationExpr::constant() with SqlRelationType.
737d089 to
d1e1bee
Compare
|
Addressed all the comments. Waiting on a new Nightly. Discussing the Release Qualification failures here. |
579e0ca to
ba812f5
Compare
ba812f5 to
4931c50
Compare
This is pushing on the repr types work stream. It's on top of #35067, and stealing stuff from #35073.
The "Add soft_panic_or_log! canaries for repr type mismatches" commit turns some error logs into soft_panics, and the other commits attempt to make CI not run into any of these panics.
Nightlies:
Release qualifications:
#35219 does some naming/cleanup that we'll keep separate (since it involves touching the
*Funcmacros).