Skip to content

sql: MIR transform to canonicalize SQL types by round tripping through repr#35067

Open
mgree wants to merge 3 commits intoMaterializeInc:mainfrom
mgree:repr-types-canonicalize-xform
Open

sql: MIR transform to canonicalize SQL types by round tripping through repr#35067
mgree wants to merge 3 commits intoMaterializeInc:mainfrom
mgree:repr-types-canonicalize-xform

Conversation

@mgree
Copy link
Contributor

@mgree mgree commented Feb 18, 2026

Motivation

Part of the repr types work.
Hoping to reduce/eliminate the messages from #35050.

Description

Adds a transform ReprizeSqlTypes that converts all types inside of an MIR relation expr (and its subsidiary scalars, tablefuncs, aggregates, etc.).

Verification

A test run will suss out any breakage.
Will upgrade the the trace messages to soft_assert_or_log, to see if all of the errors are gone.

@github-actions
Copy link

Pre-merge checklist

  • The PR title is descriptive and will make sense in the git log.
  • This PR has adequate test coverage / QA involvement has been duly considered. (trigger-ci for additional test/nightly runs)
  • If this PR includes major user-facing behavior changes, I have pinged the relevant PM to schedule a changelog post.
  • This PR has an associated up-to-date design doc, is a design doc (template), or is sufficiently small to not require a design.
  • If this PR evolves an existing $T ⇔ Proto$T mapping (possibly in a backwards-incompatible way), then it is tagged with a T-proto label.
  • If this PR will require changes to cloud orchestration or tests, there is a companion cloud PR to account for those changes that is tagged with the release-blocker label (example).

@mgree mgree force-pushed the repr-types-canonicalize-xform branch from a2a2a04 to c6cb7f5 Compare February 18, 2026 19:24
…f trusting `Get`s, which now have canonicalized types
@mgree mgree force-pushed the repr-types-canonicalize-xform branch from a14400c to 751a693 Compare February 18, 2026 20:53
@mgree mgree marked this pull request as ready for review February 18, 2026 22:27
@mgree mgree requested review from a team as code owners February 18, 2026 22:27
@mgree mgree requested a review from ohbadiah February 18, 2026 22:27
@mgree
Copy link
Contributor Author

mgree commented Feb 18, 2026

Build failure is a CI artifact from a force push.

@mgree mgree requested a review from ggevay February 18, 2026 22:27
return res;
}

tracing::error!("fell back to repr types keys on {key:?}");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now I have to ask: How often is this expected to fire?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#35084 will be turning this into soft_panic!, so we'll see.

expr
})
.collect::<Vec<_>>();
self.arranged.get(&repr_keys).map(|x| x.clone())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a bit scared that this might happen in the other direction: the function being called with reprized keys, but self.arranged not being reprized yet. It's a bit scary, because there is a call to this function (in render_join_inner) that does not fail noisily if it can't find an arrangement, but just silently does a thing that has worse performance.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

        // This will be `None` in the degenerate single-input join case, which ensures
        // that we do not panic if we never go around the `stage_plans` loop.

So, maybe you could add a soft_assert, that it's really only the single-input case where it's None.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, I don't understand why this is happening in the first place. The rendering should get keys from LIR, which should get keys from MIR, which has already been reprized. Why can reprizing again here help get a match?

Copy link
Contributor

@ggevay ggevay Feb 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've found some of the reasons: #35084

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Anyhow, #35084 is turning this fallback into a soft_panic, so mismatches here (in either direction) won't fly under the radar. So, I think this is fine for now.

dataflow_plan
.source_imports
.get(get_id)
.map(|s| s.desc.typ.clone())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could be an expect, right? That is, source_imports should always have the id that the Get refers to.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in #35084 "expect that Get's ID is also imported" commit

Box::new(ReprTypecheck::new(ctx.repr_typecheck()).strict_join_equivalences()),
Box::new(CollectNotices),
// 0.5. Canonicalize types in MRE by converting them to repr types and back.
Box::new(ReprizeSqlTypes),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could also add it at the beginning of constant_optimizer.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in #35084 "Add ReprizeSqlTypes to constant_optimizer" commit

ggevay added a commit that referenced this pull request Feb 26, 2026
… MIR structs (#35084)

This is pushing on the repr types work stream. It's on top of
#35067, and stealing
stuff from #35073.

The "Add soft_panic_or_log! canaries for repr type mismatches" commit
turns some error logs into soft_panics, and the other commits attempt to
make CI not run into any of these panics.

Nightlies:
- https://buildkite.com/materialize/nightly/builds/15255 (stale)
- https://buildkite.com/materialize/nightly/builds/15259 (stale)
- https://buildkite.com/materialize/nightly/builds/15260 (stale)
- https://buildkite.com/materialize/nightly/builds/15296 (stale)
- https://buildkite.com/materialize/nightly/builds/15337 (stale)
- https://buildkite.com/materialize/nightly/builds/15346 (stale)
- https://buildkite.com/materialize/nightly/builds/15350 (stale)
- https://buildkite.com/materialize/nightly/builds/15351 (stale)
- https://buildkite.com/materialize/nightly/builds/15355 (limited to
RQGs + SLTs)
- https://buildkite.com/materialize/nightly/builds/15358 (stale)
- https://buildkite.com/materialize/nightly/builds/15359 (slightly
stale)
- https://buildkite.com/materialize/nightly/builds/15366

Release qualifications:
- https://buildkite.com/materialize/release-qualification/builds/1115
(slightly stale)

#35219 does some
naming/cleanup that we'll keep separate (since it involves touching the
`*Func` macros).

---------

Co-authored-by: Michael Greenberg <michael.greenberg@materialize.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants