Skip to content

perf(merge_insert): probe every indexed key column for composite joins#6878

Open
ytyky wants to merge 4 commits into
lance-format:mainfrom
ytyky:fix-merge-insert-multi-columns
Open

perf(merge_insert): probe every indexed key column for composite joins#6878
ytyky wants to merge 4 commits into
lance-format:mainfrom
ytyky:fix-merge-insert-multi-columns

Conversation

@ytyky
Copy link
Copy Markdown

@ytyky ytyky commented May 20, 2026

Summary

  • Indexed merge_insert path was hard-gated to single-column joins; multi-column
    merges fell through to a full target scan even when every key column had a
    scalar index.
  • Generalize MapIndexExec to AND-fold per-column IsIn probes, and have
    merge_insert feed it every indexed join key.
  • Per-column indices stay independent (single-column path unchanged). Composite
    scalar indices (Composite scalar indices #3125) remain a complementary follow-up; the merge_insert
    plumbing this PR adds is what such an index would plug into.

Test plan

  • cargo test -p lance --lib merge_insert (145 tests, includes 6 new
    composite-key cases covering single-side / both-side / no index / NULL
    semantics / partial fragment coverage)
  • cargo fmt --all -- --check
  • cargo clippy -p lance --tests -- -D warnings
  • Benchmark numbers to follow in a comment.

Closes #3124

@github-actions
Copy link
Copy Markdown
Contributor

ACTION NEEDED
Lance follows the Conventional Commits specification for release automation.

The PR title and description are used as the merge commit message. Please update your PR title and description to match the specification.

For details on the error please inspect the "PR Title Check" action.

@ytyky ytyky changed the title Fix merge insert multi columns perf(merge_insert): probe every indexed key column for composite joins May 20, 2026
@ytyky
Copy link
Copy Markdown
Author

ytyky commented May 20, 2026

Benchmark numbers

Wall-clock for one merge_insert on a target with both join columns
indexed (a, b: Int64; BTREE on each). Source = 100 rows (50 updates
of existing keys, 50 inserts). Five iterations after one warmup run, on
this branch versus its parent commit. Bench source is committed as
rust/lance/tests/<[composite_merge_insert_bench.rs](http://composite_merge_insert_bench.rs/)>.

Target rows Pre-PR (full scan) This PR (indexed) Speedup
200k 16.5 ms 38.0 ms 0.43×
2M   179.3 ms 54.8 ms 3.27×

The pre-PR path scales linearly with target row count (full scan of
the join columns); this PR is essentially flat (dominated by loading
the scalar indices). The crossover is around 400-500k rows, so very
small tables see a ~20 ms regression that an opt-out heuristic could
recover later; the design point for merge_insert against indexed
tables is well above that.

@ytyky ytyky marked this pull request as ready for review May 20, 2026 22:06
Copy link
Copy Markdown

@claude claude Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Claude Code Review

This pull request is from a fork — automated review is disabled. A repository maintainer can comment @claude review to run a one-time review.

@yanghua
Copy link
Copy Markdown
Collaborator

yanghua commented May 21, 2026

@claude review

Copy link
Copy Markdown

@claude claude Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No bugs found, and the test coverage (composite key with partial/full indexing, NULL semantics, and post-index unindexed fragments) looks solid. Deferring because this touches the merge_insert query plan and introduces a deliberate NULL-semantics asymmetry (single-column keeps NullEqualsNull, composite uses NullEqualsNothing), plus a documented small-table regression — worth a human design-level look.

Extended reasoning...

Overview

This PR generalizes the indexed merge_insert path to handle composite (multi-column) joins. Previously the optimization was hard-gated to single-column joins; any multi-column merge fell through to a full target scan even when every key column had a scalar index. The PR:

  • Adds an IndexLookup struct and a MapIndexExec::new_multi constructor that AND-folds per-column IsIn probes against each indexed column's scalar index.
  • Replaces join_key_as_scalar_index with indexed_join_keys (returns all indexed keys) and adds unindexed_fragments_for_keys which unions the per-index unindexed fragment sets — a row is only reachable via the composite probe if every chosen index covers its fragment.
  • Restricts MapIndexExec's fragment bitmap to the intersection of all per-lookup bitmaps so the deletion mask only filters rows the probe could actually see.
  • Reports a 3.27× speedup at 2M rows and a documented ~20ms regression at 200k rows (crossover around 400-500k rows).

Security risks

None apparent. This is an internal query-planning optimization with no changes to auth, serialization formats, on-disk layout, or external surface area. ScalarValue::try_from_array already handles arbitrary input types safely.

Level of scrutiny

Higher than mechanical. merge_insert is production-critical (data correctness boundary), and this PR changes the physical query plan that produces target rows. Two specific design choices warrant human review:

  1. NULL semantics asymmetry: the new code uses NullEqualsNothing for composite keys (matching the full-scan path) but preserves NullEqualsNull for the single-column path. The comment frames this as backward-compat preservation, but it means single-column and composite paths now have different NULL semantics — worth confirming this is the intended long-term contract rather than an inconsistency to be cleaned up later.
  2. Small-table regression: ~20ms slower on 200k-row tables. The author flags an opt-out heuristic as future work; reviewers may want to weigh whether shipping without the heuristic is acceptable now.

Other factors

  • The bug-hunting pass found no issues.
  • Test coverage is genuinely thoughtful — composite-key with index on each side / both / neither, NULL semantics on indexed composite key, and the unindexed-fragments-after-index case that would silently lose updates if the union logic were wrong.
  • The benchmark file is gated behind #[ignore] so it does not affect CI runtime.
  • The conventional-commits bot flagged the title, but the title (perf(merge_insert): …) appears valid; the bot may have been responding to an earlier revision.

@codecov
Copy link
Copy Markdown

codecov Bot commented May 21, 2026

Codecov Report

❌ Patch coverage is 87.07224% with 34 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
rust/lance/src/io/exec/scalar_index.rs 68.91% 18 Missing and 5 partials ⚠️
rust/lance/src/dataset/write/merge_insert.rs 94.17% 0 Missing and 11 partials ⚠️

📢 Thoughts on this report? Let us know!

@ytyky ytyky force-pushed the fix-merge-insert-multi-columns branch from dcf1d2f to 1f834de Compare May 21, 2026 18:08
ytyky and others added 4 commits May 21, 2026 18:36
The indexed merge_insert path was hard-gated to single-column joins, so
composite-key merges fell through to a full target scan even when every
key column had a scalar index. That scan dominates wall-clock time
whenever the source batch is small relative to the target table.

Generalize `MapIndexExec` to accept a list of `IndexLookup` entries
instead of a single `(column, index_name)` pair. For each input batch
the operator builds one `IsIn` query per lookup and combines them with
`ScalarIndexExpr::And`, so the emitted candidate row addresses are the
intersection of every indexed column's matches. The deletion mask is
restricted to the intersection of the per-index fragment bitmaps for
the same reason. The existing single-column constructor is preserved
as a thin wrapper.

In `merge_insert`, replace `join_key_as_scalar_index` with
`indexed_join_keys`, which collects every join column that has a
scalar index supporting exact equality. `create_indexed_scan_joined_stream`
then builds a single `MapIndexExec` over all of them and joins on the
full composite key with standard SQL NULL semantics (the single-column
path keeps `NullEqualsNull` to avoid changing existing semantics).
Fragments missing from any chosen index are unioned in via the
existing unindexed-scan path so coverage stays complete.

Refs: lance-format#3124
One-shot wall-clock timer for indexed multi-column merge_insert against
a 200k-row target with scalar indices on both join columns. Gated
behind `#[ignore]`; run with

    cargo test --release -p lance --test composite_merge_insert_bench -- --ignored --nocapture

Used to validate the perf improvement in the parent commit.
200k rows was below the break-even where the indexed multi-column
probe overtakes the full-scan fallback. Bump to 2M so the bench
exercises the regime the parent perf commit is optimizing for.
… multi-lookup plan shape

The composite merge_insert bench prints timings to stdout, which the
workspace-wide `print_stdout` clippy lint denies. Allow the lint at the
bench file level and reformat to satisfy `cargo fmt`.

Add a unit test for `MapIndexExec` that exercises the multi-lookup
branches missed by the integration tests: the `IndexedLookup [...]`
Display rendering, `with_new_children` preserving every lookup, and the
single-column convenience constructor's unchanged Display.
@ytyky ytyky force-pushed the fix-merge-insert-multi-columns branch from 1f834de to 2314e1b Compare May 21, 2026 18:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Support multiple columns (composite key) for merge_insert

3 participants