Skip to content

fix!: derive index type from details instead of opening the index#6903

Draft
wjones127 wants to merge 5 commits into
lance-format:mainfrom
wjones127:fix-unknown-index-type
Draft

fix!: derive index type from details instead of opening the index#6903
wjones127 wants to merge 5 commits into
lance-format:mainfrom
wjones127:fix-unknown-index-type

Conversation

@wjones127
Copy link
Copy Markdown
Contributor

@wjones127 wjones127 commented May 22, 2026

BREAKING CHANGE: describe_indices() now reports nested and special-character field names as full field paths (e.g. meta.lang, `user-id`) instead of just the leaf name.

list_indices() called the load_indices() binding, which opened each index to derive its type and reported "Unknown" on any failure.

list_indices() is now a thin wrapper over describe_indices(), which derives the type from index details without opening the index:

  • describe_indices() no longer errors on indices without index details; it returns a best-effort degraded entry instead.
  • When index details exist but no plugin is registered for the type URL, the type is derived from the type URL rather than "Unknown".
  • field_names now uses the full field path, so nested fields are reported as dotted paths instead of just the leaf name.
  • IndexSegmentDescription gains a base_id field.
  • The unused load_indices() Python binding is removed.

The list_indices() return type hint was incorrect (List[Index] — the method has always returned dicts). It now returns a typed IndexInformation TypedDict, so callers get key and value types instead of an opaque dict.

Testing

  • Rust: cargo test -p lance --lib index::, lance-index registry tests — new tests cover the degraded entry and the type-URL fallback.
  • Python: test_scalar_index.py, test_column_names.py, test_vector_index.py, test_optimize.py — including new list_indices() characterization tests committed before the rework, plus index-without-details and legacy-vector cases.
  • Lint: cargo fmt, cargo clippy (lance, lance-index, pylance), ruff, pyright.

🤖 Generated with Claude Code

wjones127 and others added 2 commits May 21, 2026 16:26
The deprecated list_indices() returns plain dicts (not Index dataclasses),
but its dict shape was not covered by any test. Add characterization tests
locking down the dict keys/values and the nested-field path format, as a
backwards-compatibility guard before reworking the implementation.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
list_indices() called the load_indices() binding, which opened each
index to derive its type and reported "Unknown" on any failure.

list_indices() is now a thin wrapper over describe_indices(), which
derives the type from index details without opening the index:

- describe_indices() no longer errors on indices without index details;
  it returns a best-effort degraded entry instead.
- When index details exist but no plugin is registered for the type
  URL, the type is derived from the type URL rather than "Unknown".
- field_names now uses the full field path, so nested fields are
  reported as dotted paths instead of just the leaf name.
- IndexSegmentDescription gains a base_id field.
- The unused load_indices() Python binding is removed.

The list_indices() return type hint is corrected from List[Index] to
List[Dict[str, Any]] to match what it actually returns.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@github-actions github-actions Bot added bug Something isn't working python breaking-change labels May 22, 2026
Add Python tests that commit raw index metadata without index details
via CreateIndex and assert list_indices()/describe_indices() report the
degraded entry ("Unknown") and the legacy monolithic vector index
("Vector") instead of erroring.

Also corrects the field_names doc comment to say "full paths" rather
than "names".

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@codecov
Copy link
Copy Markdown

codecov Bot commented May 22, 2026

Codecov Report

❌ Patch coverage is 92.10526% with 9 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
rust/lance/src/index.rs 90.81% 8 Missing and 1 partial ⚠️

📢 Thoughts on this report? Let us know!

wjones127 and others added 2 commits May 21, 2026 19:01
test_nested_field_vector_index indexes the nested `data.embedding`
column; field_names now reports the full path.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Replace the `List[Dict[str, Any]]` return type of `list_indices()` with
a typed `IndexInformation` TypedDict so callers get key/value type
information instead of an opaque dict.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

breaking-change bug Something isn't working python

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant