fix!: derive index type from details instead of opening the index#6903
Draft
wjones127 wants to merge 5 commits into
Draft
fix!: derive index type from details instead of opening the index#6903wjones127 wants to merge 5 commits into
wjones127 wants to merge 5 commits into
Conversation
The deprecated list_indices() returns plain dicts (not Index dataclasses), but its dict shape was not covered by any test. Add characterization tests locking down the dict keys/values and the nested-field path format, as a backwards-compatibility guard before reworking the implementation. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
list_indices() called the load_indices() binding, which opened each index to derive its type and reported "Unknown" on any failure. list_indices() is now a thin wrapper over describe_indices(), which derives the type from index details without opening the index: - describe_indices() no longer errors on indices without index details; it returns a best-effort degraded entry instead. - When index details exist but no plugin is registered for the type URL, the type is derived from the type URL rather than "Unknown". - field_names now uses the full field path, so nested fields are reported as dotted paths instead of just the leaf name. - IndexSegmentDescription gains a base_id field. - The unused load_indices() Python binding is removed. The list_indices() return type hint is corrected from List[Index] to List[Dict[str, Any]] to match what it actually returns. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Add Python tests that commit raw index metadata without index details
via CreateIndex and assert list_indices()/describe_indices() report the
degraded entry ("Unknown") and the legacy monolithic vector index
("Vector") instead of erroring.
Also corrects the field_names doc comment to say "full paths" rather
than "names".
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Codecov Report❌ Patch coverage is
📢 Thoughts on this report? Let us know! |
test_nested_field_vector_index indexes the nested `data.embedding` column; field_names now reports the full path. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Replace the `List[Dict[str, Any]]` return type of `list_indices()` with a typed `IndexInformation` TypedDict so callers get key/value type information instead of an opaque dict. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
BREAKING CHANGE:
describe_indices()now reports nested and special-character field names as full field paths (e.g.meta.lang,`user-id`) instead of just the leaf name.list_indices()called theload_indices()binding, which opened each index to derive its type and reported"Unknown"on any failure.list_indices()is now a thin wrapper overdescribe_indices(), which derives the type from index details without opening the index:describe_indices()no longer errors on indices without index details; it returns a best-effort degraded entry instead."Unknown".field_namesnow uses the full field path, so nested fields are reported as dotted paths instead of just the leaf name.IndexSegmentDescriptiongains abase_idfield.load_indices()Python binding is removed.The
list_indices()return type hint was incorrect (List[Index]— the method has always returned dicts). It now returns a typedIndexInformationTypedDict, so callers get key and value types instead of an opaque dict.Testing
cargo test -p lance --lib index::,lance-indexregistry tests — new tests cover the degraded entry and the type-URL fallback.test_scalar_index.py,test_column_names.py,test_vector_index.py,test_optimize.py— including newlist_indices()characterization tests committed before the rework, plus index-without-details and legacy-vector cases.cargo fmt,cargo clippy(lance, lance-index, pylance),ruff,pyright.🤖 Generated with Claude Code