Skip to content

Conversation

@hatyo
Copy link
Contributor

@hatyo hatyo commented Nov 27, 2025

This PR introduces a new, more concise syntax, regarding the organization of index keys and values, for defining indexes in the relational layer: the INDEX ON syntax. This provides an alternative to the existing INDEX AS SELECT (materialized view) approach while producing semantically equivalent indexes and query plans. It also introduces new type of index, the vector index, which can be defined exclusively with the new index syntax.

New Syntax

The INDEX ON syntax follows a more traditional SQL pattern:

CREATE [UNIQUE] INDEX index_name ON table_name (
      column1 [ASC|DESC] [NULLS FIRST|NULLS LAST],
      column2 [ASC|DESC] [NULLS FIRST|NULLS LAST],
      ...
)
[INCLUDE (value_column1, value_column2, ...)]
[OPTIONS (option_name, ...)]

Examples

Regular value indexes:

-- Simple ascending index
CREATE INDEX idx_name ON customers(name)

-- Descending index with explicit ordering
CREATE INDEX idx_email ON customers(email DESC)

-- Multi-column index
CREATE INDEX idx_age_city ON customers(age, city DESC)

-- Covering index with included columns
CREATE INDEX idx_name_covering ON customers(name) INCLUDE(email, country)

-- Index with null ordering
CREATE INDEX idx_country ON customers(country ASC NULLS FIRST)

Aggregate indexes:

-- Using INDEX ON syntax with views
CREATE VIEW v_sum_by_category AS
      SELECT category, SUM(amount) AS total FROM sales GROUP BY category;
CREATE INDEX idx_sum ON v_sum_by_category(category) INCLUDE(total);

-- Equivalent INDEX AS SELECT syntax
CREATE INDEX idx_sum AS
      SELECT SUM(amount) FROM sales GROUP BY category;

Newly introduced vector indexes:

CREATE INDEX idx_vector USING HNSW
      ON documents(embedding)
      PARTITION BY (doc_type, category)
      OPTIONS (
          connectivity = 16,
          ef_construction = 200,
          metric = 'EUCLIDEAN_METRIC'
      );

Comparison with INDEX AS SELECT

Both syntaxes are fully supported and produce equivalent indexes:

Feature INDEX AS SELECT INDEX ON
Regular indexes CREATE INDEX idx AS SELECT col FROM t ORDER BY col CREATE INDEX idx ON t(col)
Multi-column CREATE INDEX idx AS SELECT a, b FROM t ORDER BY a, b CREATE INDEX idx ON t(a, b)
Covering index CREATE INDEX idx AS SELECT a, b, c FROM t ORDER BY a CREATE INDEX idx ON t(a) INCLUDE(b, c)
Descending CREATE INDEX idx AS SELECT col FROM t ORDER BY col DESC CREATE INDEX idx ON t(col DESC)
Null ordering CREATE INDEX idx AS SELECT col FROM t ORDER BY col NULLS LAST CREATE INDEX idx ON t(col NULLS LAST)
Aggregates CREATE INDEX idx AS SELECT SUM(x) FROM t GROUP BY y CREATE VIEW v AS SELECT SUM(x) AS s, y FROM t GROUP BY y;CREATE INDEX idx ON v(y) INCLUDE(s)

Both syntaxes leverage the same underlying MaterializedViewIndexGenerator, ensuring identical index structures and storage layout, the INDEX ON syntax relies on VIEW definition introduced in #3680 to determine the subset of data to be indexed.

The choice of syntax is purely a matter of preference and familiarity for already supported index, however newly introduced indexes such as vector indexes will be made exclusively avaiable using the new INDEX ON syntax.

This resolves #3786.

@hatyo hatyo force-pushed the new-index-on-source-syntax branch 2 times, most recently from 921fb5c to 88d84f0 Compare November 28, 2025 15:11
@hatyo hatyo force-pushed the new-index-on-source-syntax branch from 88d84f0 to 2d4fc70 Compare November 28, 2025 16:32
@hatyo hatyo added enhancement New feature or request relational issues related to relational FDB labels Nov 28, 2025
@hatyo hatyo marked this pull request as ready for review December 2, 2025 16:06
@hatyo hatyo force-pushed the new-index-on-source-syntax branch from 9c0f6df to 79f6255 Compare December 2, 2025 21:54
@github-actions
Copy link

github-actions bot commented Dec 2, 2025

📊 Metrics Diff Analysis Report

Summary

  • New queries: 104
  • Dropped queries: 0
  • Plan changed + metrics changed: 0
  • Plan unchanged + metrics changed: 0
ℹ️ About this analysis

This automated analysis compares query planner metrics between the base branch and this PR. It categorizes changes into:

  • New queries: Queries added in this PR
  • Dropped queries: Queries removed in this PR. These should be reviewed to ensure we are not losing coverage.
  • Plan changed + metrics changed: The query plan has changed along with planner metrics.
  • Metrics only changed: Same plan but different metrics

The last category in particular may indicate planner regressions that should be investigated.

New Queries

Count of new queries by file:

  • yaml-tests/src/test/resources/index-ddl-aggregates-only.metrics.yaml: 28
  • yaml-tests/src/test/resources/index-ddl-values-only.metrics.yaml: 34
  • yaml-tests/src/test/resources/index-ddl.metrics.yaml: 42

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request relational issues related to relational FDB

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Introduce INDEX ON syntax and use it to implement indexes of vector types

1 participant