Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
34 commits
Select commit Hold shift + click to select a range
3cb15a2
support lumina
jerry-024 Mar 2, 2026
0a20c81
update
jerry-024 Mar 3, 2026
0f64b8c
update
jerry-024 Mar 3, 2026
a59205f
add e2e test
jerry-024 Mar 3, 2026
faffb66
fix build
jerry-024 Mar 3, 2026
8b96857
fix build
jerry-024 Mar 3, 2026
3eb43a8
fix build
jerry-024 Mar 3, 2026
30d58b4
fix build
jerry-024 Mar 3, 2026
562bb39
fix build
jerry-024 Mar 3, 2026
2b9e099
fix build
jerry-024 Mar 3, 2026
98e72f5
fix build
jerry-024 Mar 3, 2026
483f278
fix build
jerry-024 Mar 3, 2026
9d846c1
fix build
jerry-024 Mar 3, 2026
b69417c
fix build
jerry-024 Mar 3, 2026
643cd13
Merge branch 'master' into support_lumina
jerry-024 Mar 3, 2026
1a5cd6f
fix build and add jni test
jerry-024 Mar 3, 2026
973254f
add jni test
jerry-024 Mar 5, 2026
186ed45
add lumina test workflow
jerry-024 Mar 5, 2026
962ec98
add lumina test workflow
jerry-024 Mar 5, 2026
8b62ef1
add lumina test workflow
jerry-024 Mar 5, 2026
d6b1fd1
remove jni module
jerry-024 Mar 5, 2026
b57993e
add oss test
jerry-024 Mar 5, 2026
888fba2
refactor
jerry-024 Mar 6, 2026
80fec05
add README
jerry-024 Mar 6, 2026
39e16c1
fix
jerry-024 Mar 6, 2026
c3417b3
fix
jerry-024 Mar 9, 2026
a19e97e
fix
jerry-024 Mar 9, 2026
38281d2
fix
jerry-024 Mar 9, 2026
c4f0142
fix
jerry-024 Mar 9, 2026
f722cf1
fix
jerry-024 Mar 9, 2026
aae2cdb
Merge remote-tracking branch 'upstream/master' into support_lumina
jerry-024 Mar 10, 2026
e0b5240
fix
jerry-024 Mar 10, 2026
5ea9c6d
add benchmark
jerry-024 Mar 11, 2026
ea07fef
insert use dataset
jerry-024 Mar 11, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
35 changes: 35 additions & 0 deletions paimon-lumina/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
## Paimon Lumina

This module integrates [Lumina](https://github.com/alibaba/paimon-cpp/tree/main/third_party/lumina)
as a vector index for Apache Paimon's global index framework.

Lumina vector search library is derived from an internal repository maintained by
Alibaba Storage Service Team. It is accessed via JNI through the `lumina-jni` artifact.

### Supported Index Types

| Index Type | Description |
|------------|-------------|
| **DISKANN** | DiskANN graph-based index (default) |

### Supported Vector Metrics

| Metric | Description |
|--------|-------------|
| **L2** | Euclidean distance (default) |
| **COSINE** | Cosine distance |
| **INNER_PRODUCT** | Dot product |

### Configuration Options

| Option | Type | Default | Description |
|--------|------|---------|-------------|
| `vector.dim` | int | 128 | Vector dimension |
| `vector.metric` | enum | L2 | Distance metric |
| `vector.index-type` | enum | DISKANN | Index type |
| `vector.encoding-type` | string | rawf32 | Encoding type (rawf32, sq8, pq) |
| `vector.size-per-index` | int | 2,000,000 | Max vectors per index file |
| `vector.training-size` | int | 500,000 | Vectors used for pretraining |
| `vector.search-factor` | int | 10 | Multiplier for search limit when filtering |
| `vector.diskann.search-list-size` | int | 100 | DiskANN search list size |
| `vector.pretrain-sample-ratio` | double | 1.0 | Pretrain sample ratio |
146 changes: 146 additions & 0 deletions paimon-lumina/pom.xml
Original file line number Diff line number Diff line change
@@ -0,0 +1,146 @@
<?xml version="1.0" encoding="UTF-8"?>
<!--
Licensed to the Apache Software Foundation (ASF) under one
or more contributor license agreements. See the NOTICE file
distributed with this work for additional information
regarding copyright ownership. The ASF licenses this file
to you under the Apache License, Version 2.0 (the
"License"); you may not use this file except in compliance
with the License. You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing,
software distributed under the License is distributed on an
"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
KIND, either express or implied. See the License for the
specific language governing permissions and limitations
under the License.
-->
<project xmlns="http://maven.apache.org/POM/4.0.0"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>

<parent>
<artifactId>paimon-parent</artifactId>
<groupId>org.apache.paimon</groupId>
<version>1.4-SNAPSHOT</version>
</parent>

<artifactId>paimon-lumina</artifactId>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just one paimon-lumina is OK, no need to have index and e2e.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please create a README.md to this, explain what is lumina.

<name>Paimon : Lumina Index</name>

<repositories>
<repository>
<id>lumina</id>
<url>https://lumina-binary.oss-cn-shanghai.aliyuncs.com/mvn-repo/</url>
</repository>
<repository>
<id>jindodata</id>
<url>https://jindodata-binary.oss-cn-shanghai.aliyuncs.com/mvn-repo/</url>
</repository>
</repositories>
Comment on lines +34 to +43
Copy link

Copilot AI Mar 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This module adds a custom Maven repository (https://lumina-binary.oss-cn-shanghai.aliyuncs.com/mvn-repo/). This has build reproducibility and supply-chain implications (and may violate ASF/release expectations if artifacts aren't in Maven Central / ASF repos). If possible, depend on artifacts published to Maven Central (or an ASF-managed repo), or gate this repository behind an explicit Maven profile so default builds don’t rely on an extra remote repository.

Suggested change
<repositories>
<repository>
<id>lumina</id>
<url>https://lumina-binary.oss-cn-shanghai.aliyuncs.com/mvn-repo/</url>
</repository>
</repositories>
<profiles>
<profile>
<id>lumina-repo</id>
<activation>
<activeByDefault>false</activeByDefault>
</activation>
<repositories>
<repository>
<id>lumina</id>
<url>https://lumina-binary.oss-cn-shanghai.aliyuncs.com/mvn-repo/</url>
</repository>
</repositories>
</profile>
</profiles>

Copilot uses AI. Check for mistakes.

<dependencies>
<dependency>
<groupId>org.apache.paimon</groupId>
<artifactId>paimon-common</artifactId>
<version>${project.version}</version>
<scope>provided</scope>
</dependency>

<dependency>
<groupId>org.aliyun.lumina</groupId>
<artifactId>lumina-jni</artifactId>
<version>0.1.0</version>
</dependency>

<dependency>
<groupId>org.apache.paimon</groupId>
<artifactId>paimon-shade-jackson-2</artifactId>
<version>${paimon.shade.jackson.version}-${paimon.shade.version}</version>
</dependency>

<!-- test dependencies -->
<dependency>
<groupId>org.junit.jupiter</groupId>
<artifactId>junit-jupiter</artifactId>
<version>${junit5.version}</version>
<scope>test</scope>
</dependency>

<dependency>
<groupId>org.apache.paimon</groupId>
<artifactId>paimon-core</artifactId>
<version>${project.version}</version>
<scope>test</scope>
</dependency>

<dependency>
<groupId>org.apache.paimon</groupId>
<artifactId>paimon-format</artifactId>
<version>${project.version}</version>
<scope>test</scope>
</dependency>

<dependency>
<groupId>org.apache.paimon</groupId>
<artifactId>paimon-test-utils</artifactId>
<version>${project.version}</version>
<scope>test</scope>
</dependency>

<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-client</artifactId>
<version>${hadoop.version}</version>
<scope>test</scope>
<exclusions>
<exclusion>
<groupId>log4j</groupId>
<artifactId>log4j</artifactId>
</exclusion>
<exclusion>
<groupId>org.slf4j</groupId>
<artifactId>slf4j-log4j12</artifactId>
</exclusion>
</exclusions>
</dependency>

<dependency>
<groupId>org.apache.paimon</groupId>
<artifactId>paimon-jindo</artifactId>
<version>${project.version}</version>
<scope>test</scope>
</dependency>

<dependency>
<groupId>com.aliyun.jindodata</groupId>
<artifactId>jindo-core</artifactId>
<version>6.9.1</version>
<scope>test</scope>
</dependency>

<dependency>
<groupId>com.aliyun.jindodata</groupId>
<artifactId>jindo-sdk</artifactId>
<version>6.9.1</version>
<scope>test</scope>
</dependency>
</dependencies>

<build>
<plugins>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-surefire-plugin</artifactId>
<configuration>
<excludes>
<exclude>**/*Benchmark*</exclude>
</excludes>
</configuration>
</plugin>
</plugins>
</build>
</project>
Loading