CAMEL-21540: Add PGVector component for PostgreSQL vector database by gnodet · Pull Request #22207 · apache/camel

gnodet · 2026-03-23T20:53:13Z

Summary

Implements CAMEL-21540: Vector Database capabilities for PostgreSQL.

New camel-pgvector component under components/camel-ai/ that provides vector similarity search capabilities using the PostgreSQL pgvector extension
Supports actions: CREATE_TABLE, DROP_TABLE, UPSERT, DELETE, SIMILARITY_SEARCH
Uses JDBC with the com.pgvector:pgvector library for pgvector type support
Configurable distance types: cosine (default), euclidean, inner product
LangChain4j data type transformers (pgvector:embeddings and pgvector:rag) for RAG pipeline integration
Integration tests using testcontainers pgvector image
LangChain4j embeddings integration test with AllMiniLmL6V2 embedding model

Test plan

PgVectorComponentIT (7 tests) - CRUD operations and similarity search
LangChain4jEmbeddingsComponentPgVectorTargetIT (4 tests) - end-to-end LangChain4j embedding + pgvector integration
Code formatted and imports sorted
Generated files committed

github-actions · 2026-03-23T20:53:46Z

🌟 Thank you for your contribution to the Apache Camel project! 🌟
🤖 CI automation will test this PR automatically.

🐫 Apache Camel Committers, please review the following items:

First-time contributors require MANUAL approval for the GitHub Actions to run
You can use the command /component-test (camel-)component-name1 (camel-)component-name2.. to request a test from the test bot although they are normally detected and executed by CI.
You can label PRs using build-all, build-dependents, skip-tests and test-dependents to fine-tune the checks executed by this PR.
Build and test logs are available in the summary page. Only Apache Camel committers have access to the summary.

⚠️ Be careful when sharing logs. Review their contents before sharing them publicly.

github-actions · 2026-03-24T00:02:16Z

🧪 CI tested the following changed modules:

bom/camel-bom
catalog/camel-allcomponents
catalog/camel-catalog
components/camel-ai
components/camel-ai/camel-langchain4j-embeddings
components/camel-ai/camel-pgvector
core/camel-main
docs
dsl/camel-componentdsl
dsl/camel-endpointdsl
dsl/camel-kamelet-main
parent
tooling/maven/camel-package-maven-plugin

ℹ️ Dependent modules were not tested because the total number of affected modules exceeded the threshold (50). Use the test-dependents label to force testing all dependents.

Build reactor — dependencies compiled but only changed modules were tested (26 modules)

Camel :: AI :: LangChain4j :: Embedding
Camel :: AI :: LangChain4j :: Embedding [jar]
Camel :: AI :: PGVector
Camel :: AI :: PGVector [jar]
Camel :: AI :: Parent
Camel :: AI :: Parent [pom]
Camel :: All Components Sync point
Camel :: All Components Sync point [pom]
Camel :: BOM
Camel :: BOM [pom]
Camel :: Catalog :: Camel Catalog
Camel :: Catalog :: Camel Catalog [jar]
Camel :: Component DSL
Camel :: Component DSL [jar]
Camel :: Docs
Camel :: Docs [pom]
Camel :: Endpoint DSL
Camel :: Endpoint DSL [jar]
Camel :: Kamelet Main
Camel :: Kamelet Main [jar]
Camel :: Main
Camel :: Main [jar]
Camel :: Maven Plugins :: Camel Maven Package
Camel :: Maven Plugins :: Camel Maven Package [maven-plugin]
Camel :: Parent
Camel :: Parent [pom]

apupier

What is the reason for a different postgres component specific for the vectors? Why not including these capabilities directly in the existing Postgres component?

components/camel-ai/camel-pgvector/pom.xml

gnodet

Claude Code on behalf of Guillaume Nodet

Good question! There are several reasons for keeping pgvector as a separate component under camel-ai/ rather than adding it to the existing PostgreSQL components:

Consistency with other vector DB components. All vector database components live under camel-ai/: Milvus, Qdrant, Pinecone, Weaviate, Neo4j. pgvector serves the same purpose — vector similarity search for AI/RAG pipelines — it just happens to use PostgreSQL as its storage engine. Users looking for vector database components would naturally look in the AI group.
Different use case and audience. The existing PG components (camel-pgevent for LISTEN/NOTIFY, camel-pg-replication-slot for WAL replication) are PostgreSQL-specific integration utilities. pgvector targets a completely different use case: embeddings storage, similarity search, and RAG pipelines. These serve different personas.
Additional dependencies. pgvector brings in com.pgvector:pgvector (for vector type support) and LangChain4j data type transformers — AI-oriented dependencies that would be out of place in a general PostgreSQL component.
Separate lifecycle. Keeping it as its own component allows it to evolve independently without affecting the existing PostgreSQL components.

This is the same approach taken by other frameworks — for example, LangChain and LlamaIndex treat pgvector as a vector store alongside Pinecone/Qdrant/etc., not as a PostgreSQL utility.

- New camel-pgvector component under components/camel-ai/ - Supports CREATE_TABLE, DROP_TABLE, UPSERT, DELETE, SIMILARITY_SEARCH actions - Uses PostgreSQL pgvector extension via JDBC with com.pgvector library - Supports cosine, euclidean, and inner product distance types - LangChain4j data type transformers: pgvector:embeddings and pgvector:rag - Integration tests with testcontainers pgvector image - LangChain4j embeddings integration test with AllMiniLmL6V2 model Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

- Add BOM, catalog, DSL, and documentation generated files - Remove Spring Boot starter reference (no starter yet) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…in and add pgvector entries Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

- Remove pgvector from alphabetical nav listing (grouped under AI only) - Fix trailing spaces in javadoc blank comment lines - Update EventEndpointBuilderFactory to CamelEventEndpointBuilderFactory - Update rest-openapi description Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

- Add CREATE_INDEX action to create HNSW indexes for faster approximate nearest neighbor search, using the configured distance type - Add CamelPgVectorFilter header to apply SQL WHERE clause filtering on similarity search results (e.g., filter by metadata or text content) - Add integration tests for both features Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…index action Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Replace ** glob with {*,*/*} for dsl source pattern to prevent scandir of target/ directories created during parallel builds. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

The dsl.adoc lives at dsl/src/main/docs/ (depth 0), which is not matched by {*,*/*}. Add explicit pattern for it. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

When a new dependency is added to parent/pom.xml, the diff contains structural XML elements like <groupId>, <artifactId>, <version> which were incorrectly extracted as "changed properties" by detectChangedProperties. This caused the script to search for modules using ${artifactId} or ${groupId} as property references, which either matched nothing useful or caused spurious failures. Fix: filter out known structural POM element names (groupId, artifactId, version, scope, type, etc.) so only actual property names like "pgvector-version" or "openai-java-version" are detected. Fixes the CI script bug seen in PR #22207 where adding a new component to parent/pom.xml caused the dependency detection to fail. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

github-actions bot added components catalog docs tooling tooling-maven core-build-and-dependencies components-ai core dsl labels Mar 23, 2026

gnodet marked this pull request as draft March 23, 2026 21:21

gnodet marked this pull request as ready for review March 24, 2026 05:53

apupier reviewed Mar 24, 2026

View reviewed changes

components/camel-ai/camel-pgvector/pom.xml Show resolved Hide resolved

gnodet commented Mar 24, 2026

View reviewed changes

oscerd approved these changes Mar 24, 2026

View reviewed changes

gnodet and others added 11 commits March 24, 2026 23:22

CAMEL-21540: Add generated files for PGVector component

bd22903

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

CAMEL-21540: Add generated files and fix docs for PGVector component

bad9eb9

- Add BOM, catalog, DSL, and documentation generated files - Remove Spring Boot starter reference (no starter yet) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

CAMEL-21540: Fix docs symlinks and nav ordering for pgvector component

de0091d

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

CAMEL-21540: Fix generated files - restore missing components from ma…

3f57efd

…in and add pgvector entries Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

CAMEL-21540: Fix springEvent to use EventEndpointBuilderFactory

8fc4053

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

CAMEL-21540: Update generated files for new filter header and create …

76ceb9a

…index action Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

CAMEL-21540: Fix docs gulp race condition with dsl target directories

903db6d

Replace ** glob with {*,*/*} for dsl source pattern to prevent scandir of target/ directories created during parallel builds. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

CAMEL-21540: Include dsl/src/main/docs in gulpfile source pattern

d595392

The dsl.adoc lives at dsl/src/main/docs/ (depth 0), which is not matched by {*,*/*}. Add explicit pattern for it. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

gnodet force-pushed the hungry-quark branch from dc85b19 to d595392 Compare March 24, 2026 22:23

gnodet mentioned this pull request Mar 25, 2026

chore(ci): unify CI test infrastructure #22247

Open

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CAMEL-21540: Add PGVector component for PostgreSQL vector database#22207

CAMEL-21540: Add PGVector component for PostgreSQL vector database#22207
gnodet wants to merge 11 commits intomainfrom
hungry-quark

gnodet commented Mar 23, 2026

Uh oh!

github-actions bot commented Mar 23, 2026

Uh oh!

github-actions bot commented Mar 24, 2026 •

edited

Loading

Uh oh!

apupier left a comment

Uh oh!

Uh oh!

gnodet left a comment •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

gnodet commented Mar 23, 2026

Summary

Test plan

Uh oh!

github-actions bot commented Mar 23, 2026

Uh oh!

github-actions bot commented Mar 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

apupier left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

gnodet left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

github-actions bot commented Mar 24, 2026 •

edited

Loading

gnodet left a comment •

edited

Loading