feat: AI embeddings for feedback records by xernobyl · Pull Request #38 · formbricks/hub

xernobyl · 2026-02-23T16:33:38Z

What does this PR do?

Adds embeddings for feedback records: when a feedback record is created or updated and has non-empty value_text, the system enqueues a job to generate an embedding via a configurable provider (OpenAI or Google Gemini) and stores it in a dedicated embeddings table (pgvector). This keeps embedding data out of the main feedback-records read path and supports multiple models per record.

Highlights:

Event-driven: EmbeddingProvider subscribes to feedback_record.created and feedback_record.updated (when value_text is in changed fields). Jobs are enqueued to a dedicated River queue (embeddings) so embedding work does not starve webhook delivery.
Pluggable providers: Single event provider and worker; embedding API is behind an EmbeddingClient interface. Implementations: OpenAI (text-embedding-3-small, configurable dimensions) and Google Gemini (gemini-embedding-001, via google.golang.org/genai). API key is optional (e.g. for local AI).
Schema: New embeddings table: id, feedback_record_id, embedding (vector), model, created_at, updated_at; unique on (feedback_record_id, model); ON DELETE CASCADE from feedback_records.
Backfill: cmd/backfill-embeddings enqueues jobs for existing records that have value_text but no embedding for the configured model. Requires EMBEDDING_PROVIDER and EMBEDDING_MODEL (no defaults).
Config: EMBEDDING_PROVIDER (openai | google), EMBEDDING_MODEL, EMBEDDING_PROVIDER_API_KEY (optional), EMBEDDING_DIMENSIONS (default 1536), EMBEDDING_MAX_CONCURRENT, EMBEDDING_MAX_ATTEMPTS. Supported providers are kept in a map for easy extension.
Observability: Embedding metrics (enqueued, outcomes, duration, errors) and structured logging; worker retries on transient errors and skips retry on not-found.

Fixes #(issue)

No API contract changes: embeddings are not exposed on feedback-record list/get. They are stored for future use (e.g. semantic search).

How should this be tested?

Unit: go test ./internal/... ./cmd/... (includes embedding_provider_test.go, worker tests).
Integration: make tests (requires Postgres; tests/ use feedback records service with embedding model; DB must have embeddings table from migration).
Manual:
1. make init-db (and make river-migrate if using River UI) so embeddings exists.
2. Set in .env: EMBEDDING_PROVIDER=openai, EMBEDDING_MODEL=text-embedding-3-small, EMBEDDING_PROVIDER_API_KEY=sk-... (or use google and a Gemini API key; or leave key empty for a no-key provider).
3. make run; create a feedback record with value_text; confirm embedding job is enqueued and processed (logs: "embedding: job enqueued", "embedding: stored") and a row appears in embeddings.
4. Backfill: EMBEDDING_PROVIDER=openai EMBEDDING_MODEL=text-embedding-3-small DATABASE_URL=... go run ./cmd/backfill-embeddings (both env vars required).

Checklist

Required

Appreciated

If API changed: added or updated OpenAPI spec and ran contract tests (make tests or API contract workflow)
If API behavior changed: added request/response examples or Swagger UI screenshots to this PR
Updated docs in docs/ if changes were necessary
Ran make tests-coverage for meaningful logic changes

github-actions · 2026-02-23T16:34:32Z

✱ Stainless preview builds

This PR will update the hub SDKs with the following commit message.

feat: Embeddings

Edit this comment to update it. It will appear in the SDK's changelogs.

✅ hub-openapi studio · code · diff

Your SDK built successfully.
generate ✅ (prev: generate ⚠️)

✅ hub-typescript studio · code · diff

Your SDK built successfully.
generate ✅ (prev: generate ⚠️) → build ✅ → lint ✅ → test ✅
npm install https://pkg.stainless.com/s/hub-typescript/d42485dcb0ddb449d9cd52a5e219faef023e9549/dist.tar.gz

This comment is auto-generated by GitHub Actions and is automatically kept up to date as you push.
If you push custom code to the preview branch, re-run this workflow to update the comment.
Last updated: 2026-02-23 17:06:50 UTC

mattinannt

@xernobyl Thank you for the PR :-)

Please always add the ticket link to the PR description so that ticket and PR get linked.

I currently only did a small review with a first AI-based review. Also maybe we need to discuss how we would proceed with supporting multiple AI/embeddings providers as this might happen much sooner than we might think.

internal/service/embedding_provider.go

internal/workers/feedback_embedding.go

internal/repository/feedback_records_repository.go

BhagyaAmarasinghe

Thanks for the PR, I have commented on some issues I've noticed.
Could you also check on below 2 points as well:

in feedback_records.go CreateFeedbackRecordRequest.SubmissionID is a non-pointer string with validate:"required". Every existing API consumer that doesn't send submission_id will now get a 400 validation error. This is a breaking change that should be called out in a changelog or migration guide, or made optional with a server-generated default.
Migration 003 adds NOT NULL column with no default. If the table has any existing rows, this ALTER TABLE will fail because PostgreSQL cannot add a NOT NULL column without a DEFAULT value to a table with existing data. The migration needs a strategy: add as nullable, backfill (e.g set submission_id = id::text), then add the NOT NULL constraint.

cmd/api/app.go

internal/workers/feedback_embedding.go

internal/openai/client.go

cmd/backfill-embeddings/main.go

migrations/004_add_feedback_records_embedding.sql

internal/config/config.go

xernobyl · 2026-02-25T11:37:20Z

@BhagyaAmarasinghe submission_id topics can be fixed on another PR

mattinannt

@xernobyl thank you for updating the PR. I have a few concerns regarding the embedding model provider and indexes. let's discuss.

migrations/004_add_feedback_records_embedding.sql

cmd/api/app.go

feat: embeddings

80401aa

chore: additional tests and logging

166be90

xernobyl changed the title ~~feat: Embeddings~~ feat: OpenAI embeddings for feedback records Feb 23, 2026

xernobyl marked this pull request as ready for review February 23, 2026 17:13

xernobyl requested review from BhagyaAmarasinghe and mattinannt February 23, 2026 17:13

mattinannt requested changes Feb 24, 2026

View reviewed changes

internal/service/embedding_provider.go Show resolved Hide resolved

internal/workers/feedback_embedding.go Show resolved Hide resolved

internal/repository/feedback_records_repository.go Outdated Show resolved Hide resolved

chore: PR fix

ed362b6

xernobyl requested a review from mattinannt February 24, 2026 11:25

xernobyl added 9 commits February 24, 2026 15:27

chore: new embedding model & google ai support

f6f4d0c

chore: go mod tidy

03d39e4

chore: fix tests postgres string

a7d0e00

chore: fix test

8be0a96

chore: change datatype to dynamic vector size instead of fixes size

74a6fb1

chore: added indexes

e063560

Merge branch 'main' into feat/embeddings

38c35c9

chore: rename migrations

dbf10ed

chore: fix migration

c2603b3

BhagyaAmarasinghe requested changes Feb 25, 2026

View reviewed changes

chore: migrated to half vecs; assorted changes

0fd25b2

xernobyl requested a review from BhagyaAmarasinghe February 25, 2026 11:37

BhagyaAmarasinghe approved these changes Feb 25, 2026

View reviewed changes

mattinannt requested changes Feb 25, 2026

View reviewed changes

migrations/004_add_feedback_records_embedding.sql Outdated Show resolved Hide resolved

cmd/api/app.go Show resolved Hide resolved

chore: clean defaults, no provider = no embeddings

52fb78e

xernobyl changed the title ~~feat: OpenAI embeddings for feedback records~~ feat: AI embeddings for feedback records Feb 25, 2026

xernobyl added 3 commits February 25, 2026 17:47

chore: fmt

7801722

chore: switch to fixed vector dimension (768)

b38f92d

chore: fix migration

994449d

xernobyl requested a review from mattinannt February 26, 2026 15:26

xernobyl enabled auto-merge February 27, 2026 08:54

mattinannt approved these changes Feb 27, 2026

View reviewed changes

xernobyl added this pull request to the merge queue Feb 27, 2026

Merged via the queue into main with commit 4927834 Feb 27, 2026
8 checks passed

xernobyl deleted the feat/embeddings branch February 27, 2026 09:51

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: AI embeddings for feedback records#38

feat: AI embeddings for feedback records#38
xernobyl merged 17 commits intomainfrom
feat/embeddings

xernobyl commented Feb 23, 2026 •

edited

Loading

Uh oh!

github-actions bot commented Feb 23, 2026 •

edited

Loading

Uh oh!

mattinannt left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

BhagyaAmarasinghe left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

xernobyl commented Feb 25, 2026

Uh oh!

mattinannt left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

xernobyl commented Feb 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

How should this be tested?

Checklist

Required

Appreciated

Uh oh!

github-actions bot commented Feb 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

✱ Stainless preview builds

Uh oh!

mattinannt left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

BhagyaAmarasinghe left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

xernobyl commented Feb 25, 2026

Uh oh!

mattinannt left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

xernobyl commented Feb 23, 2026 •

edited

Loading

github-actions bot commented Feb 23, 2026 •

edited

Loading