Conversation
✱ Stainless preview buildsThis PR will update the Edit this comment to update it. It will appear in the SDK's changelogs. ✅ hub-typescript studio · code · diff
This comment is auto-generated by GitHub Actions and is automatically kept up to date as you push. |
mattinannt
left a comment
There was a problem hiding this comment.
@xernobyl Thank you for the PR :-)
Please always add the ticket link to the PR description so that ticket and PR get linked.
I currently only did a small review with a first AI-based review. Also maybe we need to discuss how we would proceed with supporting multiple AI/embeddings providers as this might happen much sooner than we might think.
BhagyaAmarasinghe
left a comment
There was a problem hiding this comment.
Thanks for the PR, I have commented on some issues I've noticed.
Could you also check on below 2 points as well:
-
in feedback_records.go
CreateFeedbackRecordRequest.SubmissionIDis a non-pointer string with validate:"required". Every existing API consumer that doesn't send submission_id will now get a 400 validation error. This is a breaking change that should be called out in a changelog or migration guide, or made optional with a server-generated default. -
Migration 003 adds NOT NULL column with no default. If the table has any existing rows, this ALTER TABLE will fail because PostgreSQL cannot add a NOT NULL column without a DEFAULT value to a table with existing data. The migration needs a strategy: add as nullable, backfill (e.g set submission_id = id::text), then add the NOT NULL constraint.
|
@BhagyaAmarasinghe submission_id topics can be fixed on another PR |
mattinannt
left a comment
There was a problem hiding this comment.
@xernobyl thank you for updating the PR. I have a few concerns regarding the embedding model provider and indexes. let's discuss.
What does this PR do?
Adds embeddings for feedback records: when a feedback record is created or updated and has non-empty
value_text, the system enqueues a job to generate an embedding via a configurable provider (OpenAI or Google Gemini) and stores it in a dedicatedembeddingstable (pgvector). This keeps embedding data out of the main feedback-records read path and supports multiple models per record.Highlights:
EmbeddingProvidersubscribes tofeedback_record.createdandfeedback_record.updated(whenvalue_textis in changed fields). Jobs are enqueued to a dedicated River queue (embeddings) so embedding work does not starve webhook delivery.EmbeddingClientinterface. Implementations: OpenAI (text-embedding-3-small, configurable dimensions) and Google Gemini (gemini-embedding-001, viagoogle.golang.org/genai). API key is optional (e.g. for local AI).embeddingstable:id,feedback_record_id,embedding(vector),model,created_at,updated_at; unique on(feedback_record_id, model);ON DELETE CASCADEfromfeedback_records.cmd/backfill-embeddingsenqueues jobs for existing records that havevalue_textbut no embedding for the configured model. RequiresEMBEDDING_PROVIDERandEMBEDDING_MODEL(no defaults).EMBEDDING_PROVIDER(openai | google),EMBEDDING_MODEL,EMBEDDING_PROVIDER_API_KEY(optional),EMBEDDING_DIMENSIONS(default 1536),EMBEDDING_MAX_CONCURRENT,EMBEDDING_MAX_ATTEMPTS. Supported providers are kept in a map for easy extension.Fixes #(issue)
No API contract changes: embeddings are not exposed on feedback-record list/get. They are stored for future use (e.g. semantic search).
How should this be tested?
go test ./internal/... ./cmd/...(includesembedding_provider_test.go, worker tests).make tests(requires Postgres;tests/use feedback records service with embedding model; DB must haveembeddingstable from migration).make init-db(andmake river-migrateif using River UI) soembeddingsexists..env:EMBEDDING_PROVIDER=openai,EMBEDDING_MODEL=text-embedding-3-small,EMBEDDING_PROVIDER_API_KEY=sk-...(or usegoogleand a Gemini API key; or leave key empty for a no-key provider).make run; create a feedback record withvalue_text; confirm embedding job is enqueued and processed (logs: "embedding: job enqueued", "embedding: stored") and a row appears inembeddings.EMBEDDING_PROVIDER=openai EMBEDDING_MODEL=text-embedding-3-small DATABASE_URL=... go run ./cmd/backfill-embeddings(both env vars required).Checklist
Required
make buildmake tests(integration tests intests/)make fmtandmake lint; no new warningsgit pull origin mainmigrations/with goose annotations and ranmake migrate-validateAppreciated
make testsor API contract workflow)docs/if changes were necessarymake tests-coveragefor meaningful logic changes