Add OpenAI-compatible embedding client to vmcp optimizer by gabrielcosi · Pull Request #5633 · stacklok/toolhive

gabrielcosi · 2026-06-25T10:31:48Z

Summary

The vMCP optimizer's semantic tool discovery only supported the HuggingFace Text Embeddings Inference (TEI) API, so it could not use OpenAI-compatible embedding services such as OpenAI, Azure OpenAI, or gateways like Bifrost and LiteLLM.

Add an openai embedding provider alongside the existing TEI backend in the optimizer's similarity package.
Add two optional optimizer config fields: embeddingProvider (enum tei/openai, default tei) and embeddingModel.
Read the API key from the OPENAI_API_KEY environment variable so it never lands in a CRD spec or ConfigMap; an empty key allows keyless in-cluster gateways.
Validate the provider and resolve the key in the single existing GetAndValidateConfig path; regenerate CRD manifests/docs and add an example.

Closes #5305

Type of change

New feature

Test plan

Unit tests (task test)
Linting (task lint-fix)
Manual testing (describe below)

Validated end-to-end against a live cluster: configured the openai provider to point at a Bifrost OpenAI-compatible gateway that proxies a TEI backend, and confirmed a 2-input batch returns the OpenAI response shape (ordered data[].index, float data[].embedding arrays, 384-dim), which openAIClient.embedChunk decodes correctly. Also exercised keyless and Bearer-auth request paths via unit tests.

API Compatibility

This PR does not break the v1beta1 API (adds two optional, defaulted fields to OptimizerConfig).

Does this introduce a user-facing change?

Yes. The vMCP optimizer can now point semantic tool discovery at an OpenAI-compatible /embeddings endpoint via optimizer.embeddingProvider: openai (plus embeddingModel, and OPENAI_API_KEY). The provider defaults to tei, so existing configurations are unchanged.

Special notes for reviewers

The API key is intentionally not a config field; it is read from OPENAI_API_KEY so the secret stays out of the CRD spec / ConfigMap (mirrors the THV_SESSION_REDIS_PASSWORD pattern).
The openai provider reads embeddingService directly and is not used with embeddingServerRef (which provisions a managed TEI server).
Scope is intentionally self-contained to the optimizer + config; no operator controller or CLI changes.

Large PR Justification

Auto-generated code (CRDs) and tests.

Assisted by Claude Code. Design, code, and tests were directed, reviewed, and manually verified by the author.

The vmcp optimizer only spoke the TEI embedding API. Add an "openai" provider for OpenAI-compatible services (OpenAI, Azure, Bifrost, LiteLLM), selected via optimizer.embeddingProvider with the model in embeddingModel and the key read from OPENAI_API_KEY. Defaults to "tei", so existing configs are unaffected. Closes stacklok#5305 Signed-off-by: Gabriel Cosi <contact@gabrielcosi.dev>

codecov · 2026-06-25T10:38:42Z

Codecov Report

❌ Patch coverage is 88.46154% with 12 lines in your changes missing coverage. Please review.
✅ Project coverage is 70.37%. Comparing base (e7934ea) to head (dd1ddf5).
⚠️ Report is 9 commits behind head on main.

Files with missing lines	Patch %	Lines
...mcp/optimizer/internal/similarity/openai_client.go	83.33%	6 Missing and 6 partials ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #5633      +/-   ##
==========================================
+ Coverage   70.33%   70.37%   +0.03%     
==========================================
  Files         648      651       +3     
  Lines       66011    66210     +199     
==========================================
+ Hits        46432    46598     +166     
- Misses      16221    16258      +37     
+ Partials     3358     3354       -4

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

jerm-dro

Thanks for the contribution, @gabrielcosi — clean implementation and thorough tests. A few minor changes requested inline.

Signed-off-by: Gabriel Cosi <contact@gabrielcosi.dev>

github-actions

Large PR Detected

This PR exceeds 1000 lines of changes and requires justification before it can be reviewed.

How to unblock this PR:

Add a section to your PR description with the following format:

## Large PR Justification

[Explain why this PR must be large, such as:]
- Generated code that cannot be split
- Large refactoring that must be atomic
- Multiple related changes that would break if separated
- Migration or data transformation

Alternative:

Consider splitting this PR into smaller, focused changes (< 1000 lines each) for easier review and reduced risk.

See our Contributing Guidelines for more details.

This review will be automatically dismissed once you add the justification section.

Signed-off-by: Gabriel Cosi <contact@gabrielcosi.dev>

gabrielcosi · 2026-06-25T23:24:20Z

@jerm-dro the large-PR bot is giving me side-eye, although most of the lines are generated CRDs/docs and tests. Want me to add the justification?

jerm-dro · 2026-06-25T23:27:03Z

@gabrielcosi Adding the justification to the PR description is fine! It's common to say something along the lines of "most of these changes are auto-generated or tests."

Large PR justification has been provided. Thank you!

github-actions · 2026-06-25T23:32:15Z

✅ Large PR justification has been provided. The size review has been dismissed and this PR can now proceed with normal review.

jerm-dro

Thanks for the contribution! 🚀

gabrielcosi requested review from ChrisJBurns, JAORMX, amirejaz, blkt, jerm-dro, jhrozek, rdimitrov, reyortiz3 and tgrunnagle as code owners June 25, 2026 10:31

github-actions Bot added the size/L Large PR: 600-999 lines changed label Jun 25, 2026

gabrielcosi mentioned this pull request Jun 25, 2026

Add OpenAI-compatible embedding client to vmcp optimizer #5305

Closed

github-actions Bot added size/L Large PR: 600-999 lines changed and removed size/L Large PR: 600-999 lines changed labels Jun 25, 2026

jerm-dro reviewed Jun 25, 2026

View reviewed changes

Comment thread pkg/vmcp/config/config.go

Comment thread pkg/vmcp/optimizer/internal/similarity/openai_client.go Outdated

Comment thread pkg/vmcp/optimizer/internal/similarity/openai_client.go

gabrielcosi added 3 commits June 26, 2026 00:46

Reject embeddingServerRef with openai provider

787e08c

Signed-off-by: Gabriel Cosi <contact@gabrielcosi.dev>

Drain OpenAI response body before close

f0a51df

Signed-off-by: Gabriel Cosi <contact@gabrielcosi.dev>

Trim trailing slash from OpenAI base URL

3da5092

Signed-off-by: Gabriel Cosi <contact@gabrielcosi.dev>

github-actions Bot added size/XL Extra large PR: 1000+ lines changed and removed size/L Large PR: 600-999 lines changed labels Jun 25, 2026

gabrielcosi requested a review from jerm-dro June 25, 2026 22:52

github-actions Bot previously requested changes Jun 25, 2026

View reviewed changes

jerm-dro reviewed Jun 25, 2026

View reviewed changes

Comment thread pkg/vmcp/optimizer/internal/similarity/openai_client_test.go

gabrielcosi added 2 commits June 26, 2026 01:12

Add live OpenAI embedding integration test

d37fdcb

Signed-off-by: Gabriel Cosi <contact@gabrielcosi.dev>

Generalize embedding gateway examples

dd1ddf5

Signed-off-by: Gabriel Cosi <contact@gabrielcosi.dev>

github-actions Bot added size/XL Extra large PR: 1000+ lines changed and removed size/XL Extra large PR: 1000+ lines changed labels Jun 25, 2026

gabrielcosi requested a review from jerm-dro June 26, 2026 20:04

jerm-dro approved these changes Jun 26, 2026

View reviewed changes

jerm-dro merged commit 81fc91f into stacklok:main Jun 26, 2026
43 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add OpenAI-compatible embedding client to vmcp optimizer#5633

Add OpenAI-compatible embedding client to vmcp optimizer#5633
jerm-dro merged 6 commits into
stacklok:mainfrom
gabrielcosi:add-openai-embedding-client

gabrielcosi commented Jun 25, 2026 •

edited

Loading

Uh oh!

codecov Bot commented Jun 25, 2026 •

edited

Loading

Uh oh!

jerm-dro left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

github-actions Bot left a comment

Uh oh!

Uh oh!

gabrielcosi commented Jun 25, 2026

Uh oh!

jerm-dro commented Jun 25, 2026

Uh oh!

github-actions Bot commented Jun 25, 2026

Uh oh!

jerm-dro left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

gabrielcosi commented Jun 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Type of change

Test plan

API Compatibility

Does this introduce a user-facing change?

Special notes for reviewers

Large PR Justification

Uh oh!

codecov Bot commented Jun 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

jerm-dro left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

github-actions Bot left a comment

Choose a reason for hiding this comment

Large PR Detected

How to unblock this PR:

Alternative:

Uh oh!

Uh oh!

gabrielcosi commented Jun 25, 2026

Uh oh!

jerm-dro commented Jun 25, 2026

Uh oh!

github-actions Bot commented Jun 25, 2026

Uh oh!

jerm-dro left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

gabrielcosi commented Jun 25, 2026 •

edited

Loading

codecov Bot commented Jun 25, 2026 •

edited

Loading