Skip to content

Add OpenAI-compatible embedding client to vmcp optimizer#5633

Merged
jerm-dro merged 6 commits into
stacklok:mainfrom
gabrielcosi:add-openai-embedding-client
Jun 26, 2026
Merged

Add OpenAI-compatible embedding client to vmcp optimizer#5633
jerm-dro merged 6 commits into
stacklok:mainfrom
gabrielcosi:add-openai-embedding-client

Conversation

@gabrielcosi

@gabrielcosi gabrielcosi commented Jun 25, 2026

Copy link
Copy Markdown
Contributor

Summary

The vMCP optimizer's semantic tool discovery only supported the HuggingFace Text Embeddings Inference (TEI) API, so it could not use OpenAI-compatible embedding services such as OpenAI, Azure OpenAI, or gateways like Bifrost and LiteLLM.

  • Add an openai embedding provider alongside the existing TEI backend in the optimizer's similarity package.
  • Add two optional optimizer config fields: embeddingProvider (enum tei/openai, default tei) and embeddingModel.
  • Read the API key from the OPENAI_API_KEY environment variable so it never lands in a CRD spec or ConfigMap; an empty key allows keyless in-cluster gateways.
  • Validate the provider and resolve the key in the single existing GetAndValidateConfig path; regenerate CRD manifests/docs and add an example.

Closes #5305

Type of change

  • New feature

Test plan

  • Unit tests (task test)
  • Linting (task lint-fix)
  • Manual testing (describe below)

Validated end-to-end against a live cluster: configured the openai provider to point at a Bifrost OpenAI-compatible gateway that proxies a TEI backend, and confirmed a 2-input batch returns the OpenAI response shape (ordered data[].index, float data[].embedding arrays, 384-dim), which openAIClient.embedChunk decodes correctly. Also exercised keyless and Bearer-auth request paths via unit tests.

API Compatibility

  • This PR does not break the v1beta1 API (adds two optional, defaulted fields to OptimizerConfig).

Does this introduce a user-facing change?

Yes. The vMCP optimizer can now point semantic tool discovery at an OpenAI-compatible /embeddings endpoint via optimizer.embeddingProvider: openai (plus embeddingModel, and OPENAI_API_KEY). The provider defaults to tei, so existing configurations are unchanged.

Special notes for reviewers

  • The API key is intentionally not a config field; it is read from OPENAI_API_KEY so the secret stays out of the CRD spec / ConfigMap (mirrors the THV_SESSION_REDIS_PASSWORD pattern).
  • The openai provider reads embeddingService directly and is not used with embeddingServerRef (which provisions a managed TEI server).
  • Scope is intentionally self-contained to the optimizer + config; no operator controller or CLI changes.

Large PR Justification

  • Auto-generated code (CRDs) and tests.

Assisted by Claude Code. Design, code, and tests were directed, reviewed, and manually verified by the author.

The vmcp optimizer only spoke the TEI embedding API. Add an "openai"
provider for OpenAI-compatible services (OpenAI, Azure, Bifrost,
LiteLLM), selected via optimizer.embeddingProvider with the model in
embeddingModel and the key read from OPENAI_API_KEY. Defaults to "tei",
so existing configs are unaffected.

Closes stacklok#5305

Signed-off-by: Gabriel Cosi <contact@gabrielcosi.dev>
@codecov

codecov Bot commented Jun 25, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 88.46154% with 12 lines in your changes missing coverage. Please review.
✅ Project coverage is 70.37%. Comparing base (e7934ea) to head (dd1ddf5).
⚠️ Report is 9 commits behind head on main.

Files with missing lines Patch % Lines
...mcp/optimizer/internal/similarity/openai_client.go 83.33% 6 Missing and 6 partials ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #5633      +/-   ##
==========================================
+ Coverage   70.33%   70.37%   +0.03%     
==========================================
  Files         648      651       +3     
  Lines       66011    66210     +199     
==========================================
+ Hits        46432    46598     +166     
- Misses      16221    16258      +37     
+ Partials     3358     3354       -4     

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@github-actions github-actions Bot added size/L Large PR: 600-999 lines changed and removed size/L Large PR: 600-999 lines changed labels Jun 25, 2026

@jerm-dro jerm-dro left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the contribution, @gabrielcosi — clean implementation and thorough tests. A few minor changes requested inline.

Comment thread pkg/vmcp/config/config.go
Comment thread pkg/vmcp/optimizer/internal/similarity/openai_client.go Outdated
Comment thread pkg/vmcp/optimizer/internal/similarity/openai_client.go
Signed-off-by: Gabriel Cosi <contact@gabrielcosi.dev>
Signed-off-by: Gabriel Cosi <contact@gabrielcosi.dev>
Signed-off-by: Gabriel Cosi <contact@gabrielcosi.dev>
@github-actions github-actions Bot added size/XL Extra large PR: 1000+ lines changed and removed size/L Large PR: 600-999 lines changed labels Jun 25, 2026
@gabrielcosi gabrielcosi requested a review from jerm-dro June 25, 2026 22:52

@github-actions github-actions Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Large PR Detected

This PR exceeds 1000 lines of changes and requires justification before it can be reviewed.

How to unblock this PR:

Add a section to your PR description with the following format:

## Large PR Justification

[Explain why this PR must be large, such as:]
- Generated code that cannot be split
- Large refactoring that must be atomic
- Multiple related changes that would break if separated
- Migration or data transformation

Alternative:

Consider splitting this PR into smaller, focused changes (< 1000 lines each) for easier review and reduced risk.

See our Contributing Guidelines for more details.


This review will be automatically dismissed once you add the justification section.

Comment thread pkg/vmcp/optimizer/internal/similarity/openai_client_test.go
Signed-off-by: Gabriel Cosi <contact@gabrielcosi.dev>
Signed-off-by: Gabriel Cosi <contact@gabrielcosi.dev>
@github-actions github-actions Bot added size/XL Extra large PR: 1000+ lines changed and removed size/XL Extra large PR: 1000+ lines changed labels Jun 25, 2026
@gabrielcosi

Copy link
Copy Markdown
Contributor Author

@jerm-dro the large-PR bot is giving me side-eye, although most of the lines are generated CRDs/docs and tests. Want me to add the justification?

@jerm-dro

Copy link
Copy Markdown
Contributor

@gabrielcosi Adding the justification to the PR description is fine! It's common to say something along the lines of "most of these changes are auto-generated or tests."

@github-actions github-actions Bot added size/XL Extra large PR: 1000+ lines changed and removed size/XL Extra large PR: 1000+ lines changed labels Jun 25, 2026
@github-actions github-actions Bot dismissed their stale review June 25, 2026 23:32

Large PR justification has been provided. Thank you!

@github-actions

Copy link
Copy Markdown
Contributor

✅ Large PR justification has been provided. The size review has been dismissed and this PR can now proceed with normal review.

@gabrielcosi gabrielcosi requested a review from jerm-dro June 26, 2026 20:04

@jerm-dro jerm-dro left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the contribution! 🚀

@jerm-dro jerm-dro merged commit 81fc91f into stacklok:main Jun 26, 2026
43 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size/XL Extra large PR: 1000+ lines changed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add OpenAI-compatible embedding client to vmcp optimizer

2 participants