Add OpenAI-compatible embedding client to vmcp optimizer#5633
Conversation
The vmcp optimizer only spoke the TEI embedding API. Add an "openai" provider for OpenAI-compatible services (OpenAI, Azure, Bifrost, LiteLLM), selected via optimizer.embeddingProvider with the model in embeddingModel and the key read from OPENAI_API_KEY. Defaults to "tei", so existing configs are unaffected. Closes stacklok#5305 Signed-off-by: Gabriel Cosi <contact@gabrielcosi.dev>
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #5633 +/- ##
==========================================
+ Coverage 70.33% 70.37% +0.03%
==========================================
Files 648 651 +3
Lines 66011 66210 +199
==========================================
+ Hits 46432 46598 +166
- Misses 16221 16258 +37
+ Partials 3358 3354 -4 ☔ View full report in Codecov by Harness. 🚀 New features to boost your workflow:
|
jerm-dro
left a comment
There was a problem hiding this comment.
Thanks for the contribution, @gabrielcosi — clean implementation and thorough tests. A few minor changes requested inline.
Signed-off-by: Gabriel Cosi <contact@gabrielcosi.dev>
Signed-off-by: Gabriel Cosi <contact@gabrielcosi.dev>
Signed-off-by: Gabriel Cosi <contact@gabrielcosi.dev>
There was a problem hiding this comment.
Large PR Detected
This PR exceeds 1000 lines of changes and requires justification before it can be reviewed.
How to unblock this PR:
Add a section to your PR description with the following format:
## Large PR Justification
[Explain why this PR must be large, such as:]
- Generated code that cannot be split
- Large refactoring that must be atomic
- Multiple related changes that would break if separated
- Migration or data transformationAlternative:
Consider splitting this PR into smaller, focused changes (< 1000 lines each) for easier review and reduced risk.
See our Contributing Guidelines for more details.
This review will be automatically dismissed once you add the justification section.
Signed-off-by: Gabriel Cosi <contact@gabrielcosi.dev>
Signed-off-by: Gabriel Cosi <contact@gabrielcosi.dev>
|
@jerm-dro the large-PR bot is giving me side-eye, although most of the lines are generated CRDs/docs and tests. Want me to add the justification? |
|
@gabrielcosi Adding the justification to the PR description is fine! It's common to say something along the lines of "most of these changes are auto-generated or tests." |
Large PR justification has been provided. Thank you!
|
✅ Large PR justification has been provided. The size review has been dismissed and this PR can now proceed with normal review. |
jerm-dro
left a comment
There was a problem hiding this comment.
Thanks for the contribution! 🚀
Summary
The vMCP optimizer's semantic tool discovery only supported the HuggingFace Text Embeddings Inference (TEI) API, so it could not use OpenAI-compatible embedding services such as OpenAI, Azure OpenAI, or gateways like Bifrost and LiteLLM.
openaiembedding provider alongside the existing TEI backend in the optimizer'ssimilaritypackage.embeddingProvider(enumtei/openai, defaulttei) andembeddingModel.OPENAI_API_KEYenvironment variable so it never lands in a CRD spec or ConfigMap; an empty key allows keyless in-cluster gateways.GetAndValidateConfigpath; regenerate CRD manifests/docs and add an example.Closes #5305
Type of change
Test plan
task test)task lint-fix)Validated end-to-end against a live cluster: configured the
openaiprovider to point at a Bifrost OpenAI-compatible gateway that proxies a TEI backend, and confirmed a 2-input batch returns the OpenAI response shape (ordereddata[].index, floatdata[].embeddingarrays, 384-dim), whichopenAIClient.embedChunkdecodes correctly. Also exercised keyless and Bearer-auth request paths via unit tests.API Compatibility
v1beta1API (adds two optional, defaulted fields toOptimizerConfig).Does this introduce a user-facing change?
Yes. The vMCP optimizer can now point semantic tool discovery at an OpenAI-compatible
/embeddingsendpoint viaoptimizer.embeddingProvider: openai(plusembeddingModel, andOPENAI_API_KEY). The provider defaults totei, so existing configurations are unchanged.Special notes for reviewers
OPENAI_API_KEYso the secret stays out of the CRD spec / ConfigMap (mirrors theTHV_SESSION_REDIS_PASSWORDpattern).openaiprovider readsembeddingServicedirectly and is not used withembeddingServerRef(which provisions a managed TEI server).Large PR Justification
Assisted by Claude Code. Design, code, and tests were directed, reviewed, and manually verified by the author.