Skip to content

fix: handle embedding models without KV memory#2160

Merged
abetlen merged 2 commits intomainfrom
abetlen/fix-embedding-null-memory
Mar 25, 2026
Merged

fix: handle embedding models without KV memory#2160
abetlen merged 2 commits intomainfrom
abetlen/fix-embedding-null-memory

Conversation

@abetlen
Copy link
Owner

@abetlen abetlen commented Mar 25, 2026

  • make LlamaContext.kv_cache_clear() a no-op when upstream context has no memory object (encoder-only embedding models without a kv cache)
  • switch the embedding smoke test to an encoder embedding model: CompendiumLabs/bge-small-en-v1.5-gguf (bge-small-en-v1.5-q4_k_m.gguf) to catch these issues more quickly in the future

Closes #2159.

@abetlen abetlen merged commit ac59e5a into main Mar 25, 2026
@abetlen abetlen deleted the abetlen/fix-embedding-null-memory branch March 25, 2026 22:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Error with latest version of llama-cpp-python 0.3.18 and vector embedding

1 participant