fix: handle embedding models without KV memory by abetlen · Pull Request #2160 · abetlen/llama-cpp-python

abetlen · 2026-03-25T21:47:07Z

make LlamaContext.kv_cache_clear() a no-op when upstream context has no memory object (encoder-only embedding models without a kv cache)
switch the embedding smoke test to an encoder embedding model: CompendiumLabs/bge-small-en-v1.5-gguf (bge-small-en-v1.5-q4_k_m.gguf) to catch these issues more quickly in the future

Closes #2159.

abetlen added 2 commits March 25, 2026 14:09

Fix embedding models without KV memory

57fbe53

Add changelog entry for embedding memory fix

96550fb

abetlen merged commit ac59e5a into main Mar 25, 2026

abetlen deleted the abetlen/fix-embedding-null-memory branch March 25, 2026 22:04

Provide feedback