Skip to content

fix(search): batch missing-embedding calls to avoid API batch-size limits#1542

Open
octo-patch wants to merge 1 commit intoMemTensor:mainfrom
octo-patch:fix/issue-1482-embed-batch-missing-documents
Open

fix(search): batch missing-embedding calls to avoid API batch-size limits#1542
octo-patch wants to merge 1 commit intoMemTensor:mainfrom
octo-patch:fix/issue-1482-embed-batch-missing-documents

Conversation

@octo-patch
Copy link
Copy Markdown
Contributor

Fixes #1482

Problem

_extract_embeddings in search_handler.py collects all documents that lack a cached embedding and calls embedder.embed(all_missing) in a single shot. Providers such as Dashscope text-embedding-v4 reject or silently return None when the batch is too large (e.g. 25 documents), which then causes a TypeError when the code tries to iterate over the None result, or silently drops all missing embeddings.

The warning message before this:

[SearchHandler] MMR embedding metadata missing; will compute missing embeddings: missing_total=25

followed by a crash or silent failure in the MMR deduplication path.

Solution

Split missing_documents into chunks of _EMBED_BATCH_SIZE (16) and call embed() for each chunk, extending a combined result list. Batches that return None/empty are skipped gracefully so remaining embeddings can still be used.

_EMBED_BATCH_SIZE = 16

computed: list[list[float]] = []
for i in range(0, len(missing_documents), _EMBED_BATCH_SIZE):
    batch = missing_documents[i : i + _EMBED_BATCH_SIZE]
    batch_result = self.searcher.embedder.embed(batch)
    if batch_result:
        computed.extend(batch_result)

Testing

  • Verified with 25 missing documents: previously crashed with TypeError; now completes successfully using 2 batches of 16 and 9.
  • Verified with <16 missing documents: behaviour unchanged (single call).

…mits

When _extract_embeddings encounters many documents without cached
embeddings it previously called embedder.embed(all_missing) in one shot.
Providers like Dashscope text-embedding-v4 reject or silently return None
for large batches (e.g. 25 documents), causing a TypeError / empty result
downstream in the MMR deduplication path.

Fix: split missing_documents into chunks of _EMBED_BATCH_SIZE (16) and
accumulate results, skipping any batch that returns None/empty so the
rest of the embeddings can still be used.

Fixes MemTensor#1482

Co-Authored-By: Octopus <liyuan851277048@icloud.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

fix: Extracting too many missing_documents embeddings cause embedder error

1 participant