fix(ollama): soft-fail on empty response in generate_from_raw by planetf1 · Pull Request #1161 · generative-computing/mellea

planetf1 · 2026-05-27T06:03:35Z

Description

Fixes #599

generate_from_raw was intermittently returning empty-string results. After investigation (see #16326 which we have now closed), the root cause was in the mellea client: asyncio.gather(..., return_exceptions=True) silently converted any exception — timeout, ResponseError, connection reset while the model was loading — into ModelOutputThunk(value=""). This was fixed separately in #1163 (dropped return_exceptions=True).

This PR adds a second, belt-and-braces layer for the rare but real case where Ollama genuinely returns HTTP 200 with response: "" and done: True (e.g. EOS sampled as the first token — runner.go:546–552 in Ollama, where a TODO comment acknowledges the issue). In 4400 requests across warm and cold-load conditions we never observed this path, but the code path exists and is worth handling explicitly.

mellea/backends/ollama.py — adds an elif response.done and not response.response and not response.thinking: branch. Logs a warning and attaches a RuntimeError and the raw GenerateResponse to generate_log.extra for operator inspection. The and not response.thinking guard avoids false-positives on thinking models that return an empty .response alongside a non-empty .thinking field. Does not re-introduce return_exceptions=True — that removal from #1163 is preserved.

test/backends/test_ollama.py — adds a module-scoped _ensure_model_warm autouse fixture so that running this file in isolation (outside the full conftest warm-up path) doesn't start cold. Removes the @pytest.mark.xfail from test_generate_from_raw — the original failure was the return_exceptions=True bug, now fixed; the test has passed consistently across 1000+ trials.

test/backends/test_ollama_unit.py — unit tests for the new branch (no live Ollama required): empty done-response soft-fails with RuntimeError attached; thinking-model response with response="" is not flagged; one empty slot in a batch of three doesn't discard the other two results.

Testing

# Unit tests — no Ollama server required
uv run pytest test/backends/test_ollama_unit.py -v
# 19 passed

# 1000-trial soak: 4000 requests, 0 empty responses
uv run --with httpx python /tmp/hammer1000.py 1000
# DONE: 1000 trials, 4000 requests, 0 empty responses, 398s elapsed

# Verify xfail removal — test now passes clean
uv run pytest test/backends/test_ollama.py::test_generate_from_raw -v
# 1 passed

…nerative-computing#599) Ollama returns HTTP 200 with an empty `response` field when the first sampled token is EOS (runner.go:546-552). This is real but vanishingly rare — 4 400 requests across 1 100 trials showed zero occurrences once the primary cause (return_exceptions=True in asyncio.gather, PR generative-computing#1163) was removed. This PR adds belt-and-braces handling for that genuine-but-rare path: warm the model in isolation runs, detect an HTTP-200-with-empty-body response and log a warning rather than silently returning an empty string. The stale xfail on test_generate_from_raw (which was passing cleanly after generative-computing#1163) is removed. Three unit tests cover the soft-fail branch directly without needing a live Ollama server. Assisted-by: Claude Code Signed-off-by: Nigel Jones <jonesn@uk.ibm.com>

…issues Lesson from generative-computing#599 investigation: return_exceptions=True silently converts exceptions to empty values in batch backends. Assisted-by: Claude Code Signed-off-by: Nigel Jones <jonesn@uk.ibm.com>

github-actions Bot added the bug Something isn't working label May 27, 2026

planetf1 changed the title ~~fix(ollama): raise on empty response from generate_from_raw~~ fix(ollama): soft-fail on empty response in generate_from_raw May 27, 2026

planetf1 force-pushed the worktree-issue-599 branch from 0a8a0ff to 6fbe1ce Compare May 27, 2026 06:48

planetf1 marked this pull request as ready for review May 27, 2026 06:49

planetf1 requested a review from a team as a code owner May 27, 2026 06:49

planetf1 requested review from akihikokuroda and markstur May 27, 2026 06:49

planetf1 force-pushed the worktree-issue-599 branch from 372a8ef to 42d840e Compare May 28, 2026 18:28

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(ollama): soft-fail on empty response in generate_from_raw#1161

fix(ollama): soft-fail on empty response in generate_from_raw#1161
planetf1 wants to merge 2 commits into
generative-computing:mainfrom
planetf1:worktree-issue-599

planetf1 commented May 27, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

planetf1 commented May 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Testing

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

planetf1 commented May 27, 2026 •

edited

Loading