Skip to content

fix: prevent excessive background LLM requests causing rate limiting and sluggishness#33

Merged
BYK merged 3 commits intomainfrom
fix/excessive-background-requests
Mar 3, 2026
Merged

fix: prevent excessive background LLM requests causing rate limiting and sluggishness#33
BYK merged 3 commits intomainfrom
fix/excessive-background-requests

Conversation

@BYK
Copy link
Owner

@BYK BYK commented Mar 3, 2026

Problem

Users reported multiple symptoms pointing to excessive background LLM requests:

  1. Heavy rate limiting from upstream LLM providers
  2. Slower LLM interactions
  3. Many "error" sounds in TUI with no visible indication
  4. "Prompt too long" errors briefly visible for very large numbers
  5. Overall sluggish OpenCode server behavior

Root Causes & Fixes

Bug 1: Auto-Recovery Infinite Loop (CRITICAL)

When a context overflow (prompt too long) triggered auto-recovery via session.prompt(), if the recovery itself also overflowed, a new session.error fired with no re-entrancy guard, creating an infinite loop:

overflow → distill + recovery prompt → overflow → distill + recovery prompt → ...

Each cycle fired 2+ LLM calls, repeating until rate-limited.

Fix: Add recoveringSessions Set. If a session is already recovering when a second overflow arrives, bail out immediately (forceMinLayer is still persisted for the user's next manual message).

Bug 2: Curator Fires on Every session.idle (HIGH)

The condition used onIdle || turnsSinceCuration >= afterTurns. Since onIdle defaults to true, the || short-circuits — afterTurns (default: 10) was never evaluated. The curator fired an LLM worker request after every single agent turn.

Fix: Change || to && — curate on idle only when enough turns have accumulated.

Bug 3: shouldSkip Lists All Sessions on Every Unknown Message (MEDIUM)

When session.get() failed (common with short IDs from message events), the fallback called session.list() fetching ALL sessions on every unknown message event.

Fix: Remove session.list() fallback, cache sessions as known-good after first check. Worker sessions are already caught by isWorkerSession(). Accept the tradeoff: short-ID child sessions won't be skipped, but a few extra temporal messages from eval are harmless.

Testing

  • 14 new tests in test/index.test.ts covering:
    • isContextOverflow detection for all error message formats
    • buildRecoveryMessage with and without summaries
    • Re-entrancy guard: concurrent overflow events for the same session (only 1 recovery prompt fired)
    • Curator gating: no curation when turnsSinceCuration < afterTurns
    • shouldSkip: no session.list() fallback, sessions cached after first check
  • All 187 tests pass (173 existing + 14 new)

BYK added 2 commits March 3, 2026 14:51
Update zod dependency to ^4.3.6 and fix config.ts to use explicit
fully-populated default objects for nested schemas, required by Zod v4's
changed .default() semantics (short-circuits instead of parsing defaults).
…and sluggishness

Three bugs identified and fixed:

1. Auto-recovery infinite loop (CRITICAL): When a context overflow error
   triggered auto-recovery via session.prompt(), if the recovery itself
   also overflowed, a new session.error fired with no re-entrancy guard,
   creating an infinite loop of distill+prompt calls (2+ LLM calls per
   cycle). Fix: add recoveringSessions Set — second overflow for the same
   session bails out immediately.

2. Curator fires on every session.idle (HIGH): The condition used
   'onIdle || turnsSinceCuration >= afterTurns'. Since onIdle defaults to
   true, the || short-circuits and afterTurns (default: 10) is never
   checked. The curator fired an LLM worker request after every single
   agent turn. Fix: change || to && — curate on idle only when enough
   turns have accumulated.

3. shouldSkip lists all sessions on every unknown message (MEDIUM): When
   session.get() failed (common with short IDs from message events), the
   fallback called session.list() fetching ALL sessions on every unknown
   message event. Fix: remove session.list() fallback, cache sessions as
   known-good after first check. Worker sessions are already caught by
   isWorkerSession().

Symptoms these fixes address:
- Upstream rate limiting from excessive LLM calls
- Slower LLM interactions (curator competing for rate limit budget)
- Many 'error' sounds in TUI (each failed recovery wrote to stderr)
- 'Prompt too long' errors visible in TUI (recovery loop)
- Overall sluggish OpenCode server behavior
@BYK BYK enabled auto-merge (squash) March 3, 2026 23:32
@BYK BYK merged commit 460060d into main Mar 3, 2026
1 check passed
@BYK BYK deleted the fix/excessive-background-requests branch March 3, 2026 23:39
@craft-deployer craft-deployer bot mentioned this pull request Mar 3, 2026
2 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant