fix: add compound indexes for common multi-column query patterns#40
Open
fix: add compound indexes for common multi-column query patterns#40
Conversation
…handling Bug 1: Worker sessions were reused across multiple LLM calls in distillation and curation. Each call's assistant response with reasoning/thinking parts accumulated in the session history. When the next call sent the full history back, providers rejected it with 'Multiple reasoning_opaque values received in a single response'. Fix: Delete the parent→worker mapping from the workerSessions Map immediately after reading the response in all 4 functions (distillSegment, metaDistill, curator.run, curator.consolidate). This ensures each LLM call gets a fresh session. Worker session IDs are kept in workerSessionIDs Set so shouldSkip() still recognizes them. Bug 2: The recall tool's execute() had no try/catch. If any of the three search calls (temporal.search, searchDistillations, ltm.search) threw, the entire tool execution failed and returned nothing. Fix: Wrap each search call in independent try/catch blocks so partial results are returned even if one source fails. Errors are logged via log.error() (always visible).
Almost every query in the codebase filters on (project_id, session_id) but only single-column indexes existed, forcing SQLite to pick one and scan the remaining rows for other conditions. New compound indexes (schema version 6): - temporal_messages(project_id, session_id) — 10+ queries - temporal_messages(project_id, session_id, distilled) — undistilled() - temporal_messages(project_id, distilled, created_at) — pruning - distillations(project_id, session_id) — 8+ queries - distillations(project_id, session_id, generation, archived) — gen0 ops Drop redundant single-column indexes that are now left-prefixes: - idx_temporal_project (prefix of idx_temporal_project_session) - idx_temporal_distilled (low-selectivity, covered by compounds) - idx_distillation_project (prefix of idx_distillation_project_session)
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
The
recalltool and many internal queries were slow because virtually every query filters on 2-4 columns simultaneously (project_id + session_id,project_id + session_id + distilled, etc.) but only single-column indexes existed. SQLite can only use one index per table per query, so it was picking the most selective single-column index and then doing a sequential scan on matching rows for the remaining conditions.Analysis
temporal_messagesproject_id + session_idtemporal_messagesproject_id + session_id + distilledtemporal_messagesproject_id + distilled + created_atdistillationsproject_id + session_iddistillationsproject_id + session_id + generation + archivedAdditionally,
searchDistillations()in reflect.ts uses LIKE-based search (no FTS for distillations), making the row-filtering before the LIKE even more critical to be fast.Solution
Schema version 6 migration adds 5 compound indexes and drops 3 redundant single-column indexes that are now left-prefixes of the new compounds:
New indexes:
idx_temporal_project_session— coversbySession(), search LIKE fallback,count(),undistilledCount()idx_temporal_project_session_distilled— coversundistilled(),undistilledCount()idx_temporal_project_distilled_created— covers pruning TTL and size-cap passesidx_distillation_project_session— coversloadForSession(),latestObservations(),searchDistillations(),resetOrphans()idx_distillation_project_session_gen_archived— coversgen0Count(),loadGen0(), gradient prefix loadingDropped (redundant):
idx_temporal_project(prefix ofidx_temporal_project_session)idx_temporal_distilled(low-selectivity boolean, all queries also filter on project_id)idx_distillation_project(prefix ofidx_distillation_project_session)