From 998273adf10cce7647c9f50e2e4c55cf51191176 Mon Sep 17 00:00:00 2001
From: Burak Yigit Kaya <byk@sentry.io>
Date: Wed, 11 Mar 2026 22:40:35 +0000
Subject: [PATCH] fix: rotate worker sessions after each LLM call and add
 recall error handling
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Bug 1: Worker sessions were reused across multiple LLM calls in
distillation and curation. Each call's assistant response with
reasoning/thinking parts accumulated in the session history. When the
next call sent the full history back, providers rejected it with
'Multiple reasoning_opaque values received in a single response'.

Fix: Delete the parent→worker mapping from the workerSessions Map
immediately after reading the response in all 4 functions
(distillSegment, metaDistill, curator.run, curator.consolidate). This
ensures each LLM call gets a fresh session. Worker session IDs are
kept in workerSessionIDs Set so shouldSkip() still recognizes them.

Bug 2: The recall tool's execute() had no try/catch. If any of the
three search calls (temporal.search, searchDistillations, ltm.search)
threw, the entire tool execution failed and returned nothing.

Fix: Wrap each search call in independent try/catch blocks so partial
results are returned even if one source fails. Errors are logged via
log.error() (always visible).
---
 AGENTS.md           |  7 ++---
 src/curator.ts      |  8 ++++++
 src/distillation.ts |  8 ++++++
 src/reflect.ts      | 65 +++++++++++++++++++++++++++------------------
 4 files changed, 57 insertions(+), 31 deletions(-)

diff --git a/AGENTS.md b/AGENTS.md
index e7366d6..c89d4bb 100644
--- a/AGENTS.md
+++ b/AGENTS.md
@@ -40,9 +40,6 @@
 <!-- lore:019cb3e6-da66-7534-a573-30d2ecadfd53 -->
 * **Returning bare promises loses async function from error stack traces**: When an \`async\` function returns another promise without \`await\`, the calling function disappears from error stack traces if the inner promise rejects. A function that drops \`async\` and does \`return someAsyncCall()\` loses its frame entirely. Fix: keep the function \`async\` and use \`return await someAsyncCall()\`. This matters for debugging — the intermediate function name in the stack trace helps locate which code path triggered the failure. ESLint rule \`no-return-await\` is outdated; modern engines optimize \`return await\` in async functions.
 
-<!-- lore:019cd20d-f42c-71bf-9da5-b2dd52c5014d -->
-* **sgdisk reserves 33 sectors for backup GPT, shrinking partition vs original layout**: When recreating a GPT partition with \`sgdisk\`, it sets LastUsableLBA 33 sectors short of disk end for backup GPT. If the original partition extended to the last sector (common for factory-formatted exFAT SD cards), the recreated partition is too small. Windows validates exFAT VolumeLength matches GPT partition size — mismatch causes 'drive not formatted' error. Fix: patch the exFAT VBR's VolumeLength to match GPT partition size (LastLBA - FirstLBA + 1), then recalculate boot region checksum (sector 11). Do NOT extend LastUsableLBA past backup GPT header location.
-
 <!-- lore:019c8f4f-67ca-7212-a8c4-8a75b230ceea -->
 * **Test DB isolation via LORE\_DB\_PATH and Bun test preload**: Lore test suite uses isolated temp DB via test/setup.ts preload (bunfig.toml). Preload sets LORE\_DB\_PATH to mkdtempSync path before any imports of src/db.ts; afterAll cleans up. src/db.ts checks LORE\_DB\_PATH first. agents-file.test.ts needs beforeEach cleanup for intra-file isolation and TEST\_UUIDS cleanup in afterAll (shared with ltm.test.ts). Individual test files don't need close() calls — preload handles DB lifecycle.
 
@@ -55,7 +52,7 @@
 * **Lore logging: LORE\_DEBUG gating for info/warn, always-on for errors**: src/log.ts provides three levels: log.info() and log.warn() are suppressed unless LORE\_DEBUG=1 or LORE\_DEBUG=true; log.error() always emits. All write to stderr with \[lore] prefix. This exists because OpenCode TUI renders all stderr as red error text — routine status messages (distillation counts, pruning stats, consolidation) were alarming users. Rule: use log.info() for successful operations and status, log.warn() for non-actionable oddities (e.g. dropping trailing messages), log.error() only in catch blocks for real failures. Never use console.error directly in plugin source files.
 
 <!-- lore:019cb12a-c957-7e24-b3f5-6869f3429d13 -->
-* **Lore release process: craft + issue-label publish**: Lore/Craft release pipeline and gotchas: (1) Trigger release.yml via workflow\_dispatch with version='auto' — craft determines version and creates GitHub issue. Label 'accepted' → publish.yml runs craft publish with npm OIDC. Don't create release branches or bump package.json manually. (2) GitHub App must be installed per-repo ('Only select repositories' → add at Settings → Installations). APP\_ID/APP\_PRIVATE\_KEY in \`production\` environment. Symptom: 404 on GET /repos/.../installation. (3) npm OIDC only works for publish — \`npm info\` needs NPM\_TOKEN for private packages (public works without auth).
+* **Lore release process: craft + issue-label publish**: Lore/Craft release pipeline: (1) Trigger release.yml via workflow\_dispatch with version='auto' — craft determines version and creates GitHub issue. Label 'accepted' → publish.yml runs craft publish with npm OIDC. Don't create release branches or bump package.json manually. (2) GitHub App must be installed per-repo. APP\_ID/APP\_PRIVATE\_KEY in \`production\` environment. Symptom: 404 on GET /repos/.../installation. (3) npm OIDC only works for publish — \`npm info\` needs NPM\_TOKEN for private packages.
 
 <!-- lore:019cb200-0001-7000-8000-000000000001 -->
 * **PR workflow for opencode-lore: branch → PR → auto-merge**: All changes (including minor fixes and test-only changes) must go through a branch + PR + auto-merge, never pushed directly to main. Workflow: (1) git checkout -b \<type>/\<slug>, (2) commit, (3) git push -u origin HEAD, (4) gh pr create --title "..." --body "..." --base main, (5) gh pr merge --auto --squash \<PR#>. Branch name conventions follow merged PR history: fix/\<slug>, feat/\<slug>, chore/\<slug>. Auto-merge with squash is required (merge commits disallowed). Never push directly to main even for trivial changes.
@@ -63,5 +60,5 @@
 ### Preference
 
 <!-- lore:019ca19d-fc02-7657-b2e9-7764658c01a5 -->
-* **Code style**: User prefers no backwards-compat shims — fix callers directly. Prefer explicit error handling over silent failures. Derive thresholds from existing constants rather than hardcoding magic numbers (e.g., use \`raw.length <= COL\_COUNT\` instead of \`n < 10\_000\`). In CI, define shared env vars at workflow level, not per-job. Always dry-run before bulk destructive operations (SELECT before DELETE to verify row count).
+* **Code style**: No backwards-compat shims — fix callers directly. Prefer explicit error handling over silent failures. Derive thresholds from existing constants rather than hardcoding magic numbers. In CI, define shared env vars at workflow level, not per-job. Dry-run before bulk destructive operations (SELECT before DELETE). Prefer \`jq\`/\`sed\`/\`awk\` over \`node -e\` for JSON manipulation in CI scripts.
 <!-- End lore-managed section -->
diff --git a/src/curator.ts b/src/curator.ts
index 9ca4603..c92d14e 100644
--- a/src/curator.ts
+++ b/src/curator.ts
@@ -113,6 +113,11 @@ export async function run(input: {
     path: { id: workerID },
     query: { limit: 2 },
   });
+  // Rotate worker session so the next call starts fresh — prevents
+  // accumulating multiple assistant messages with reasoning/thinking parts,
+  // which providers reject ("Multiple reasoning_opaque values").
+  workerSessions.delete(input.sessionID);
+
   const last = msgs.data?.at(-1);
   if (!last || last.info.role !== "assistant")
     return { created: 0, updated: 0, deleted: 0 };
@@ -222,6 +227,9 @@ export async function consolidate(input: {
     path: { id: workerID },
     query: { limit: 2 },
   });
+  // Rotate worker session — see run() comment.
+  workerSessions.delete(input.sessionID);
+
   const last = msgs.data?.at(-1);
   if (!last || last.info.role !== "assistant") return { updated: 0, deleted: 0 };
 
diff --git a/src/distillation.ts b/src/distillation.ts
index 453e5bf..ad4d924 100644
--- a/src/distillation.ts
+++ b/src/distillation.ts
@@ -388,6 +388,11 @@ async function distillSegment(input: {
     path: { id: workerID },
     query: { limit: 2 },
   });
+  // Rotate worker session so the next call starts fresh — prevents
+  // accumulating multiple assistant messages with reasoning/thinking parts,
+  // which providers reject ("Multiple reasoning_opaque values").
+  workerSessions.delete(input.sessionID);
+
   const last = msgs.data?.at(-1);
   if (!last || last.info.role !== "assistant") return null;
 
@@ -438,6 +443,9 @@ async function metaDistill(input: {
     path: { id: workerID },
     query: { limit: 2 },
   });
+  // Rotate worker session — see distillSegment() comment.
+  workerSessions.delete(input.sessionID);
+
   const last = msgs.data?.at(-1);
   if (!last || last.info.role !== "assistant") return null;
 
diff --git a/src/reflect.ts b/src/reflect.ts
index c37fd2c..5addfac 100644
--- a/src/reflect.ts
+++ b/src/reflect.ts
@@ -1,6 +1,7 @@
 import { tool } from "@opencode-ai/plugin/tool";
 import * as temporal from "./temporal";
 import * as ltm from "./ltm";
+import * as log from "./log";
 import { db, ensureProject } from "./db";
 import { serialize, inline, h, p, ul, lip, liph, t, root } from "./markdown";
 
@@ -114,34 +115,46 @@ export function createRecallTool(projectPath: string, knowledgeEnabled = true):
       const scope = args.scope ?? "all";
       const sid = context.sessionID;
 
-      const temporalResults =
-        scope === "knowledge"
-          ? []
-          : temporal.search({
-              projectPath,
-              query: args.query,
-              sessionID: scope === "session" ? sid : undefined,
-              limit: 10,
-            });
+      let temporalResults: temporal.TemporalMessage[] = [];
+      if (scope !== "knowledge") {
+        try {
+          temporalResults = temporal.search({
+            projectPath,
+            query: args.query,
+            sessionID: scope === "session" ? sid : undefined,
+            limit: 10,
+          });
+        } catch (err) {
+          log.error("recall: temporal search failed:", err);
+        }
+      }
 
-      const distillationResults =
-        scope === "knowledge"
-          ? []
-          : searchDistillations({
-              projectPath,
-              query: args.query,
-              sessionID: scope === "session" ? sid : undefined,
-              limit: 5,
-            });
+      let distillationResults: Distillation[] = [];
+      if (scope !== "knowledge") {
+        try {
+          distillationResults = searchDistillations({
+            projectPath,
+            query: args.query,
+            sessionID: scope === "session" ? sid : undefined,
+            limit: 5,
+          });
+        } catch (err) {
+          log.error("recall: distillation search failed:", err);
+        }
+      }
 
-      const knowledgeResults =
-        !knowledgeEnabled || scope === "session"
-          ? []
-          : ltm.search({
-              query: args.query,
-              projectPath,
-              limit: 10,
-            });
+      let knowledgeResults: ltm.KnowledgeEntry[] = [];
+      if (knowledgeEnabled && scope !== "session") {
+        try {
+          knowledgeResults = ltm.search({
+            query: args.query,
+            projectPath,
+            limit: 10,
+          });
+        } catch (err) {
+          log.error("recall: knowledge search failed:", err);
+        }
+      }
 
       return formatResults({
         temporalResults,