perf(hooks): queue session writes, split session-start, cache version check by davidbuniat · Pull Request #61 · activeloopai/hivemind

davidbuniat · 2026-04-18T06:12:59Z

Summary

Disk-backed session write queue — src/hooks/session-queue.ts replaces the per-event INSERT from capture.ts with an append-only JSONL queue per session. Inflight renames, stale recovery, batched flushes, and an auth-failure disable state keep the hot path local-only; flushes run on Stop/SubagentStop and SessionEnd.
Session-start split — session-start.ts is now local-only (credentials + context injection). All network work (table setup, placeholder row, queue drain, version refresh, auto-update) moved to session-start-setup.ts so the sync hook no longer blocks on Deeplake.
Cached version check — new src/hooks/version-check.ts caches the latest-version lookup with a TTL so we stop hitting GitHub on every session start.
Grep prefilter — grep-core gains extractRegexLiteralPrefilter so content scans with safe literal substrings (foo.*bar → foo) still benefit from a LIKE anchor.
Single-query memory+sessions reads — virtual-table-query.ts and grep-core.ts now emit one UNION ALL query across the memory and sessions tables (with a parallel-query fallback on failure), collapsing two round-trips into one.
buildPathFilter narrows file-like paths — paths whose basename contains a . now use exact path = instead of the LIKE '.../%' prefix, avoiding spurious matches when a file path is also a directory prefix.
Mirror changes on the codex plugin side (capture, pre-tool-use, session-start, session-start-setup, stop) and regenerated bundles for both plugins.

Review fixes (`7698cb8`)

SQL injection guard (session-queue.ts) — escape \ → \\ before ' → '' in the JSON message column so payloads survive SQL backends with standard_conforming_strings=off.
Pre-release version compare (version-check.ts) — strip the -<tag> suffix before Number() in isNewer, so 1.2.3-beta no longer collapses to NaN and silently suppresses update prompts.
requeueInflight TOCTOU (session-queue.ts) — replaced the existsSync/renameSync branch with an unconditional appendFileSync(queuePath, inflight). Removes the race where a concurrent capture append could be atomically overwritten by the rename.
extractScopedPath regex — resolved: the dual-table scope branching (and src/virtual-path-scope.ts) was removed in 31c573a; both tables are queried via the new UNION ALL path.

Follow-up (post-review)

Hook modules are now import-safe — each hook entrypoint (capture, pre-tool-use, session-start, session-start-setup, session-end, stop, and the codex mirrors) splits its side-effecting main() from pure helpers and guards execution behind a new isDirectRun(import.meta.url) helper (src/utils/direct-run.ts). Lets the test suite import hook logic without triggering stdin reads or network calls.
Source-level hook tests — new claude-code/tests/hooks-source.test.ts (~590 lines) and codex/tests/codex-source-hooks.test.ts (~616 lines) cover the extracted entry points end-to-end with mocked config/API deps. New version-check.test.ts locks in TTL + pre-release handling. virtual-table-query.test.ts covers the batched readVirtualPathContents / listVirtualPathRowsForDirs and the UNION ALL → parallel fallback path.
Bundle regeneration — both claude-code/bundle/* and codex/bundle/* rebuilt to match the refactored source.

Test plan

tsc --noEmit --skipLibCheck (via pre-commit)
npm test — source-level hook suites pass (hooks-source, codex-source-hooks, version-check, virtual-table-query, session-queue)
Smoke: start a session with no creds → device flow still works; start with creds → context injects instantly, setup runs in background
Smoke: run a few turns, confirm rows land in the sessions table in batches (look for flush flushed: rows=N batches=M in ~/.deeplake/hook-debug.log)
Smoke: kill a session mid-flush, reopen → stale inflight recovered, queue drains on next SessionStart
Smoke: ls /sessions, grep under /sessions/... and under memory paths → single UNION ALL query issued (fallback only on DB error)

🤖 Generated with Claude Code

… check Refactor the hot-path session-start and capture hooks to do less synchronous network work, and introduce a disk-backed write queue so per-event inserts no longer block the user. Highlights: - New src/hooks/session-queue.ts: append-only JSONL queue per session with inflight rename, stale recovery, batched INSERT flush, and auth-failure disable state. Flushed on Stop/SubagentStop and SessionEnd. - src/hooks/capture.ts now enqueues rows locally instead of issuing one INSERT per event; flush happens at turn boundaries. - src/hooks/session-start.ts slimmed to local-only work (credentials + context injection). All network work (table setup, placeholder row, queue drain, version check, auto-update) moved to session-start-setup.ts. - New src/hooks/version-check.ts with cached latest-version lookup (TTL) so we don't hit GitHub on every session start. - New src/virtual-path-scope.ts centralizes /sessions/ vs memory routing; pre-tool-use and grep-core consult it for ls/find/grep scoping so sessions and memory are queried in parallel only when the path covers both. - grep-core gains a regex literal prefilter helper so content scans can still leverage a LIKE anchor when a safe substring exists. - Matching changes on the codex side (capture/pre-tool-use/session-start/ session-start-setup/stop) and regenerated bundles for both plugins. - Tests: new session-queue.test.ts and version-check.test.ts; updates to session-start, grep-core, grep-interceptor, deeplake-api, and codex integration tests. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

claude · 2026-04-18T06:13:15Z

Claude finished @davidbuniat's task in 2m 16s —— View job

Review

session-queue.ts:125 — SQL injection via backslash: manual replace(/'/g, "''") on the JSON message field skips backslash escaping; on a backend with standard_conforming_strings=off, \' in JSON terminates the string literal.
version-check.ts:36 — isNewer silently fails for pre-release version strings (e.g. 1.2.3-beta): Number("3-beta") is NaN, all comparisons return false, update prompt is suppressed.
session-queue.ts:357 — TOCTOU data loss in requeueInflight: a concurrent capture process can create the queue file between existsSync and renameSync, which then atomically overwrites it.
grep-core.ts:273 — extractScopedPath regex [^']+ breaks for SQL-escaped paths (single quote → ''), causing wrong table routing.

Inline suggestions attached to each line.

claude · 2026-04-18T06:15:20Z

+  if (rows.length === 0) throw new Error("buildSessionInsertSql: rows must not be empty");
+  const table = sqlIdent(sessionsTable);
+  const values = rows.map((row) => {
+    const jsonForSql = row.message.replace(/'/g, "''");


Manual single-quote escaping is insufficient: if the SQL backend runs with standard_conforming_strings=off, a \' sequence inside JSON would be parsed as an escaped quote by the SQL engine, terminating the string literal prematurely and allowing SQL injection. The existing comment on the old capture.ts noted the trade-off, but centralising this into a shared builder makes the risk surface larger. Either document that the Deeplake SQL endpoint always uses standard-conforming strings, or strip backslashes before content reaches the DB (accepting that embedded \ in JSON strings become literal).

Suggested change

const jsonForSql = row.message.replace(/'/g, "''");

const jsonForSql = row.message.replace(/\\/g, "\\\\").replace(/'/g, "''");

claude · 2026-04-18T06:15:22Z

+}
+
+export function isNewer(latest: string, current: string): boolean {
+  const parse = (v: string) => v.split(".").map(Number);


Number("3-beta") → NaN; all > / === comparisons involving NaN return false, so a pre-release tag in either version silently treats the version as equal and suppresses the update prompt.

Suggested change

const parse = (v: string) => v.split(".").map(Number);

const parse = (v: string) => v.replace(/-.*$/, "").split(".").map(Number);

claude · 2026-04-18T06:15:25Z

+      if (isStale(path, staleInflightMs)) {
+        sessionIds.add(name.slice(0, -".inflight".length));
+      }
+    }


TOCTOU: another capture process can append to queuePath between the existsSync check and renameSync, then renameSync atomically overwrites that file and those rows are lost; prefer appending the inflight content to queuePath unconditionally instead of branching on existence.

Suggested change

}

if (!existsSync(inflightPath)) return;

if (existsSync(queuePath)) {

const inflight = readFileSync(inflightPath, "utf-8");

const queued = readFileSync(queuePath, "utf-8");

writeFileSync(queuePath, `${inflight}${queued}`);

rmSync(inflightPath, { force: true });

} else {

renameSync(inflightPath, queuePath);

}

claude · 2026-04-18T06:15:28Z

 }

+function extractScopedPath(pathFilter: string): string {
+  const match = pathFilter.match(/path = '([^']+)'/);


[^']+ stops at the first ', so a path containing a single quote (SQL-escaped to '' by buildPathFilter) yields a truncated or wrong path, routing the query to both tables instead of just one.

Suggested change

const match = pathFilter.match(/path = '([^']+)'/);

const match = pathFilter.match(/path = '((?:[^']|'')+)'/);

return match?.[1].replace(/''/g, "'") ?? "/";

github-actions · 2026-04-18T06:15:41Z

Coverage Report

Scope: files changed in this PR. Enforced threshold: 90% per metric (per file via vitest.config.ts).

Status	Category	Percentage	Covered / Total
🔴	Lines	88.93% (🎯 90%)	2146 / 2413
🔴	Statements	87.03% (🎯 90%)	2455 / 2821
🔴	Functions	86.52% (🎯 90%)	308 / 356
🔴	Branches	84.02% (🎯 90%)	1925 / 2291

File Coverage — 30 files changed

File	Stmts	Branches	Functions	Lines
`src/deeplake-api.ts`	🟢 96.9%	🔴 89.7%	🟢 97.3%	🟢 98.2%
`src/hooks/bash-command-compiler.ts`	🟢 94.3%	🔴 87.7%	🟢 96.2%	🟢 99.0%
`src/hooks/capture.ts`	🟢 98.2%	🟢 98.4%	🟢 100.0%	🟢 100.0%
`src/hooks/codex/capture.ts`	🟢 95.7%	🟢 90.6%	🟢 100.0%	🟢 95.2%
`src/hooks/codex/pre-tool-use.ts`	🟢 95.6%	🔴 86.0%	🟢 90.9%	🟢 95.9%
`src/hooks/codex/session-start-setup.ts`	🟢 91.5%	🟢 91.8%	🔴 80.0%	🟢 92.4%
`src/hooks/codex/session-start.ts`	🟢 100.0%	🟢 96.2%	🟢 100.0%	🟢 100.0%
`src/hooks/codex/spawn-wiki-worker.ts`	🔴 20.0%	🔴 0.0%	🔴 25.0%	🔴 20.0%
`src/hooks/codex/stop.ts`	🟢 98.0%	🟢 94.7%	🔴 88.9%	🟢 97.8%
`src/hooks/codex/wiki-worker.ts`	🔴 0.0%	🔴 0.0%	🔴 0.0%	🔴 0.0%
`src/hooks/grep-direct.ts`	🟢 96.9%	🟢 92.9%	🟢 100.0%	🟢 98.4%
`src/hooks/memory-path-utils.ts`	🟢 100.0%	🟢 90.9%	🟢 100.0%	🟢 100.0%
`src/hooks/pre-tool-use.ts`	🔴 86.1%	🔴 75.0%	🟢 92.8%	🟢 90.5%
`src/hooks/query-cache.ts`	🟢 95.7%	🟢 100.0%	🔴 80.0%	🟢 100.0%
`src/hooks/session-end.ts`	🟢 100.0%	🟢 95.7%	🟢 100.0%	🟢 100.0%
`src/hooks/session-queue.ts`	🟢 96.7%	🔴 88.3%	🟢 100.0%	🟢 98.3%
`src/hooks/session-start-setup.ts`	🟢 96.3%	🟢 94.9%	🔴 83.3%	🟢 96.2%
`src/hooks/session-start.ts`	🟢 100.0%	🟢 94.7%	🟢 100.0%	🟢 100.0%
`src/hooks/spawn-wiki-worker.ts`	🔴 20.0%	🔴 0.0%	🔴 25.0%	🔴 20.0%
`src/hooks/summary-state.ts`	🔴 52.6%	🔴 52.4%	🔴 61.5%	🔴 60.3%
`src/hooks/version-check.ts`	🟢 100.0%	🟢 92.5%	🟢 100.0%	🟢 100.0%
`src/hooks/virtual-table-query.ts`	🟢 97.8%	🟢 93.9%	🟢 95.7%	🟢 98.8%
`src/hooks/wiki-worker.ts`	🔴 0.0%	🔴 0.0%	🔴 0.0%	🔴 0.0%
`src/shell/deeplake-fs.ts`	🔴 89.4%	🔴 79.3%	🔴 85.5%	🟢 92.8%
`src/shell/grep-core.ts`	🟢 96.2%	🟢 90.3%	🟢 97.0%	🟢 100.0%
`src/shell/grep-interceptor.ts`	🟢 97.8%	🟢 91.1%	🟢 91.7%	🟢 100.0%
`src/utils/direct-run.ts`	🟢 100.0%	🟢 100.0%	🟢 100.0%	🟢 100.0%
`src/utils/session-path.ts`	—	—	—	—
`src/utils/version-check.ts`	—	—	—	—
`src/utils/wiki-log.ts`	—	—	—	—

_{Generated for commit 8894d6b.}

…s, requeue race - session-queue: escape backslashes before single quotes so JSON payloads survive SQL backends with standard_conforming_strings=off. - version-check: strip pre-release tags before Number() so 1.2.3-beta compares deterministically instead of collapsing to NaN. - session-queue: requeueInflight now appends inflight content via appendFileSync unconditionally, removing the existsSync→renameSync TOCTOU window where a concurrent capture append could be overwritten. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

The virtual /index.md served from the Deeplake-backed memory path was only listing rows from the `memory` table (summaries), so in workspaces where the memory table is empty or has been dropped (e.g. locomo_benchmark/baseline) the index falsely reported "0 sessions" / "1 sessions" even when the `sessions` table held hundreds of rows. Agents reading the index would conclude memory was empty and give up on retrieval. Extend `buildVirtualIndexContent` to accept both summary and session rows and render them under `## Summaries` and `## Sessions` sections, with a combined header like `273 entries (1 summaries, 272 sessions):`. Update the fallback branch in `readVirtualPathContents` to query both tables in parallel and pass the results to the new builder. Verified against the locomo baseline benchmark: the same three QAs that previously saw a 1-entry index (conv 0 / qa 6, 25, 46) now receive the full listing on the fast-path cat index.md call, and the generated index matches the 272 sessions ingested into the baseline workspace.

Lock in the fix that made `buildVirtualIndexContent` aware of session rows and the fallback path in `readVirtualPathContents` query both tables when /index.md has no physical row. New unit tests for `buildVirtualIndexContent`: - renders both sections with a combined "N entries (X summaries, Y sessions):" header when both tables have rows, with Summaries listed before Sessions - renders only sessions when the memory table is empty (guards the baseline_cloud regression where the old output reported "0 sessions:" despite 272 rows in the sessions table) - stays backwards-compatible for callers that pass only summary rows - produces a well-formed empty index when both inputs are empty New integration tests for `readVirtualPathContents`: - when /index.md has no physical row, the fallback issues three queries (union for exact paths + two parallel fallback queries) and each fallback targets the correct table and LIKE filter - the synthesized index still renders summaries if the sessions-table fallback query rejects One existing test (`reads multiple exact paths in a single query and synthesizes /index.md when needed`) was updated to expect three calls instead of two, matching the new dual-table fallback behavior.

…al QAs Adds integration coverage for the three LoCoMo QAs that cloud baseline got wrong before the /index.md fix landed (conv_0 questions 6, 25, 46): - qa_6 : "When is Melanie planning on going camping?" (gold: June 2023) - qa_25 : "When did Caroline go to the LGBTQ conference?" (10 July 2023) - qa_46 : "Would Melanie be considered an ally..." (Yes, she is supportive) Each QA is driven through `processPreToolUse` twice — once via the Read-tool intercept (`Read /home/.deeplake/memory/index.md`) and once via the Bash intercept (`cat /home/.deeplake/memory/index.md`) — against a DeeplakeApi mock that mirrors the real sessions-only baseline workspace at the time of the regression (memory table empty, 272 rows across conv_0..9 in the sessions table). The assertions verify the synthesized index reports "272 entries (0 summaries, 272 sessions):", contains the specific session file each QA needed (conv_0_session_2 for the camping date, conv_0_session_7 for the conference, conv_0_session_10 for the ally question), and does not regress to "0 sessions:" or "1 sessions:" headers. The suite also exercises the pure builder and the `readVirtualPathContents` fallback against the same 272-row fixture so the regression is caught at the unit, integration, and entry-point boundaries. Tests run hermetically by stubbing the disk-backed session cache so they do not read or write ~/.deeplake/query-cache/. Verified by temporarily reverting the fix on virtual-table-query.ts: all eight assertions fail without the fix (0 sessions: header, missing session paths), then pass cleanly once the fix is restored.

Claude Code hooks replace the tool input with whatever `updatedInput` they emit. The pre-tool-use hook was always emitting `{command, description}` — the Bash-tool shape — even when the incoming tool was Read. The Read implementation then read `updatedInput.file_path`, found `undefined`, and crashed with: "The 'path' property must be of type string, got undefined" Claude wasted a turn (or more) recovering by re-issuing the read as a Bash `cat`. In the plugin-v8-optimizations-100 run (memory table populated, 272 summaries), 60 / 100 transcripts contained this error. In the sessions-only baseline_cloud run it was even worse because the recovery path hit fix #1's `/index.md` bug on top. The fix teaches the hook to materialize Read intercepts into a real file on disk and return the path: - Add an optional `file_path` field to ClaudePreToolDecision. When present, main() emits `updatedInput: {file_path}` instead of the Bash-shaped `{command, description}`. - Add `writeReadCacheFile(sessionId, virtualPath, content)` which writes into `~/.deeplake/query-cache/<sessionId>/read/<virtualPath>`, mirroring the per-session cache the index already uses. Cleanup reuses the existing session-end path. - Add `buildReadDecision(file_path, description)` so the call site is explicit about the Read-tool shape. - Branch in the direct-read code path: when `input.tool_name === "Read"`, write the fetched content via `writeReadCacheFile` and return `buildReadDecision(...)`. Bash cat / head / tail / wc keep their existing `echo <content>` shape. - Thread `writeReadCacheFileFn` through the existing deps so tests can stub it and stay hermetic. Test updates: - `hooks-source.test.ts > reuses cached /index.md content ...` now asserts `directDecision?.file_path` instead of `.command` for the Read variant, with a stubbed cache writer that captures the written content. - `hooks-source.test.ts > uses direct grep, direct reads, listings ...` updated the Read assertion the same way. - `pre-tool-use-baseline-cloud-3qa.test.ts` Read cases now assert that the decision carries `file_path` (bug #2 guard) while the Bash cases confirm `command` still exists (bash shape preserved). Verified: stashing the fix causes all three Read-tool per-QA tests to fail; restoring the fix makes them pass. End-to-end verified against locomo_benchmark/baseline (272 sessions, memory dropped) on a 5-QA subset spanning conv 0 questions 6 / 25 / 29 / 46 / 62 — five QAs that baseline-local answered correctly and the original baseline_cloud run got wrong. Post-fix run: 5 / 5 correct, 0 occurrences of "property must be of type string" across the five transcripts. (Haiku happened to pick Bash over Read for each QA in this run, so the Read intercept didn't fire in-flight; the unit tests and the earlier fix1b transcript where Read was attempted cover that path.)

…ons/* Read Extends the integration test suite for fix #1 and fix #2 with two more QAs — qa_3 (Caroline's research) and qa_29 (Melanie's pottery workshop) — bringing the REAL_QAS pool to five. qa_3 specifically maps to the Read calls that fired in the `baseline_cloud_9qa_read_candidates_fix2` benchmark run (three Read calls, all against memory paths), so its inclusion anchors the test suite against live behavior observed on the sessions-only `locomo_benchmark/baseline` workspace. Adds a dedicated test for the other Read-tool regression surface: a Read against a /sessions/<file>.json path (not only /index.md). The same benchmark run showed haiku calling `Read /home/.deeplake/memory/sessions/conv_0_session_{1,2}.json` directly; the new test feeds that exact shape through `processPreToolUse`, asserts the decision carries `file_path` (not `command`), and verifies the session JSON body is materialized to the read cache at the expected virtual path. Renames the test file from `pre-tool-use-baseline-cloud-3qa.test.ts` to `pre-tool-use-baseline-cloud.test.ts` now that it covers more than three QAs. Verification: 13 / 13 tests pass; temporarily stashing the fix #2 source change makes the new per-QA Read assertions and the /sessions Read assertion all fail (decision.file_path is undefined), restoring the source brings them back to green.

Claude Code's Bash tool merges the child process's stderr into the tool_result string the model sees. When a user or CI had HIVEMIND_TRACE_SQL=1 or HIVEMIND_DEBUG=1 exported, every SQL query issued by the shell bundle during `node shell-bundle -c "..."` wrote a `[deeplake-sql] query start:` line to stderr — and all of it landed in Claude's view of the command output, drowning out the real data. Confirmed on the original baseline_cloud-100 run: 35+ trace lines across the transcripts, interleaved with the bash command results Claude was trying to parse. In several QAs the SQL noise replaced the useful output entirely (exit code 1 + trace lines → Claude concluded "no matches"). Two-part fix: 1. Move the TRACE_SQL / DEBUG_FILE_LOG env checks out of the top-level module constants in `src/deeplake-api.ts` and into the `traceSql` function body. The check now evaluates per-call, so callers that import the SDK can still flip the env vars at runtime. (Previously the constants were frozen at module load, so any downstream delete had no effect.) 2. In `src/shell/deeplake-shell.ts`, detect one-shot mode (`-c` in argv) up front and `delete process.env[...]` the four trace variables before doing anything else. Interactive REPL mode keeps the env untouched so developers still get `[deeplake-sql]` lines when they set the vars intentionally. Test coverage in `claude-code/tests/shell-bundle-sql-trace-silence.test.ts`: - Spawns the built `claude-code/bundle/shell/deeplake-shell.js` with fake creds and HIVEMIND_TRACE_SQL / DEEPLAKE_TRACE_SQL / HIVEMIND_DEBUG / DEEPLAKE_DEBUG all set to "1", pointed at an unreachable API URL with a 200ms query timeout. After the SQL query fails (expected), asserts stderr is free of `[deeplake-sql]` lines. - A source-level check confirms `traceSql` reads the env vars inside the function body (runtime) rather than via a frozen top-level `const TRACE_SQL`. Regression verified: stashing both source changes causes the bundle test to fail with the expected `[deeplake-sql] query fail:` line in stderr and the source-level test to report the reintroduced top-level const; restoring the source brings both green. End-to-end verified against `locomo_benchmark/baseline` on a 6-QA subset (conv 0 QAs 3 / 11 / 27 / 32 / 59 / 65). Before fix: 2–4 SQL trace lines leaked into each QA's tool_result stream. After fix: zero leaks across all six transcripts. qa_3 and qa_11 (already correct with fix #1 + fix #2) stay correct; the hard QAs (27, 32, 59, 65) continue to show judge-score variance under Haiku non-determinism but are no longer looking at SQL noise as their "retrieval result".

`sqlLike(value)` escapes `_` and `%` in the value by prefixing them with backslashes so callers can interpolate user-controlled strings inside `LIKE 'pattern'` literals. But the Deeplake SQL backend does not treat backslash as the LIKE escape character by default — without an explicit `ESCAPE '\'` clause, `\_` becomes two literal characters in the pattern instead of a literal `_`, so queries whose paths contain underscores silently return nothing. Empirically reproduced on the `locomo_benchmark/baseline` workspace: grep -l Caroline /home/.deeplake/memory/sessions/*.json → returns 20+ session paths (works: path has no underscores past the final slash, sqlLike produces '/sessions/%.json') grep -i hike /home/.deeplake/memory/sessions/conv_0_session_*.json → returns (no matches) before this fix — because the SQL becomes path LIKE '/sessions/conv\_0\_session\_%.json' and Deeplake matches `\_` literally against `_` → zero rows → returns real matches after this fix (ESCAPE '\' added, `\_` is now interpreted as literal `_`, matches the underscored paths) Same symptom in the 100-QA post-fix baseline_cloud run: 15 / 100 QA that local baseline answered correctly came back wrong/partial in the cloud, and the tool-call transcripts show repeated `(no matches)` on grep commands whose glob mentions `conv_<c>_session_*.json`. The fix appends ` ESCAPE '\'` to every `LIKE '...'` clause that is fed from `sqlLike()`: - src/shell/grep-core.ts:buildPathCondition — both the wildcard path branch and the directory-prefix branch. - src/hooks/virtual-table-query.ts:buildDirFilter — per-dir `path LIKE '<dir>/%'` clauses used by listVirtualPathRowsForDirs. - src/hooks/virtual-table-query.ts:findVirtualPaths — both the memoryTable and sessionsTable branches, on both the path and the filename LIKE clauses. Codex/Claude Code find fallbacks and `bash-command-compiler`'s `find_grep` path ultimately call `findVirtualPaths`, so they inherit the fix without a local change. Rebuild updates the 8 Claude Code and 8 Codex bundles. Verified via a targeted reproducer that drives `processPreToolUse` with the same glob commands against the real baseline workspace: all three underscored-glob greps return real matches after the fix, where previously they returned `(no matches)`.

…review truncation Claude Code's Bash tool silently persists any tool_result larger than ~16 KB to disk and replaces it with a 2 KB preview plus a path to the persisted file. The model almost never recovers from that replacement: in the locomo `baseline_cloud_100qa_fix123` run (100 QA, all fixes #1 / #2 / #3 applied), 11 / 14 losing QAs that hit the persist path never read the persisted file even once, and finished on the truncated 2 KB preview — which was rarely enough to carry the answer. Typical triggers from that run: - `grep -r Caroline /home/.deeplake/memory/` → 66 KB of dialogue lines because the name appears in nearly every session. - `for f in /.../sessions/conv_0_session_*.json; do grep ...; done` → 926 KB of concatenated grep output (slow-path shell bundle). - `cat /.../sessions/conv_0_session_*.json` (glob over many files) → tens of KB of JSON. This fix introduces `src/utils/output-cap.ts` with `capOutputForClaude(output, {kind})` and applies it on the plugin's exit paths before Claude Code sees the result: - `grep-direct.ts:handleGrepDirect` — caps grep's combined output. - `bash-command-compiler.ts:executeCompiledBashCommand` — caps the final concatenation of compiled segments (cat / ls / find / grep / find_grep, incl. `&&` and `;` pipelines). - `pre-tool-use.ts` direct read path — caps `cat` / `head` / `tail` Bash intercepts. Read-tool intercepts are unaffected: they write content to disk and return a `file_path`, so no size pressure from Claude Code's preview truncation applies. - `pre-tool-use.ts` direct `ls` and `find` fallbacks — capped too. Cap is 8 KB (CLAUDE_OUTPUT_CAP_BYTES), comfortably under Claude Code's ~16 KB persist threshold and 4× the 2 KB preview the model used to get. When the cap fires, the output is truncated at a line boundary and the tail gets a short footer: ... [grep truncated: 313 more lines (58.4 KB) elided — refine with '| head -N' or a tighter pattern] The footer names the operation (grep / cat / ls / find / bash) and gives the model an actionable next step. Unit tests in `claude-code/tests/output-cap.test.ts` (8 tests): - No-op for inputs that fit the cap, including empty strings. - Byte size after cap is ≤ CLAUDE_OUTPUT_CAP_BYTES. - Truncation aligns to line boundaries; footer line counts add up to the original total. - Single oversized line (no newline) is byte-sliced with a footer. - Custom `maxBytes` is honoured (no silent 1 KB floor). - Default footer kind is "output" when no kind is passed. - A realistic 400-line grep fixture that exceeds 16 KB gets capped above 4 KB and under the cap — strictly more useful than the 2 KB preview. Bundle rebuild propagates the change to the 8 Claude Code and 8 Codex bundles. Verified empirically via `processPreToolUse` against the real `locomo_benchmark/baseline` workspace: grep -r Caroline /home/.deeplake/memory/ before fix #5: ~66 KB of output, Claude Code truncated to 2 KB. after fix #5: ~7.9 KB (313 lines kept, 313 more elided, footer). grep -r 'Caroline|Melanie' /home/.deeplake/memory/ before: ~70 KB. after: ~7.9 KB with footer reporting 391 lines elided. cat /home/.deeplake/memory/sessions/conv_0_session_1.json ~2 KB — unchanged, well under the cap. Expected impact on the 100-QA baseline_cloud benchmark: 11 QAs that lost points purely because of the 2 KB preview now see up to 8 KB of the same grep output. Combined with fix #4 (19 QAs with (no matches) from SQL LIKE under-escaping), the plugin should close the remaining ~7.5 pt gap to the local-files baseline (75.0 %) and likely match or exceed it.

Append per-file thresholds in vitest.config.ts for the two source files that materially changed in this PR, holding them at the same 90 / 90 / 90 / 90 bar already applied to the grep-dual-table files from PR #60: - src/utils/output-cap.ts — new file, fix #5. Currently at 100 / 100 / 100 / 100 under the tests in claude-code/tests/output-cap.test.ts. - src/hooks/virtual-table-query.ts — rewritten for fix #1 (dual-table index generation) and fix #4 (ESCAPE '\' on LIKE clauses). Currently at 98.9 / 93.2 / 95.8 / 98.9 under claude-code/tests/virtual-table-query.test.ts and claude-code/tests/pre-tool-use-baseline-cloud.test.ts. Files left without new thresholds because their changes in this PR are small and localized: - src/hooks/pre-tool-use.ts — added a Read-intercept branch and a writeReadCacheFile helper; the broader file is covered by hooks-source.test.ts which is pre-failing on this branch (unrelated to the fixes in this PR). - src/deeplake-api.ts — moved TRACE_SQL from a module-level const into the traceSql function body (fix #3). - src/shell/deeplake-shell.ts — three env-var deletes in the one-shot entry (fix #3).

…sessions # Conflicts: # claude-code/bundle/capture.js # claude-code/bundle/session-end.js # claude-code/bundle/session-start-setup.js # claude-code/bundle/session-start.js # codex/bundle/capture.js # codex/bundle/session-start-setup.js # codex/bundle/session-start.js # codex/bundle/stop.js # src/hooks/capture.ts # src/hooks/codex/capture.ts # src/hooks/codex/session-start-setup.ts # src/hooks/codex/session-start.ts # src/hooks/codex/stop.ts # src/hooks/session-end.ts # src/hooks/session-start-setup.ts # src/hooks/session-start.ts

…ix #4 Fix #4 (`3d15454`) appended `ESCAPE '\'` to every LIKE clause fed by `sqlLike()` so backslash-escaped `_` / `%` match their literal characters on the Deeplake backend. The existing buildPathFilter glob test still asserted the pre-fix SQL. Update the literal string and the regex so the assertion matches the new SQL shape, and annotate the case with a comment explaining why the ESCAPE clause is required.

The `pull_request.branches:` filter matches on the base branch of a PR. With `[main, dev]` the CI workflow (typecheck + jscpd duplication check + coverage report) silently skipped any PR targeting a long- lived feature branch like `optimizations`. Only "PR Checks" and "Claude PR Review" ran on those PRs, so the coverage and dup report comments never showed up. Dropping the filter runs CI on every PR; the push side stays limited to main/dev so we don't double-run on personal branch pushes.

The merge of `origin/main` pulled in the canonical source refactors for the Codex hooks (session-start / session-start-setup / stop) but the corresponding tests on Davit's `optimizations` branch were written against an intermediate refactor state where helpers like `runCodexSessionStartSetup`, `extractLastAssistantMessage`, `buildCodexStopEntry`, `runCodexStopHook`, and the matching `claude-code/tests/hooks-source.test.ts` imports never made it into the exported surface. CI was failing with 39 `TypeError: X is not a function` errors. Two broken test files are deleted (they never existed on `origin/main` and their coverage is already provided by the canonical suites added by PR #62, which landed on `main` and came in with this merge): - `claude-code/tests/hooks-source.test.ts` (894 LOC, 19 / 30 failing) - `codex/tests/codex-source-hooks.test.ts` (1126 LOC, 20 / 28 failing) The canonical replacements from `main` cover the same ground: - `claude-code/tests/capture-hook.test.ts` - `claude-code/tests/session-start-hook.test.ts` - `claude-code/tests/session-start-setup-hook.test.ts` - `claude-code/tests/session-end-hook.test.ts` - `claude-code/tests/codex-capture-hook.test.ts` - `claude-code/tests/codex-session-start-hook.test.ts` - `claude-code/tests/codex-session-start-setup-hook.test.ts` - `claude-code/tests/codex-stop-hook.test.ts` - `claude-code/tests/codex-wiki-worker.test.ts` Two test files also merged in with Davit-branch test blocks that asserted stale session-start prompt wording. Restored to main's version: - `claude-code/tests/session-start.test.ts` — dropped the "steers recall tasks toward index-first exact file reads" block; main's session-start prompt uses different phrasing. - `codex/tests/codex-integration.test.ts` — restored main's assertions ("Do NOT jump straight to JSONL" instead of "Do NOT jump straight to raw session files"). Verified: `npx vitest run` — 837 / 837 tests pass across 39 files. Per-file coverage thresholds unaffected (output-cap.ts 100%, virtual-table-query.ts 98.9% lines, grep-core.ts / grep-direct.ts / grep-interceptor.ts / session-queue.ts all above their bars).

…ine count Three issues flagged by the automated review on PR #63: 1. `writeReadCacheFile` (src/hooks/pre-tool-use.ts) had no containment guard: `path.join(cacheRoot, session, "read", rel)` resolves `..` segments in `rel`, so a DB-controlled `virtualPath` could escape the per-session cache dir. Added a check that `absPath` stays under `expectedRoot = join(cacheRoot, session, "read")` and throws `"writeReadCacheFile: path escapes cache root: <abs>"` otherwise. Uses `path.sep` so the boundary check is correct on any platform. 2. The inline `/index.md` fallback in `processPreToolUse` (pre-tool- use.ts:334-347) was unreachable after fix #1 landed, and if somehow reached would regenerate the old broken single-table index (queries only `memory`, uses the header "${n} sessions:", omits `## Sessions`). Removed; the dual-table builder in `virtual-table-query.ts` now owns index generation exclusively. 3. `src/utils/output-cap.ts` had a dead `cut += lineBytes` accumulator (would trigger `noUnusedLocals` under strict TS config) and a trailing-newline off-by-one: `output.split("\n")` on `"a\nb\n"` returns `["a", "b", ""]`, so `totalLines` over-counted by 1 whenever the input ended with a newline — which grep and cat both do. The footer reported one extra "elided line" that was the empty terminator, not a real content line. Dropped the dead accumulator and adjusted totalLines to subtract the trailing empty entry. Test coverage: - `claude-code/tests/pre-tool-use-baseline-cloud.test.ts` — 4 new cases on `writeReadCacheFile`: happy path, `../../../etc/passwd` traversal refused (and no file lands anywhere under cacheRoot), absolute-root escape refused, and a path that normalizes back inside the cache (`/sessions/foo/../bar.json`) is still accepted. Plus one integration test that pins the removal of the inline /index.md fallback: `processPreToolUse` must materialize the dual-table builder's content and must NOT issue its own `FROM "memory" WHERE path LIKE '/summaries/%'` SELECT. - `claude-code/tests/output-cap.test.ts` — 2 new cases on the line counting: with a trailing newline the kept-lines + elided-lines sum matches the original line count exactly (no off-by-one), and without a trailing newline the count is still exact. Full suite: 844 / 844 tests passing.

…ed row The jscpd duplication check used to run as a step inside the "Typecheck and Test" job, so the PR checks table only showed a single aggregate row for both. Reviewers couldn't tell at a glance whether duplication passed without opening the combined log. Move jscpd into its own `duplication` job named "Duplication check". Small installation cost (extra `npm install`, runs in parallel with the test job) in exchange for clear attribution on the PR checks table. Artifact upload and the jscpd config stay the same.

PR #63 bot review flagged several source files as under-covered. Added a dedicated branch-coverage suite for the pre-tool-use hook and registered the two now-sufficient files in `vitest.config.ts` so their thresholds are enforced on every run. `claude-code/tests/pre-tool-use-branches.test.ts` — 46 test cases: - Pure helpers: buildAllowDecision, buildReadDecision, rewritePaths, touchesMemory, isSafe (positive + negative paths). - getShellCommand: Grep hit + miss, Read on file + directory, Bash safe + unsafe + non-memory, Glob hit + miss, unknown tool → null. - extractGrepParams: Grep output_mode=count, empty path → "/", Bash delegating to parseBashGrep, non-grep Bash → null, unknown tool → null. - processPreToolUse end-to-end: - returns null for non-memory Bash - returns `[RETRY REQUIRED]` guidance for unsupported commands - falls back to the shell bundle when no config is loaded - Glob + Bash `ls` + Bash `ls -la` long format - ls with both file-level (-rw-) and directory (drwx) entries; also empty-name rows skipped by the `if (!name) continue` guard - cat / head / tail / wc -l / cat | head pipeline - find / find | wc -l - Grep tool delegates to handleGrepDirect; null result falls through to the read/ls branch instead of short-circuiting - direct query throws → shell bundle fallback - Index cache short-circuit: three cases covering the inline readVirtualPathContentsWithCache callback that the bash compiler passes into executeCompiledBashCommand — cache hit, cache miss (writes fresh index), empty cachePaths edge case. Coverage after this suite (measured on pre-tool-use-branches + pre-tool-use-baseline-cloud): src/hooks/pre-tool-use.ts lines 98.9 branches 90.0 funcs 93.8 stmts 98.6 src/hooks/memory-path-utils.ts lines 100 branches 90.9 funcs 100 stmts 100 Both now registered under `coverage.thresholds` at 90 / 90 / 90 / 90 in `vitest.config.ts`, alongside the five existing PR-tracked files. Full suite: 890 / 890 passing (was 844 before this commit).

… paths CI (HOME=/home/runner) reported two failures on the just-added branch coverage suite: AssertionError: expected '/home/emanuele/.deeplake/memory/...' to be '/sessions/a.json' The `rewritePaths` and `touchesMemory` assertions hardcoded my local home path. The real MEMORY_PATH in production is join(homedir(), ".deeplake", "memory"), so hardcoded absolute paths in tests don't survive anywhere except my workstation — not CI, not another developer's machine. Import `homedir` + `join` from node:os / node:path and build MEM_ABS once at the top of the file. The two affected cases now use template strings so the values match whatever home the test runner is using. The other tests in the suite already use ~-prefixed literals, matched by the TILDE_PATH branch independently of homedir. Verified: `env -i HOME=/home/runner PATH=$PATH npx vitest run` — 46 / 46 pass.

fix: close 48-pt LoCoMo accuracy gap with 5 plugin hook fixes

…-sessions" This reverts commit b590669, reversing changes made to 21aff84.

can you push the fix

da4ac52

claude Bot reviewed Apr 18, 2026

View reviewed changes

davidbuniat and others added 22 commits April 17, 2026 23:17

further fixes

31c573a

fixes

2123908

tests fixes

2633403

test fixes

fdb671d

update push

ecb5b5e

added tests

89ffa6a

intermediate push

45fea18

furhter optimizations

d8ed840

test improvements

304895e

improvements

1d49a56

query cache

67124fe

some improvements

e7c2bc7

fix

658dfc7

fixes

2a32494

coverage++

c8cf5c9

less code

d93bb14

75% on 10q

bf0d05a

improve results

2236ce4

fixes

cf98f70

fixes 403 errors

de7c953

last minor improvements

89c38c3

davidbuniat and others added 9 commits April 18, 2026 11:12

test improvements

21aff84

efenocchi mentioned this pull request Apr 20, 2026

fix: close 48-pt LoCoMo accuracy gap with 5 plugin hook fixes #63

Merged

7 tasks

efenocchi added 11 commits April 20, 2026 23:36

Merge pull request #63 from activeloopai/fix/index-md-include-sessions

b590669

fix: close 48-pt LoCoMo accuracy gap with 5 plugin hook fixes

Revert "Merge pull request #63 from activeloopai/fix/index-md-include…

033b6ed

…-sessions" This reverts commit b590669, reversing changes made to 21aff84.

efenocchi mentioned this pull request Apr 21, 2026

fix: close 48-pt LoCoMo accuracy gap with 5 plugin hook fixes (re-targeted to main) #64

Merged

7 tasks

kaghni mentioned this pull request Apr 22, 2026

Wire shared getLatestVersionCached into CC and Codex session-start hooks #70

Closed

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf(hooks): queue session writes, split session-start, cache version check#61

perf(hooks): queue session writes, split session-start, cache version check#61
davidbuniat wants to merge 44 commits intomainfrom
optimizations

davidbuniat commented Apr 18, 2026 •

edited

Loading

Uh oh!

claude Bot commented Apr 18, 2026 •

edited

Loading

Uh oh!

claude Bot Apr 18, 2026

Uh oh!

claude Bot Apr 18, 2026

Uh oh!

claude Bot Apr 18, 2026

Uh oh!

claude Bot Apr 18, 2026

Uh oh!

github-actions Bot commented Apr 18, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

	const jsonForSql = row.message.replace(/'/g, "''");
	const jsonForSql = row.message.replace(/\\/g, "\\\\").replace(/'/g, "''");

	const parse = (v: string) => v.split(".").map(Number);
	const parse = (v: string) => v.replace(/-.*$/, "").split(".").map(Number);

-    }
+  if (!existsSync(inflightPath)) return;
+  if (existsSync(queuePath)) {
+    const inflight = readFileSync(inflightPath, "utf-8");
+    const queued = readFileSync(queuePath, "utf-8");
+    writeFileSync(queuePath, `${inflight}${queued}`);
+    rmSync(inflightPath, { force: true });
+  } else {
+    renameSync(inflightPath, queuePath);
+  }

	const match = pathFilter.match(/path = '([^']+)'/);
	const match = pathFilter.match(/path = '((?:[^']\|'')+)'/);
	return match?.[1].replace(/''/g, "'") ?? "/";

Conversation

davidbuniat commented Apr 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Review fixes (7698cb8)

Follow-up (post-review)

Test plan

Uh oh!

claude Bot commented Apr 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review

Uh oh!

claude Bot Apr 18, 2026

Choose a reason for hiding this comment

Uh oh!

claude Bot Apr 18, 2026

Choose a reason for hiding this comment

Uh oh!

claude Bot Apr 18, 2026

Choose a reason for hiding this comment

Uh oh!

claude Bot Apr 18, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions Bot commented Apr 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Coverage Report

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

davidbuniat commented Apr 18, 2026 •

edited

Loading

Review fixes (`7698cb8`)

claude Bot commented Apr 18, 2026 •

edited

Loading

github-actions Bot commented Apr 18, 2026 •

edited

Loading