poc: WASM/wazero tree-sitter backend (speed + stability vs cgo PR #80) by dvcdsys · Pull Request #81 · dvcdsys/code-index

dvcdsys · 2026-06-07T22:50:05Z

Draft / PoC for comparison — not for merge. Alternative to the cgo backend in #80, to decide direction.

Official tree-sitter C runtime + TypeScript grammar → standalone wasm32-wasi module (zig cc), driven from Go via wazero. No cgo, no JS, no third-party parser — only the wazero host (poc/wasm-treesitter/wasmts.go) is ours.

Speed — same 852-file vscode TS corpus, full-tree walk

backend	wall	files/s	ERROR trees	`editorOptions.ts`
gotreesitter (pure-Go)	13.83s	62	13	8.77s → ERROR
WASM (wazero)	~2.5s	~330	0	49ms
cgo (native, #80)	1.26s	675	0	17ms

~2× slower than cgo, ~5× faster than gotreesitter, correct. Overhead is the per-node host↔guest call boundary (mitigable with a batched subtree export).

Stability

tree-sitter is robust on adversarial input under both backends. WASM additionally contains guest faults (resource/trap → recoverable Go error, host alive) where cgo would SIGSEGV the whole process. Insurance vs unknown C bugs.

Decision framing

~2× parse cost (largely invisible end-to-end — embeddings dominate) in exchange for CGO_ENABLED=0 builds, crash-isolation, and a likely smaller binary. Cost: engineering effort to build/bundle all 31 grammars + flesh out the node API. Full write-up in poc/wasm-treesitter/README.md.

🤖 Generated with Claude Code

Alternative to feat/chunker-cgo-treesitter: the official tree-sitter C runtime + TypeScript grammar compiled to a standalone wasm32-wasi reactor module (build.sh, via zig cc) and driven from Go through wazero — no cgo, no JS, no third-party parser. Only the wazero host (wasmts.go) is bespoke; the parser is unmodified upstream C. wasm_store.c is gated by TREE_SITTER_FEATURE_WASM (we don't define it), so the stock amalgamation compiles to wasi with no stubs. Measured on the same 852-file vscode TypeScript corpus (full-tree walk): backend wall files/s ERROR trees editorOptions.ts gotreesitter (pure-Go) 13.83s 62 13 8.77s -> ERROR WASM (wazero, pure-Go) ~2.5s ~330 0 49ms cgo (native) 1.26s 675 0 17ms - WASM ~2x slower than cgo, ~5x faster than gotreesitter, correct (0 errors). - Overhead is the per-node host<->guest call boundary (~3 calls/node x 2.68M nodes), not memory — slot-pooling barely moved it. A batched "serialize subtree" export would close most of the gap (future work). - Stability: tree-sitter is robust on adversarial input under both backends; WASM additionally CONTAINS faults (resource/guest trap -> recoverable Go error, host alive) where cgo would SIGSEGV the whole process. Insurance vs unknown C bugs, not a fix for an observed crash. Trade-off vs cgo: ~2x parse cost (largely invisible end-to-end since embeddings dominate) in exchange for CGO_ENABLED=0 builds, crash-isolation, and a likely smaller binary; cost is the engineering effort to build/bundle all 31 grammars and flesh out the node API. README.md has the full comparison. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

…d skip, doc-comment attachment Replace gotreesitter with the official tree-sitter C runtime + 31 grammars compiled to one wasm32-wasi module (ts-core.wasm.br, brotli ~3MB) driven via wazero. No cgo: traps are contained (parse falls back to sliding window, the process survives), and the binary stays CGO_ENABLED=0. Memory design (measured on the prod-shaped churn workload): - linear memory is mmap-backed (experimental.WithMemoryAllocator) instead of wazero's default Go-heap append-grow: no realloc-copy garbage on growth and munmap-on-close returns recycled instances' memory to the OS immediately. Churn heapSys 1135→391MB, peak RSS 1070→535MB; full-repo chunking peak RSS 1516→787MB. - engine pool: hard concurrency cap (dashboard-tunable), 256MiB per-instance linear-memory ceiling (2× headroom over the worst measured instance at the indexer's 512KiB file cap), high-water-mark recycling, 1 idle instance. Chunker quality fixes: - minified/bundled js/ts/css (.min., .bundle.js, >2KiB lines) skip the parser straight to sliding window — the pathological input class that ballooned instances for near-zero semantic value. - a declaration's doc comment now attaches to its chunk (language-agnostic via tree-sitter's extra flag + same-row wrapper climb; verified for Go, TS, C, Python, Rust, Java). Generated files stop spraying comment-only micro chunks: openapi.gen.go 893→517 chunks, median 114→256B, symbols/refs byte-identical. Memory-stress harnesses are committed but gated behind CIX_MEMSTRESS=1. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

…am (OOM fix) Two new runtime-config fields, end to end (DB migrations 16/17 → runtimecfg → admin API → openapi → dashboard): - chunk_max_concurrent — the wasm chunker's instance-concurrency cap, decoupled from embedding concurrency; resizes the live limiter without a restart. Env: CIX_CHUNK_MAX_CONCURRENT; per-instance memory knobs stay env-only (CIX_CHUNK_MEM_LIMIT_PAGES, CIX_CHUNK_RECYCLE_GROWTH_MB, CIX_CHUNK_MAX_IDLE). - llama_cache_ram_mib — llama-server's HOST prompt cache cap (--cache-ram). Upstream defaults this to 8 GiB (ggml-org/llama.cpp#16391), which is pure waste for an embeddings-only sidecar: prompts are never reused, but the cache fills anyway. Observed on prod: llama-server RSS 365MB→11.3GB within minutes of indexing vscode@main, then cgroup OOM kill — twice at the 10G limit, again at 16G. With --cache-ram 0 (our default; -1 = unlimited) it plateaus at ~900MB under the same load. Env: CIX_LLAMA_CACHE_RAM; shown in the dashboard's Runtime parameters card, applied via Save & Restart. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

…vation A full-reindex wipe ran as ONE transaction: DELETE of all refs/symbols/ file_hashes plus the trigram-FTS rows. On a vscode-sized project (~445k refs, tens of thousands of FTS rows — each FTS delete re-tokenizes its content) that held SQLite's single writer for minutes, starving every concurrent writer past busy_timeout. Prod symptom: the jobs worker logged `claim failed: SQLITE_BUSY` on every 5s poll tick for the whole wipe. - BeginIndexing full wipe: file_hashes first (its own statement — once gone, every file looks dirty, so a crash mid-wipe just resumes on the next run), then symbols/refs in 20k-row batches, then chunks_fts/chunks_meta via the batched chunksfts.DeleteByProject (500 rows per tx — FTS deletes are the expensive ones). The writer is released between batches. - projects.Delete: same batched FTS wipe, project row deleted last so a failed wipe is resumable. - jobs worker: SQLITE_BUSY on claim is expected contention, not a fault — log the streak start as WARN with a once-a-minute heartbeat instead of an ERROR per tick, and log when it clears. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

dvcdsys and others added 4 commits June 7, 2026 23:49

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

poc: WASM/wazero tree-sitter backend (speed + stability vs cgo PR #80)#81

poc: WASM/wazero tree-sitter backend (speed + stability vs cgo PR #80)#81
dvcdsys wants to merge 4 commits into
developfrom
feat/chunker-wasm-treesitter

dvcdsys commented Jun 7, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

dvcdsys commented Jun 7, 2026

Speed — same 852-file vscode TS corpus, full-tree walk

Stability

Decision framing

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant