feat: research-informed compaction improvements#35
Merged
Conversation
Three improvements to the distillation and compression pipeline, informed by insights from the KV cache compression literature: 1. Loss-annotated tool stripping (gradient.ts) When tool outputs are stripped at higher gradient layers, the replacement now includes metadata — tool name, line count, error presence, and file paths — instead of a static '[output omitted]' placeholder. This helps the model decide whether to invoke recall for full details. Inspired by the per-token scalar bias β from Zweiger et al. 2025. 2. Context-distillation meta-distillation objective (prompt.ts) The RECURSIVE_SYSTEM reflector prompt now produces a structured working context with sections for Current State, Key Decisions, Technical Changes, and Session Timeline, rather than a flat re-organized event log. This context-distillation objective generalizes better to diverse downstream queries than the prior summarization approach. Inspired by the Self-Study method from Eyuboglu et al. 2025. 3. Multi-resolution composable distillations (db.ts, distillation.ts, gradient.ts, temporal.ts) Gen-0 distillations are now archived instead of deleted during meta-distillation. Archived entries are excluded from the in-context prefix but remain searchable via the recall tool, preserving a detailed 'zoom-in' layer beneath the compressed gen-1 summary. Archived entries follow the same temporal pruning retention policy. Inspired by Cartridges composability from Eyuboglu et al. 2025. Schema: migration v5 adds 'archived' column to distillations table. References: - Fast KV Compaction via Attention Matching (Zweiger et al., 2025) https://arxiv.org/abs/2602.16284 - Cartridges: Compact Representations of Context for LLMs (Eyuboglu et al., 2025) https://arxiv.org/abs/2501.17390
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Three improvements to the distillation and compression pipeline, informed by insights from the KV cache compression literature (Zweiger et al. 2025, Eyuboglu et al. 2025).
Changes
1. Loss-annotated tool stripping (
gradient.ts)When tool outputs are stripped at higher gradient layers, the replacement now includes metadata — tool name, line count, error presence, and referenced file paths — instead of a static
[output omitted]placeholder. This helps the model decide whether to invoke recall for full details.Before:
[output omitted — use recall for details]After:
[output omitted — read: 142 lines, contained errors, paths: src/db.ts, src/gradient.ts — use recall for details]Inspired by the per-token scalar bias β from "Fast KV Compaction via Attention Matching" (Zweiger et al., 2025) — when tokens are removed, preserving metadata about what was lost helps compensate for information loss.
2. Context-distillation meta-distillation objective (
prompt.ts)The
RECURSIVE_SYSTEMreflector prompt now produces a structured working context with sections for Current State, Key Decisions, Technical Changes, and Session Timeline, rather than a flat re-organized event log. This context-distillation objective generalizes better to diverse downstream queries than the prior summarization approach.Inspired by the Self-Study method from "Cartridges" (Eyuboglu et al., 2025) — the key finding that memorization objectives (produce an event log) don't generalize, while context-distillation objectives (produce context optimized for varied query types) do.
3. Multi-resolution composable distillations (
db.ts,distillation.ts,gradient.ts,temporal.ts)Gen-0 distillations are now archived instead of deleted during meta-distillation:
recalltool (detail layer for specific queries)This preserves a multi-resolution view: gen-1 "cartridge" for in-context overview, gen-0 detail for targeted recall.
Inspired by Cartridges composability — independently compressed representations can be concatenated and queried without retraining.
Schema migration
Version 5: adds
archived INTEGER NOT NULL DEFAULT 0column to thedistillationstable with an index. Backward compatible — existing rows default toarchived = 0.Test results
All 187 tests pass (only change: schema version assertion updated from 4 to 5).