Skip to content

feat: research-informed compaction improvements#35

Merged
BYK merged 1 commit intomainfrom
feat/compaction-improvements
Mar 9, 2026
Merged

feat: research-informed compaction improvements#35
BYK merged 1 commit intomainfrom
feat/compaction-improvements

Conversation

@BYK
Copy link
Owner

@BYK BYK commented Mar 9, 2026

Three improvements to the distillation and compression pipeline, informed by insights from the KV cache compression literature (Zweiger et al. 2025, Eyuboglu et al. 2025).

Changes

1. Loss-annotated tool stripping (gradient.ts)

When tool outputs are stripped at higher gradient layers, the replacement now includes metadata — tool name, line count, error presence, and referenced file paths — instead of a static [output omitted] placeholder. This helps the model decide whether to invoke recall for full details.

Before: [output omitted — use recall for details]
After: [output omitted — read: 142 lines, contained errors, paths: src/db.ts, src/gradient.ts — use recall for details]

Inspired by the per-token scalar bias β from "Fast KV Compaction via Attention Matching" (Zweiger et al., 2025) — when tokens are removed, preserving metadata about what was lost helps compensate for information loss.

2. Context-distillation meta-distillation objective (prompt.ts)

The RECURSIVE_SYSTEM reflector prompt now produces a structured working context with sections for Current State, Key Decisions, Technical Changes, and Session Timeline, rather than a flat re-organized event log. This context-distillation objective generalizes better to diverse downstream queries than the prior summarization approach.

Inspired by the Self-Study method from "Cartridges" (Eyuboglu et al., 2025) — the key finding that memorization objectives (produce an event log) don't generalize, while context-distillation objectives (produce context optimized for varied query types) do.

3. Multi-resolution composable distillations (db.ts, distillation.ts, gradient.ts, temporal.ts)

Gen-0 distillations are now archived instead of deleted during meta-distillation:

  • Archived entries are excluded from the in-context prefix (no context budget impact)
  • Archived entries remain searchable via the recall tool (detail layer for specific queries)
  • Archived entries follow the same temporal pruning retention policy (120-day default)

This preserves a multi-resolution view: gen-1 "cartridge" for in-context overview, gen-0 detail for targeted recall.

Inspired by Cartridges composability — independently compressed representations can be concatenated and queried without retraining.

Schema migration

Version 5: adds archived INTEGER NOT NULL DEFAULT 0 column to the distillations table with an index. Backward compatible — existing rows default to archived = 0.

Test results

All 187 tests pass (only change: schema version assertion updated from 4 to 5).

Three improvements to the distillation and compression pipeline, informed
by insights from the KV cache compression literature:

1. Loss-annotated tool stripping (gradient.ts)
   When tool outputs are stripped at higher gradient layers, the replacement
   now includes metadata — tool name, line count, error presence, and file
   paths — instead of a static '[output omitted]' placeholder. This helps
   the model decide whether to invoke recall for full details.
   Inspired by the per-token scalar bias β from Zweiger et al. 2025.

2. Context-distillation meta-distillation objective (prompt.ts)
   The RECURSIVE_SYSTEM reflector prompt now produces a structured working
   context with sections for Current State, Key Decisions, Technical Changes,
   and Session Timeline, rather than a flat re-organized event log. This
   context-distillation objective generalizes better to diverse downstream
   queries than the prior summarization approach.
   Inspired by the Self-Study method from Eyuboglu et al. 2025.

3. Multi-resolution composable distillations (db.ts, distillation.ts, gradient.ts, temporal.ts)
   Gen-0 distillations are now archived instead of deleted during
   meta-distillation. Archived entries are excluded from the in-context
   prefix but remain searchable via the recall tool, preserving a detailed
   'zoom-in' layer beneath the compressed gen-1 summary. Archived entries
   follow the same temporal pruning retention policy.
   Inspired by Cartridges composability from Eyuboglu et al. 2025.

Schema: migration v5 adds 'archived' column to distillations table.

References:
- Fast KV Compaction via Attention Matching (Zweiger et al., 2025)
  https://arxiv.org/abs/2602.16284
- Cartridges: Compact Representations of Context for LLMs (Eyuboglu et al., 2025)
  https://arxiv.org/abs/2501.17390
@BYK BYK enabled auto-merge (squash) March 9, 2026 10:40
@BYK BYK merged commit bdd6d59 into main Mar 9, 2026
1 check passed
@BYK BYK deleted the feat/compaction-improvements branch March 9, 2026 10:40
@craft-deployer craft-deployer bot mentioned this pull request Mar 9, 2026
2 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant