feat(plugins): introduce middleware token proxy plugin suite (reorder, dedup, lookup, skill index) by SNM-SNM · Pull Request #50 · EfficientContext/ContextPilot

SNM-SNM · 2026-06-11T19:00:13Z

Overview

This PR introduces a suite of Middleware Token Proxy Plugins designed to sit between agent frameworks (e.g., OpenClaw) and LLM engines (e.g., SGLang) to optimize KV cache sharing, minimize context budgets, and perform cache-aware routing.

These modules implement the core optimizations detailed in the MSc cache optimization project (Direction A).

Key Features

1. Static Context Optimization Plugins (WP1)

ContextReorderPlugin: Processes OpenAI-formatted request batches to group similar contexts, maximizing RadixAttention prefix sharing on SGLang.
ContextDedupPlugin: Conversation-aware history compressor. It tracks session turns and automatically replaces redundant historical messages with lightweight reference hints (e.g., [Reference to Turn 1]) utilizing the ConversationTracker.

2. Dynamic Routing & Tool Filtering Plugins (WP2)

KVCacheLookupPlugin: Subscribes to SGLang worker event streams via ZeroMQ (ZMQ) to build a real-time, in-memory Shadow Radix Tree representing the workers' GPU KV cache. Routes incoming requests to the worker with the longest prefix match.
SkillAwareContextPlugin: Dynamically filters and injects tool schemas into the request's tools array based on the _required_skills list, trimming unused tool definitions to save context budget.

3. Core Engine Optimizations & Fixes

Multiprocessing Bypass: Added an execution bypass in compute_distance_cpu.py when num_workers == 1 to eliminate multiprocessing.Pool initialization and IPC serialization overhead.
Windows Terminal Compatibility: Replaced Unicode checkmark character ✓ with standard + in all logging/printing calls to prevent UnicodeEncodeError crashes on non-UTF-8 consoles.
Dependencies: Added pytest-asyncio, pyzmq, and msgspec to dependencies to ensure successful integration and testing.

Verification & Testing

Mock Proxy Test: Added a complete pipeline mock test in evaluation/core_merge/mock_proxy.py to demonstrate the end-to-end integration and telemetry collection of all four plugins.
Unit Tests: Added tests/test_kv_lookup.py and tests/test_skill_index.py.
CI status: All 184 CPU tests passed successfully.

…ith telemetry

…dbox, and ZMQ prototypes

…jection and update docs

…lower overhead

…console encoding crashes

SNM-SNM added 11 commits March 25, 2026 16:54

feat(DirectionA): implement ContextReorder and ContextDedup plugins w…

62a6d2d

…ith telemetry

feat(evaluation): integrate evaluation suite including profiling, san…

fcc06d0

…dbox, and ZMQ prototypes

ci: run tests using python -m pytest to resolve import issues

7cfefc2

ci: add pytest-asyncio to dev dependencies for async tests

7f21140

Add ShadowRadixTree KVCacheLookup

84ed725

ci: add pyzmq and msgspec to requirements.txt for kv_lookup plugin

25d743d

docs: update INTEGRATION_GUIDE.md to document KVCacheLookupPlugin

07fb05f

feat(plugins): add SkillAwareContextPlugin for dynamic tool schema in…

4027fe0

…jection and update docs

perf(reorder): bypass multiprocessing pool when num_workers is 1 for …

e86fe59

…lower overhead

feat(evaluation): add core merge mock proxy pipeline and fix Windows …

fbc1422

…console encoding crashes

style: apply black formatting to new plugins and tests

3beeceb

SNM-SNM requested review from Chivier and SecretSettler June 11, 2026 19:00

feat(evaluation): add run_elm_eval.py script for UoE ELM gateway tests

f2c1090

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(plugins): introduce middleware token proxy plugin suite (reorder, dedup, lookup, skill index)#50

feat(plugins): introduce middleware token proxy plugin suite (reorder, dedup, lookup, skill index)#50
SNM-SNM wants to merge 12 commits into
mainfrom
feature/msc-cache-optimization

SNM-SNM commented Jun 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

SNM-SNM commented Jun 11, 2026

Overview

Key Features

1. Static Context Optimization Plugins (WP1)

2. Dynamic Routing & Tool Filtering Plugins (WP2)

3. Core Engine Optimizations & Fixes

Verification & Testing

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant