CFAdv

CFAdv is a context compiler for LLMs. It ingests files of any format, scores and selects the most relevant content under a token budget, and assembles provider-ready packets for OpenAI, Anthropic, Ollama, and compatible APIs.

Built on context-fusion and extended with an attention fusion layer (AttnRes-inspired) that reorders selected context by query relevance, so the most useful content appears first in the prompt.

Features

Multiformat ingestion: text, PDF, DOCX, CSV, JSON, images (OCR), code, Markdown
Normalization: uniform ContextBlock objects with token counts, trust and freshness scores
Task-specific compact representations: QA, code, agent, and universal variants
Utility and risk scoring: relevance, trust, freshness, structure, diversity, hallucination proxy
Multi-objective planner: value density + token + latency + cacheability ranking
Attention-based fusion: query-dependent softmax weighting inspired by AttnRes (arxiv 2603.15031)
Two-level block attention: intra-block ranking + cross-block mean-pooled ordering
Canonical IR and delta fusion: ContextPacket, CacheSegment, incremental ContextDelta
Dedup and fingerprinting: exact + near-duplicate collapse with provenance retention
Multi-provider adapters: OpenAI, Anthropic, Ollama, and OpenAI-compatible APIs
Provider-aware compilation: chat, qa, code, agent packers with mode-aware system prompts
Cache-aware assembly: stable/dynamic segment split for reuse across repeated turns
MCP server: expose CFAdv tools and resources over MCP
Framework integrations: retriever wrappers for LangChain and LlamaIndex
Precompute pipeline: fingerprints, summaries, token stats, compact variants, features
Compression pipeline: JSON minify, schema prune, citation compaction
Ablation studies: identify which context blocks contribute most to outcomes
Memory management: persistent storage with compaction and retention policies
Web UI: local browser app to run and inspect pipeline outputs

Quick Start

Installation

pip install context-portfolio-optimizer

For development:

git clone https://github.com/rotsl/CFAdv.git
cd CFAdv
make install-dev

Copy .env.example to .env and fill in your API keys:

cp .env.example .env

Basic Usage

from context_portfolio_optimizer import PipelineRunner

runner = PipelineRunner()
result = runner.run(["document.pdf", "code.py"], budget=3000)

print(result["context"])  # Optimized, attention-ranked context string
print(result["stats"])    # Processing statistics

CLI Usage

# Ingest and display content
cpo ingest ./data

# Run full optimization pipeline
cpo run ./data --budget 3000 --query "Summarize architecture" --output context.txt \
  --provider openai --model gpt-5-mini --mode chat --profile openai_chat

# Plan context for a task
cpo plan "Summarize these documents" --budget 5000

# Compile provider-ready packet (qa/code/agent/chat modes)
cpo compile ./data \
  --task "Answer with citations" \
  --provider openai \
  --model gpt-5-mini \
  --mode qa \
  --budget 4000 \
  --compression light \
  --delta

# Precompute artifacts for latency reduction
cpo precompute ./data --store-dir .cpo_cache/precompute --semantic-dedup

# Run MCP-style server
cpo serve-mcp --host <host> --port 8765

# Inspect cache + precompute store
cpo inspect-cache

# Run ablation study
cpo ablate ./data --budget 3000

# Launch local visualization UI
cpo ui --host <host> --port 8080

Provider/client mapping:

ChatGPT / OpenAI API: --provider openai with OPENAI_API_KEY
Claude AI / Claude API: --provider anthropic with ANTHROPIC_API_KEY
Local models with Ollama: --provider ollama (no cloud key required)
OpenAI-compatible APIs (Grok, DeepSeek, etc.): --provider openai_compatible with OPENAI_COMPAT_BASE_URL
MCP clients: cpo serve-mcp --host <host> --port <port>

What Most Users Need

README.md for setup and commands
.env for provider API keys
configs/ for provider and budget config overrides (optional)
examples/gui_input/ for quick GUI test inputs
CLI commands: run, compile, ui, serve-mcp

Usage Flowcharts

Normal User Path (Chat + Agent)

flowchart TD
    A[Install CFAdv] --> B[Add API keys in .env]
    B --> C{Pick workflow}
    C --> D[Chat workflow]
    C --> E[Agent workflow]
    D --> D1[Choose model: gpt-5-mini or claude-sonnet-4-6 or local ollama]
    E --> E1[Choose agentic model: claude-sonnet-4-6 or gpt-5-mini or tool-using model]
    D1 --> F[Run cpo compile or cpo run]
    E1 --> F
    F --> G[Provider adapter builds request]
    G --> H[Model response + citations + context stats]

Developer Path (Build + Evaluate)

flowchart TD
    A[Prepare corpus] --> B[Run cpo precompute]
    B --> C[Run benchmarks and tests]
    C --> D{Serve path}
    D --> E[CLI and app integration]
    D --> F[Web UI]
    E --> G{Runtime mode}
    F --> G
    G --> H[Chat or QA packer]
    G --> I[Agent packer + delta fusion]
    H --> J[Provider adapter]
    I --> J
    J --> K[OpenAI or Anthropic or Ollama or compatible]
    K --> L[Track token, latency, and cache metrics]

Architecture

CFAdv uses a middleware pipeline:

Ingest → Normalize → Canonical IR → Precompute → Dedup/Fingerprint
→ Query Classify → Candidate Retrieval → Fast Rerank → Budget Planner
→ Context Compression → Attention Fusion → Delta Fusion → Provider Adapter → Cache-Aware Assemble

Ingest: Extract content from multiple file formats
Normalize: Convert to uniform ContextBlock objects
Represent: Generate alternative compact representations per block
Precompute: Persist compact variants, token stats, retrieval features, and fingerprints
Retrieve: Query classify → top-100 lexical retrieval → top-20/25 rerank
Plan: Multi-objective latency-aware representation selection under token budget
Fuse: Query-dependent attention ranking (AttnRes-inspired) + ContextDelta for agent turns
Assemble: Build cache segments and canonical ContextPacket
Compile: Build provider-specific request-ready payloads

See docs/architecture.md for full component detail and docs/attention_fusion.md for the attention fusion design.

Supported Formats

Format	Extensions	Dependencies
Text	`.txt`, `.log`	—
Documents	`.pdf`	pdfminer.six
	`.docx`	python-docx
Structured	`.csv`, `.tsv`	pandas
	`.json`, `.jsonl`	—
Images	`.png`, `.jpg`, `.tiff`	Pillow, pytesseract
Code	`.py`, `.js`, `.ts`, `.go`, `.rs`, etc.	tree-sitter (optional)
Markdown	`.md`	—

Configuration

Copy configs/default.yaml or create your own config.yaml:

budget:
  instructions: 1000
  retrieval: 3000
  memory: 2000
  examples: 1500
  tool_trace: 1000
  output_reserve: 1000

scoring:
  utility_weights:
    retrieval: 0.25
    trust: 0.20
    freshness: 0.15
    structure: 0.15
    diversity: 0.15
    token_cost: -0.10

provider:
  name: anthropic
  model: claude-sonnet-4-6

features:
  use_attention_fusion: true
  attention_temperature: 1.0

Available providers: openai, anthropic, ollama, openai_compatible.

Algorithm

CFAdv formulates context selection as a multi-objective knapsack problem:

maximize Σ(
    w_u * utility_i
  - w_r * risk_i
  - w_t * token_cost_i
  - w_l * latency_cost_i
  + w_c * cacheability_i
  + w_d * diversity_i
) * z_i

subject to:
    Σ(token_i * z_i) <= token_budget
    z_i ∈ {0, 1}

After selection, contexts are reordered by query-dependent attention weights (softmax over cosine similarity between the query embedding and each context embedding), so the most relevant content appears first. See docs/algorithm.md.

Attention Fusion

CFAdv adds AttentionContextFusion and BlockAttentionFusion on top of the base planner, inspired by Block Attention Residuals (AttnRes, arxiv 2603.15031):

Each context is embedded with bow_embedding (64-dim, L2-normalized, vocabulary-aware)
Query-to-context cosine similarity scores are computed and passed through temperature-scaled softmax
Contexts are reordered by descending weight, so highest relevance appears first
BlockAttentionFusion applies the same hierarchy to named blocks (system / history / retrieval / tools), using mean-pooled embeddings as block representatives for cross-block ranking

See docs/attention_fusion.md for the full design and formulas.

Precompute Workflow

cpo precompute ./data --store-dir .cpo_cache/precompute --semantic-dedup

Stores fingerprints, summaries, compact variants, and retrieval features in .cpo_cache/precompute. Use --precomputed-only in run/compile to avoid regeneration on cache hits.

Chat Mode vs Agent Mode

Mode	Packing strategy
`chat`	Concise context for standard conversation prompts
`qa`	Extractive evidence + citation-first packing
`code`	Signatures, changed regions, dependency-focused packing
`agent`	Working-memory and constraint deltas with optional incremental fusion

Compression Pipeline

Compression levels (none, light, medium, aggressive) apply:

Citation map compaction (Source URI → [id])
JSON minification
Schema field pruning for structured payloads

Delta Fusion

Use --delta with run or compile to compute incremental packet changes across turns:

added blocks, updated blocks, removed blocks, unchanged block IDs

Cache-Aware Assembly

Each packet splits into stable and dynamic segments:

stable: task/system instructions, citation maps, cacheable blocks
dynamic: non-cacheable or volatile blocks

Enables reuse across repeated chat/agent turns and lowers effective prompt churn.

Examples

python examples/multiformat_ingestion_demo.py  # multi-format ingestion
python examples/rag_context_optimizer.py       # RAG-optimized context selection
python examples/memory_compaction_demo.py      # memory management
python examples/ablation_demo.py               # ablation studies

make examples  # run all four

See examples/EXAMPLE_RESULTS.md for latest run outputs.

Web UI

cpo ui --host <host> --port 8080
# or
make ui

Open http://<host>:8080 to:

Choose Input Mode (Directory or File list) and enter a path (e.g. ./examples/gui_input)
Set Task Mode (chat, qa, code, agent) and enter a query
Set Budget (token budget)
Pick Provider and Model (default: anthropic / claude-sonnet-4-6)
Click Run Pipeline

The results panel shows run stats, representation usage, selected blocks, context preview, and model answer.

Improvements over context-fusion

CFAdv is built on context-fusion and adds:

Capability	context-fusion	CFAdv
Multiformat ingestion, normalization, scoring	✓	✓
Knapsack budget planner + BM25 retrieval	✓	✓
Compact representations, delta fusion, providers	✓	✓
Query-dependent context ordering	—	✓
Two-level block attention hierarchy	—	✓
Vocabulary-aware 64-dim embeddings (L2-norm)	—	✓
`docs/attention_fusion.md`	—	✓
Test count	~49	72

For a detailed side-by-side, see docs/comparison.md.

Benchmarks

make benchmark          # tiny eval (baseline vs cf_uniform vs cf_attention)
make benchmark-weights  # same with attention weight detail
make benchmark-api      # live Anthropic API benchmark (requires .env)
make benchmark-all      # all local benchmarks

Latest tiny benchmark (2026-03-21, local deterministic):

Mode	Avg tokens	Success	vs baseline
baseline	99.0	100%	—
cf_uniform	3.7	100%	−96.3%
cf_attention	3.7	100%	−96.3%

Latest Claude API benchmark (2026-03-21, claude-sonnet-4-6):

Mode	Avg context tokens	Success
with_cfadv	10.3	100%
without_cfadv	947.0	100%

Context-token reduction with CFAdv: 98.9%

Tiny benchmark — context tokens (lower is better)
With CFAdv    3.7   | █
Without CFAdv 99.0  | ████████████████████████

Claude API — context tokens (lower is better)
With CFAdv    10.3  | █
Without CFAdv 947.0 | ████████████████████████████████████████

See benchmarks/BENCHMARK_RESULTS.md, benchmarks/BENCHMARK_API_RESULTS.md, and benchmarks/BENCHMARK_SUPPLEMENTAL_RESULTS.md for full per-task detail.

Testing

make test           # run full suite
make test-cov       # with coverage report
make test-integration

Latest run (2026-03-21): 72 passed, 0 failed. See tests/TEST_RESULTS.md.

Coverage highlights: attention_fusion.py 83%, planner.py 95%, bm25.py 97%, registry.py 98%.

Validation Snapshot

Latest local smoke checks (2026-03-21):

pipeline: cpo run ./docs --budget 600 --query "Summarize key architecture points" — passed
GUI: cpo ui --host <host> --port 8081 — HTML served, /api/run responded with JSON

Development

make bootstrap      # first-time setup
make install-dev    # install package + dev tools + pre-commit hooks
make lint           # ruff check
make format         # ruff format
make type-check     # mypy
make all-checks     # format + lint + type-check + test
make build          # build sdist + wheel
make docs           # build MkDocs site

Project Structure

CFAdv/
├── README.md
├── CITATION.cff
├── CONTRIBUTING.md
├── SECURITY.md
├── pyproject.toml
├── Makefile
├── requirements.txt
├── requirements-dev.txt
├── .env.example
├── src/context_portfolio_optimizer/
│   ├── ingestion/          # File loaders (text, PDF, DOCX, CSV, JSON, image, code)
│   ├── normalization/      # ContextBlock building
│   ├── representations/    # Compact representation variants
│   ├── retrieval/          # BM25 + reranker + query classifier
│   ├── scoring/            # Utility and risk models
│   ├── allocation/         # Budget + knapsack + multi-objective planner
│   ├── dedup/              # Fingerprinting + duplicate collapse
│   ├── compression/        # JSON/citation/schema compression
│   ├── caching/            # Cache segment and packet cache
│   ├── fusion/             # Attention fusion + delta computation
│   ├── assembly/           # Provider-aware packet compiler
│   ├── ir/                 # Canonical ContextPacket IR
│   ├── providers/          # Provider adapters + registry
│   ├── precompute/         # Offline precompute pipeline + bow_embedding
│   ├── orchestration/      # Pipeline runner
│   ├── memory/             # Memory storage + compaction
│   ├── agents/             # Agent loop support
│   ├── integrations/       # LangChain / LlamaIndex wrappers
│   ├── mcp_server/         # MCP-style server
│   ├── web_ui.py           # Local visualization server
│   └── cli.py              # Command-line interface (`cpo`)
├── configs/                # Provider and runtime YAML configs
├── docs/                   # Architecture, algorithm, attention_fusion, comparison, CLI
├── benchmarks/             # Benchmark runners + result reports
├── examples/               # Demo scripts + GUI input samples
└── tests/                  # Test suite (72 tests)

Local-only artifacts excluded by .gitignore: .env, virtualenvs, caches, coverage outputs.

Citation

@software{r2026cfadv,
  author       = {Rohan R},
  title        = {CFAdv},
  year         = {2026},
  url          = {https://github.com/rotsl/CFAdv},
  version      = {0.1.0},
  orcid        = {0009-0005-9225-1775}
}

License

Apache-2.0. See LICENSE for details.

Contributing

See CONTRIBUTING.md for guidelines.

Roadmap

Additional file format support (EPUB, HTML)
Learned utility models from feedback
Distributed processing for large datasets
Tighter integration with popular RAG frameworks

Acknowledgments

CFAdv builds on ideas from information retrieval and operations research. The attention fusion module is inspired by Block Attention Residuals (AttnRes, arxiv 2603.15031).

CFAdv: less context, more signal.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
.github		.github
benchmarks		benchmarks
configs		configs
docs		docs
examples		examples
scripts		scripts
src/context_portfolio_optimizer		src/context_portfolio_optimizer
tests		tests
web		web
.editorconfig		.editorconfig
.env.example		.env.example
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
CITATION.cff		CITATION.cff
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
SECURITY.md		SECURITY.md
pyproject.toml		pyproject.toml
requirements-dev.txt		requirements-dev.txt
requirements.txt		requirements.txt
robots.txt		robots.txt

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

CFAdv

Features

Quick Start

Installation

Basic Usage

CLI Usage

What Most Users Need

Usage Flowcharts

Normal User Path (Chat + Agent)

Developer Path (Build + Evaluate)

Architecture

Supported Formats

Configuration

Algorithm

Attention Fusion

Precompute Workflow

Chat Mode vs Agent Mode

Compression Pipeline

Delta Fusion

Cache-Aware Assembly

Examples

Web UI

Improvements over context-fusion

Benchmarks

Testing

Validation Snapshot

Development

Project Structure

Citation

License

Contributing

Roadmap

Acknowledgments

About

Topics

Resources

License

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases 1

Sponsor this project

Uh oh!

Contributors

Uh oh!

Languages