semantic-compression-engine

Compress structured logs by 7600x using semantic merging. CPU-only, streaming, LLM-ready. Distill meaning, not just bits.

Distill massive datasets into their semantic essence.
A CPU-optimized streaming compressor that reduces structured logs by 99.98%+ while preserving queryable meaning—built for LLM context optimization, RAG pipelines, and observability.

🚀 Quick Results

Metric	Input	Output	Reduction
Log Lines	100,000	14	7601x
Semantic Concepts	Unknown	14 Unique Events	99.98%
Processing Speed	-	63 lines/sec	CPU Only
LLM Token Cost	~$2.00	~$0.0003	~6600x savings

From 100k lines of HDFS logs → 14 semantic entries.
Errors stay separated from info logs. Temporal context is preserved. Meaning survives.

🎯 Why This Exists

LLMs are expensive. Context windows are growing, but so is the data. Traditional compression (ZIP, GZIP) saves bits, but not meaning.

Semantic compression solves this by:

✅ Merging semantically similar events (not just identical strings)
✅ Preserving temporal relationships (what happened when)
✅ Reducing token costs by orders of magnitude
✅ Running on CPU—no GPU required

Ideal for:

RAG systems drowning in retrieved chunks
Log analytics at scale (DevOps, SRE)
LLM training data preprocessing
Real-time stream processing for agents

⚡ Quick Start

Install Dependencies

pip install -r requirements.txt

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
README.md		README.md
requirements.txt		requirements.txt
semantic_compression-v1.py		semantic_compression-v1.py
semantic_compressor_architecture.png		semantic_compressor_architecture.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

semantic-compression-engine

🚀 Quick Results

🎯 Why This Exists

⚡ Quick Start

Install Dependencies

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

ahmaxdev/semantic-compression-engine

Folders and files

Latest commit

History

Repository files navigation

semantic-compression-engine

🚀 Quick Results

🎯 Why This Exists

⚡ Quick Start

Install Dependencies

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages