OpenReader

Privacy-first PDF tools for humans and AI agents — entirely local.
Read, search, compare, merge, split, extract, and compress PDFs without uploading documents anywhere.

Overview

OpenReader is a local-first PDF utility that works with AI agents.

Every operation — reading, searching, annotating, compressing, comparing, merging, splitting — runs on your machine. No accounts. No subscriptions. No telemetry. No cloud uploads.

Use OpenReader directly, from scripts, or through AI agents.

Download

Microsoft Store (Recommended)

OpenReader is live in the Microsoft Store.

Recommended: Install OpenReader by Sparsh on Microsoft Store.

GitHub Releases (Advanced Users)

For advanced users, MSIX packages remain available on the GitHub Releases page.

Platform	Package	Notes
Windows 10/11	`OpenReader.msix`	MSIX package. May be unsigned — requires Developer Mode for sideloading.
Windows 10/11	`OpenReader-Setup.exe`	Legacy installer for manual recovery. Requires administrator rights.
Windows 10/11	`OpenReader-Windows.zip`	Portable ZIP.
macOS	`OpenReader-macOS-*.zip`	Experimental. See macOS notes.
Linux	—	Unsupported.

Platform Support

Platform	Status
Windows 10/11	Supported
Microsoft Store	Live in Microsoft Store
GitHub MSIX	Advanced users
macOS Apple Silicon	Experimental
macOS Intel	Experimental
Linux	Unsupported

Update Policy

OpenReader does not install updates itself.

Microsoft Store installations update automatically through the Store.
GitHub MSIX installations: Help → Check for Updates opens the releases page. Download and install manually.
Source builds: git pull and rebuild.

AI Agent Integration (MCP Server)

OpenReader ships with a built-in MCP (Model Context Protocol) server. Any MCP-compatible agent — Claude Code, Claude Desktop, Hermes, or others — can interact with PDFs directly on your machine.

No cloud, no API keys, no document uploads.

What you can do with AI agents

Workflow	What happens
Ask questions about a PDF	Agent extracts text from any page and answers.
Search entire PDF libraries	Agent indexes a folder and searches across all documents (keyword or semantic).
Compare document versions	Agent runs a side-by-side diff and gives you a structured summary.
Summarize research collections	Agent reads multiple PDFs and synthesizes findings.
Build automated PDF pipelines	Write scripts that merge, split, compress, and extract — all local.

Architecture

┌─────────────────────────────────────────────────────┐
│  Claude / Hermes / any MCP-compatible agent          │
│  (asks questions, runs searches, compares docs)       │
└──────────┬──────────────────────────────────────────┘
           │ MCP protocol (stdio or SSE)
           ▼
┌─────────────────────────────────────────────────────┐
│  OpenReader MCP Server                               │
│  pdfreader_lib/mcp_server.py                         │
│  14 tools: extract, search, compare, merge, split…   │
└──────────┬──────────────────────────────────────────┘
           │ local file access only
           ▼
┌─────────────────────────────────────────────────────┐
│  Your PDFs (stored on your machine)                   │
└─────────────────────────────────────────────────────┘

Quick setup

# Install MCP dependencies
pip install -r requirements-mcp.txt

Add to your MCP-compatible agent's configuration:

{
  "mcpServers": {
    "openreader": {
      "command": "python",
      "args": ["-m", "pdfreader_lib.mcp_server"]
    }
  }
}

The server runs over stdio by default. For HTTP/SSE transport:

python -m pdfreader_lib.mcp_server --transport sse --port 8312

All operations are local. No data is uploaded anywhere.

Features

Category	Capabilities
Reading	Open PDFs, one-page view, previous/next navigation, page jump, fit-width, zoom in/out
Multi-tab	Open several documents in a single window with movable, closeable tabs. Ctrl+T new tab, Ctrl+W close tab, Ctrl+Shift+W close all
Session restore	Remembers open PDFs and page positions across restarts. Auto or manual restore (File menu toggle)
Search (keyword)	Full-document text search, match count, next/previous result navigation (PageUp/PageDown). Ctrl+F to focus
Search (semantic)	TF-IDF cosine similarity search across indexed library. Toggle "Semantic" in search bar
Library search	SQLite FTS5 full-text index over entire folders. Cross-document search ranked by BM25. Ctrl+Shift+F shortcut
PDF comparison	Side-by-side diff with color-coded changes (red delete, green insert) and diff summary
Copying	Drag-select text from the visible page and copy with `Ctrl+C` or the Copy button
OCR fallback	Attempts OCR-assisted selection on scanned/image-based pages when Tesseract OCR data is available
Annotations	Highlight, underline, and strikethrough selected text; sticky notes on any page. Saved as native PDF annotations
Annotation management	Show/hide annotations toggle (View menu). Delete all annotations on current page or entire document (Tools menu)
Save PDF	Explicit File → Save (Ctrl+S) to persist annotation edits immediately
PDF tools	Merge PDFs, split every page, extract page ranges like `1-3,5`, save compressed copies
Dark mode	System-aware dark theme (Catppuccin Mocha) with Auto/Light/Dark toggle via View → Theme
Recent files	Quick access to the last 10 opened PDFs via File → Open Recent
Update detection	Help → Check for Updates queries GitHub API and opens the releases page.

Screenshots

Reader	Sample PDF

Dark Mode	PDF Tools

Sample PDF 2

Privacy and Security

OpenReader processes PDFs locally. It does not use network services and does not upload PDFs.

The app includes lightweight safety checks before opening and rendering documents:

Accepts .pdf files only.
Checks for a PDF header before parsing.
Rejects empty files and files over 500 MB.
Rejects pages outside the supported page-size limit.
Caps render pixel allocation to reduce PDF-bomb/OOM risk.
Limits all-pages search result storage.
Keeps only a small OCR page cache in memory.
Runs pip-audit and Bandit in CI.

These checks reduce risk from malformed or oversized PDFs, but PDF parsing still depends on PyMuPDF/MuPDF. Avoid opening PDFs from untrusted sources unless you use OS-level sandboxing, a VM, or another isolation layer.

License

OpenReader is free software under the GNU AGPLv3.

Build From Source

Windows

python -m venv .venv
.\.venv\Scripts\Activate.ps1
python -m pip install -r requirements.txt
python main.py

Build the executable:

.\scripts\build_windows.ps1

Output:

dist\OpenReader\
├── OpenReader.exe
└── _internal\
    ├── python311.dll
    ├── PySide6\
    └── ...

macOS

macOS packaged builds are experimental. To run from source:

git clone https://github.com/sparshsam/openreader.git
cd openreader
python3 -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt
python main.py

See docs/macos.md for macOS setup and OCR notes.

OCR Setup

Text selection works natively on PDFs with embedded text. For scanned/image-only PDFs, the app falls back to OCR via PyMuPDF's Tesseract integration.

Windows: Download Tesseract from UB-Mannheim/tesseract, run the installer, check "Add to PATH", restart the app.

macOS: brew install tesseract

Linux (source builds): sudo apt install tesseract-ocr tesseract-ocr-eng

Project Structure

.
├── .github/                 # CI, security checks, Dependabot
├── assets/                  # App icon and README screenshots
├── docs/                    # Platform notes and known limitations
├── installer/               # Inno Setup installer script (legacy)
├── packaging/               # MSIX packaging
├── scripts/                 # Build scripts
├── tests/                   # Regression test suite
├── tools/                   # Developer utilities and CI test helpers
├── main.py                  # Main PySide6 application
├── pdfreader_lib/           # Core library (search, comparison, MCP server)
├── requirements.txt         # Pinned runtime/build dependencies
├── requirements-mcp.txt     # MCP server dependencies (optional)
└── CHANGELOG.md

Contributing

Contributions are welcome. Please read CONTRIBUTING.md and SECURITY.md before opening issues or pull requests.

Tech Stack

Layer	Choice
Language	Python 3.11+
UI Framework	PySide6 (Qt 6)
PDF Rendering	PyMuPDF (MuPDF)
Search	SQLite FTS5 (keyword), TF-IDF / scikit-learn (semantic)
OCR	PyMuPDF / Tesseract integration
Packaging	PyInstaller (onedir), MSIX
CI/CD	GitHub Actions (Windows + macOS)
Security scanning	Bandit, pip-audit
Platform	Windows (primary), macOS (experimental)

Last updated: June 2026

Name		Name	Last commit message	Last commit date
Latest commit History 158 Commits
.github		.github
assets		assets
docs		docs
installer		installer
packaging		packaging
pdfreader_lib		pdfreader_lib
scripts		scripts
tests		tests
tools		tools
.bandit		.bandit
.env.example		.env.example
.gitattributes		.gitattributes
.gitignore		.gitignore
AGENTS.md		AGENTS.md
ARCHITECTURE.md		ARCHITECTURE.md
CHANGELOG.md		CHANGELOG.md
CLAUDE.md		CLAUDE.md
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
PDFReader by Sparsh.spec		PDFReader by Sparsh.spec
PRIVACY.md		PRIVACY.md
README.md		README.md
RELEASE.md		RELEASE.md
ROADMAP.md		ROADMAP.md
SECURITY.md		SECURITY.md
SUPPORT.md		SUPPORT.md
VERSIONING.md		VERSIONING.md
main.py		main.py
requirements-mcp.txt		requirements-mcp.txt
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

OpenReader

Overview

Download

Microsoft Store (Recommended)

GitHub Releases (Advanced Users)

Platform Support

Update Policy

AI Agent Integration (MCP Server)

What you can do with AI agents

Architecture

Quick setup

Features

Screenshots

Privacy and Security

License

Build From Source

Windows

macOS

OCR Setup

Project Structure

Contributing

Tech Stack

About

Uh oh!

Releases 46

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

OpenReader

Overview

Download

Microsoft Store (Recommended)

GitHub Releases (Advanced Users)

Platform Support

Update Policy

AI Agent Integration (MCP Server)

What you can do with AI agents

Architecture

Quick setup

Features

Screenshots

Privacy and Security

License

Build From Source

Windows

macOS

OCR Setup

Project Structure

Contributing

Tech Stack

About

Topics

Resources

License

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases 46

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages