Skip to content

Latest commit

 

History

History
142 lines (100 loc) · 4.4 KB

File metadata and controls

142 lines (100 loc) · 4.4 KB

pdfRest Python SDK

Tests PyPI Version Python Versions llms.txt

Build production-grade PDF automation with the official Python SDK for pdfRest: a powerful PDF API platform for conversion, OCR, extraction, redaction, security, forms, and AI-ready document workflows.

Why pdfRest

  • Enterprise PDF quality powered by Adobe PDF Library technology.
  • Fast onboarding with API Lab, code samples, and straightforward REST patterns.
  • Chainable API workflows that let you pass outputs between calls.
  • Deployment flexibility: Cloud, self-hosted on AWS, or self-hosted container.
  • Security and compliance resources published in the trust center and product documentation.

Why this SDK

  • Official typed Python interface to pdfRest (PdfRestClient and AsyncPdfRestClient).
  • Pydantic-backed request/response models for safer integrations.
  • High-level helpers for the endpoints teams use most in production.
  • Consistent error handling, request customization, and file management helpers.

What you can build

Use this PDF API for workflows like:

  • Convert and transform: PDF to Word/Excel/PowerPoint/images/Markdown, and convert files to PDF/PDF-A/PDF-X.
  • Extract and analyze: OCR, text extraction, image extraction, PDF metadata.
  • Secure and govern: redaction, encryption, permissions, signing, watermarking.
  • Compose and optimize: merge/split, compress, flatten, rasterize, color conversion.
  • Form operations: import/export form data, flatten forms, XFA to Acroforms.

Built for AI and LLM pipelines

pdfRest is especially useful for document AI systems:

  • Convert PDFs to structured Markdown for downstream retrieval and training data prep.
  • Extract clean text and metadata for indexing and chunking pipelines.
  • Summarize and translate document content with API-native operations.
  • Keep multi-step pipelines efficient by chaining outputs between operations.

Installation

pdfrest supports Python 3.10+.

Recommended (uv):

uv add pdfrest

Fallback (pip):

pip install pdfrest

Quick start

Set your API key in PDFREST_API_KEY:

export PDFREST_API_KEY="your-api-key"

Run your script:

uv run python your_script.py

Example (upload + extract text):

from pathlib import Path

from pdfrest import PdfRestClient

with PdfRestClient() as client:
    uploaded = client.files.create_from_paths([Path("input.pdf")])[0]
    result = client.extract_pdf_text(uploaded, full_text="document")

preview = ""
if result.full_text is not None and result.full_text.document_text is not None:
    preview = result.full_text.document_text[:500]
print(preview)

Async example:

import asyncio
from pathlib import Path

from pdfrest import AsyncPdfRestClient


async def main() -> None:
    async with AsyncPdfRestClient() as client:
        uploaded = (await client.files.create_from_paths([Path("input.pdf")]))[0]
        result = await client.extract_pdf_text(uploaded, full_text="document")
        preview = ""
        if result.full_text is not None and result.full_text.document_text is not None:
            preview = result.full_text.document_text[:500]
        print(preview)


asyncio.run(main())

Deployment options

  • Cloud (default): use PdfRestClient() with PDFREST_API_KEY.
  • Self-hosted: set base_url="https://your-api-host" and keep the same Python SDK surface.

Learn more

For contributors

Contributor workflows live in CONTRIBUTING.md.