Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
80 changes: 80 additions & 0 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,80 @@
# Contributing

Thanks for contributing to `pdfrest`.

## Development setup

1. Install project tooling:

```bash
uv sync --group dev
```

2. (Recommended) install git hooks:

```bash
uv run pre-commit install
```

3. Verify package import/version:

```bash
uv run python -c "import pdfrest; print(pdfrest.__version__)"
```

## Code quality checks

Run these before opening a PR:

```bash
uv run ruff format .
uv run ruff check .
uv run basedpyright
```

## Tests

Quick local run:

```bash
uv run pytest -n auto --maxschedchunk 2
```

Full interpreter matrix with coverage artifacts (`coverage/py<version>/`):

```bash
uvx nox -s tests
```

Class/function coverage gate for client classes:

```bash
uvx nox -s class-coverage
```

To reuse existing coverage JSON without rerunning tests:

```bash
uvx nox -s class-coverage -- --no-tests
```

## Examples

Run all examples:

```bash
uvx nox -s examples
```

Run one example:

```bash
uv run nox -s run-example -- examples/delete/delete_example.py
```

## Docs preview (optional)

```bash
uv run mkdocs serve
uv run mkdocs build --strict
```
142 changes: 111 additions & 31 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,61 +1,141 @@
# pdfrest
# pdfRest Python SDK

Python client library for the PDFRest service. The project is managed with
[uv](https://docs.astral.sh/uv/) and targets Python 3.9 and newer.
[![Tests](https://img.shields.io/github/actions/workflow/status/pdfrest/pdfrest-python/test-and-publish.yml?branch=main&label=tests)](https://github.com/pdfrest/pdfrest-python/actions/workflows/test-and-publish.yml)
[![PyPI Version](https://img.shields.io/pypi/v/pdfrest)](https://pypi.org/project/pdfrest/)
[![Python Versions](https://img.shields.io/pypi/pyversions/pdfrest)](https://pypi.org/project/pdfrest/)
[![llms.txt](https://img.shields.io/badge/llms.txt-available-2ea44f)](https://python.pdfrest.com/llms.txt)

## Running examples
Build production-grade PDF automation with the official Python SDK for
[pdfRest](https://pdfrest.com/): a powerful PDF API platform for conversion,
OCR, extraction, redaction, security, forms, and AI-ready document workflows.

```bash
uvx nox -s examples
uv run nox -s run-example -- examples/delete/delete_example.py
```
- Homepage: [pdfrest.com](https://pdfrest.com/)
- API docs: [pdfrest.com/apidocs](https://pdfrest.com/apidocs/)
- Python SDK docs: [python.pdfrest.com](https://python.pdfrest.com/)
- API Lab: [pdfrest.com/apilab](https://pdfrest.com/apilab/)

## Getting started
## Why pdfRest

```bash
uv sync
uv run python -c "import pdfrest; print(pdfrest.__version__)"
```
- Enterprise PDF quality powered by Adobe PDF Library technology.
- Fast onboarding with API Lab, code samples, and straightforward REST patterns.
- Chainable API workflows that let you pass outputs between calls.
- Deployment flexibility: Cloud, self-hosted on AWS, or self-hosted container.
- Security and compliance resources published in the trust center and product
documentation.

## Why this SDK

- Official typed Python interface to pdfRest (`PdfRestClient` and
`AsyncPdfRestClient`).
- Pydantic-backed request/response models for safer integrations.
- High-level helpers for the endpoints teams use most in production.
- Consistent error handling, request customization, and file management helpers.

## What you can build

## Development
Use this PDF API for workflows like:

To install the tooling used by CI locally, include the `--group dev` flag:
- Convert and transform: PDF to Word/Excel/PowerPoint/images/Markdown, and
convert files to PDF/PDF-A/PDF-X.
- Extract and analyze: OCR, text extraction, image extraction, PDF metadata.
- Secure and govern: redaction, encryption, permissions, signing, watermarking.
- Compose and optimize: merge/split, compress, flatten, rasterize, color
conversion.
- Form operations: import/export form data, flatten forms, XFA to Acroforms.

## Built for AI and LLM pipelines

pdfRest is especially useful for document AI systems:

- Convert PDFs to structured Markdown for downstream retrieval and training data
prep.
- Extract clean text and metadata for indexing and chunking pipelines.
- Summarize and translate document content with API-native operations.
- Keep multi-step pipelines efficient by chaining outputs between operations.

## Installation

`pdfrest` supports Python `3.10+`.

Recommended (`uv`):

```bash
uv sync --group dev
uv add pdfrest
```

It is recommended to enable the pre-commit hooks after installation:
Fallback (`pip`):

```bash
uv run pre-commit install
pip install pdfrest
```

Run the test suite with:
## Quick start

Set your API key in `PDFREST_API_KEY`:

```bash
uv run pytest
export PDFREST_API_KEY="your-api-key"
```

Check per-function coverage for the client classes:
Run your script:

```bash
uvx nox -s class-coverage
uv run python your_script.py
```

To reuse an existing `coverage/py<version>/coverage.json` without rerunning
tests, add `-- --no-tests` (and optional `--coverage-json path`).
Example (upload + extract text):

## Documentation
```python
from pathlib import Path

Run the docs site locally:
from pdfrest import PdfRestClient

```bash
uv run mkdocs serve
with PdfRestClient() as client:
uploaded = client.files.create_from_paths([Path("input.pdf")])[0]
result = client.extract_pdf_text(uploaded, full_text="document")

preview = ""
if result.full_text is not None and result.full_text.document_text is not None:
preview = result.full_text.document_text[:500]
print(preview)
```

Build the static documentation site:
Async example:

```bash
uv run mkdocs build --strict
```python
import asyncio
from pathlib import Path

from pdfrest import AsyncPdfRestClient


async def main() -> None:
async with AsyncPdfRestClient() as client:
uploaded = (await client.files.create_from_paths([Path("input.pdf")]))[0]
result = await client.extract_pdf_text(uploaded, full_text="document")
preview = ""
if result.full_text is not None and result.full_text.document_text is not None:
preview = result.full_text.document_text[:500]
print(preview)


asyncio.run(main())
```

## Deployment options

- Cloud (default): use `PdfRestClient()` with `PDFREST_API_KEY`.
- Self-hosted: set `base_url="https://your-api-host"` and keep the same Python
SDK surface.

## Learn more

- API toolkit overview: [pdfrest.com](https://pdfrest.com/)
- Resources and insights:
[pdfrest.com/resources](https://pdfrest.com/resources/)
- Example scripts: `examples/README.md`
- Python SDK docs: [python.pdfrest.com](https://python.pdfrest.com/)

## For contributors

Contributor workflows live in `CONTRIBUTING.md`.
Loading
Loading