Agentic workflows for spatial pathology.
spatho is the Python package and CLI for Agentic Spatial Pathologist. It wraps the lower-level histoseg engine with workflow configuration, organ packs, artifact manifests, H&E overlays, structure review, and report generation.
The platform framing is:
stGPT learns reusable contour/region morpho-molecular representations; spatho plans, validates, and turns them into auditable spatial pathology evidence.
Legacy standalone surface: the canonical product-layer implementation is being integrated into
ASTROunderapp/src/xenium_ai_discovery/pathology_app/. This repository remains the compatibility, packaging, and deployment-oriented shell forspatho.
spatho now supports two parallel AI review paths. The paid OpenAI API route remains available and unchanged, while the local pathology-ai route adds a self-hosted option for private or cluster deployments.
| Backend | How it runs | Best for | Key settings |
|---|---|---|---|
openai |
Calls the paid OpenAI API with your OPENAI_API_KEY. |
Fast setup, managed models, lightweight local machine. | pathology_review_backend="openai" |
pathology_ai_api |
Calls a local HTTP service backed by vLLM, embeddings, reranking, and Qdrant. | PDC/HPC, private data, cost control, local model operations. | pathology_review_backend="pathology_ai_api" and pathology_ai_api_base_url |
The two paths are intentionally independent: enabling the local service does not remove or disable the OpenAI backend.
spatho is the agentic spatial pathology workbench layer. It can consume precomputed stGPT artifacts without importing stgpt, or call a local stGPT install when stgpt_backend="local_stgpt" is configured. stGPT evidence is guarded before biological review: missing required artifacts make spatho doctor not ready, fatal QC blocks a run, and warning-only QC is shown as cautionary model-derived evidence.
Workflow fields default to disabled:
{
"stgpt_enabled": true,
"stgpt_backend": "precomputed_artifacts",
"stgpt_artifact_dir": "/path/to/stgpt/spatho_export",
"stgpt_min_cell_coverage": 0.95,
"stgpt_require_qc_pass": true
}Expected stGPT artifacts are cell_embeddings.parquet, structure_embedding_summary.csv or structure_summary.parquet, and qc_report.json. The workbench writes stgpt_evidence_summary.csv/json, updates the artifact manifest, and inserts a report section that labels the evidence as model-derived rather than measured expression.
python -m pip install -U spathoFor local development:
git clone https://github.com/hutaobo/Agentic-Spatial-Pathologist.git
cd Agentic-Spatial-Pathologist
python -m pip install -U pip
python -m pip install -e .[dev]If you are actively developing against a local histoseg checkout, install that editable copy first:
python -m pip install -e ../HistoSegUse this path when you want the simplest managed-model setup.
export OPENAI_API_KEY=sk-...
spatho init-workflow \
--organ breast \
--case-name breast_case_01 \
--dataset-root /path/to/Xenium_outs \
--base-pipeline-config /path/to/project/configs/breast_case_01.json \
--output /path/to/workflows/breast_case_01_openai.json
spatho run --config /path/to/workflows/breast_case_01_openai.jsonThe generated workflow can keep:
{
"pathology_review_backend": "openai"
}Disable OpenAI and force heuristic mode when needed:
spatho run --config /path/to/workflow.json --heuristic-onlyUse this path when you want pathology review to call a self-hosted service instead of the paid OpenAI API.
{
"pathology_review_backend": "pathology_ai_api",
"pathology_ai_api_base_url": "http://localhost:8000"
}For PDC/Dardel deployment, see docs/PDC_LOCAL_PATHOLOGY_AI.md. The local stack is:
pathology-ai: lightweight HTTP orchestration from this repovllm: OpenAI-compatible local LLM endpointembedder: TEI-compatible Python embedding service forBAAI/bge-m3reranker: TEI-compatible Python reranking service forBAAI/bge-reranker-v2-m3qdrant: local vector storage
Default local model configuration:
LLM_BASE_URL=http://127.0.0.1:8001/v1
LLM_MODEL=openai/gpt-oss-120b
EMBED_MODEL=BAAI/bge-m3
RERANK_MODEL=BAAI/bge-reranker-v2-m3
VECTOR_DB=qdrant
DEFAULT_TOP_K=6
STRICT_JSON=true
Check an environment and workflow config:
spatho doctor --config /path/to/workflow.jsonList built-in organ packs:
spatho list-organ-packsExport the workflow JSON schema:
spatho config-schema --output /path/to/workflow.schema.jsonBuild or refresh an artifact manifest:
spatho build-manifest --config /path/to/workflow.jsonWrite Xenium RNA+protein + H&E alignment fixtures:
spatho write-xenium-alignment-fixtures \
--output-dir /path/to/output/pipeline/validation \
--segmentation-source ranger_protein_assistedThis writes a Xenium RNA+protein alignment note, a fixture manifest, and transform cases covering identity, um -> pixel, translation, axis order, and composed polygon export.
from spatho import run_evidence_workbench, run_workflow
result = run_workflow("/path/to/workflows/breast_case_01_openai.json")
print(result["pathology_report_html"])
workbench_result = run_evidence_workbench("/path/to/workflows/breast_case_01_openai.json")Generate a starter config from Python:
from spatho import init_workflow
result = init_workflow(
"/path/to/workflows/breast_case_01_openai.json",
organ="breast",
case_name="breast_case_01",
dataset_root="/path/to/Xenium_outs",
base_pipeline_config="/path/to/project/configs/breast_case_01.json",
)
print(result["workflow_config"])A typical full run produces:
- cluster evidence bundles
- OpenAI, local pathology-ai, or heuristic cluster annotations
- dendrogram-guided structure assignments
- clustermap and H&E overlay artifacts
- structure-level pathology reviews
- case-level HTML report
- machine-readable artifact manifest
spatho ships with built-in organ packs that define the annotation taxonomy, default study context, workflow parameter defaults, and expected artifact contract.
Built-in packs:
lungbreast
These packs live in src/spatho/organ_packs.
Workflow JSON files are backed by a formal schema exported from the package. For Xenium RNA+protein workflows, the config template records:
dataset_modality = xenium_rna_proteincanonical_space = physical_umexport_space = xenium_explorer_pixelxenium_pixel_size_umsegmentation_source
See docs/XENIUM_RNA_PROTEIN_ALIGNMENT.md for the rationale and polygon-level analysis model.
src/spatho: public-facing Python package and CLIsrc/pathology_ai_service: local pathology AI HTTP servicedeploy/pathology_ai: Docker Compose and PDC Slurm/Apptainer deployment assetsdocs/PDC_LOCAL_PATHOLOGY_AI.md: local/PDC pathology-ai deployment guidedocs/PYPI_RELEASE.md: PyPI publishing checklistexamples/workflows: public-safe starter workflow templatesmain.py: older Gradio/Serve deployment surface kept for compatibility
Current implementation model:
histosegexecutes the geometry, segmentation, and workflow internalsspathowraps and presents the workflow as a product-facing package
Target implementation model:
histosegremains the geometry/segmentation enginespathoowns workflow UX, organ packs, public docs, reports, and deployment surfaces- the canonical integrated product implementation continues to move into ASTRO
This repo includes a PyPI publishing workflow based on GitHub Actions Trusted Publishing. See docs/PYPI_RELEASE.md for setup and release steps.
This project is intended for noncommercial research use unless separately licensed. Before public release or commercial use, review the license text and commercial boundary together with the underlying histoseg dependency.