Problem
When a dependency updates (e.g., numpy 1.26 to 1.27), all downstream packages are rebuilt. Most updates don't change the ABI — downstream wheels would be identical. For large dependency trees across multiple variants and architectures, this wastes significant compute.
Proposed Solution
Three-layer system to determine which packages actually need rebuilding:
Layer 1 — ABI Fingerprinting: Fingerprint each wheel's binary interface after build. Store this as a compact JSON digest alongside the wheel. Pure Python wheels marked as is_purelib: true.
Layer 2 — Build Environment Diff: Compare build inputs (version, build tag, build-dep versions, env vars, patches) against previous build. If nothing changed, skip rebuild.
Layer 3 — Graph Pruning: Walk the dependency graph in topological order. For each package whose dependency updated:
- Pure Python downstream — always skip
- Native, no linkage to changed dependency — skip
- Native, links to changed dependency, ABI unchanged — skip
- Native, links to changed dependency, ABI changed — rebuild
Design decisions
- Build-time dependency version changes always trigger rebuild (catches header-level dependencies missed by ELF analysis)
- Conservative defaults: missing fingerprint, tool errors, unknown linkage all trigger rebuild
- Opt-in via
--incremental-rebuild, with --dry-run-incremental for validation
--force overrides everything (existing behavior preserved)
ABI tooling (open question)
elfdeps (already integrated, soname/symbol-version level) vs. libabigail (type-level ABI comparison via abidiff/abidw, new dependency). Architecture is tool-agnostic.
[1] https://sourceware.org/libabigail/
Rollout
- Layer 1 only — fingerprinting, no behavioral change
- Layers 2+3 behind
--incremental-rebuild flag
- Production enablement after
--dry-run-incremental validation
Future opportunity
PyTorch stable C ABI for extensions would make this even more impactful — torch version bumps could skip rebuilding most downstream extensions.
Problem
When a dependency updates (e.g., numpy 1.26 to 1.27), all downstream packages are rebuilt. Most updates don't change the ABI — downstream wheels would be identical. For large dependency trees across multiple variants and architectures, this wastes significant compute.
Proposed Solution
Three-layer system to determine which packages actually need rebuilding:
Layer 1 — ABI Fingerprinting: Fingerprint each wheel's binary interface after build. Store this as a compact JSON digest alongside the wheel. Pure Python wheels marked as
is_purelib: true.Layer 2 — Build Environment Diff: Compare build inputs (version, build tag, build-dep versions, env vars, patches) against previous build. If nothing changed, skip rebuild.
Layer 3 — Graph Pruning: Walk the dependency graph in topological order. For each package whose dependency updated:
Design decisions
--incremental-rebuild, with--dry-run-incrementalfor validation--forceoverrides everything (existing behavior preserved)ABI tooling (open question)
elfdeps(already integrated, soname/symbol-version level) vs.libabigail(type-level ABI comparison viaabidiff/abidw, new dependency). Architecture is tool-agnostic.[1] https://sourceware.org/libabigail/
Rollout
--incremental-rebuildflag--dry-run-incrementalvalidationFuture opportunity
PyTorch stable C ABI for extensions would make this even more impactful — torch version bumps could skip rebuilding most downstream extensions.