Skip to content

Performance (4/5): reuse rule-app cost across re-expansion (+ age as a cost term)#3837

Draft
unp1 wants to merge 4 commits into
mainfrom
pr-cost-age
Draft

Performance (4/5): reuse rule-app cost across re-expansion (+ age as a cost term)#3837
unp1 wants to merge 4 commits into
mainfrom
pr-cost-age

Conversation

@unp1

@unp1 unp1 commented Jun 17, 2026

Copy link
Copy Markdown
Member

This PR is 4/5 of a performance series that prepares a clearer match → evaluate → apply pipeline before larger optimisations; it builds on the compiled matcher (1/5, #3831).

📖 Developer docs: Performance Optimizations (3.1)

Intended Change

The strategy re-costs rule-app containers far more often than necessary. On every peek round, TacletAppContainer.createFurtherApps calls the full Strategy.computeCost again for a base app that has not changed — and computeCost (the feature-tree evaluation) is the dominant CPU cost of automode. This PR carries a container's cost forward across re-expansion instead of recomputing it, for the taclets where that is provably sound. It contains two coupled pieces:

1. Age as a first-class container-level cost term. Previously the goal-age contribution was baked into the stored cost (via AgeFeature), which meant the stored cost could not be reused across rounds (age changes every round). Age is lifted out: a container stores its age-free cost, and RuleAppContainer#withAge re-adds the current goal age when the container is (re-)inserted. This is a clean refactor (the AgeFeature leaf is removed) and is the enabler for reuse.

2. Cost reuse across re-expansion. When a taclet's cost is a pure function of the app + its find-subterm (plus the always-refreshed age term and the NonDuplicateApp vetoes), the stored age-free cost is carried forward verbatim instead of recomputed. Eligibility is decided by a sound-by-construction, annotation-driven classification (@CostLocal / @CostNonLocal): a taclet is eligible only if every feature reachable in its cost bindings is explicitly marked local; everything unannotated is treated as non-local (the safe default — forgetting an annotation costs performance, never soundness). The classification is generator-aware: a composite summing over a sequent-scanning generator stays non-local. The NonDuplicateApp-family vetoes that can still fire are re-checked on every reuse, so an app that became a duplicate is still dropped.

A small related determinism fix is included: introductionTime no longer caches the "not introduced yet" sentinel (-1), which could otherwise freeze a value depending on query order and make term ordering subtly non-deterministic.

flowchart LR
  A["re-expand base"] --> B{"cost-local?"}
  B -- no --> C["full computeCost"]
  B -- yes --> D["reuse cost, re-add age"]
Loading

Type of pull request

  • Refactoring (age-as-cost-term) + performance feature (cost reuse)
  • There are changes to the (Java) code

Performance

Measured on the 6-problem perfTest set (median of 3 runs). Cost reuse is byte-identical on the perfTest/perfValidation corpus, so node counts are unchanged; the win is reduced computeCost time. A -Dkey.strategy.costReuse.verify development flag (default off) recomputes the cost and warns on any mismatch.

Problem main time (ms) this PR (ms) speedup
symmArray 21396 19301 1.11×
gemplusDecimal/add 11474 11759 0.98×
ArrayList.remove.1 3629 3460 1.05×
SimplifiedLinkedList.remove 28903 26330 1.10×
Saddleback_search 23510 21964 1.07×
coincidence_count/project 4935 4678 1.05×
Total 93847 87492 1.07×

Node counts identical to main (byte-identical proofs); the single sub-1.0 entry (gemplusDecimal/add, 0.98×) is run-to-run noise. The win is reduced computeCost time on cost-bound proofs.

Ensuring quality

  • Sound-by-construction (default non-local); costReuse.verify cross-checks against full recompute.
  • Byte-identical on perfTest/perfValidation; full RAP closes the same goals.
  • compile + spotless + nullness clean.
  • Active by default (the previous -Dkey.strategy.costReuse gate is removed).

📖 Conceptual overview in the developer docs: Performance Optimizations (3.1)

Additional information and contact(s)

This PR has been done with AI tooling support.

The contributions within this pull request are licensed under GPLv2 (only) for inclusion in KeY.

unp1 added 4 commits June 17, 2026 00:32
Carry a rule-app container's strategy cost forward across the per-round
re-expansion instead of recomputing it, when the taclet's cost is a pure
function of the app + find subterm (plus the always-refreshed age term and
NonDuplicateApp vetoes). Sound-by-construction, annotation-driven
classification (CostLocal/CostNonLocal, default non-local); generator-aware
so a composite summing over a sequent-scanning generator stays non-local.
A development aid -Dkey.strategy.costReuse.verify recomputes the cost and warns on any
mismatch. Byte-identical on the perfTest/perfValidation
corpus; ~7% automode speedup on cost-bound proofs.
introductionTime cached the not-introduced-yet answer (-1); the symbol may be
introduced by a later rule, after which the real time would be found, so the
frozen -1 made the value depend on whether the symbol was first compared before
or after its introduction -- an access-pattern dependence that makes term
ordering, and hence OneStepSimplifier rewriting, subtly non-deterministic. Only
cache a real introduction time (stable once found).
Replace the per-call new Feature[0] / r.length==0 idiom with a shared
INELIGIBLE constant and identity check. An eligible taclet always carries
at least the top-level NonDuplicateApp veto, so identity is the clearer
'not eligible' test. Pure refactor: byte-identical (symmArray 14601 nodes,
0 verify-mode mismatches).
Age (goal time) was contributed inside each top-level strategy's computeCost
(AgeFeature in ModularJavaDLStrategy's cost/inst sums; getTime() in FIFOStrategy
and SimpleFilteredStrategy). Move it out into a single container-level term,
RuleAppContainer.withAge, added exactly once when a container is built -- so
strategies (and their components) compute only their age-free cost and age is
added once regardless of how strategies are composed. AgeFeature is removed.

This lets cost reuse carry the age-free base forward verbatim: TacletAppContainer
stores the age-free cost and the reuse fast path is just 'base + current age'
with no getTime()-getAge() reconstruction and no age>=0 guard (initial containers
reuse soundly too). As a side effect the container's age field is decoupled from
the cost and is now purely the AssumesInstantiator freshness key.

Behaviour-preserving: byte-identical to the parent on SLL, saddleback, symmArray,
median (verified by A/B against the legacy age-in-features path before it was
removed; isolated timing showed the relocation is performance-neutral, so its
value is code quality plus enabling the simpler, broader cost-reuse path).
@unp1 unp1 changed the title perf: reuse rule-app cost across re-expansion (+ age as first-class cost term) Performance (3/4): reuse rule-app cost across re-expansion (+ age as a cost term) Jun 17, 2026
@unp1 unp1 changed the title Performance (3/4): reuse rule-app cost across re-expansion (+ age as a cost term) Performance (4/5): reuse rule-app cost across re-expansion (+ age as a cost term) Jun 17, 2026
@unp1 unp1 self-assigned this Jun 17, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant