Performance (4/5): reuse rule-app cost across re-expansion (+ age as a cost term) by unp1 · Pull Request #3837 · KeYProject/key

unp1 · 2026-06-17T04:02:56Z

This PR is 4/5 of a performance series that prepares a clearer match → evaluate → apply pipeline before larger optimisations; it builds on the compiled matcher (1/5, #3831).

📖 Developer docs: Performance Optimizations (3.1)

Intended Change

The strategy re-costs rule-app containers far more often than necessary. On every peek round, TacletAppContainer.createFurtherApps calls the full Strategy.computeCost again for a base app that has not changed — and computeCost (the feature-tree evaluation) is the dominant CPU cost of automode. This PR carries a container's cost forward across re-expansion instead of recomputing it, for the taclets where that is provably sound. It contains two coupled pieces:

1. Age as a first-class container-level cost term. Previously the goal-age contribution was baked into the stored cost (via AgeFeature), which meant the stored cost could not be reused across rounds (age changes every round). Age is lifted out: a container stores its age-free cost, and RuleAppContainer#withAge re-adds the current goal age when the container is (re-)inserted. This is a clean refactor (the AgeFeature leaf is removed) and is the enabler for reuse.

2. Cost reuse across re-expansion. When a taclet's cost is a pure function of the app + its find-subterm (plus the always-refreshed age term and the NonDuplicateApp vetoes), the stored age-free cost is carried forward verbatim instead of recomputed. Eligibility is decided by a sound-by-construction, annotation-driven classification (@CostLocal / @CostNonLocal): a taclet is eligible only if every feature reachable in its cost bindings is explicitly marked local; everything unannotated is treated as non-local (the safe default — forgetting an annotation costs performance, never soundness). The classification is generator-aware: a composite summing over a sequent-scanning generator stays non-local. The NonDuplicateApp-family vetoes that can still fire are re-checked on every reuse, so an app that became a duplicate is still dropped.

A small related determinism fix is included: introductionTime no longer caches the "not introduced yet" sentinel (-1), which could otherwise freeze a value depending on query order and make term ordering subtly non-deterministic.

flowchart LR
  A["re-expand base"] --> B{"cost-local?"}
  B -- no --> C["full computeCost"]
  B -- yes --> D["reuse cost, re-add age"]

Type of pull request

Refactoring (age-as-cost-term) + performance feature (cost reuse)
There are changes to the (Java) code

Performance

Measured on the 6-problem perfTest set (median of 3 runs). Cost reuse is byte-identical on the perfTest/perfValidation corpus, so node counts are unchanged; the win is reduced computeCost time. A -Dkey.strategy.costReuse.verify development flag (default off) recomputes the cost and warns on any mismatch.

Problem	main time (ms)	this PR (ms)	speedup
symmArray	21396	19301	1.11×
gemplusDecimal/add	11474	11759	0.98×
ArrayList.remove.1	3629	3460	1.05×
SimplifiedLinkedList.remove	28903	26330	1.10×
Saddleback_search	23510	21964	1.07×
coincidence_count/project	4935	4678	1.05×
Total	93847	87492	1.07×

Node counts identical to main (byte-identical proofs); the single sub-1.0 entry (gemplusDecimal/add, 0.98×) is run-to-run noise. The win is reduced computeCost time on cost-bound proofs.

Ensuring quality

Sound-by-construction (default non-local); costReuse.verify cross-checks against full recompute.
Byte-identical on perfTest/perfValidation; full RAP closes the same goals.
compile + spotless + nullness clean.
Active by default (the previous -Dkey.strategy.costReuse gate is removed).

📖 Conceptual overview in the developer docs: Performance Optimizations (3.1)

Additional information and contact(s)

This PR has been done with AI tooling support.

The contributions within this pull request are licensed under GPLv2 (only) for inclusion in KeY.

Carry a rule-app container's strategy cost forward across the per-round re-expansion instead of recomputing it, when the taclet's cost is a pure function of the app + find subterm (plus the always-refreshed age term and NonDuplicateApp vetoes). Sound-by-construction, annotation-driven classification (CostLocal/CostNonLocal, default non-local); generator-aware so a composite summing over a sequent-scanning generator stays non-local. A development aid -Dkey.strategy.costReuse.verify recomputes the cost and warns on any mismatch. Byte-identical on the perfTest/perfValidation corpus; ~7% automode speedup on cost-bound proofs.

introductionTime cached the not-introduced-yet answer (-1); the symbol may be introduced by a later rule, after which the real time would be found, so the frozen -1 made the value depend on whether the symbol was first compared before or after its introduction -- an access-pattern dependence that makes term ordering, and hence OneStepSimplifier rewriting, subtly non-deterministic. Only cache a real introduction time (stable once found).

Replace the per-call new Feature[0] / r.length==0 idiom with a shared INELIGIBLE constant and identity check. An eligible taclet always carries at least the top-level NonDuplicateApp veto, so identity is the clearer 'not eligible' test. Pure refactor: byte-identical (symmArray 14601 nodes, 0 verify-mode mismatches).

Age (goal time) was contributed inside each top-level strategy's computeCost (AgeFeature in ModularJavaDLStrategy's cost/inst sums; getTime() in FIFOStrategy and SimpleFilteredStrategy). Move it out into a single container-level term, RuleAppContainer.withAge, added exactly once when a container is built -- so strategies (and their components) compute only their age-free cost and age is added once regardless of how strategies are composed. AgeFeature is removed. This lets cost reuse carry the age-free base forward verbatim: TacletAppContainer stores the age-free cost and the reuse fast path is just 'base + current age' with no getTime()-getAge() reconstruction and no age>=0 guard (initial containers reuse soundly too). As a side effect the container's age field is decoupled from the cost and is now purely the AssumesInstantiator freshness key. Behaviour-preserving: byte-identical to the parent on SLL, saddleback, symmArray, median (verified by A/B against the legacy age-in-features path before it was removed; isolated timing showed the relocation is performance-neutral, so its value is code quality plus enabling the simpler, broader cost-reuse path).

unp1 added 4 commits June 17, 2026 00:32

unp1 mentioned this pull request Jun 17, 2026

Performance (overview): round-3 combined — 1.82× faster automode (all five PRs) #3839

Draft

unp1 changed the title ~~perf: reuse rule-app cost across re-expansion (+ age as first-class cost term)~~ Performance (3/4): reuse rule-app cost across re-expansion (+ age as a cost term) Jun 17, 2026

unp1 changed the title ~~Performance (3/4): reuse rule-app cost across re-expansion (+ age as a cost term)~~ Performance (4/5): reuse rule-app cost across re-expansion (+ age as a cost term) Jun 17, 2026

unp1 self-assigned this Jun 17, 2026

unp1 added the 🚀 Performance label Jun 17, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Performance (4/5): reuse rule-app cost across re-expansion (+ age as a cost term)#3837

Performance (4/5): reuse rule-app cost across re-expansion (+ age as a cost term)#3837
unp1 wants to merge 4 commits into
mainfrom
pr-cost-age

unp1 commented Jun 17, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

unp1 commented Jun 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Intended Change

Type of pull request

Performance

Ensuring quality

Additional information and contact(s)

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

unp1 commented Jun 17, 2026 •

edited

Loading