Performance (4/5): reuse rule-app cost across re-expansion (+ age as a cost term)#3837
Draft
unp1 wants to merge 4 commits into
Draft
Performance (4/5): reuse rule-app cost across re-expansion (+ age as a cost term)#3837unp1 wants to merge 4 commits into
unp1 wants to merge 4 commits into
Conversation
Carry a rule-app container's strategy cost forward across the per-round re-expansion instead of recomputing it, when the taclet's cost is a pure function of the app + find subterm (plus the always-refreshed age term and NonDuplicateApp vetoes). Sound-by-construction, annotation-driven classification (CostLocal/CostNonLocal, default non-local); generator-aware so a composite summing over a sequent-scanning generator stays non-local. A development aid -Dkey.strategy.costReuse.verify recomputes the cost and warns on any mismatch. Byte-identical on the perfTest/perfValidation corpus; ~7% automode speedup on cost-bound proofs.
introductionTime cached the not-introduced-yet answer (-1); the symbol may be introduced by a later rule, after which the real time would be found, so the frozen -1 made the value depend on whether the symbol was first compared before or after its introduction -- an access-pattern dependence that makes term ordering, and hence OneStepSimplifier rewriting, subtly non-deterministic. Only cache a real introduction time (stable once found).
Replace the per-call new Feature[0] / r.length==0 idiom with a shared INELIGIBLE constant and identity check. An eligible taclet always carries at least the top-level NonDuplicateApp veto, so identity is the clearer 'not eligible' test. Pure refactor: byte-identical (symmArray 14601 nodes, 0 verify-mode mismatches).
Age (goal time) was contributed inside each top-level strategy's computeCost (AgeFeature in ModularJavaDLStrategy's cost/inst sums; getTime() in FIFOStrategy and SimpleFilteredStrategy). Move it out into a single container-level term, RuleAppContainer.withAge, added exactly once when a container is built -- so strategies (and their components) compute only their age-free cost and age is added once regardless of how strategies are composed. AgeFeature is removed. This lets cost reuse carry the age-free base forward verbatim: TacletAppContainer stores the age-free cost and the reuse fast path is just 'base + current age' with no getTime()-getAge() reconstruction and no age>=0 guard (initial containers reuse soundly too). As a side effect the container's age field is decoupled from the cost and is now purely the AssumesInstantiator freshness key. Behaviour-preserving: byte-identical to the parent on SLL, saddleback, symmArray, median (verified by A/B against the legacy age-in-features path before it was removed; isolated timing showed the relocation is performance-neutral, so its value is code quality plus enabling the simpler, broader cost-reuse path).
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This PR is 4/5 of a performance series that prepares a clearer match → evaluate → apply pipeline before larger optimisations; it builds on the compiled matcher (1/5, #3831).
📖 Developer docs: Performance Optimizations (3.1)
Intended Change
The strategy re-costs rule-app containers far more often than necessary. On every peek round,
TacletAppContainer.createFurtherAppscalls the fullStrategy.computeCostagain for a base app that has not changed — andcomputeCost(the feature-tree evaluation) is the dominant CPU cost of automode. This PR carries a container's cost forward across re-expansion instead of recomputing it, for the taclets where that is provably sound. It contains two coupled pieces:1. Age as a first-class container-level cost term. Previously the goal-age contribution was baked into the stored cost (via
AgeFeature), which meant the stored cost could not be reused across rounds (age changes every round). Age is lifted out: a container stores its age-free cost, andRuleAppContainer#withAgere-adds the current goal age when the container is (re-)inserted. This is a clean refactor (theAgeFeatureleaf is removed) and is the enabler for reuse.2. Cost reuse across re-expansion. When a taclet's cost is a pure function of the app + its find-subterm (plus the always-refreshed age term and the
NonDuplicateAppvetoes), the stored age-free cost is carried forward verbatim instead of recomputed. Eligibility is decided by a sound-by-construction, annotation-driven classification (@CostLocal/@CostNonLocal): a taclet is eligible only if every feature reachable in its cost bindings is explicitly marked local; everything unannotated is treated as non-local (the safe default — forgetting an annotation costs performance, never soundness). The classification is generator-aware: a composite summing over a sequent-scanning generator stays non-local. TheNonDuplicateApp-family vetoes that can still fire are re-checked on every reuse, so an app that became a duplicate is still dropped.A small related determinism fix is included:
introductionTimeno longer caches the "not introduced yet" sentinel (-1), which could otherwise freeze a value depending on query order and make term ordering subtly non-deterministic.flowchart LR A["re-expand base"] --> B{"cost-local?"} B -- no --> C["full computeCost"] B -- yes --> D["reuse cost, re-add age"]Type of pull request
Performance
Measured on the 6-problem
perfTestset (median of 3 runs). Cost reuse is byte-identical on the perfTest/perfValidation corpus, so node counts are unchanged; the win is reducedcomputeCosttime. A-Dkey.strategy.costReuse.verifydevelopment flag (default off) recomputes the cost and warns on any mismatch.Node counts identical to
main(byte-identical proofs); the single sub-1.0 entry (gemplusDecimal/add, 0.98×) is run-to-run noise. The win is reducedcomputeCosttime on cost-bound proofs.Ensuring quality
costReuse.verifycross-checks against full recompute.-Dkey.strategy.costReusegate is removed).📖 Conceptual overview in the developer docs: Performance Optimizations (3.1)
Additional information and contact(s)
This PR has been done with AI tooling support.
The contributions within this pull request are licensed under GPLv2 (only) for inclusion in KeY.