Conversation
…Labels removeIrrelevantLabels rebuilt the whole term tree on every call (stream().map()/filter() .collect() per node), the single biggest allocator during proof search (~20%), even though most subterms have no irrelevant label. Replace with an identity-preserving rebuild (plain loops, lazy sub-array, return the original term when its subtree has no irrelevant label). Behaviour-preserving (terms are immutable; result is structurally identical).
Objects.hash(first, second) allocates an Object[] on every call; Pair is heavily used as a hash-map key during proof search. Inline the same hash value without the array.
applyReplacewithHelper allocated a PiTIterator (posInTerm().iterator()) per rewrite-taclet application and consumed it in the recursive replace(). Thread the PosInTerm + a depth index instead (same indices/order), avoiding the per-application iterator object.
The KeY and JML ANTLR parsers build a prediction (DFA) cache lazily while parsing, held on the generated parsers' static fields, so it stays resident for the whole JVM -- including the (long) proof search, where it is unused (~17 MB retained on a large proof). It is a pure cache that ANTLR rebuilds transparently on the next parse, so dropping it after a problem/proof has finished loading is correctness-safe. Add ParsingFacade.clearParserCaches() (KeY/JavaDL) and JmlFacade.clearCaches() (JML) and call them from AbstractProblemLoader.load().
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This PR is 2/5 of a performance series that prepares a clearer match → evaluate → apply pipeline before larger optimisations; it builds on the compiled matcher (1/5, #3831).
📖 Developer docs: Performance Optimizations (3.1)
Intended Change
Proof search allocates a large volume of short-lived objects on its hottest paths; this churn drives GC and bounds throughput on long proofs. This PR removes four such allocation hot-spots without changing any proof. Every change is behaviour-preserving (byte-identical proofs); the win is purely fewer allocations / less GC.
The four independent changes:
removeIrrelevantLabels— skip rebuilding unchanged term trees. This was the single biggest allocator during proof search (~20% of allocations in a JFR profile): it rebuilt the whole term tree on every call viastream().map()/filter().collect()per node, even when no subterm carried an irrelevant label. Replaced with an identity-preserving rebuild — plain loops, a lazily-allocated sub-array, and returning the original term whenever its subtree has no irrelevant label. Terms are immutable, so the result is structurally identical.Pair.hashCode— no varargs array.Objects.hash(first, second)allocates anObject[]on every call, andPairis heavily used as a hash-map key during proof search. Inlined the identical hash value without the array.RewriteTacletExecutor— walk the find-position by index.applyReplacewithallocated aPiTIterator(posInTerm().iterator()) per rewrite-taclet application. Replaced with aPosInTerm+ depth-index walked recursively — same indices/order, no per-application iterator object.Release the ANTLR parser DFA cache after loading. The KeY/JML ANTLR parsers build a prediction (DFA) cache lazily while parsing, held on the generated parsers'
staticfields, so it stays resident for the whole JVM — including the long proof search, where it is unused (~17 MB retained on a large proof). It is a pure cache that ANTLR rebuilds transparently on the next parse, so it is dropped once a problem/proof has finished loading. (This one reduces retained footprint rather than allocation rate.)Type of pull request
Performance
Measured on the 6-problem
perfTestreal-world set (median of 3 runs; proofs byte-identical, so node counts are unchanged). The table reports automode time and committed runtime heap.Node counts are identical to
mainfor all six (proofs unchanged). The heap columns are thecommitted runtime heap (
Runtime.totalMemory) sampled after each problem of the shared serial run,so they are cumulative; the consistent ~14–20 MB main→PR reduction at every checkpoint reflects the
released ANTLR parser DFA cache (~17 MB) plus reduced allocation pressure. A JFR allocation profile
additionally attributes ~20% of proof-search allocations to the old
removeIrrelevantLabelsalone.Ensuring quality
📖 Conceptual overview in the developer docs: Performance Optimizations (3.1)
Additional information and contact(s)
This PR has been done with AI tooling support.
The contributions within this pull request are licensed under GPLv2 (only) for inclusion in KeY.