Skip to content

Performance (2/5): cut proof-search allocations + release the parser DFA cache#3835

Draft
unp1 wants to merge 4 commits into
mainfrom
pr-memory
Draft

Performance (2/5): cut proof-search allocations + release the parser DFA cache#3835
unp1 wants to merge 4 commits into
mainfrom
pr-memory

Conversation

@unp1

@unp1 unp1 commented Jun 17, 2026

Copy link
Copy Markdown
Member

This PR is 2/5 of a performance series that prepares a clearer match → evaluate → apply pipeline before larger optimisations; it builds on the compiled matcher (1/5, #3831).

📖 Developer docs: Performance Optimizations (3.1)

Intended Change

Proof search allocates a large volume of short-lived objects on its hottest paths; this churn drives GC and bounds throughput on long proofs. This PR removes four such allocation hot-spots without changing any proof. Every change is behaviour-preserving (byte-identical proofs); the win is purely fewer allocations / less GC.

The four independent changes:

  1. removeIrrelevantLabels — skip rebuilding unchanged term trees. This was the single biggest allocator during proof search (~20% of allocations in a JFR profile): it rebuilt the whole term tree on every call via stream().map()/filter().collect() per node, even when no subterm carried an irrelevant label. Replaced with an identity-preserving rebuild — plain loops, a lazily-allocated sub-array, and returning the original term whenever its subtree has no irrelevant label. Terms are immutable, so the result is structurally identical.

  2. Pair.hashCode — no varargs array. Objects.hash(first, second) allocates an Object[] on every call, and Pair is heavily used as a hash-map key during proof search. Inlined the identical hash value without the array.

  3. RewriteTacletExecutor — walk the find-position by index. applyReplacewith allocated a PiTIterator (posInTerm().iterator()) per rewrite-taclet application. Replaced with a PosInTerm + depth-index walked recursively — same indices/order, no per-application iterator object.

  4. Release the ANTLR parser DFA cache after loading. The KeY/JML ANTLR parsers build a prediction (DFA) cache lazily while parsing, held on the generated parsers' static fields, so it stays resident for the whole JVM — including the long proof search, where it is unused (~17 MB retained on a large proof). It is a pure cache that ANTLR rebuilds transparently on the next parse, so it is dropped once a problem/proof has finished loading. (This one reduces retained footprint rather than allocation rate.)

Type of pull request

  • Refactoring (behaviour should not change — proofs are byte-identical)
  • There are changes to the (Java) code

Performance

Measured on the 6-problem perfTest real-world set (median of 3 runs; proofs byte-identical, so node counts are unchanged). The table reports automode time and committed runtime heap.

Problem main time (ms) this PR (ms) speedup main heap (MB) this PR heap (MB)
symmArray 21396 17552 1.22× 130 116
gemplusDecimal/add 11474 9490 1.21× 205 190
ArrayList.remove.1 3629 3425 1.06× 241 223
SimplifiedLinkedList.remove 28903 28243 1.02× 346 327
Saddleback_search 23510 21449 1.10× 436 417
coincidence_count/project 4935 3842 1.28× 480 460
Total / peak 93847 84001 1.12× 480 460

Node counts are identical to main for all six (proofs unchanged). The heap columns are the
committed runtime heap (Runtime.totalMemory) sampled after each problem of the shared serial run,
so they are cumulative; the consistent ~14–20 MB main→PR reduction at every checkpoint reflects the
released ANTLR parser DFA cache (~17 MB) plus reduced allocation pressure. A JFR allocation profile
additionally attributes ~20% of proof-search allocations to the old removeIrrelevantLabels alone.

Ensuring quality

  • Behaviour-preserving; full RAP closes the same goals with identical proofs.
  • compile + spotless + nullness clean.
  • Active by default (no configuration needed).

📖 Conceptual overview in the developer docs: Performance Optimizations (3.1)

Additional information and contact(s)

This PR has been done with AI tooling support.

The contributions within this pull request are licensed under GPLv2 (only) for inclusion in KeY.

unp1 added 4 commits June 17, 2026 00:18
…Labels

removeIrrelevantLabels rebuilt the whole term tree on every call (stream().map()/filter()
.collect() per node), the single biggest allocator during proof search (~20%), even though
most subterms have no irrelevant label. Replace with an identity-preserving rebuild (plain
loops, lazy sub-array, return the original term when its subtree has no irrelevant label).
Behaviour-preserving (terms are immutable; result is structurally identical).
Objects.hash(first, second) allocates an Object[] on every call; Pair is heavily used as a
hash-map key during proof search. Inline the same hash value without the array.
applyReplacewithHelper allocated a PiTIterator (posInTerm().iterator()) per rewrite-taclet
application and consumed it in the recursive replace(). Thread the PosInTerm + a depth index
instead (same indices/order), avoiding the per-application iterator object.
The KeY and JML ANTLR parsers build a prediction (DFA) cache lazily while parsing, held on the generated parsers' static fields, so it stays resident for the whole JVM -- including the (long) proof search, where it is unused (~17 MB retained on a large proof). It is a pure cache that ANTLR rebuilds transparently on the next parse, so dropping it after a problem/proof has finished loading is correctness-safe. Add ParsingFacade.clearParserCaches() (KeY/JavaDL) and JmlFacade.clearCaches() (JML) and call them from AbstractProblemLoader.load().
@unp1 unp1 changed the title perf: cut proof-search allocations (labels, Pair, find-walk) + release parser DFA cache Performance (1/4): cut proof-search allocations + release the parser DFA cache Jun 17, 2026
@unp1 unp1 changed the title Performance (1/4): cut proof-search allocations + release the parser DFA cache Performance (2/5): cut proof-search allocations + release the parser DFA cache Jun 17, 2026
@unp1 unp1 self-assigned this Jun 17, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant