From 24f6f37afb12389a321b2e6bc3ae750f73b11305 Mon Sep 17 00:00:00 2001 From: Stevengre Date: Wed, 8 Apr 2026 09:21:56 +0000 Subject: [PATCH 1/2] Update KMIR tool application: address reviewer feedback, update to v0.4.202 - Update Docker image from 0.3.152 to 0.4.202 - Add Known Limitations section addressing soundness questions - Add Artifacts & Audit Mechanisms, Versioning, CI Integration sections - Update prove-rs to prove (canonical command, prove-rs kept as alias) - Add --terminate-on-thunk flag for proof soundness - Fix --no-full-printer (removed in 0.4.x) to --statistics --leaves - Fix typos (Solitidy -> Solidity, absense -> absence, extra parenthesis) Co-Authored-By: Claude Opus 4.6 (1M context) --- .github/workflows/kmir.yml | 2 +- doc/src/tools/kmir.md | 125 +++++++++++++++--- kmir-proofs/README.md | 26 ++-- .../unchecked_arithmetic/run-proofs.sh | 6 +- .../unchecked_arithmetic.rs | 2 +- 5 files changed, 131 insertions(+), 30 deletions(-) diff --git a/.github/workflows/kmir.yml b/.github/workflows/kmir.yml index ab70092167c78..d6d2838906a13 100644 --- a/.github/workflows/kmir.yml +++ b/.github/workflows/kmir.yml @@ -29,5 +29,5 @@ jobs: -w /home/kmir/workspace \ -u $(id -u):$(id -g) \ -v $PWD:/home/kmir/workspace \ - runtimeverificationinc/kmir:ubuntu-jammy-0.3.152 \ + runtimeverificationinc/kmir:ubuntu-jammy-0.4.202 \ kmir-proofs/unchecked_arithmetic/run-proofs.sh diff --git a/doc/src/tools/kmir.md b/doc/src/tools/kmir.md index 9840b651fb4b2..9a9229e130d9e 100644 --- a/doc/src/tools/kmir.md +++ b/doc/src/tools/kmir.md @@ -21,7 +21,7 @@ This diagram describes the extraction and verification workflow for KMIR: ![kmir_env_diagram_march_2025](https://github.com/user-attachments/assets/bf426c8d-f241-4ad6-8cb2-86ca06d8d15b) -The K Framework ([kframework.org](https://kframework.org/) is the basis of how +The [K Framework](https://kframework.org/) is the basis of how KMIR operates to guarantee properties of Rust programs. K is a rewrite-based semantic framework based on [matching logic](http://www.matching-logic.org/) in which programming languages, their operational semantics and type systems, and @@ -55,8 +55,12 @@ This state cannot be unified with a valid target state in the proof, so the proo fails. KMIR systematically explores all feasible paths under the user-supplied preconditions. Only when **every** path terminates without hitting UB *and* satisfies the target property does KMIR declare the program UB-free (and correct -for the property). This ensures that any “no UB” claim holds under the sole assumption -that KMIR’s implementation is correct. +for the property). Additionally, the `--terminate-on-thunk` flag (recommended for all proofs) +ensures that if the symbolic execution reaches a MIR construct whose semantics +are not yet implemented, the proof halts cleanly rather than silently passing +through unsupported operations. This ensures that any “no UB” claim holds under +the sole assumption that KMIR’s implementation of the supported MIR fragment is +correct. Programs modelled in K Framework can be executed _symbolically_, i.e., operating on abstract input which is not fully specified but characterized by _path conditions_ @@ -83,7 +87,7 @@ one of several possible techniques can be used: need to be written in K's native language. 2) As potential future work, it would be possible to implement an annotation language to provide the required context for loop invariant directly in - source code (as done in the past using natspec for Solitidy code). + source code (as done in the past using natspec for Solidity code). 3) In general, K also supports bounded loop unrolling, by way of identifying loop heads and counting how many times the same loop head has been observed. This technique is managed by the all-path reachability prover library for @@ -91,9 +95,52 @@ one of several possible techniques can be used: special support from the back-end. Our front-end, at the time of writing, does not have either of these -options turned on. As soon as larger programs will require it, we will decide -whether the preference is to implement support for user-supplied K-level -loop invariants, or bounded loop unrolling support, or (probably) offer both. +options turned on. When a proof involves an unbounded loop or recursion +without an invariant, the symbolic execution continues exploring paths +until a configurable bound is reached (via `--max-depth` or +`--max-iterations`). If the bound is exceeded before all paths are +fully explored, the proof does **not** pass — it results in stuck or +pending nodes in the control flow graph, which KMIR reports as a +non-passing proof. This means KMIR will never silently claim +correctness for programs with unbounded behavior that it cannot fully +explore. As soon as larger programs will require it, we will offer +user-supplied K-level loop invariants and/or bounded loop unrolling +support. + +### Known Limitations + +KMIR is under active development. The following limitations apply to +the current version: + +- **MIR construct coverage:** KMIR supports a substantial subset of + Stable MIR including integer and float arithmetic, structs, enums + (including niche-encoded variants), arrays, closures, unions, + transmute casts, pointer offsets, global and heap allocations, + volatile load/store, and function pointers. Constructs **not yet + supported** include: async/await, trait objects (`dyn`), iterators, + `Vec`/`String`/`HashMap`, `Box`/`Rc`/`Arc` (smart pointers), + multi-threading, panic unwinding, and `Drop` destructors. +- **Thunk mechanism for unsupported constructs:** When KMIR encounters + an operation whose semantics are not yet implemented, it produces a + _thunk_ — a placeholder that marks the unsupported operation. With + `--terminate-on-thunk` (recommended for all proofs), the proof halts + at this point, ensuring unsupported operations are never silently + skipped. Without this flag, the thunk propagates through the proof + state and the proof will typically not pass because the thunk cannot + unify with a valid target state. +- **Loop invariants:** User-supplied loop invariants are supported in + principle via K's native language but are not yet exposed through a + Rust-level annotation language. Programs with unbounded loops require + manual K-level invariant specification or bounded exploration. +- **Invalid loop invariants:** If a user provides a K-level invariant + that is not actually inductive, the prover will detect this: the + proof step that attempts to apply the invariant will fail to + discharge its side conditions, resulting in a stuck proof node. The + proof will not pass. + +For the most up-to-date list of supported features and known issues, +see the [mir-semantics issue +tracker](https://github.com/runtimeverification/mir-semantics/issues). KMIR also prioritizes UI with interactive proof exploration available out-of-the-box through the terminal KCFG (K Control Flow Graph) viewer, allowing @@ -182,17 +229,19 @@ Rust programs that `stable-mir-json` extracts. * Run Stable MIR code extracted from Rust programs (`kmir run my-program.smir.json`); * Prove that a given Rust program or function will terminate without panics or - undefined behaviour (`kmir prove-rs my-program.rs [--start-symbol my_function]`). + undefined behaviour (`kmir prove my-program.rs [--start-symbol my_function]`; + also available as `kmir prove-rs` for backward compatibility). This command invokes `stable-mir-json` internally, and then performs an all-path - reachability proof that the program reaches normal termination under all possible inputs. - Any statements that would panic or cause undefined behaviour will terminate execution - so this proves the successful execution. Pre-and post-conditions can be modelled - using assertions and conditional execution. -* (Advanced use for K experts:) Prove a property about a Rust program, which is given - as a K "claim" and proven using an all-path reachability proof in K - (`kmir prove my-program-spec.k`); + reachability proof that the program reaches normal termination under all possible + inputs. Any statements that would panic or cause undefined behaviour will terminate + execution so this proves the successful execution. Pre- and post-conditions can be + modelled using assertions and conditional execution. The `--terminate-on-thunk` + flag is recommended to ensure soundness (see Known Limitations above). +* (Advanced use for K experts:) Prove a property about a Rust program, which is + given as a K "claim" and proven using an all-path reachability proof in K + (`kmir prove my-program-spec.k --smir`); * Print or inspect the control flow graph of a program's proof constructed by the - `kmir prove` or `kmir prove-rs` commands (`kmir show module.proof_identifier` + `kmir prove` command (`kmir show module.proof_identifier` and `kmir view module.proof_identifier`); An example of proofs using KMIR, and how to construct them in Rust code, @@ -204,6 +253,50 @@ Future development will include using source code annotations instead of explici test functions to better integrate with Rust code bases, using a suitable annotation language to express pre- and post-conditions. +## Artifacts & Audit Mechanisms + +- The [mir-semantics repository](https://github.com/runtimeverification/mir-semantics) + includes a comprehensive regression test suite that is run on every PR via CI. + This test suite covers MIR construct parsing, concrete execution, and symbolic + proof scenarios, ensuring the correctness of the KMIR semantics implementation. +- Example proofs demonstrating KMIR's verification capabilities on Rust standard + library functions are provided in the + [`kmir-proofs`](https://github.com/model-checking/verify-rust-std/tree/main/kmir-proofs) + directory of this repository. +- The K Framework itself has been used to formally verify production systems + including the [KEVM Ethereum virtual machine + semantics](https://github.com/runtimeverification/evm-semantics) and various + smart contract languages, providing a track record of formal verification at + scale. + +## Versioning + +KMIR follows a sequential version numbering scheme. Releases are published as +Docker images on [Docker +Hub](https://hub.docker.com/r/runtimeverificationinc/kmir/tags) with tags in the +format `ubuntu-jammy-{version}` (e.g., `ubuntu-jammy-0.4.202`). The version +number is incremented with each release and corresponds to the state of the +[mir-semantics repository](https://github.com/runtimeverification/mir-semantics) +at the time of release. Version updates to the CI workflow in this repository are +managed by the Runtime Verification team, coordinated with updates to the proof +examples to ensure continued compatibility. + +## CI Integration + +The KMIR CI workflow (`.github/workflows/kmir.yml`) runs KMIR proofs on every +pull request and push to the `main` branch. The workflow: + +1. Checks out the repository. +2. Runs a Docker container using the pinned KMIR image from Docker Hub. +3. Executes the proof runner script (`kmir-proofs/unchecked_arithmetic/run-proofs.sh`), + which iterates over annotated proof targets in the Rust source files and + invokes `kmir prove` with `--terminate-on-thunk` for each target. +4. Each proof must pass (all paths reach normal termination without UB) for the + CI check to succeed. + +The Docker-based approach ensures reproducibility across environments and avoids +the need to install the K Framework toolchain in CI. + ## Background Reading - **[Matching Logic](http://www.matching-logic.org/)** diff --git a/kmir-proofs/README.md b/kmir-proofs/README.md index 76f899d7ce20f..dbf874053a9ad 100644 --- a/kmir-proofs/README.md +++ b/kmir-proofs/README.md @@ -20,9 +20,10 @@ installed and available on the path. ## Program Property Proofs in KMIR -The most user-friendly way to create and run a proof in KMIR is the `prove-rs` -functionality, which allows a user to prove that a given program will -run to completion without an error. +The most user-friendly way to create and run a proof in KMIR is the `kmir prove` +command (also available as `kmir prove-rs` for backward compatibility), which +allows a user to prove that a given program will run to completion without an +error. Desired post-conditions of the program, such as properties of the computed result, can be formulated as simple `assert` statements. Preconditions can be modelled @@ -42,11 +43,17 @@ the `$block` is executed, we can assume that the boolean expression `$pre` holds true. KMIR will stop executing the program as soon as any undefined behaviour arises -from the executed statements. Therefore, running to completion proves the absense +from the executed statements. Therefore, running to completion proves the absence of undefined behaviour, as well as the post-conditions expressed as assertions (possibly under the assumption of preconditions modelled using the above macro). -## Example: Proving Absense of Undefined Behaviour in `unchecked_*` arithmetic +The `--terminate-on-thunk` flag is recommended for all proofs. It ensures that +if the prover encounters a MIR construct with incomplete semantics support, the +proof halts cleanly rather than silently propagating through unsupported +operations. This guarantees that a passing proof is sound with respect to the +supported MIR fragment. + +## Example: Proving Absence of Undefined Behaviour in `unchecked_*` arithmetic The proofs in subdirectory `unchecked_arithmetic` concern a section of the challenge of securing [Safety of Methods for Numeric Primitive @@ -85,13 +92,14 @@ If the sum of the two arguments `a` and `b` does not exceed the bounds of type ` not trigger undefined behaviour and produce the correct result, expressed by the `precondition` macro and the assertion at the end of the unsafe block. -To run the proof, we execute `kmir prove-rs` and provide the function name as +To run the proof, we execute `kmir prove` and provide the function name as the `--start-symbol`. The `--verbose` option allows for watching the proof being executed, the `--proof-dir` will contain data about the proof's intermediate states -that can be inspected afterwards. +that can be inspected afterwards. The `--terminate-on-thunk` flag ensures +soundness by halting the proof if any unsupported construct is encountered. ```shell -kmir prove-rs unchecked_arithmetic.rs --proof-dir proof --start-symbol unchecked_add_i32 --verbose +kmir prove unchecked_arithmetic.rs --proof-dir proof --start-symbol unchecked_add_i32 --verbose --terminate-on-thunk ``` After the proof finishes, the prover reports whether it passed or failed, and some @@ -100,7 +108,7 @@ The graph can be shown or interactively inspected using commands `kmir show` and ```shell kmir view --proof-dir proof unchecked_arithmetic.unchecked_add_i32 -kmir show --proof-dir proof unchecked_arithmetic.unchecked_add_i32 [--no-full-printer] +kmir show --proof-dir proof unchecked_arithmetic.unchecked_add_i32 --statistics --leaves ``` While `kmir show` only prints the control flow graph, `kmir view` opens an interactive diff --git a/kmir-proofs/unchecked_arithmetic/run-proofs.sh b/kmir-proofs/unchecked_arithmetic/run-proofs.sh index bb9642a1aaffe..66296a26ce4ec 100755 --- a/kmir-proofs/unchecked_arithmetic/run-proofs.sh +++ b/kmir-proofs/unchecked_arithmetic/run-proofs.sh @@ -6,7 +6,7 @@ DIR=$(realpath $(dirname $0)) cd $DIR -start_symbols=$(sed -n -e 's,// @kmir prove-rs: ,,p' unchecked_arithmetic.rs) +start_symbols=$(sed -n -e 's,// @kmir prove: ,,p' unchecked_arithmetic.rs) # By default: unchecked_add_i32 unchecked_sub_usize unchecked_mul_isize # See `main` in program. start symbols need to be reachable from `main`. @@ -17,12 +17,12 @@ for sym in ${start_symbols}; do echo "Running proof for $sym" echo "#########################################" - kmir prove-rs --start-symbol $sym --verbose --proof-dir proofs --reload unchecked_arithmetic.rs + kmir prove --start-symbol $sym --verbose --proof-dir proofs --reload --terminate-on-thunk unchecked_arithmetic.rs echo "#########################################" echo "Proof finished, with the following graph:" echo "#########################################" - kmir show --proof-dir proofs unchecked_arithmetic.${sym} --no-full-printer + kmir show --proof-dir proofs unchecked_arithmetic.${sym} --statistics --leaves echo "#########################################" done diff --git a/kmir-proofs/unchecked_arithmetic/unchecked_arithmetic.rs b/kmir-proofs/unchecked_arithmetic/unchecked_arithmetic.rs index 333ab6169bcca..fbbcb5823e8de 100644 --- a/kmir-proofs/unchecked_arithmetic/unchecked_arithmetic.rs +++ b/kmir-proofs/unchecked_arithmetic/unchecked_arithmetic.rs @@ -1,4 +1,4 @@ -// @kmir prove-rs: unchecked_add_i32 unchecked_sub_usize unchecked_mul_isize +// @kmir prove: unchecked_add_i32 unchecked_sub_usize unchecked_mul_isize fn main() { unchecked_add_i32(1,2); From dbd421dccd04bc5f0a9d8b9a47c32f3963c54ba1 Mon Sep 17 00:00:00 2001 From: Stevengre Date: Wed, 8 Apr 2026 09:34:53 +0000 Subject: [PATCH 2/2] Add solana-token equivalence proofs as KMIR artifact example Reference the runtimeverification/solana-token proofs branch (50+ specs) and verification result tracking issues as a larger-scale case study of KMIR applied to real Rust programs. Co-Authored-By: Claude Opus 4.6 (1M context) --- doc/src/tools/kmir.md | 11 +++++++++++ 1 file changed, 11 insertions(+) diff --git a/doc/src/tools/kmir.md b/doc/src/tools/kmir.md index 9a9229e130d9e..ab0e0445ee854 100644 --- a/doc/src/tools/kmir.md +++ b/doc/src/tools/kmir.md @@ -263,6 +263,17 @@ language to express pre- and post-conditions. library functions are provided in the [`kmir-proofs`](https://github.com/model-checking/verify-rust-std/tree/main/kmir-proofs) directory of this repository. +- As a larger-scale case study, KMIR has been applied to formally verify the + equivalence of the [Solana SPL Token + program](https://github.com/runtimeverification/solana-token/) and its + pinocchio-based reimplementation (P-Token). The verification specs are on the + [`proofs` branch](https://github.com/runtimeverification/solana-token/tree/proofs) + with over 50 proof targets covering token operations (transfer, mint, burn, + freeze, approve, etc.), and results are tracked in + [issue #24](https://github.com/runtimeverification/solana-token/issues/24) + (non-multisig) and + [issue #97](https://github.com/runtimeverification/solana-token/issues/97) + (multisig). - The K Framework itself has been used to formally verify production systems including the [KEVM Ethereum virtual machine semantics](https://github.com/runtimeverification/evm-semantics) and various