Skip to content

feat(core): Add CoreSMT verification pipeline with incremental solver and diagnosis#475

Draft
MikaelMayer wants to merge 165 commits intomainfrom
migrate-b3-smt-pipeline-core-to-core
Draft

feat(core): Add CoreSMT verification pipeline with incremental solver and diagnosis#475
MikaelMayer wants to merge 165 commits intomainfrom
migrate-b3-smt-pipeline-core-to-core

Conversation

@MikaelMayer
Copy link
Contributor

@MikaelMayer MikaelMayer commented Feb 23, 2026

Problem

Strata Core's existing verifier generates all VCs upfront via symbolic execution, encoding each to a separate SMT file. This prevents incremental verification where solver state is maintained across statements.

Additionally, Core.ExpressionMetadata was Unit, discarding source location information and making it impossible to report accurate positions in verification failures.

Solution

Core Expression Metadata: Unit → SourceRange

Core.ExpressionMetadata is changed from Unit to Strata.SourceRange so that:

  • Source locations are preserved through the B3→Core→CoreSMT pipeline
  • Diagnosed failures can be converted back to B3 with accurate source positions

All code constructing Core expressions with () now uses SourceRange.none or a proper range. The SyntaxMono macro (eb[...]) gains a defaultMetadata field on MkLExprParams (defaults to Unit.unit for generic params, overridden to SourceRange.none for Core). TermType.toSMTString is extracted as a pure function to avoid duplication.

CoreSMT Verification Pipeline

An SMTSolverInterface with push/pop support enables incremental verification. A diagnosis engine splits failing conjunctions and checks each conjunct individually to identify which assertions cannot be proved or which covers are refuted.

B3 Verifier Migration

The B3 verifier now uses the CoreSMT pipeline: B3 expressions are converted to Core (preserving source locations), verified via CoreSMT, and diagnosed failures are converted back to B3 for display.

Changes

  • Core.ExpressionMetadata: UnitStrata.SourceRange
  • SyntaxMono macro: defaultMetadata field for type-appropriate metadata
  • TermType.toSMTString: extracted pure function to TermType.lean
  • isCoreSMT predicate, SMTSolverInterface, diagnosis engine, VCResult.diagnosis
  • B3 verifier migrated to CoreSMT pipeline
  • C_Simp/Verify.lean: csimpMetaToCore conversion (Unit → SourceRange)

Testing

All existing tests pass with the new pipeline.

MikaelMayer and others added 30 commits January 27, 2026 11:14
…or Func

- Added ToFormat for generic Func with proper constraints
- Added [ToFormat T.IDMeta] to Factory.lean section variables
- Removed unnecessary ToFormat instances from test files and Program.lean
- Removed custom Env.format function (now uses default ToFormat)
- Function bodies now display properly instead of showing <body>
- Resolved conflict in Factory.lean (Factory_wf moved to FactoryWF.lean in main)
- Applied rotate_left fix to FactoryWF.lean
- Modified testFuncDeclSymbolic to show functions capture variables by reference
- Function declared with n=10, then n mutated to 20, function uses n=20 at call time
- Proof obligation correctly shows result should be 25 (5+20), not 15 (5+10)
- Reverted Env.lean to main (custom scope formatting not needed)
…claration time

- Functions now capture variable values at declaration time, not by reference
- Free variables in function body are substituted with their current values from environment
- Test demonstrates: n=10 at declaration, n=20 after mutation, function uses n=10
- Proof obligation correctly shows result is 15 (5+10), not 25 (5+20)
…duplication

- Move generic Func structure to Strata/DL/Util/Func.lean
- Add PureFunc to Imperative, removing Lambda->Imperative dependency
- Fix funcDecl type checking to add function to type context
- Remove duplicate renameLhs/substFvar from ProcedureInlining
- Extract captureFreevars helper in StatementEval
- Refine getVars to exclude formal parameters for funcDecl
- Add type checking test for funcDecl
- Add WFfuncDeclProp structure in WF.lean checking input parameter uniqueness
- Add LFunc.type_inputs_nodup theorem in Factory.lean
- Add Function.typeCheck_inputs_nodup theorem in FunctionType.lean
- Add listMap_keys_map_snd helper lemma in StatementWF.lean
- Replace sorry in funcDecl case with complete proof
…flags

Restore the pattern from the old verifier: createInteractiveSolver
selects appropriate flags based on the solver name (cvc5 vs z3)
rather than hardcoding cvc5-specific flags.
The lean-action cache key does not include .st grammar files.
When LaurelGrammar.st changes, the cached .olean files are stale
causing spurious test failures. Remove them before building to
force a rebuild.
The lean-action cache key only includes lean-toolchain and lake-manifest.json.
When .st grammar files change (e.g. LaurelGrammar.st), the cached .olean files
become stale causing spurious test failures.

Fix: manage the lake cache ourselves with use-github-cache: false, adding
hashFiles('**/*.st') to the cache key so grammar changes invalidate the cache.
…t processFuncDecl

- Unify proveCheck and coverCheck into a single runCheck function
  parameterized by PropertyType, eliminating ~40 lines of duplication
- Extract processFuncDecl helper to reduce processStatement length
- File reduced from 280 to 238 lines
TranslationTests:
- Filter declare-datatype, push/pop, check-sat from output
- Use check-sat-assuming instead of push/assert/check-sat/pop
- Buffer solver returns unsat to prevent diagnosis from running
- Update all test expectations

VerifierTests:
- Show full obligation as first diagnosis line
- Show path conditions for full obligation
- Show sub-expressions when there are multiple failures
- Update all test expectations
Keep push/pop and check-sat in translation test output - they are
meaningful CoreSMT output. Only filter set-logic, set-option, and
declare-datatype (prelude). Update test expectations accordingly.
The declare-datatype Option was added to the solver prelude but
CoreSMT is new and has no existing users that depend on it.
Remove it from initializeSolver, reset, and programToSMT.
Also remove the declare-datatype filter from TranslationTests
since it is no longer emitted.
info: test: ✗ counterexample found
(0,61): check 8 == 8 && f(5) == 7
└─ (0,67): could not prove 8 == 8 && f(5) == 7
└─ (0,67): could not prove 8 == 8
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

8 == 8 is a path condition, it's not something we can't prove. What happened?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed. The diagnosis now correctly identifies proved vs refuted sub-expressions:

  • Sub-expressions that are proved (not(expr) is unsat) are filtered out
  • Sub-expressions that are refuted (expr is unsat) show as "it is impossible"
  • Sub-expressions that are unknown show as "could not prove"
  • Sub-expressions identical to the full obligation are not duplicated

So 8 == 8 is now correctly filtered (proved), and 1 + 2 == 4 correctly shows as "it is impossible".

Previously, diagnosis would report 'could not prove 8 == 8' even though
8 == 8 is trivially true. The fix:
- For assert checks: skip sub-expressions that are proved (not(expr) is unsat)
- For assert checks: mark sub-expressions as refuted if expr is unsat
- For reach checks: mark sub-expressions as refuted if expr is unsat
- Skip sub-expressions identical to the full obligation (no duplication)

This matches the behavior of the old B3 verifier's diagnosis.
Each sub-expression failure now shows its path condition, which
includes both the conjunction-split context (proved left conjuncts)
and the state path condition (assume statements).

For example, 'f(5) == 7' in 'check 8 == 8 && f(5) == 7' now shows
'8 == 8' as a path condition, since 8 == 8 was proved and used as
context when diagnosing f(5) == 7.
info: test_assert_helps: ✗ unknown
(0,103): assert f(5) > 1
└─ (0,103): could not prove f(5) > 1
(0,103): check f(5) > 1
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should still be assert here. It's the assert that fails, not the check.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed. The B3 statement kind (check/assert/reach) is now stored in the Core metadata during ToCore conversion, and the verifier output uses it to correctly label failures. assert f(5) > 1 now shows as "assert" not "check".

Store the B3 statement kind ('check'/'assert') in Core metadata so
the verifier output correctly labels assert failures as 'assert'
rather than 'check'.
let xsArray ← translateDeclList bindings xsa
-- Note: the indices in the following are placeholders
let newBoundVars := List.toArray (xsArray.mapIdx (fun i _ => LExpr.bvar () i))
let newBoundVars := List.toArray (xsArray.mapIdx (fun i _ => LExpr.bvar Strata.SourceRange.none i))
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this whole file we should actually have no more Strata.SourceRange.none as we ought to get the sourceRange from the metadata of the DDM. We don't need to do it for python for now but for core it'll be essential.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed. All Strata.SourceRange.none in Translate.lean are now replaced with the actual source range from the DDM Arg/Op annotation at each call site: arg.ann for expression nodes, op.args[N]!.ann for binding helpers, xsa.ann/bodya.ann for quantifiers, etc.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed. All Strata.SourceRange.none in Translate.lean are replaced with the actual DDM source range from the Arg/Op annotation at each call site (arg.ann, op.args[N]!.ann, xsa.ann, bodya.ann, etc.). The Arg.ann field is already Strata.SourceRange — the same type as CoreExprMetadata — so no conversion is needed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed. All Strata.SourceRange.none in Translate.lean replaced with actual DDM source ranges: arg.ann for expression nodes, op.args[N]!.ann for binding helpers, xsa.ann/bodya.ann for quantifiers. Arg.ann is already Strata.SourceRange so no conversion needed.

…nslate.lean

- Restore the three #guard_msgs tests in Identifiers.lean that were
  removed when CoreExprMetadata changed from Unit to SourceRange.
  Update expectations to show Strata.SourceRange.none instead of ().
- Replace all Strata.SourceRange.none in Translate.lean with the
  actual source range from the DDM Arg/Op annotation at each call site.
  This ensures Core expressions carry proper source location info.
- SourceRange.lean: Add #guard_msgs tests for the Repr instance showing
  that none displays as () and non-none shows struct fields
- ASTtoCST.lean: The fvar initializer case is relevant to this PR since
  CoreExprMetadata changed from Unit to SourceRange, requiring the fvar
  constructor to take a SourceRange argument
- CoreToCBMC.lean: Comment already present explaining SourceRange.none
Fix Core.defaultSolver reference in SolverInterface.lean after
the SimpleAPI refactor moved defaultSolver to Core.Options.
…Types

- SourceRange Repr always shows () to keep debug output readable
  (source location info is available via SourceRange.format)
- LExpr.eraseTypes now also erases metadata to default, so alpha
  equivalence checks in ProcedureInlining tests are not affected
  by source range differences between programs
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants