Add ERC-4337 EntryPoint invariant benchmark case by Th0rgal · Pull Request #32 · lfglabs-dev/verity-benchmark

Th0rgal · 2026-04-23T21:03:20Z

Summary

Adds the ERC-4337 EntryPoint core invariant as a new benchmark case, proving the property that Certora spent over a year attempting to verify with SMT solvers
Models the two-loop handleOps structure (validation phase + execution phase) as pure Lean functions with universally quantified validation results
All 8 theorems fully proven with zero sorry — builds clean with lake build

What is proven

The core ERC-4337 security property from the EIP spec:

"The EntryPoint only calls the sender with userOp.callData if and only if validateUserOp on that specific sender has passed."

This decomposes into:

Theorem	Property	Difficulty
`execution_implies_validation`	Safety: execution → validation	medium
`validation_implies_execution`	Liveness: validation → execution	medium
`execution_iff_validation`	Combined biconditional (the full invariant)	hard
`all_validated_on_success`	All-or-nothing validation loop	easy
`all_executed_on_success`	All executions attempted on success	easy
`no_execution_on_revert`	No execution on validation failure	easy
`single_op_execution_on_validation`	Verity contract: batchExecuted set	easy
`single_op_fee_collected`	Verity contract: fees incremented	easy

Why this matters

Certora's SMT-based approach cannot universally quantify over arbitrary Turing-complete programs (the account/paymaster contracts). Lean's interactive theorem proving handles this naturally — the proofs hold for ALL possible validation outcomes via standard universal quantification over List Bool.

Files

Benchmark/Cases/ERC4337/EntryPointInvariant/Contract.lean — Abstract model of handleOps two-loop structure
Benchmark/Cases/ERC4337/EntryPointInvariant/Specs.lean — Invariant specifications
Benchmark/Cases/ERC4337/EntryPointInvariant/Proofs.lean — Complete proofs (no sorry)
Benchmark/Cases/ERC4337/EntryPointInvariant/Compile.lean — Build gate
cases/erc4337/entry_point_invariant/ — YAML metadata (case + 8 tasks)

Test plan

lake build passes with no errors
Zero sorry in all proof files
All existing benchmark cases still build

🤖 Generated with Claude Code

Note

Medium Risk
Adds a new benchmark case with substantial new Lean/Verity model+proof code and updates the benchmark registry/report, so breakage risk is mostly around build/integration and task generation rather than runtime behavior.

Overview
Adds a new ERC-4337 EntryPointInvariant benchmark case, including a pure-Lean control-flow model of handleOps (validation phase + execution phase), corresponding Verity contract model (EntryPointModel), and completed reference proofs/specs for the execution-attempt iff validation-passed invariant.

Wires the new case into the benchmark entrypoints (Benchmark.lean, Benchmark/Cases.lean) and updates REPORT.md to include the new family/case and its associated proof tasks.

Includes minor API renaming in the Zama ERC-7984 model (mint parameter to → recipient) and updates the related specs/proofs and generated task stubs accordingly.

^{Reviewed by Cursor Bugbot for commit a8bf0bc. Bugbot is set up for automated code reviews on this repo. Configure here.}

Prove the core ERC-4337 security property that Certora could not verify with SMT solvers: execution happens if and only if validation passed. The model abstracts the two-loop handleOps structure as pure Lean functions with universally quantified validation results. All 8 theorems are fully proven (zero sorry), including: - Safety (execution implies validation) - Liveness (validation implies execution) - Combined biconditional invariant - All-validated/all-executed on success - No-execution on revert - Verity contract proofs for single-op model Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 81bd0dfabd

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

cursor

Cursor Bugbot has reviewed your changes and found 2 potential issues.

^{❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.}

^{Reviewed by Cursor Bugbot for commit 81bd0df. Configure here.}

…4337 Creates the erc4337 family manifest, erc4337_v09 implementation manifest, and 8 generated task template files to fix CI manifest validation. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Th0rgal · 2026-04-24T13:54:49Z

Migrated this benchmark PR to the faithful Verity model from lfglabs-dev/verity#1746.

Changes pushed in 7891277:

pins verity to 57d47acd44f273df9848bc2df612d2daa8aedf18
adds a native ERC-4337 handleOpsNative model using executable forEach, oracle-backed externalCall, and oracle-backed call/tryCatch
updates the Zama mint binder from to to recipient for compatibility with the new Verity syntax

Local verification: lake build passes.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 789127790c

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: a8bf0bcd7f

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-04-24T16:54:37Z


 require verity from git
-  "https://github.com/Th0rgal/verity.git"@"4ebe4931d25e5a1594fcd3f43ff040ecc3c4225a"
+  "https://github.com/lfglabs-dev/verity.git"@"de9377951c00bbd5fbf72e85d56fe50b8ff40d0c"


Keep case verity_version in sync with pin bump

Updating the global Verity dependency pin here to de937... without updating existing cases/*/*/case.yaml manifests leaves most cases still declaring 4ebe..., and scripts/run_case.sh records provenance from each case manifest’s verity_version rather than from Lake. In practice, non-ERC4337 runs after this commit build against one Verity commit but publish a different verity_commit in results metadata, which breaks reproducibility/auditability; either update all case manifests in the same commit or source the recorded commit from lakefile.lean.

Useful? React with 👍 / 👎.

chatgpt-codex-connector Bot reviewed Apr 23, 2026

View reviewed changes

Comment thread cases/erc4337/entry_point_invariant/case.yaml

Comment thread cases/erc4337/entry_point_invariant/tasks/execution_implies_validation.yaml

cursor Bot reviewed Apr 23, 2026

View reviewed changes

Comment thread cases/erc4337/entry_point_invariant/tasks/all_executed_on_success.yaml

Comment thread cases/erc4337/entry_point_invariant/case.yaml

Add family/implementation manifests and generated task files for ERC-…

1775caf

…4337 Creates the erc4337 family manifest, erc4337_v09 implementation manifest, and 8 generated task template files to fix CI manifest validation. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Th0rgal mentioned this pull request Apr 24, 2026

Add ERC-4337 EntryPoint proof case study lfglabs-dev/lfglabs.dev#35

Open

Migrate ERC-4337 benchmark to faithful Verity model

7891277

chatgpt-codex-connector Bot reviewed Apr 24, 2026

View reviewed changes

Comment thread cases/erc4337/entry_point_invariant/case.yaml Outdated

Th0rgal added 2 commits April 24, 2026 18:33

Clarify ERC4337 benchmark model scope

1a7a010

Pin ERC4337 benchmark to latest Verity features

a8bf0bc

chatgpt-codex-connector Bot reviewed Apr 24, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add ERC-4337 EntryPoint invariant benchmark case#32

Add ERC-4337 EntryPoint invariant benchmark case#32
Th0rgal wants to merge 5 commits intomainfrom
feat/erc4337-entrypoint-invariant

Th0rgal commented Apr 23, 2026 •

edited by cursor Bot

Loading

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

Uh oh!

Uh oh!

cursor Bot left a comment

Uh oh!

Uh oh!

Uh oh!

Th0rgal commented Apr 24, 2026 •

edited

Loading

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

chatgpt-codex-connector Bot Apr 24, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Th0rgal commented Apr 23, 2026 • edited by cursor Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

What is proven

Why this matters

Files

Test plan

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

Uh oh!

cursor Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Th0rgal commented Apr 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Apr 24, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Th0rgal commented Apr 23, 2026 •

edited by cursor Bot

Loading

Th0rgal commented Apr 24, 2026 •

edited

Loading