Skip to content

test(evm): add opcode microbenchmarks generation#382

Draft
starwarfan wants to merge 2 commits intoDTVMStack:mainfrom
starwarfan:feature/opcode-microbenchmarks
Draft

test(evm): add opcode microbenchmarks generation#382
starwarfan wants to merge 2 commits intoDTVMStack:mainfrom
starwarfan:feature/opcode-microbenchmarks

Conversation

@starwarfan
Copy link
Copy Markdown
Contributor

Summary

  • Adds a Python script tools/generate_opcode_benchmarks.py to programmatically generate EVM bytecodes for testing the performance of individual opcodes.
  • Updates tools/check_performance_regression.py to allow specifying multiple benchmark directories and filters.
  • Modifies .ci/run_test_suite.sh to invoke the generator and include the new test/opcode-benchmarks directory in the evmone-bench suite.

This allows us to track performance regression of individual opcodes against libevmone, while carefully using data-dependency chaining to defeat dead-code elimination in DTVM multipass JIT.

Made with Cursor

@starwarfan starwarfan marked this pull request as draft March 4, 2026 07:35
@starwarfan starwarfan force-pushed the feature/opcode-microbenchmarks branch from b04216e to acf8fe8 Compare March 4, 2026 08:32
@github-actions
Copy link
Copy Markdown

github-actions bot commented Mar 4, 2026

⚡ Performance Regression Check Results

✅ Performance Check Passed (interpreter)

Performance Benchmark Results (threshold: 20%)

Benchmark Baseline (us) Current (us) Change Status
total/main/blake2b_huff/8415nulls 1.61 1.57 -2.4% PASS
total/main/blake2b_huff/empty 0.06 0.06 -0.9% PASS
total/main/blake2b_shifts/8415nulls 14.53 14.56 +0.2% PASS
total/main/sha1_divs/5311 6.12 6.15 +0.6% PASS
total/main/sha1_divs/empty 0.08 0.08 +0.5% PASS
total/main/sha1_shifts/5311 4.24 4.23 -0.2% PASS
total/main/sha1_shifts/empty 0.06 0.06 -0.6% PASS
total/main/snailtracer/benchmark 60.21 60.07 -0.2% PASS
total/main/structarray_alloc/nfts_rank 0.99 0.98 -0.7% PASS
total/main/swap_math/insufficient_liquidity 0.01 0.01 +0.1% PASS
total/main/swap_math/received 0.01 0.01 +1.3% PASS
total/main/swap_math/spent 0.01 0.01 +1.4% PASS
total/main/weierstrudel/1 0.26 0.26 -1.3% PASS
total/main/weierstrudel/15 2.40 2.40 -0.1% PASS
total/micro/JUMPDEST_n0/empty 1.28 1.32 +3.5% PASS
total/micro/jump_around/empty 0.09 0.10 +2.5% PASS
total/micro/loop_with_many_jumpdests/empty 19.41 20.11 +3.6% PASS
total/micro/memory_grow_mload/by1 0.14 0.14 -0.4% PASS
total/micro/memory_grow_mload/by16 0.15 0.15 -0.2% PASS
total/micro/memory_grow_mload/by32 0.16 0.16 -0.6% PASS
total/micro/memory_grow_mload/nogrow 0.14 0.14 -0.5% PASS
total/micro/memory_grow_mstore/by1 0.15 0.15 -1.3% PASS
total/micro/memory_grow_mstore/by16 0.16 0.16 -0.5% PASS
total/micro/memory_grow_mstore/by32 0.17 0.17 -0.5% PASS
total/micro/memory_grow_mstore/nogrow 0.15 0.15 -1.8% PASS
total/micro/signextend/one 0.28 0.32 +13.1% PASS
total/micro/signextend/zero 0.28 0.32 +13.5% PASS
total/synth/ADD/b0 2.16 2.16 -0.0% PASS
total/synth/ADD/b1 2.21 2.21 -0.1% PASS
total/synth/ADDRESS/a0 5.44 5.48 +0.8% PASS
total/synth/ADDRESS/a1 5.77 5.73 -0.6% PASS
total/synth/AND/b0 2.04 2.03 -0.3% PASS
total/synth/AND/b1 2.01 2.10 +4.7% PASS
total/synth/BYTE/b0 4.49 4.46 -0.7% PASS
total/synth/BYTE/b1 3.93 3.94 +0.2% PASS
total/synth/CALLDATASIZE/a0 2.38 2.39 +0.6% PASS
total/synth/CALLDATASIZE/a1 2.66 2.67 +0.4% PASS
total/synth/CALLER/a0 5.47 5.48 +0.1% PASS
total/synth/CALLER/a1 5.80 5.73 -1.2% PASS
total/synth/CALLVALUE/a0 1.89 2.01 +6.2% PASS
total/synth/CALLVALUE/a1 2.11 2.16 +2.6% PASS
total/synth/CODESIZE/a0 2.59 2.59 +0.2% PASS
total/synth/CODESIZE/a1 3.01 2.92 -2.8% PASS
total/synth/DUP1/d0 1.11 1.15 +3.7% PASS
total/synth/DUP1/d1 1.24 1.21 -2.4% PASS
total/synth/DUP10/d0 1.13 1.15 +1.8% PASS
total/synth/DUP10/d1 1.25 1.22 -1.8% PASS
total/synth/DUP11/d0 1.15 1.15 -0.0% PASS
total/synth/DUP11/d1 1.26 1.22 -3.2% PASS
total/synth/DUP12/d0 1.12 1.15 +3.1% PASS
total/synth/DUP12/d1 1.25 1.23 -1.4% PASS
total/synth/DUP13/d0 1.12 1.14 +2.3% PASS
total/synth/DUP13/d1 1.24 1.23 -1.1% PASS
total/synth/DUP14/d0 1.12 1.15 +2.3% PASS
total/synth/DUP14/d1 1.24 1.22 -1.6% PASS
total/synth/DUP15/d0 1.34 1.39 +4.1% PASS
total/synth/DUP15/d1 1.26 1.22 -3.1% PASS
total/synth/DUP16/d0 1.29 1.33 +3.2% PASS
total/synth/DUP16/d1 1.25 1.22 -2.3% PASS
total/synth/DUP2/d0 1.11 1.14 +2.2% PASS
total/synth/DUP2/d1 1.25 1.22 -2.4% PASS
total/synth/DUP3/d0 1.11 1.14 +2.6% PASS
total/synth/DUP3/d1 1.25 1.22 -2.0% PASS
total/synth/DUP4/d0 1.11 1.14 +2.7% PASS
total/synth/DUP4/d1 1.26 1.22 -3.1% PASS
total/synth/DUP5/d0 1.12 1.15 +2.3% PASS
total/synth/DUP5/d1 1.24 1.22 -2.1% PASS
total/synth/DUP6/d0 1.12 1.15 +2.4% PASS
total/synth/DUP6/d1 1.25 1.22 -2.6% PASS
total/synth/DUP7/d0 1.12 1.15 +2.1% PASS
total/synth/DUP7/d1 1.26 1.22 -3.3% PASS
total/synth/DUP8/d0 1.12 1.15 +2.1% PASS
total/synth/DUP8/d1 1.29 1.21 -6.7% PASS
total/synth/DUP9/d0 1.12 1.14 +1.4% PASS
total/synth/DUP9/d1 1.24 1.21 -2.4% PASS
total/synth/EQ/b0 3.98 3.94 -1.2% PASS
total/synth/EQ/b1 4.31 4.41 +2.4% PASS
total/synth/GAS/a0 2.58 2.58 -0.1% PASS
total/synth/GAS/a1 2.94 2.93 -0.4% PASS
total/synth/GT/b0 4.25 4.25 +0.1% PASS
total/synth/GT/b1 4.43 4.51 +1.7% PASS
total/synth/ISZERO/u0 7.04 7.05 +0.1% PASS
total/synth/JUMPDEST/n0 1.28 1.32 +3.3% PASS
total/synth/LT/b0 4.21 4.25 +1.1% PASS
total/synth/LT/b1 4.46 4.48 +0.5% PASS
total/synth/MSIZE/a0 3.63 3.65 +0.5% PASS
total/synth/MSIZE/a1 3.95 3.90 -1.2% PASS
total/synth/MUL/b0 4.05 4.12 +1.8% PASS
total/synth/MUL/b1 3.98 3.99 +0.2% PASS
total/synth/NOT/u0 1.78 1.81 +1.9% PASS
total/synth/OR/b0 2.06 2.04 -1.0% PASS
total/synth/OR/b1 2.08 2.12 +1.8% PASS
total/synth/PC/a0 2.37 2.38 +0.2% PASS
total/synth/PC/a1 2.72 2.67 -1.6% PASS
total/synth/PUSH1/p0 1.78 1.77 -0.4% PASS
total/synth/PUSH1/p1 1.77 1.72 -3.1% PASS
total/synth/PUSH10/p0 1.83 1.82 -0.6% PASS
total/synth/PUSH10/p1 1.82 1.76 -3.3% PASS
total/synth/PUSH11/p0 1.85 1.84 -0.3% PASS
total/synth/PUSH11/p1 1.82 1.76 -3.3% PASS
total/synth/PUSH12/p0 1.85 1.85 -0.3% PASS
total/synth/PUSH12/p1 1.82 1.77 -3.0% PASS
total/synth/PUSH13/p0 1.85 1.85 -0.2% PASS
total/synth/PUSH13/p1 1.82 1.77 -2.9% PASS
total/synth/PUSH14/p0 1.81 1.81 -0.0% PASS
total/synth/PUSH14/p1 1.83 1.77 -3.1% PASS
total/synth/PUSH15/p0 1.87 1.86 -0.2% PASS
total/synth/PUSH15/p1 1.83 1.77 -2.9% PASS
total/synth/PUSH16/p0 1.88 1.87 -0.4% PASS
total/synth/PUSH16/p1 1.84 1.79 -2.7% PASS
total/synth/PUSH17/p0 1.88 1.87 -0.4% PASS
total/synth/PUSH17/p1 1.84 1.78 -3.2% PASS
total/synth/PUSH18/p0 1.88 1.86 -0.9% PASS
total/synth/PUSH18/p1 1.84 1.78 -3.3% PASS
total/synth/PUSH19/p0 1.90 1.89 -0.3% PASS
total/synth/PUSH19/p1 1.84 1.79 -3.0% PASS
total/synth/PUSH2/p0 1.78 1.77 -0.2% PASS
total/synth/PUSH2/p1 1.78 1.73 -2.9% PASS
total/synth/PUSH20/p0 1.91 1.90 -0.5% PASS
total/synth/PUSH20/p1 1.85 1.79 -2.8% PASS
total/synth/PUSH21/p0 1.91 1.90 -0.4% PASS
total/synth/PUSH21/p1 1.85 1.79 -3.1% PASS
total/synth/PUSH22/p0 1.86 1.86 -0.1% PASS
total/synth/PUSH22/p1 1.86 1.79 -3.4% PASS
total/synth/PUSH23/p0 1.92 1.91 -0.5% PASS
total/synth/PUSH23/p1 1.86 1.80 -3.2% PASS
total/synth/PUSH24/p0 1.93 1.93 -0.2% PASS
total/synth/PUSH24/p1 1.86 1.80 -3.1% PASS
total/synth/PUSH25/p0 1.93 1.93 -0.2% PASS
total/synth/PUSH25/p1 1.86 1.81 -2.8% PASS
total/synth/PUSH26/p0 1.93 1.92 -0.5% PASS
total/synth/PUSH26/p1 1.87 1.81 -3.3% PASS
total/synth/PUSH27/p0 1.95 1.94 -0.3% PASS
total/synth/PUSH27/p1 1.88 1.82 -3.3% PASS
total/synth/PUSH28/p0 1.96 1.95 -0.3% PASS
total/synth/PUSH28/p1 1.87 1.82 -2.8% PASS
total/synth/PUSH29/p0 1.97 1.95 -0.6% PASS
total/synth/PUSH29/p1 1.89 1.82 -3.4% PASS
total/synth/PUSH3/p0 1.79 1.78 -0.4% PASS
total/synth/PUSH3/p1 1.79 1.74 -2.5% PASS
total/synth/PUSH30/p0 1.89 1.95 +3.4% PASS
total/synth/PUSH30/p1 1.90 1.83 -3.3% PASS
total/synth/PUSH31/p0 1.98 1.96 -0.8% PASS
total/synth/PUSH31/p1 1.89 1.84 -2.9% PASS
total/synth/PUSH32/p0 1.98 1.98 -0.3% PASS
total/synth/PUSH32/p1 1.90 1.85 -2.6% PASS
total/synth/PUSH4/p0 1.80 1.79 -0.3% PASS
total/synth/PUSH4/p1 1.79 1.74 -3.0% PASS
total/synth/PUSH5/p0 1.80 1.79 -0.7% PASS
total/synth/PUSH5/p1 1.79 1.74 -3.0% PASS
total/synth/PUSH6/p0 1.78 1.78 -0.4% PASS
total/synth/PUSH6/p1 1.80 1.74 -2.9% PASS
total/synth/PUSH7/p0 1.81 1.81 -0.4% PASS
total/synth/PUSH7/p1 1.80 1.74 -3.2% PASS
total/synth/PUSH8/p0 1.82 1.82 -0.2% PASS
total/synth/PUSH8/p1 1.81 1.75 -3.4% PASS
total/synth/PUSH9/p0 1.83 1.82 -0.5% PASS
total/synth/PUSH9/p1 1.81 1.76 -2.5% PASS
total/synth/RETURNDATASIZE/a0 2.58 2.59 +0.4% PASS
total/synth/RETURNDATASIZE/a1 3.01 2.96 -1.4% PASS
total/synth/SAR/b0 3.37 3.36 -0.4% PASS
total/synth/SAR/b1 3.64 3.70 +1.5% PASS
total/synth/SGT/b0 4.24 4.25 +0.4% PASS
total/synth/SGT/b1 4.36 4.46 +2.3% PASS
total/synth/SHL/b0 3.70 3.69 -0.3% PASS
total/synth/SHL/b1 2.59 2.60 +0.4% PASS
total/synth/SHR/b0 3.57 3.44 -3.6% PASS
total/synth/SHR/b1 2.79 2.54 -8.9% PASS
total/synth/SIGNEXTEND/b0 2.48 2.55 +2.6% PASS
total/synth/SIGNEXTEND/b1 2.40 2.43 +1.3% PASS
total/synth/SLT/b0 4.24 4.28 +0.8% PASS
total/synth/SLT/b1 4.39 4.45 +1.3% PASS
total/synth/SUB/b0 2.16 2.17 +0.5% PASS
total/synth/SUB/b1 2.19 2.24 +2.6% PASS
total/synth/SWAP1/s0 1.54 1.54 -0.1% PASS
total/synth/SWAP10/s0 1.54 1.54 -0.0% PASS
total/synth/SWAP11/s0 1.54 1.62 +5.2% PASS
total/synth/SWAP12/s0 1.54 1.55 +0.7% PASS
total/synth/SWAP13/s0 1.55 1.54 -0.3% PASS
total/synth/SWAP14/s0 1.56 1.55 -0.2% PASS
total/synth/SWAP15/s0 1.62 1.59 -1.6% PASS
total/synth/SWAP16/s0 1.74 1.72 -1.3% PASS
total/synth/SWAP2/s0 1.53 1.54 +0.6% PASS
total/synth/SWAP3/s0 1.54 1.54 -0.6% PASS
total/synth/SWAP4/s0 1.55 1.53 -1.1% PASS
total/synth/SWAP5/s0 1.55 1.54 -0.8% PASS
total/synth/SWAP6/s0 1.55 1.55 +0.0% PASS
total/synth/SWAP7/s0 1.55 1.55 +0.0% PASS
total/synth/SWAP8/s0 1.55 1.54 -0.8% PASS
total/synth/SWAP9/s0 1.54 1.55 +0.6% PASS
total/synth/XOR/b0 2.02 2.03 +0.4% PASS
total/synth/XOR/b1 2.10 2.09 -0.2% PASS
total/synth/loop_v1 6.20 6.67 +7.5% PASS
total/synth/loop_v2 6.19 6.68 +7.8% PASS

Summary: 194 benchmarks, 0 regressions


✅ Performance Check Passed (multipass)

Performance Benchmark Results (threshold: 20%)

Benchmark Baseline (us) Current (us) Change Status
total/main/blake2b_huff/8415nulls 1.98 2.02 +1.9% PASS
total/main/blake2b_huff/empty 0.11 0.11 +0.3% PASS
total/main/blake2b_shifts/8415nulls 6.55 6.58 +0.4% PASS
total/main/sha1_divs/5311 3.44 3.54 +2.9% PASS
total/main/sha1_divs/empty 0.05 0.05 +2.4% PASS
total/main/sha1_shifts/5311 3.75 3.67 -2.2% PASS
total/main/sha1_shifts/empty 0.06 0.06 -1.1% PASS
total/main/snailtracer/benchmark 66.95 67.99 +1.5% PASS
total/main/structarray_alloc/nfts_rank 0.30 0.31 +2.6% PASS
total/main/swap_math/insufficient_liquidity 0.03 0.03 +0.3% PASS
total/main/swap_math/received 0.03 0.03 +1.8% PASS
total/main/swap_math/spent 0.03 0.03 -0.1% PASS
total/main/weierstrudel/1 0.38 0.38 +0.1% PASS
total/main/weierstrudel/15 2.94 2.90 -1.4% PASS
total/micro/JUMPDEST_n0/empty 0.18 0.18 +0.7% PASS
total/micro/jump_around/empty 0.73 0.69 -5.7% PASS
total/micro/loop_with_many_jumpdests/empty 2.47 2.47 -0.1% PASS
total/micro/memory_grow_mload/by1 0.23 0.23 +1.2% PASS
total/micro/memory_grow_mload/by16 0.25 0.25 +1.1% PASS
total/micro/memory_grow_mload/by32 0.27 0.28 +4.0% PASS
total/micro/memory_grow_mload/nogrow 0.22 0.23 +2.4% PASS
total/micro/memory_grow_mstore/by1 0.27 0.27 +1.4% PASS
total/micro/memory_grow_mstore/by16 0.28 0.28 +2.1% PASS
total/micro/memory_grow_mstore/by32 0.29 0.30 +3.0% PASS
total/micro/memory_grow_mstore/nogrow 0.27 0.27 +1.4% PASS
total/micro/signextend/one 0.42 0.41 -1.3% PASS
total/micro/signextend/zero 0.42 0.41 -1.2% PASS
total/synth/ADD/b0 0.02 0.02 +2.2% PASS
total/synth/ADD/b1 0.02 0.02 +6.2% PASS
total/synth/ADDRESS/a0 1.06 0.99 -6.9% PASS
total/synth/ADDRESS/a1 1.17 1.02 -13.3% PASS
total/synth/AND/b0 0.02 0.02 +1.9% PASS
total/synth/AND/b1 0.02 0.02 +6.2% PASS
total/synth/BYTE/b0 1.96 1.97 +0.5% PASS
total/synth/BYTE/b1 2.32 2.32 +0.0% PASS
total/synth/CALLDATASIZE/a0 0.73 0.50 -32.2% PASS
total/synth/CALLDATASIZE/a1 0.76 0.52 -31.3% PASS
total/synth/CALLER/a0 1.06 1.18 +11.6% PASS
total/synth/CALLER/a1 1.14 1.02 -11.1% PASS
total/synth/CALLVALUE/a0 0.65 0.58 -11.2% PASS
total/synth/CALLVALUE/a1 0.68 0.61 -10.8% PASS
total/synth/CODESIZE/a0 0.73 0.49 -32.6% PASS
total/synth/CODESIZE/a1 0.76 0.52 -31.4% PASS
total/synth/DUP1/d0 0.02 0.02 +2.2% PASS
total/synth/DUP1/d1 0.02 0.02 +7.9% PASS
total/synth/DUP10/d0 0.02 0.02 +2.5% PASS
total/synth/DUP10/d1 0.02 0.02 +8.4% PASS
total/synth/DUP11/d0 0.02 0.02 +2.4% PASS
total/synth/DUP11/d1 0.02 0.02 +7.9% PASS
total/synth/DUP12/d0 0.02 0.02 +1.2% PASS
total/synth/DUP12/d1 0.02 0.02 +8.4% PASS
total/synth/DUP13/d0 0.02 0.02 +2.6% PASS
total/synth/DUP13/d1 0.02 0.02 +8.3% PASS
total/synth/DUP14/d0 0.02 0.02 +2.5% PASS
total/synth/DUP14/d1 0.02 0.02 +8.3% PASS
total/synth/DUP15/d0 0.02 0.02 +2.4% PASS
total/synth/DUP15/d1 0.02 0.02 +8.3% PASS
total/synth/DUP16/d0 0.02 0.02 +2.4% PASS
total/synth/DUP16/d1 0.02 0.02 +8.7% PASS
total/synth/DUP2/d0 0.02 0.02 +2.0% PASS
total/synth/DUP2/d1 0.02 0.02 +8.1% PASS
total/synth/DUP3/d0 0.02 0.02 +2.1% PASS
total/synth/DUP3/d1 0.02 0.02 +8.8% PASS
total/synth/DUP4/d0 0.02 0.02 +2.2% PASS
total/synth/DUP4/d1 0.02 0.02 +8.2% PASS
total/synth/DUP5/d0 0.02 0.02 +2.0% PASS
total/synth/DUP5/d1 0.02 0.02 +8.2% PASS
total/synth/DUP6/d0 0.02 0.02 +2.1% PASS
total/synth/DUP6/d1 0.02 0.02 +7.6% PASS
total/synth/DUP7/d0 0.02 0.02 +3.2% PASS
total/synth/DUP7/d1 0.02 0.02 +8.4% PASS
total/synth/DUP8/d0 0.02 0.02 +2.1% PASS
total/synth/DUP8/d1 0.02 0.02 +8.1% PASS
total/synth/DUP9/d0 0.02 0.02 +2.4% PASS
total/synth/DUP9/d1 0.02 0.02 +8.1% PASS
total/synth/EQ/b0 0.02 0.02 +1.6% PASS
total/synth/EQ/b1 0.02 0.02 +6.1% PASS
total/synth/GAS/a0 1.01 1.01 -0.2% PASS
total/synth/GAS/a1 1.05 1.05 -0.1% PASS
total/synth/GT/b0 0.02 0.02 +2.1% PASS
total/synth/GT/b1 0.02 0.02 +6.2% PASS
total/synth/ISZERO/u0 0.02 0.02 +1.9% PASS
total/synth/JUMPDEST/n0 0.18 0.18 -1.1% PASS
total/synth/LT/b0 0.02 0.02 +2.0% PASS
total/synth/LT/b1 0.02 0.02 +6.3% PASS
total/synth/MSIZE/a0 0.02 0.02 +1.9% PASS
total/synth/MSIZE/a1 0.02 0.02 +1.8% PASS
total/synth/MUL/b0 4.25 4.25 +0.1% PASS
total/synth/MUL/b1 4.19 4.76 +13.5% PASS
total/synth/NOT/u0 0.02 0.02 +2.2% PASS
total/synth/OR/b0 0.02 0.02 +1.8% PASS
total/synth/OR/b1 0.02 0.02 +6.2% PASS
total/synth/PC/a0 0.02 0.02 +2.4% PASS
total/synth/PC/a1 0.02 0.02 +1.4% PASS
total/synth/PUSH1/p0 0.02 0.02 +2.3% PASS
total/synth/PUSH1/p1 0.02 0.02 +2.1% PASS
total/synth/PUSH10/p0 0.04 0.04 +0.6% PASS
total/synth/PUSH10/p1 0.04 0.04 +0.8% PASS
total/synth/PUSH11/p0 0.04 0.04 +0.8% PASS
total/synth/PUSH11/p1 0.04 0.04 +1.3% PASS
total/synth/PUSH12/p0 0.05 0.05 +0.3% PASS
total/synth/PUSH12/p1 0.05 0.05 +0.9% PASS
total/synth/PUSH13/p0 0.05 0.05 +0.1% PASS
total/synth/PUSH13/p1 0.05 0.05 +0.5% PASS
total/synth/PUSH14/p0 0.05 0.05 +0.7% PASS
total/synth/PUSH14/p1 0.05 0.05 +1.1% PASS
total/synth/PUSH15/p0 0.05 0.05 +0.6% PASS
total/synth/PUSH15/p1 0.05 0.05 +0.6% PASS
total/synth/PUSH16/p0 0.06 0.06 +0.5% PASS
total/synth/PUSH16/p1 0.06 0.06 +0.5% PASS
total/synth/PUSH17/p0 0.06 0.06 +0.2% PASS
total/synth/PUSH17/p1 0.06 0.06 +0.6% PASS
total/synth/PUSH18/p0 0.06 0.06 +0.5% PASS
total/synth/PUSH18/p1 0.06 0.06 +0.3% PASS
total/synth/PUSH19/p0 0.06 0.06 -0.6% PASS
total/synth/PUSH19/p1 0.06 0.06 +0.4% PASS
total/synth/PUSH2/p0 0.02 0.02 +1.5% PASS
total/synth/PUSH2/p1 0.02 0.02 +1.8% PASS
total/synth/PUSH20/p0 0.07 0.07 +0.5% PASS
total/synth/PUSH20/p1 0.07 0.07 +1.1% PASS
total/synth/PUSH21/p0 0.07 0.07 +0.4% PASS
total/synth/PUSH21/p1 0.07 0.07 +0.5% PASS
total/synth/PUSH22/p0 1.82 1.95 +7.1% PASS
total/synth/PUSH22/p1 1.32 1.45 +9.9% PASS
total/synth/PUSH23/p0 1.83 1.95 +6.4% PASS
total/synth/PUSH23/p1 1.33 1.46 +9.5% PASS
total/synth/PUSH24/p0 1.83 1.95 +6.4% PASS
total/synth/PUSH24/p1 1.31 1.45 +10.5% PASS
total/synth/PUSH25/p0 1.83 1.95 +6.8% PASS
total/synth/PUSH25/p1 1.32 1.44 +8.7% PASS
total/synth/PUSH26/p0 1.83 1.97 +7.3% PASS
total/synth/PUSH26/p1 1.33 1.45 +9.1% PASS
total/synth/PUSH27/p0 1.83 1.96 +7.0% PASS
total/synth/PUSH27/p1 1.33 1.46 +9.6% PASS
total/synth/PUSH28/p0 1.84 1.97 +7.0% PASS
total/synth/PUSH28/p1 1.34 1.47 +9.9% PASS
total/synth/PUSH29/p0 1.84 1.96 +6.6% PASS
total/synth/PUSH29/p1 1.35 1.47 +9.1% PASS
total/synth/PUSH3/p0 0.02 0.02 +0.0% PASS
total/synth/PUSH3/p1 0.02 0.02 +1.6% PASS
total/synth/PUSH30/p0 1.85 2.01 +8.4% PASS
total/synth/PUSH30/p1 1.35 1.48 +9.9% PASS
total/synth/PUSH31/p0 1.85 1.98 +7.3% PASS
total/synth/PUSH31/p1 1.45 1.56 +7.6% PASS
total/synth/PUSH32/p0 1.84 1.97 +7.0% PASS
total/synth/PUSH32/p1 1.35 1.49 +9.9% PASS
total/synth/PUSH4/p0 0.03 0.03 +1.4% PASS
total/synth/PUSH4/p1 0.03 0.03 +1.3% PASS
total/synth/PUSH5/p0 0.03 0.03 +1.3% PASS
total/synth/PUSH5/p1 0.03 0.03 -1.2% PASS
total/synth/PUSH6/p0 0.03 0.03 +1.2% PASS
total/synth/PUSH6/p1 0.03 0.03 +1.3% PASS
total/synth/PUSH7/p0 0.03 0.03 -0.5% PASS
total/synth/PUSH7/p1 0.03 0.03 +1.1% PASS
total/synth/PUSH8/p0 0.04 0.04 +1.4% PASS
total/synth/PUSH8/p1 0.04 0.04 +0.9% PASS
total/synth/PUSH9/p0 0.04 0.04 +0.7% PASS
total/synth/PUSH9/p1 0.04 0.04 +0.8% PASS
total/synth/RETURNDATASIZE/a0 0.65 0.73 +12.6% PASS
total/synth/RETURNDATASIZE/a1 0.68 0.76 +12.2% PASS
total/synth/SAR/b0 3.55 3.65 +2.7% PASS
total/synth/SAR/b1 3.99 3.96 -0.7% PASS
total/synth/SGT/b0 0.02 0.02 +2.0% PASS
total/synth/SGT/b1 0.02 0.02 +6.0% PASS
total/synth/SHL/b0 3.94 3.93 -0.3% PASS
total/synth/SHL/b1 2.62 2.68 +2.4% PASS
total/synth/SHR/b0 3.13 3.12 -0.5% PASS
total/synth/SHR/b1 2.71 2.65 -2.3% PASS
total/synth/SIGNEXTEND/b0 2.42 2.49 +2.7% PASS
total/synth/SIGNEXTEND/b1 2.60 2.80 +7.4% PASS
total/synth/SLT/b0 0.02 0.02 +1.9% PASS
total/synth/SLT/b1 0.02 0.02 +6.5% PASS
total/synth/SUB/b0 0.02 0.02 +1.9% PASS
total/synth/SUB/b1 0.02 0.02 +5.8% PASS
total/synth/SWAP1/s0 0.01 0.02 +2.7% PASS
total/synth/SWAP10/s0 0.01 0.02 +3.8% PASS
total/synth/SWAP11/s0 0.02 0.02 +2.7% PASS
total/synth/SWAP12/s0 0.02 0.02 +2.6% PASS
total/synth/SWAP13/s0 0.02 0.02 +3.0% PASS
total/synth/SWAP14/s0 0.02 0.02 +2.3% PASS
total/synth/SWAP15/s0 0.02 0.02 +2.8% PASS
total/synth/SWAP16/s0 0.02 0.02 +2.8% PASS
total/synth/SWAP2/s0 0.02 0.02 +2.3% PASS
total/synth/SWAP3/s0 0.02 0.02 +2.5% PASS
total/synth/SWAP4/s0 0.02 0.02 +2.7% PASS
total/synth/SWAP5/s0 0.01 0.02 +2.6% PASS
total/synth/SWAP6/s0 0.01 0.02 +2.8% PASS
total/synth/SWAP7/s0 0.02 0.02 +2.4% PASS
total/synth/SWAP8/s0 0.02 0.02 +3.1% PASS
total/synth/SWAP9/s0 0.02 0.02 +2.3% PASS
total/synth/XOR/b0 0.02 0.02 +1.8% PASS
total/synth/XOR/b1 0.02 0.02 +7.1% PASS
total/synth/loop_v1 1.68 2.01 +20.1% PASS
total/synth/loop_v2 1.60 1.95 +22.2% PASS

Summary: 194 benchmarks, 0 regressions


Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds infrastructure to generate and run EVM opcode microbenchmarks (and expands Solidity-based benchmark/test fixtures) to better track opcode-level performance regressions against libevmone, with CI wiring to execute the expanded benchmark suite.

Changes:

  • Add opcode microbenchmark generator (tools/generate_opcode_benchmarks.py) and broaden performance benchmark filtering defaults.
  • Extend Solidity test case support to include typed ABI arguments (args) and add multiple new Solidity benchmark categories (DeFi, ERC20, NFT, DAO, Layer2).
  • Update benchmark/CI execution paths (new contract benchmark harness, CI script updates, interpreter/JIT behavior tweaks).

Reviewed changes

Copilot reviewed 35 out of 35 changed files in this pull request and generated 8 comments.

Show a summary per file
File Description
tools/solc_batch_compile.sh Compile all .sol files in each Solidity test directory into a single combined JSON artifact.
tools/generate_opcode_benchmarks.py New Python generator producing state-test JSONs for opcode microbenchmarks.
tools/check_performance_regression.py Adjust default Google Benchmark filter pattern for evmone-bench runs.
tests/evm_solidity/nft/test_cases.json New NFT Solidity test/benchmark cases configuration.
tests/evm_solidity/nft/OnChainMetadataNFT.sol New on-chain metadata NFT contract fixture.
tests/evm_solidity/nft/NFTWrapper.sol New wrapper contract to exercise NFT flows.
tests/evm_solidity/nft/ERC721Enumerable.sol New enumerable ERC721-like fixture for benchmarks/tests.
tests/evm_solidity/layer2/test_cases.json New Layer2 Solidity test/benchmark cases configuration.
tests/evm_solidity/layer2/RollupState.sol New rollup-state contract fixture.
tests/evm_solidity/layer2/MerkleProofVerifier.sol New Merkle proof verifier fixture.
tests/evm_solidity/layer2/Layer2Wrapper.sol New wrapper contract to exercise Layer2 flows.
tests/evm_solidity/erc20_bench/test_cases.json New ERC20 benchmark cases, including typed args setup calls.
tests/evm_solidity/erc20_bench/PausableBurnableERC20.sol New pausable/burnable ERC20-like fixture.
tests/evm_solidity/erc20_bench/FeeOnTransferERC20.sol New fee-on-transfer ERC20-like fixture.
tests/evm_solidity/erc20_bench/ERC20BenchWrapper.sol New wrapper contract to exercise ERC20 flows.
tests/evm_solidity/defi/test_cases.json New DeFi Solidity test/benchmark cases configuration.
tests/evm_solidity/defi/SimpleDEX.sol New simple DEX contract fixture.
tests/evm_solidity/defi/LendingPool.sol New lending pool fixture.
tests/evm_solidity/defi/DeFiWrapper.sol New wrapper contract to exercise DeFi flows.
tests/evm_solidity/dao/test_cases.json New DAO Solidity test/benchmark cases configuration.
tests/evm_solidity/dao/SimpleGovernor.sol New governor fixture contract.
tests/evm_solidity/dao/MultiSigWallet.sol New multisig wallet fixture contract.
tests/evm_solidity/dao/DAOWrapper.sol New wrapper contract to exercise DAO flows.
src/tests/solidity_test_helpers.h Extend SolidityTestCase with typed ABI Args.
src/tests/solidity_test_helpers.cpp Parse args from JSON and adjust calldata derivation behavior.
src/tests/solidity_contract_tests.cpp Generate calldata from function + args and pre-resolve constructor addresses.
src/tests/evm_fallback_execution_tests.cpp Update expected status code for a fallback-related test.
src/evm/interpreter.cpp Remove undefined-opcode pre-check in one dispatch path (status behavior changes).
src/evm/evm_cache.cpp Adjust gas chunk cost source used during cache build.
src/compiler/evm_frontend/evm_mir_compiler.cpp Modify MLOAD lowering (remove pinning) and adjust gas reload for REVERT path.
src/compiler/evm_frontend/evm_imported.cpp Only preserve return data on REVERT for CREATE; clear otherwise.
src/action/evm_bytecode_visitor.h Adjust metering behavior for consecutive JUMPDEST runs.
benchmarks/evm_contract_benchmark.cpp New Google Benchmark harness to run Solidity contract benchmarks via EVMC.
.github/workflows/dtvm_evm_test_x86.yml Minor workflow formatting change.
.ci/run_test_suite.sh Add JIT logging CMake opts in some modes; update benchmark invocation to include opcode benchmark dir.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +143 to +153
def build_ternary_op(opcode: str, iterations: int) -> str:
"""
Template for opcodes that take 3 inputs and push 1 output (e.g. ADDMOD, MULMOD).
Setup: PUSH1 0x07 (Modulus), PUSH1 0x01 (Operand), PUSH1 0x01 (Accumulator).
Loop: DUP3 DUP3 <OP>. Duplicates the modulus and operand, applying <OP> on
(modulus, operand, accumulator). The result becomes the new accumulator.
"""
setup = OP_PUSH1 + "07" + OP_PUSH1 + "01" + OP_PUSH1 + "01"
loop_body = OP_DUP3 + OP_DUP3 + opcode
end = OP_PUSH1 + "00" + OP_MSTORE + OP_PUSH1 + "20" + OP_PUSH1 + "00" + OP_RETURN
return setup + (loop_body * iterations) + end
Copy link

Copilot AI Mar 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

build_ternary_op() does not keep enough stack items alive for more than one iteration: after the first iteration, the opcode consumes 3 items and pushes 1, so the next loop iteration will hit a stack underflow when executing DUP3. The loop body needs to preserve the constant operand/modulus (e.g., by duplicating them and rearranging the stack so the opcode consumes the duplicates while the originals remain). As written, the generated ADDMOD/MULMOD benchmarks will be invalid bytecode for --iterations > 1 (including the default 10000).

Copilot uses AI. Check for mistakes.
Comment on lines +170 to +181
def build_memory_op_mstore(iterations: int) -> str:
"""
Template for MSTORE.
Setup: PUSH1 0x00 (Address), PUSH1 0x01 (Value).
Loop: DUP2 DUP2 MSTORE (duplicate addr and value, then store).
Note: MSTORE doesn't produce an output on stack, so we just keep DUPing.
Actually, DUP2 DUP2 MSTORE consumes 2 items and pushes 0, so DUP2 DUP2 perfectly offsets it.
To create a data dependency, we can increment the value: DUP2 DUP2 MSTORE PUSH1 0x01 ADD.
"""
setup = OP_PUSH1 + "00" + OP_PUSH1 + "00" # Addr, Value
loop_body = OP_DUP2 + OP_DUP2 + OP_MSTORE + OP_PUSH1 + "01" + OP_ADD
end = OP_PUSH1 + "00" + OP_MSTORE + OP_PUSH1 + "20" + OP_PUSH1 + "00" + OP_RETURN
Copy link

Copilot AI Mar 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

build_memory_op_mstore()'s setup/comment are inconsistent and the initial stack order looks wrong for MSTORE. The docstring says value starts as 0x01, but the code pushes 00 twice, and for MSTORE the stack convention is [..., value, offset] (e.g. PUSH1 0x42; PUSH1 0x00; MSTORE). Please align the pushed constants with the intended (value, address) order and update the comment accordingly so the generated microbenchmark is doing the intended store pattern.

Copilot uses AI. Check for mistakes.
Comment on lines +147 to +152
// Currently, SolidityTestCase does not expose typed arguments; rely on raw calldata.
(void)Addrs; // unused until dynamic argument encoding is wired through SolidityTestCase
if (!Tc.Calldata.empty()) {
auto Opt = fromHex(Tc.Calldata);
if (Opt) return *Opt;
}
Copy link

Copilot AI Mar 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This benchmark harness currently ignores SolidityTestCase.Args / Function and only uses raw Calldata. With the new args support in test_cases.json (e.g., the setup_* ERC20 funding calls), Tc.Calldata will be empty and these setup calls will be silently skipped, changing benchmark behavior. BuildCalldata() should mirror the logic in src/tests/solidity_contract_tests.cpp by generating selector + ABI-encoded args when Tc.Args is non-empty.

Suggested change
// Currently, SolidityTestCase does not expose typed arguments; rely on raw calldata.
(void)Addrs; // unused until dynamic argument encoding is wired through SolidityTestCase
if (!Tc.Calldata.empty()) {
auto Opt = fromHex(Tc.Calldata);
if (Opt) return *Opt;
}
// Prefer raw calldata when provided.
(void)Addrs; // reserved for future use (e.g., resolving address arguments)
if (!Tc.Calldata.empty()) {
auto Opt = fromHex(Tc.Calldata);
if (Opt) return *Opt;
}
// If no raw calldata is provided but a function selector and ABI-encoded args are,
// build calldata as: selector || encoded_args.
if (!Tc.Function.empty() && !Tc.Args.empty()) {
std::string CalldataHex;
CalldataHex.reserve(Tc.Function.size() +
std::accumulate(Tc.Args.begin(), Tc.Args.end(), std::size_t{0},
[](std::size_t Acc, const auto& Arg) {
return Acc + Arg.size();
}));
CalldataHex += Tc.Function;
for (const auto& Arg : Tc.Args) {
CalldataHex += Arg;
}
auto Opt = fromHex(CalldataHex);
if (Opt) return *Opt;
}

Copilot uses AI. Check for mistakes.
Comment on lines +147 to +148
// Currently, SolidityTestCase does not expose typed arguments; rely on raw calldata.
(void)Addrs; // unused until dynamic argument encoding is wired through SolidityTestCase
Copy link

Copilot AI Mar 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment in BuildCalldata() says SolidityTestCase does not expose typed arguments, but this PR adds SolidityTestCase::Args and the rest of the codebase now uses it to build calldata. Please update/remove this comment to avoid misleading future changes (and once args are supported here, the (void)Addrs workaround should no longer be needed).

Suggested change
// Currently, SolidityTestCase does not expose typed arguments; rely on raw calldata.
(void)Addrs; // unused until dynamic argument encoding is wired through SolidityTestCase
// Currently this helper relies on raw calldata when provided.
if (Addrs.size() == static_cast<size_t>(-1)) {
// This block is never executed; Addrs is intentionally unused here for now.
}

Copilot uses AI. Check for mistakes.
Comment on lines 766 to 769
Byte OpcodeByte = Code[Frame->Pc];
evmc_opcode Op = static_cast<evmc_opcode>(OpcodeByte);
const uint8_t OpcodeU8 = static_cast<uint8_t>(OpcodeByte);

if (NamesTable[OpcodeU8] == NULL) {
Context.setStatus(EVMC_UNDEFINED_INSTRUCTION);
if (handleExecutionStatus(Frame, Context)) {
return;
}
break;
}

switch (Op) {
Copy link

Copilot AI Mar 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After removing the NamesTable check, undefined opcodes now fall through the default case and set EVMC_INVALID_INSTRUCTION (unless they match PUSH/DUP/SWAP/CALL/CREATE ranges). This changes EVMC semantics: EVMC_INVALID_INSTRUCTION is meant for the INVALID (0xFE) opcode, while unknown opcodes like 0xEE should typically return EVMC_UNDEFINED_INSTRUCTION. Consider restoring an undefined-opcode check in this dispatch path (similar to the gas-chunk loop) so status codes are consistent and spec-aligned.

Copilot uses AI. Check for mistakes.
@@ -2713,18 +2713,6 @@ EVMMirBuilder::handleMLoad(Operand AddrComponents) {

Operand Bytes32Op(MemPtr, EVMType::BYTES32);
Operand Result = convertBytes32ToU256Operand(Bytes32Op);
Copy link

Copilot AI Mar 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

handleMLoad() removed the protectUnsafeValue() pinning of the loaded 64-bit parts. Without this, backend optimization can potentially reorder or CSE the underlying loads across later memory-writing opcodes/calls, changing EVM semantics (observing newer memory contents than the MLOAD point-in-time). If that reordering risk still exists (as the removed comment described), the loads should remain protected (or be made explicitly non-movable/volatile) to preserve correctness.

Suggested change
Operand Result = convertBytes32ToU256Operand(Bytes32Op);
Operand Result = convertBytes32ToU256Operand(Bytes32Op);
// Protect the loaded value from being reordered or CSE'd across later
// memory-writing opcodes/calls to preserve EVM MLOAD semantics.
protectUnsafeValue(Result);

Copilot uses AI. Check for mistakes.
Comment on lines 231 to +237
echo "Saving performance baseline..."
python3 check_performance_regression.py \
--save-baseline "$BENCHMARK_SAVE_BASELINE" \
--save-baseline "$ ENCHMARK_SAVE_BASELINE" \
--output-summary "$BENCHMARK_SUMMARY_FILE" \
--lib ./libdtvmapi.so \
--mode "$BENCHMARK_MODE" \
--benchmark-dir test/evm-benchmarks/benchmarks
--benchmark-dir test/evm-benchmarks/benchmarks $OPCODE_BENCH_DIR
Copy link

Copilot AI Mar 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The --save-baseline argument is using a misspelled/garbled variable name ($ ENCHMARK_SAVE_BASELINE), which will expand to an empty string or a different variable and cause the baseline-save path to write to an invalid location (or fail argument parsing). Fix the variable reference to BENCHMARK_SAVE_BASELINE and remove the embedded spaces so the argument value is correct.

Copilot uses AI. Check for mistakes.
Comment on lines 202 to 209
python3 check_performance_regression.py \
--baseline "$BASELINE_CACHE" \
--threshold "$BENCHMARK_THRESHOLD" \
--output-summary "$BENCHMARK_SUMMARY_FILE" \
--lib ./libdtvmapi.so \
--mode "$BENCHMARK_MODE" \
--benchmark-dir test/evm-benchmarks/benchmarks
--benchmark-dir test/evm-benchmarks/benchmarks $OPCODE_BENCH_DIR
elif [ -n "$BENCHMARK_BASELINE_LIB" ]; then
Copy link

Copilot AI Mar 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

check_performance_regression.py currently defines --benchmark-dir as a single string argument. This invocation appends $OPCODE_BENCH_DIR as an extra positional argument, which will be treated as an unknown CLI argument if it expands to a path (and will make the script exit with a usage error). Either update check_performance_regression.py to accept multiple benchmark directories (e.g., action='append' / nargs='+') and loop over them in run_benchmark(), or change $OPCODE_BENCH_DIR to include its own --benchmark-dir ... flag when set.

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants