perf(state): Cache validator workload in BuildLastCommitInfo #5421

CreeptoGengar · 2025-09-24T10:58:15Z

Summary

Implements caching for state.BuildLastCommitInfo to eliminate performance bottlenecks in consensus operations.

Changes

Add validator and ABCI validator caches to BlockExecutor
Create optimized BuildLastCommitInfoFromStoreWithCache and BuildLastCommitInfoWithCache methods
Update all call sites in applyBlock, ExtendVote, ProcessProposal
Add ClearValidatorCache() for memory management

Performance Impact

30.7x faster: 8462 ns/op → 275.7 ns/op
9.7x less memory: 3711 B/op → 384 B/op
3.4x fewer allocations: 37 → 11 allocs/op

Fixes #3152

Note

Introduce cached validator/ABCI validator lookups in BuildLastCommitInfo and update call sites for faster block processing.

State/Execution (state/execution.go):
- Caching:
  - Add validatorCache and abciValidatorCache with RW locks to BlockExecutor.
  - New methods: BuildLastCommitInfoFromStoreWithCache, BuildLastCommitInfoWithCache, and ClearValidatorCache.
- Integrations:
  - Replace uses of buildLastCommitInfoFromStore in ProcessProposal, FinalizeBlock path (applyBlock), and ExtendVote with cached variants.
  - Initialize caches in NewBlockExecutor.
  - Add ExecCommitBlockWithCache helper.
Tests (state/execution_test.go):
- Add unit test for cache behavior and clearing.
- Add benchmarks comparing cached vs original implementations.

^{Written by Cursor Bugbot for commit 5663a3f. This will update automatically on new commits. Configure here.}

mattac21 · 2025-09-25T16:33:39Z

state/execution.go

 }

+// ExecCommitBlockWithCache is an optimized version that uses caching for better performance
+func (blockExec *BlockExecutor) ExecCommitBlockWithCache(


this is unused. looks like the previous ExecCommitBlock is only used when replaying blocks. do we want to use that there?

mattac21 · 2025-09-25T16:34:47Z

state/execution.go

 }

+// ClearValidatorCache clears the validator caches to free memory
+func (blockExec *BlockExecutor) ClearValidatorCache() {


this is unused except in tests, so the part of your test that is testing cache clearing isn't really testing anything

mattac21 · 2025-09-25T16:36:02Z

state/execution.go

+		validatorCache:     make(map[int64]*types.ValidatorSet),
+		abciValidatorCache: make(map[string]abci.Validator),


in real execution, no values from these caches are ever removed, so these will grow forever. We will need someway to periodically remove elements from them.

mattac21 · 2025-09-25T16:37:07Z

state/execution.go

+		commitSig := block.LastCommit.Signatures[i]
+
+		// Create cache key for this validator
+		cacheKey := fmt.Sprintf("%s_%d", val.PubKey.Address(), val.VotingPower)


from my understanding the point of the abciValidatorCache is to not have to do the expensive val.PubKey.Address() call. however you are calling val.PubKey.Address() when generating the cacheKey. doesn't this eliminate most of the point of the cache?

CreeptoGengar · 2025-10-15T11:23:19Z

Had some race conditions in the validator caching that were causing CI failures, so I added proper double-checked locking and moved the cache cleanup outside the mutex locks to prevent deadlocks.

Also toned down the cache cleanup frequency since it was interfering with mempool rechecking and updated the test accordingly.

CreeptoGengar · 2025-10-15T11:24:16Z

@mattac21 @aljo242

please check it out and correct me if needed

I appreciate your support and corrections :)

mattac21 · 2025-10-20T14:38:16Z

state/execution.go

+		// Simple cleanup: clear half of the cache
+		// Since Go maps don't guarantee iteration order, we'll clear the entire cache
+		// and let it rebuild naturally. This is simpler and avoids the FIFO issue.


cursor pointed this out as well but the first line of the comment here is misleading, it clears the whole cache, not half

Unable to authenticate your request. Please make sure to connect your GitHub account to Cursor. Go to Cursor

mattac21 · 2025-10-20T14:40:02Z

state/execution.go

+func (blockExec *BlockExecutor) cleanupOldCacheEntries() {
+	// Only cleanup if cache is significantly over the limit to avoid frequent cleanup
+	blockExec.validatorCacheMutex.Lock()
+	if len(blockExec.validatorCache) > blockExec.maxCacheSize*2 {


why *2 here? that feels misleading against what the maxCacheSize variable implies (that we will not grow past it). Since you already initialize the maxCacheSize to a default of 1000, why not just set it to 2000 by default?

mattac21 · 2025-10-20T14:43:00Z

state/execution.go

+		// Double-checked locking: acquire write lock and check again
+		blockExec.validatorCacheMutex.Lock()
+		// Check again in case another goroutine added it
+		if existingValSet, exists := blockExec.validatorCache[height]; exists {


im not following why you would need to grab from the cache again here? wouldn't this always be the same as what you just loaded from the store on line 549? I get that another goroutine may have just cached it and its now available in the cache, but why do we need to fetch it from the cache when we already did the work of loading it from the store above?

same question for the BuildLastCommitInfoWithCache

mattac21 · 2025-10-20T14:47:45Z

state/execution.go

+// cleanupOldCacheEntries removes old entries from caches to prevent memory leaks
+func (blockExec *BlockExecutor) cleanupOldCacheEntries() {
+	// Only cleanup if cache is significantly over the limit to avoid frequent cleanup
+	blockExec.validatorCacheMutex.Lock()


this will be called quite frequently and you dont actually need a write lock on this except for the single time that the cache needs to be cleared. it may make more sense to acquire a read only lock when checking the condition, and change to a write lock only when the cache actually needs to be written to

you could also use your helper GetCacheSize to get the sizes with a read only lock

Refactor ABCI validator conversion to use canonical helper for consistent field population.

github-actions · 2025-12-15T00:23:48Z

This pull request has been automatically marked as stale because it has not had recent activity. It will be closed in 7 days-before-close if no further activity occurs.

CreeptoGengar · 2025-12-15T11:18:42Z

@mattac21

CreeptoGengar added 2 commits September 24, 2025 13:57

Update execution.go

f06bbbe

Update execution_test.go

861cb16

mattac21 requested changes Sep 25, 2025

View reviewed changes

Merge branch 'main' into gen

5663a3f