Skip to content

Conversation

@CreeptoGengar
Copy link
Contributor

@CreeptoGengar CreeptoGengar commented Sep 24, 2025

Summary

Implements caching for state.BuildLastCommitInfo to eliminate performance bottlenecks in consensus operations.

Changes

  • Add validator and ABCI validator caches to BlockExecutor
  • Create optimized BuildLastCommitInfoFromStoreWithCache and BuildLastCommitInfoWithCache methods
  • Update all call sites in applyBlock, ExtendVote, ProcessProposal
  • Add ClearValidatorCache() for memory management

Performance Impact

  • 30.7x faster: 8462 ns/op → 275.7 ns/op
  • 9.7x less memory: 3711 B/op → 384 B/op
  • 3.4x fewer allocations: 37 → 11 allocs/op

Fixes #3152


Note

Introduce cached validator/ABCI validator lookups in BuildLastCommitInfo and update call sites for faster block processing.

  • State/Execution (state/execution.go):
    • Caching:
      • Add validatorCache and abciValidatorCache with RW locks to BlockExecutor.
      • New methods: BuildLastCommitInfoFromStoreWithCache, BuildLastCommitInfoWithCache, and ClearValidatorCache.
    • Integrations:
      • Replace uses of buildLastCommitInfoFromStore in ProcessProposal, FinalizeBlock path (applyBlock), and ExtendVote with cached variants.
      • Initialize caches in NewBlockExecutor.
      • Add ExecCommitBlockWithCache helper.
  • Tests (state/execution_test.go):
    • Add unit test for cache behavior and clearing.
    • Add benchmarks comparing cached vs original implementations.

Written by Cursor Bugbot for commit 5663a3f. This will update automatically on new commits. Configure here.

}

// ExecCommitBlockWithCache is an optimized version that uses caching for better performance
func (blockExec *BlockExecutor) ExecCommitBlockWithCache(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is unused. looks like the previous ExecCommitBlock is only used when replaying blocks. do we want to use that there?

}

// ClearValidatorCache clears the validator caches to free memory
func (blockExec *BlockExecutor) ClearValidatorCache() {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is unused except in tests, so the part of your test that is testing cache clearing isn't really testing anything

Comment on lines +88 to +89
validatorCache: make(map[int64]*types.ValidatorSet),
abciValidatorCache: make(map[string]abci.Validator),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in real execution, no values from these caches are ever removed, so these will grow forever. We will need someway to periodically remove elements from them.

commitSig := block.LastCommit.Signatures[i]

// Create cache key for this validator
cacheKey := fmt.Sprintf("%s_%d", val.PubKey.Address(), val.VotingPower)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

from my understanding the point of the abciValidatorCache is to not have to do the expensive val.PubKey.Address() call. however you are calling val.PubKey.Address() when generating the cacheKey. doesn't this eliminate most of the point of the cache?

cursor[bot]

This comment was marked as outdated.

cursor[bot]

This comment was marked as outdated.

cursor[bot]

This comment was marked as outdated.

@CreeptoGengar CreeptoGengar requested a review from mattac21 October 2, 2025 13:02
cursor[bot]

This comment was marked as outdated.

cursor[bot]

This comment was marked as outdated.

@CreeptoGengar
Copy link
Contributor Author

Had some race conditions in the validator caching that were causing CI failures, so I added proper double-checked locking and moved the cache cleanup outside the mutex locks to prevent deadlocks.

Also toned down the cache cleanup frequency since it was interfering with mempool rechecking and updated the test accordingly.

@CreeptoGengar
Copy link
Contributor Author

CreeptoGengar commented Oct 15, 2025

@mattac21 @aljo242

please check it out and correct me if needed

I appreciate your support and corrections :)

Comment on lines 129 to 131
// Simple cleanup: clear half of the cache
// Since Go maps don't guarantee iteration order, we'll clear the entire cache
// and let it rebuild naturally. This is simpler and avoids the FIFO issue.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cursor pointed this out as well but the first line of the comment here is misleading, it clears the whole cache, not half

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same below

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unable to authenticate your request. Please make sure to connect your GitHub account to Cursor. Go to Cursor

func (blockExec *BlockExecutor) cleanupOldCacheEntries() {
// Only cleanup if cache is significantly over the limit to avoid frequent cleanup
blockExec.validatorCacheMutex.Lock()
if len(blockExec.validatorCache) > blockExec.maxCacheSize*2 {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why *2 here? that feels misleading against what the maxCacheSize variable implies (that we will not grow past it). Since you already initialize the maxCacheSize to a default of 1000, why not just set it to 2000 by default?

// Double-checked locking: acquire write lock and check again
blockExec.validatorCacheMutex.Lock()
// Check again in case another goroutine added it
if existingValSet, exists := blockExec.validatorCache[height]; exists {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

im not following why you would need to grab from the cache again here? wouldn't this always be the same as what you just loaded from the store on line 549? I get that another goroutine may have just cached it and its now available in the cache, but why do we need to fetch it from the cache when we already did the work of loading it from the store above?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same question for the BuildLastCommitInfoWithCache

// cleanupOldCacheEntries removes old entries from caches to prevent memory leaks
func (blockExec *BlockExecutor) cleanupOldCacheEntries() {
// Only cleanup if cache is significantly over the limit to avoid frequent cleanup
blockExec.validatorCacheMutex.Lock()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this will be called quite frequently and you dont actually need a write lock on this except for the single time that the cache needs to be cleared. it may make more sense to acquire a read only lock when checking the condition, and change to a write lock only when the cache actually needs to be written to

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you could also use your helper GetCacheSize to get the sizes with a read only lock

@mattac21 mattac21 changed the title feat: cache validator workload in BuildLastCommitInfo for 28x performance improvement perf(state): Cache validator workload in BuildLastCommitInfo Oct 20, 2025
@github-actions
Copy link

This pull request has been automatically marked as stale because it has not had recent activity. It will be closed in 7 days-before-close if no further activity occurs.

@CreeptoGengar
Copy link
Contributor Author

@mattac21

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Cache the workload in state.buildLastCommitInfo

3 participants