Skip to content

Conversation

@weiihann
Copy link
Contributor

@weiihann weiihann commented Nov 11, 2025

This PR optimizes the handling of contract codes by deduplicating them in state updates. The codes field in stateUpdate has been changed from map[common.Address]contractCode to map[common.Hash][]byte.

Previously, contract codes were tracked by address in the state update. During the commit phase when writing bytecodes to the db, it was possible for duplicate bytecodes to be written to the PebbleDB batch when multiple contracts deployed the same code in the same block. This is a simple optimization that reduces resource consumption by deduplicating bytecodes before committing.

Ran a full sync from genesis to block 1,700,000 with no issues found.

Note: we may also want to not committing the bytecodes at all if they already exist in the db. This would require extra reads from the code cache or the db. It would need more changes and unsure if it's worth the effort. But from our data analysis, we can be sure that 97% of the contracts reuse existing bytecodes.

EDIT: did a initial investigation. The total bytes saved from not persisting duplicated bytecode values is around 23GB from genesis to block 23M. Probably not worth it.

@rjl493456442
Copy link
Member

rjl493456442 commented Nov 11, 2025

I'm not sure whether the address of dirty contract code will be needed in the future.

It's reasonable to remove it, as I'm confident it won't affect the current Ethereum network.
I just need to think a bit more about the long-term compatibility implications.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants