Skip to content

Pex during handshake#2992

Open
pompon0 wants to merge 7 commits intomainfrom
gprusak-pex
Open

Pex during handshake#2992
pompon0 wants to merge 7 commits intomainfrom
gprusak-pex

Conversation

@pompon0
Copy link
Contributor

@pompon0 pompon0 commented Feb 27, 2026

This PR allows nodes to learn more node addresses, even if the peer they dial is out of capacity for new connections. This works by making listener node send pex batch as part of the handshake (it might discard the connection just after handshake, in case it decides it does not have capacity for this connection).

This new "pex in handshake" is enabled only if reactor pex is enabled in the node - disabling pex reactor prevents a node from learning new addresses and is currently mainly used to prevent node from connecting to random peers (which is misleading indirect use of pex flag), and this pr maintains this semantics to avoid distruptions. I think we should change this semantics and require people to just set MaxConnected to 0 instead (which is a direct way to say: connect only to persistent peers).

Additionally a SelfAddress is added to the handshake message: nodes advertise addresses of nodes they are connected to. Until now they could only advertise the addresses of outbound connections (i.e. verified addresses), but with this PR also SelfAddress of inbound connections is included (each node declares just their own up to date address, so it is fine to gossip it).

Note that SelfAddress could have been also be extracted from the pex response (it is always included), but I wanted to make it more explicit that it is special.

@github-actions
Copy link

github-actions bot commented Feb 27, 2026

The latest Buf updates on your PR. Results from workflow Buf / buf (pull_request).

BuildFormatLintBreakingUpdated (UTC)
✅ passed✅ passed✅ passed✅ passedFeb 27, 2026, 6:27 PM

@codecov
Copy link

codecov bot commented Feb 27, 2026

Codecov Report

❌ Patch coverage is 83.75000% with 13 lines in your changes missing coverage. Please review.
✅ Project coverage is 58.13%. Comparing base (89daf14) to head (3b5cb73).
⚠️ Report is 10 commits behind head on main.

Files with missing lines Patch % Lines
sei-tendermint/internal/p2p/conv.go 84.61% 2 Missing and 2 partials ⚠️
sei-tendermint/internal/p2p/handshake.go 50.00% 2 Missing and 2 partials ⚠️
sei-tendermint/internal/p2p/peermanager.go 84.61% 1 Missing and 1 partial ⚠️
sei-tendermint/internal/p2p/router.go 89.47% 1 Missing and 1 partial ⚠️
sei-tendermint/internal/p2p/giga_router.go 0.00% 1 Missing ⚠️
Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #2992      +/-   ##
==========================================
+ Coverage   58.08%   58.13%   +0.04%     
==========================================
  Files        2109     2110       +1     
  Lines      173234   173473     +239     
==========================================
+ Hits       100626   100844     +218     
- Misses      63664    63679      +15     
- Partials     8944     8950       +6     
Flag Coverage Δ
sei-chain-pr 69.42% <83.75%> (?)
sei-db 69.50% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines Coverage Δ
sei-tendermint/internal/p2p/peermanager_pool.go 98.31% <ø> (ø)
sei-tendermint/internal/p2p/pex/reactor.go 89.13% <100.00%> (-0.12%) ⬇️
sei-tendermint/internal/p2p/routeroptions.go 86.36% <ø> (+1.74%) ⬆️
sei-tendermint/internal/p2p/testonly.go 79.91% <100.00%> (+0.27%) ⬆️
sei-tendermint/internal/p2p/transport.go 84.09% <100.00%> (+0.36%) ⬆️
sei-tendermint/node/setup.go 67.50% <100.00%> (+0.84%) ⬆️
sei-tendermint/internal/p2p/giga_router.go 0.00% <0.00%> (ø)
sei-tendermint/internal/p2p/peermanager.go 85.26% <84.61%> (-3.11%) ⬇️
sei-tendermint/internal/p2p/router.go 84.80% <89.47%> (+0.90%) ⬆️
sei-tendermint/internal/p2p/conv.go 82.75% <84.61%> (+0.40%) ⬆️
... and 1 more
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

// NOTE: amplification factor!
// small request results in up to maxMsgSize response
maxMsgSize = maxAddressSize * maxGetSelection
maxMsgSize = 1000 + maxAddressSize*p2p.MaxPexAddrs
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How big is the average msg now? Where does the constant 1000 come from?

Encode: func(m *handshakeMsg) *pb.Handshake {
var selfAddr *string
if addr, ok := m.SelfAddr.Get(); ok {
selfAddr = utils.Alloc(addr.String())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When would it happen that you can't get selfAddr here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

node doesn't have to configure an external (public) address in case it doesn't have one.

if p.SelfAddr != nil {
addr, err := ParseNodeAddress(*p.SelfAddr)
if err != nil {
return nil, fmt.Errorf("SelfAddr: %w", err)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would this ever happen during normal operations? DNS failures?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adversary node can send broken data. This is just a proto converter though, it doesn't assume the proto to be valid in any sense, other than specified by the proto message definiton

for i, addrString := range p.PexAddrs {
addr, err := ParseNodeAddress(addrString)
if err != nil {
return nil, fmt.Errorf("PexAddrs[%v]: %w", i, err)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this may happen on a valid address, should we just ignore that one address and keep the others?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wdym? Valid address is parseable. This is not a dynamic property.

if err != nil {
return nil, fmt.Errorf("NodeAuthKey: %w", err)
}
nodeAuthSig, err := ed25519.SignatureFromBytes(p.NodeAuthSig)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can anyone with zero stake send us address updates?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, node keys do not have stake assigned. They are not validator keys

func (r *Router) Advertise(maxAddrs int) []NodeAddress {
return r.peerManager.Advertise(maxAddrs)
addrs := r.peerManager.Advertise()
return addrs[:min(len(addrs), maxAddrs)]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: would we ever want randomly pick instead of always the front?

// case listener does not have capacity for new connections.
// Dialer also could potentially send pex data, but there is no benefit from doing so:
// - if listener is full, then it won't use the new data. Listener also will not broadcast unverified data to anyone.
// - if it is not full, then the connection will be established and pex data will be sent the regular way using PEX protocol.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm confused, do you mean, if it is not full, Listener may broadcast unverified data to others?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pex data is exchanged periodically both ways over every connection. But if the connection dies immediately after handshake, then there is no exchange obviously. Connection may die immediately after handshake if listener rejects it (maxconnected capacity reached), in which case it won't try to establish new connections any time soon anyway, so it doesn't need new peer addresses. Noone is sending unverified data to anyone.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants