Skip to content

feat(firewall): open XNet port to all nodes in the network#10263

Open
pierugo-dfinity wants to merge 6 commits into
masterfrom
pierugo/firewall/open-xnet-to-all-nodes
Open

feat(firewall): open XNet port to all nodes in the network#10263
pierugo-dfinity wants to merge 6 commits into
masterfrom
pierugo/firewall/open-xnet-to-all-nodes

Conversation

@pierugo-dfinity
Copy link
Copy Markdown
Contributor

@pierugo-dfinity pierugo-dfinity commented May 20, 2026

In an effort to allow XNet connections between cloud engines and other subnets, this PR opens port 2497 (the XNet port) to all nodes in the network.

This is done by splitting the current [tcp/udp]_ports_for_node_whitelist config field into two trusted_[tcp/udp]_ports_for_node_whitelist and untrusted_[tcp/udp]_ports_for_node_whitelist. The trusted ports are open only to "whitelisted" nodes (i.e., non-type4 nodes for non-type4 nodes, and all nodes for type4 nodes, as dictated by the is_whitelisted function in firewall.rs), while the untrusted ports are always open to all nodes.

@pierugo-dfinity pierugo-dfinity added the CI_ALL_BAZEL_TARGETS Runs all bazel targets label May 20, 2026
@github-actions github-actions Bot added the feat label May 20, 2026
@pierugo-dfinity pierugo-dfinity marked this pull request as ready for review May 20, 2026 12:01
@pierugo-dfinity pierugo-dfinity requested review from a team as code owners May 20, 2026 12:01
Comment on lines +324 to +327
trusted_tcp_ports_for_node_whitelist: [22, 4100, 8080],
trusted_udp_ports_for_node_whitelist: [4100],
untrusted_tcp_ports_for_node_whitelist: [2497],
untrusted_udp_ports_for_node_whitelist: [],
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I understand what trusted/untrusted means in this context. Maybe something like this?

Suggested change
trusted_tcp_ports_for_node_whitelist: [22, 4100, 8080],
trusted_udp_ports_for_node_whitelist: [4100],
untrusted_tcp_ports_for_node_whitelist: [2497],
untrusted_udp_ports_for_node_whitelist: [],
whitelisted_node_tcp_ports: [22, 4100, 8080],
whitelisted_node_udp_ports: [4100],
all_node_tcp_ports: [2497],
all_node_udp_ports: [],

I think that would also be closer to the existing naming in the orchestrator

Copy link
Copy Markdown
Contributor Author

@pierugo-dfinity pierugo-dfinity May 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, renamed here. I kept the "whitelist" at the end in contrast to the next field ports_for_http_adapter_blacklist

@pierugo-dfinity pierugo-dfinity force-pushed the pierugo/firewall/open-xnet-to-all-nodes branch from d9cd7d0 to 712017d Compare May 20, 2026 15:11
Copy link
Copy Markdown
Contributor

@Bownairo Bownairo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approving node changes

// Insert the whitelisting rules at the top of the list (highest priority)
tcp_rules.insert(0, tcp_node_whitelisting_rule);
udp_rules.insert(0, udp_node_whitelisting_rule);
// Insert the ic-http-adapter rule at the top of the list (highest priority)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment here suggests that the order matters. Should we preserve it, and also add one above in get_node_whitelisting_rules?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point 679f012

Comment on lines +324 to +327
whitelisted_nodes_tcp_ports_whitelist: [22, 4100, 8080],
whitelisted_nodes_udp_ports_whitelist: [4100],
all_nodes_tcp_ports_whitelist: [2497],
all_nodes_udp_ports_whitelist: [],
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any system test we could adapt to test this?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added the firewall_correctness_test system test and moved firewall-related system tests into their own folder (with common functionality in its own crate ic_firewall_system_test_utils. LMK what you think of it: cbd870a.

Btw, by writing it, I realized that port 4100 now uses UDP and we can thus remove TCP from the list of allowed rules: 521da89.

@pierugo-dfinity pierugo-dfinity requested a review from a team as a code owner May 22, 2026 13:46
@github-actions github-actions Bot added the @idx label May 22, 2026
Comment on lines +76 to +78
static TCP_PORTS_CLOSED: &[u32] = &[23, 24, 25];

static CLOUD_ENGINE_NODE_REWARD_TYPES: &[NodeRewardType] = &[
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
static TCP_PORTS_CLOSED: &[u32] = &[23, 24, 25];
static CLOUD_ENGINE_NODE_REWARD_TYPES: &[NodeRewardType] = &[
const TCP_PORTS_CLOSED: &[u32] = &[23, 24, 25];
const CLOUD_ENGINE_NODE_REWARD_TYPES: &[NodeRewardType] = &[

NodeRewardType::Type4dot1,
NodeRewardType::Type4dot2,
NodeRewardType::Type4dot3,
NodeRewardType::Type4dot3,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
NodeRewardType::Type4dot3,
NodeRewardType::Type4dot4,

More generally, I think by now we have multiple of these definitions (along with is_cloud_engine(), and similar) across the repository. Eventually I think we should have a single definition somewhere (i.e. in rs/protobuf/src/registry/node.rs), to reduce duplication and to make sure they don't diverge.

Comment on lines +17 to +19
* Ports in `trusted_tcp_ports_for_node_whitelist` should only be reachable from
whitelisted nodes.
* Ports in `untrusted_tcp_ports_for_node_whitelist` should be reachable from all IC
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are these still the old names?

Comment on lines +157 to +162
async fn add_dummy_registry_rule(nns_node: &IcNodeSnapshot, log: &Logger) {
let ipv6_prefixes = get_config().firewall.unwrap().default_rules[0]
.ipv6_prefixes
.clone();
// This is actually not a dummy rule. It still allows inbound SSH such that the test driver can
// connect to the nodes and perform the test.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe something like this?

Suggested change
async fn add_dummy_registry_rule(nns_node: &IcNodeSnapshot, log: &Logger) {
let ipv6_prefixes = get_config().firewall.unwrap().default_rules[0]
.ipv6_prefixes
.clone();
// This is actually not a dummy rule. It still allows inbound SSH such that the test driver can
// connect to the nodes and perform the test.
async fn add_ssh_only_registry_rule(nns_node: &IcNodeSnapshot, log: &Logger) {
let ipv6_prefixes = get_config().firewall.unwrap().default_rules[0]
.ipv6_prefixes
.clone();

Comment on lines +121 to +123
// Wait for a while to make sure that all nodes have updated their firewall rules according
// to the new registry configuration.
tokio::time::sleep(std::time::Duration::from_secs(20)).await;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems like it could be flaky. Alternatively maybe we could retry one of the tcp connections until it succeeds, or check the firewall_registry_version metric of all nodes

Comment on lines +26 to +28
tags = [
"colocate",
],
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These used to be "long_test"s. They aren't anymore, is that intentional?

execute_add_firewall_rules_proposal(
log,
nns_node,
FirewallRulesScope::Global,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did the scope change intentionally?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants