feat: multi-box deployment (wg-mesh + frigate-edge)#1
Merged
Conversation
A thin wrapper around `networking.wireguard.interfaces` that takes a mesh-wide `peers` definition (same on every host) and builds the interface from it, dropping the entry that matches `thisHost`. Adding a new node is a one-place edit to `peers` plus `thisHost` on the new box. Mesh is point-to-point with /32 peer allowedIPs — no subnets routed through. Exposure of services on the mesh interface is the consumer's concern (scope via `networking.firewall.interfaces.<iface>`). Includes a two-VM nixosTest (`checks.<system>.wireguard-mesh`) that brings up the mesh on a shared subnet, asserts cross-mesh reachability, and confirms the firewall opens the WG UDP port. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds a frigate-edge preset for the multi-box deployment shape: TLS + ACME + frigate, with bitcoind/fulcrum/ZMQ on another host. Edge nodes authenticate to bitcoind via USERPASS (cookie auth doesn't cross host boundaries); the credentials file is consumed via systemd's LoadCredential and templated into config.toml at start. Refactor: the ACME/nginx/TLS wiring shared between public-frigate and frigate-edge moves into a private `_internal/frigate-tls-acme.nix` helper, set up by the parent preset via the `services._roost.*` internal namespace. No behavior change for existing public-frigate consumers; the regtest-preset test still passes the same assertions. New `exposeBackends` option block on public-frigate lets a backend host bind its bitcoind RPC, ZMQ sequence publisher, and fulcrum on a mesh interface in addition to loopback, with firewall rules scoped to that interface only. Backed by the existing nix-bitcoin typed options where they support it (rpc.users, rpc.allowip), `extraConfig` where they're single-bind (rpcbind, fulcrum tcp). Test: `checks.<system>.regtest-edge` boots two VMs (backend running the full local stack with exposeBackends on; edge running frigate-edge against it) and runs the same scan-end-to-end checks regtest-preset runs, driven against the edge's Electrum listener. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The upstream wireguard module produces two units per interface: a `wireguard-wg0.service` that brings up the interface itself, and one `wireguard-wg0-peer-<X>.service` per peer that installs the peer's config. A target `wireguard-wg0.target` aggregates both. The test was waiting on the bare interface service, which returns as soon as the interface is up — before the per-peer services have installed any peers in the kernel. Pings fired immediately after hit "ping: sendmsg: Required key not available" because there was no peer matching 10.42.0.2 yet, and the 30s timeout expired before the peer service finished its setup. Wait on the target instead. It is `wantedBy = [ "multi-user.target" ]` and `wants` both the interface service and every peer service, so a target-reached state is the right "everything's installed" signal. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The exposeBackends firewall rule was hardcoding port 8332. That's mainnet's bitcoind RPC port; nix-bitcoin's `rpc.port` default tracks the chain (regtest → 18443, testnet → 18332, signet → 38332). On any non-mainnet network the firewall opens 8332 while bitcoind listens elsewhere, and edge consumers see connection refused. Surfaced by regtest-edge: bitcoind on the backend bound 18443 (regtest), the firewall opened 8332, the edge's frigate hit "Cannot connect to Bitcoin Core at http://192.168.1.2:18443" and the service exited. Pull the port from `config.services.bitcoind.rpc.port` so the firewall follows whatever bitcoind is actually doing. Safe inside this mkIf-block because exposeBackends already asserts bitcoind.manage = true, which guarantees the option is defined. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The NixOS test framework assigns each node's primary interface address as 192.168.<vlan>.<nodeNumber>, starting at nodeNumber 1 in declaration order (range 1 254 in lib/testing/network.nix:23). I had hardcoded .2 / .3 throughout, which is off-by-one — the first declared node is .1, not .2. Consequence in mesh.nix: nodeA's wireguard peer pointed at 192.168.1.3:51820 (a non-existent IP), nodeB pointed at .2 (which was *its own* address). Handshake never completed; ping timed out with no peer alive. Consequence in regtest-edge.nix: the edge tried to reach the backend at 192.168.1.2:18443, but the backend was actually at .1. Even with the firewall fix in the preceding commit, the edge couldn't find the backend at all. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The previous HMAC was computed with `openssl ... -macopt hexkey:$salt`,
which decodes the salt from hex into raw bytes and uses those bytes as
the HMAC key. bitcoind's rpcauth.py uses the salt's literal UTF-8
string bytes as the key:
hmac.new(salt.encode("utf-8"), password.encode("utf-8"), "SHA256")
Two different keys → two different HMACs. Every auth attempt the edge
sent to bitcoind was rejected with "incorrect password attempt".
Recomputed with the correct algorithm. Comment now states the exact
key derivation so the next person who hits this doesn't trip over the
same mistake.
Verifiable via:
python3 -c "import hmac; print(hmac.new('2316d0a5e8ee6339ffb4d86c983bb421'.encode(), 'testpassword'.encode(), 'SHA256').hexdigest())"
# → 34cc4776187170b359d40928b25deb28ea2bfc436c96fdd0db7150ec5211de85
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
`wait_for_open_port` defaults to 900s. When frigate fails its first connect to bitcoind (auth error, DNS, ZMQ) systemd's Restart=on-failure keeps trying every ~13s, and with DefaultStartLimitBurst=5 / DefaultStartLimitIntervalSec=10s the burst limit never trips — the restarts are spaced just far enough apart. The unit stays in eternal auto-restart while the port never opens; the test waits 15 minutes and then fails with a useless "port never opened" message. Replace the bare `wait_for_open_port(50001)` with a 60s polling loop that, on timeout, dumps the last 50 lines of frigate's journal. The 60s bound covers ~4–5 restart cycles — plenty for a legitimately slow backend boot, and short enough to surface a real configuration bug quickly. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The probe pipes Electrum JSON-RPC into `nc -q 3 127.0.0.1 50001`. The `-q` flag (wait N seconds after stdin EOF before closing) is a netcat-openbsd extension; NixOS's default nc supports `-z` but not `-q`, so the probe silently emitted nothing and the 120s loop timed out with empty responses on every iteration. regtest-preset.nix avoids this by adding `pkgs.netcat-openbsd` to `environment.systemPackages`. Mirror that here on the edge node. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
wireguard-meshmodule (modules/wireguard-mesh.nix) — thin n-peer mesh wrapper aroundnetworking.wireguard.interfaces. Identicalpeersblock on every member; onlythisHostandprivateKeyFilediffer per node.frigate-edgepreset (modules/presets/frigate-edge.nix) — TLS + ACME + frigate, with bitcoind/fulcrum/ZMQ on another host. USERPASS auth (cookie can't cross host boundaries). Does not import nix-bitcoin.modules/_internal/frigate-tls-acme.nix. Bothpublic-frigateandfrigate-edgeimport it. No behavior change for existingpublic-frigateconsumers.exposeBackendsoption onpublic-frigate— binds bitcoind RPC + ZMQ + fulcrum on a configurable mesh address in addition to loopback, with firewall rules scoped to a single interface.Tests added
checks.<system>.wireguard-mesh— two-VM mesh ping + firewall scope.checks.<system>.regtest-edge— two-VM end-to-end: backend runs full nix-bitcoin + public-frigate withexposeBackendson; edge runs frigate-edge against it; verifies the edge serves an Electrum response with chain tip via the remote fulcrum proxy.Test plan
🤖 Generated with Claude Code