Skip to content

Feature Proposal: Signed time sync from explicitly trusted advert sources (aka "clock towers") #2416

@mangelajo

Description

@mangelajo

This is a refinement on #1329 with a security/authenticity angle, and also relates to #1332, #1426, #2150 and #440. Filing it as a separate thread because the trust-model piece is what's new. Happy to fold it into one of the existing issues if maintainers prefer.

Why

We run a Spanish mesh on EU868 with a mix of nRF52 / ESP32 nodes. Most of them have no battery-backed RTC, and only a few have GPS (and even those don't always feed the system clock; see #1426). The result is what you'd expect: repeaters drift, some come back from a reset stuck somewhere in 2024, and the spread gets worse over weeks. Multiple existing issues converge on the same root cause.

I added a small !clocks query to our companion bot that lists every repeater whose advert timestamp is more than 30 s off from our local clock. The list is uncomfortably long, which got me thinking about how a network like this could correct itself without centralised infrastructure or a dedicated time-sync packet type.

The idea

Adverts are already signed with the source's ed25519 key, so authenticity comes for free: if a receiver verifies the signature against a key it recognises, the timestamp inside really came from that node. Authority (i.e. "do I believe this node has good time?") is the missing piece, and it should be a per-receiver decision, not a network-wide one.

Concretely:

  1. Add a bit of metadata in the advert that says "I have an authoritative clock source" and what kind: RTC / GPS / DCF77-style / host-supplied via serial / none. Receivers use this both for filtering and for weighting.
  2. Each receiver keeps a local trust list of pubkeys it accepts as time references. Trust is explicit: the operator adds the keys out of band. No TOFU, no learned trust.
  3. When N adverts from trusted sources arrive within a short window, the receiver takes the median timestamp and steps the local clock. The median (rather than mean, as @tonurics noted in Clock-synchronization feature based on the average time from the last N adverts #1329) gives robustness against a single bad source: broken RTC, GPS spoofing, whatever.
  4. Replay is handled by enforcing strict-monotonic timestamps per pubkey: a new advert from a trusted source must be newer than the last one accepted from that key. The 4-byte epoch field already in the advert is enough. A separate counter is only needed if sub-second guarantees matter.

The signature work is already there. The new pieces are: the source-type bits in the advert, the local trust list, and the median + monotonic check on the receive side.

How it would look in practice

A repeater with an RTC marks "trusted clock" + source=RTC in its outbound adverts. A handheld ESP32 without RTC, on first deploy, gets a few neighbour repeaters' pubkeys added to its trust list via the companion app, a repeater instead would get those "clocktower" trusted keys added by the serial configurator. When adverts arrive, the signature is verified, the source bit checked, the pubkey looked up. Adverts that pass go into a small ring buffer. On a configurable cadence, or whenever the local clock is clearly wrong (pre-build-date, or off by >1 h from the median of trusted sources), the node steps to the median.

Big jumps (>1 h) only get applied if at least two trusted sources agree, so a single broken source can't drag the network with it.

Things I'd like feedback on

  • Wire format. The flags byte in the advert is already fully allocated. The cleanest path is probably to repurpose one of the feat1/feat2 2-byte fields to carry (source_type, accuracy_class). A small sub-spec would help.
  • Cold-boot / wildly-wrong-clock case. Strict monotonicity has to be relaxed when the local clock is obviously absurd, otherwise a node that comes back from a reset stuck in the future (sync clock and time commands will fail on repeater if repeaters clock is in future #1332) will reject every correct sync attempt. Heuristic: if the local clock predates the firmware build date, or the spread of trusted sources says we're off by more than X, the next sync ignores monotonicity.
  • Hop latency. Floods can take a couple of seconds end-to-end on a busy mesh, so this design isn't aiming at sub-second accuracy. For mesh app use cases (message ordering, display, advert dedup) it's fine, but worth flagging explicitly.
  • Coexistence with Clock-synchronization feature based on the average time from the last N adverts #1329. If no trust list is configured, the receiver could fall back to the averaging approach in Clock-synchronization feature based on the average time from the last N adverts #1329. The two seem complementary rather than competing.
  • Coexistence with [Feature request] Add support for alternative time adjustment sources, such as DCF77 #2150. The TimeProvider abstraction proposed there fits cleanly: a "trusted advert source" is just another implementation next to GPS/DCF77.
  • GPS as a time source. This proposal only pays off if at least some nodes actually have authoritative time to share. Today most boards with a GPS don't feed it into the system clock (the situation Update RTC from GPS #1426 describes), and that's worth tackling as its own RFE/PR. The two land cleanly in either order, but together a GPS-equipped repeater becomes a natural anchor for its neighbours.

Companion-side note

Slightly orthogonal but related: in our bot we already log every received advert and compute drift vs the local clock, which is how we found which repeaters are out of sync. If this ships, exposing "last applied correction (delta, source key, when)" via the existing companion API would make it easy to show in apps.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions