Skip to content

feat: room server — reconnect sync fix, pubkey alias system, name resolution endpoint#220

Closed
tjdownes wants to merge 7 commits intorightup:devfrom
tjdownes:feat/room-server-names
Closed

feat: room server — reconnect sync fix, pubkey alias system, name resolution endpoint#220
tjdownes wants to merge 7 commits intorightup:devfrom
tjdownes:feat/room-server-names

Conversation

@tjdownes
Copy link
Copy Markdown
Contributor

@tjdownes tjdownes commented May 1, 2026

Summary

Four improvements to the room server, each independently useful:

1. Fix: queued messages not delivered after client eviction (room_server.py)

When a client's last_activity is 0 (evicted by the inactivity timer), reconnecting clients were silently skipped by the sync loop — their push_failures counter had already hit MAX_PUSH_FAILURES and the loop checked that guard before checking for a reconnect. Messages queued during their absence were never delivered.

Fix: detect reconnect first (active client + last_activity == 0), reset push_failures and restore last_activity, then continue to the normal push path.

2. Fix: deduplicate client retransmissions (room_server.py)

Clients retry a message when they don't receive an ack fast enough — common for distant nodes with weak signal. Without deduplication every retry lands as a separate message in the room.

Add an in-memory _recent_posts cache keyed on (pubkey_hex, message_text). Any identical (author, text) pair arriving within DEDUP_WINDOW_SECS (30 s) is silently dropped. Cache is evicted when it grows past 500 entries to bound memory usage. Server-authored messages (web UI posts) bypass the check via the existing allow_server_author flag.

3. Feat: pubkey alias system (sqlite_handler.py, storage_collector.py, api_endpoints.py)

Phone-app users connecting via the room server protocol never broadcast LoRa adverts, so their messages always appear as an unresolved pubkey. This adds a pubkey_aliases table so admins can manually assign display names.

  • pubkey_aliases table created on DB init; migration 12 for existing databases
  • get_node_name_by_pubkey() on StorageCollector checks aliases first (highest priority), then falls back to the adverts table
  • New endpoints:
    • GET /api/pubkey_aliases — list all aliases
    • POST /api/pubkey_alias — set alias {pubkey, alias}
    • DELETE /api/pubkey_alias?pubkey=... — remove alias
    • GET /api/resolve_pubkey_names?pubkeys=a,b,c — lightweight batch lookup for the UI poll cycle, returns {pubkey: name|null} map
  • GET /api/acl_clients response now includes node_name (alias → advert fallback) for each connected client

4. Test coverage (tests/)

58 new tests across 4 files:

File Tests Coverage
test_sqlite_pubkey_aliases.py 22 CRUD against real in-memory SQLite
test_storage_collector_name_resolution.py 11 alias-first priority chain, fallback, None cases
test_room_server_reconnect.py 14 eviction/reconnect sync loop, max-failure skip, normal push, dedup (5 cases)
test_resolve_pubkey_names_endpoint.py 12 batch resolution, companion fallback, alias priority, error handling

Tests use importlib.util.spec_from_file_location to load modules directly, bypassing __init__.py chains that pull in unavailable hardware dependencies.


UI changes

Companion UI PR: tjdownes/pyMC-RepeaterUI#1 (draft, targeting pyMC-dev/pyMC-RepeaterUI)

Per discussion with @rightup, built UI assets are not included here — the UI source PR should be reviewed and built separately.

🤖 Generated with Claude Code

tjdownes and others added 7 commits April 30, 2026 18:08
When a client (e.g. a phone-app user) disconnects and is inactive for
more than INACTIVE_CLIENT_TIMEOUT (1 hour), the eviction routine marks
their room_client_sync row with last_activity=0 and removes them from
the in-memory ACL.  Previously, when that client reconnected and was
re-added to the ACL, the sync loop would see last_activity=0 and
permanently skip them — so any messages that arrived while they were
offline were never delivered.

Fix: treat last_activity==0 as a reconnect signal rather than a
permanent skip.  The sync loop now restores the client's sync state
(preserving sync_since so already-seen messages are not replayed,
resetting push_failures and pending_ack_crc) and falls through to the
normal unsynced-message check so the backlog is delivered immediately.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Adds a new `pubkey_aliases` table so admins can assign human-readable
display names to client public keys that never appear in the adverts
table (e.g. MeshCore phone-app users whose device does not broadcast
LoRa adverts).

- sqlite_handler.py: `pubkey_aliases` table created on DB init plus
  migration 12 for existing databases; CRUD helpers set/get/delete/list
- storage_collector.py: `get_node_name_by_pubkey()` now checks aliases
  first (highest priority), then falls back to the adverts table as
  before; new wrapper methods set/delete/list aliases forwarded to the
  DB handler
- api_endpoints.py: new GET /api/pubkey_aliases and
  POST|DELETE /api/pubkey_alias endpoints; GET /api/resolve_pubkey_names
  batch lookup for the UI poll cycle; `acl_clients` response now
  includes `node_name` resolved via the same lookup chain

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…nd API endpoint

53 tests across four files:
- test_sqlite_pubkey_aliases: CRUD against real in-memory SQLite
- test_storage_collector_name_resolution: alias-first priority chain
- test_room_server_reconnect: eviction/reconnect sync loop behaviour
- test_resolve_pubkey_names_endpoint: batch name resolution endpoint

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Clients retry a message when they don't receive an ack fast enough
(common for distant nodes with weak signal). Without deduplication every
retry lands as a separate message in the room.

Add an in-memory `_recent_posts` cache keyed on (pubkey_hex, message_text).
Any identical (author, text) pair arriving within DEDUP_WINDOW_SECS (30s)
is silently dropped. Cache is evicted when it grows past 500 entries to
bound memory usage.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
5 tests covering TestAddPostDedup:
- First post is stored normally
- Identical (author, text) within 30s window is dropped
- Same text accepted again after the window expires
- Different text from the same author always accepted
- Same text from different authors both stored independently

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The MeshCore client retransmits unacknowledged messages up to 3 times.
Each retransmission carries the SAME sender_timestamp (the original send
time encoded in the packet by the firmware), making it a reliable identity
for deduplication regardless of how the text bytes decode.

Previously the dedup key was (author, message_text) + a 30 s time window.
This failed in two ways:
  1. The client's retry interval can exceed 30 s, so the third attempt
     often slipped through.
  2. pymc_core decodes invalid trailing bytes as U+FFFD (□) via
     errors="replace".  If one copy decodes as "Test" and a retry as
     "Test□" the text keys differ and the duplicate is stored anyway.

Changes:
- pymc_core text handler now exposes sender_timestamp, attempt, and
  txt_type in packet.decrypted so callers can use them.
- repeater text.py extracts the client timestamp and prefers it over
  int(time.time()) when calling add_post(); also strips U+FFFD from
  message text alongside the existing null-byte strip.
- room_server add_post() uses (author_hex, sender_timestamp) as the
  primary dedup key.  Falls back to (author_hex, normalised_text) +
  DEDUP_WINDOW_SECS for the web-API path where sender_timestamp is 0.
- Tests updated: 7 → 7 cases covering both the timestamp path and the
  text-window fallback, including the mixed-text retry scenario.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…hanges

The previous commit introduced a timestamp-based dedup path that required
pymc_core to expose sender_timestamp in packet.decrypted.  That dependency
is unnecessary — the simpler fix is to normalise the message text before
comparing, which collapses all retransmission variants to the same string.

pymc_core decodes invalid trailing bytes as U+FFFD (□) via errors="replace".
The firmware null-terminates C strings.  Either way, all retransmissions of
the same message should produce identical normalised text.

Changes:
- text.py: extend rstrip to also strip U+FFFD (□) alongside existing \x00 strip.
  Comment explains why: retransmissions can have trailing replacement chars.
- room_server.py: revert to simple (author, normalised_text) + time-window dedup.
  No sender_timestamp involvement; works with any pymc_core version.
- tests: restore 5+2 test layout covering dedup, window expiry, per-author
  isolation, and web-API (sender_timestamp=0) path.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@tjdownes tjdownes closed this May 1, 2026
@tjdownes tjdownes deleted the feat/room-server-names branch May 1, 2026 14:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant