Add dora_monitor: Slack alerting tool for ethrex devnet#1
Open
edg-l wants to merge 6 commits into
Open
Conversation
- guard the slot-set trim against last_known_head=0 (previously the cutoff could go negative and silently never trim) - pick canonical fork by client majority instead of highest head_slot (a minority fork can briefly be ahead during a split) - offline alert only on status=offline; synchronizing/optimistic are normal transient states and were over-paging - split Slack messages on line boundaries when they exceed 3800 chars instead of letting Slack silently truncate - distinguish Slack 429 in the error log - cap /clients/execution HTML read at 512KB to bound regex work - clearer error on unknown YAML keys (top-level and under checks:) - minor: docstring noting heartbeat snapshots aren't atomic, simpler dry-run prefix closure, cleaner status check in DoraClient._get
- post heartbeats via Block Kit (header / section / divider / context) instead of one mrkdwn blob; action alerts stay as plain text posts - new send_blocks() on SlackNotifier with text fallback for notifications - collapse online + canonical + distance=0 clients into one bucket; surface outliers (offline, synchronizing, non-canonical, lagging) above the healthy bucket with status emoji per row - status emojis: green/yellow/orange/red circles for online/sync/opt/off - dry-run patches both send and send_blocks; --debug dumps blocks JSON so it can be previewed in Slack's Block Kit Builder
Propagation timing routinely produces transient 1-2 slot leads or lags that the previous code surfaced as fork alerts (and an immediate resolved alert a tick later). Configurable via fork_confirm_ticks (default 3 = ~90s at the default 30s poll), persisted per-client in the dedup state so it survives restarts.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Adds
dora_monitor/, a Python 3.10+ tool that polls a Dora explorer API and posts Slack alerts when the tracked client (default:ethrex) has issues.Summary
status != online, and EL version drift (deploy/rollback detection)./clients/executionHTML page because Dora's/v1/clients/executionJSON endpoint reflects devp2p-crawler connectivity (connected/disconnected), not the UI'sReady/Synchronizing/Offlinestatus. This is documented in the README./api/v1/network/client_head_forks(CL view); an EL-only crash is detected indirectly via the paired beacon's head_slot stalling.Test plan
config.example.yaml, fill indora_urlandslack_webhook_url, runmake dry-runand verify alerts print to stdout without hitting Slack.make dry-run-onceagainst a live Dora instance and check parsed client data looks correct.--force-heartbeat.sync_lag_slotsthreshold, confirm a sync-lag alert fires and a recovery alert fires when the node catches up.make runfor a full poll cycle; verify state JSON is written and dedup prevents duplicate alerts on subsequent runs.--reset-stateand confirm alerts re-fire on next tick.