Skip to content

refactor: update existing grafana alerts#173

Merged
amdove merged 6 commits intomainfrom
update-current-alerts
Mar 11, 2026
Merged

refactor: update existing grafana alerts#173
amdove merged 6 commits intomainfrom
update-current-alerts

Conversation

@amdove
Copy link
Collaborator

@amdove amdove commented Mar 10, 2026

Description

No changes to functionality, only Grafana alert configuration and format as follows.

Severity adjustments:

  • Loki WAL alert: CRITICAL → WARNING
  • CrashLoopBackoff: CRITICAL → WARNING
  • Node Not Ready: CRITICAL → WARNING

FSx alerts:

  • Split single alert into warning (80%) and critical (90%) thresholds

CloudWatch cleanup:

  • Removed EC2 NetworkOut and NetworkPacketsOut alerts (low signal, high noise)

Pod alerts cleanup:

  • Removed PodError, PodNotHealthy, and PodRestarts

RDS alerts:

  • Updated CPU utilization alert (80% threshold)
  • Split storage into warning (10 GiB) and critical (5 GiB); removed old single-threshold alert
  • Updated freeable memory threshold (512 MiB → 256 MiB) and unpaused
  • Removed read latency alert

Load balancer alerts:

  • Standardized annotation format to match other alerts (WHERE/DETAILS sections)

Node alerts:

  • Added comment header template for consistency

RDS storage thresholds:

  • Uses static thresholds rather than percentage-based because CloudWatch only exposes FreeStorageSpace, not total allocated storage.

Category of change

  • Bug fix (non-breaking change which fixes an issue)
  • Version upgrade (upgrading the version of a service or product)
  • New feature (non-breaking change which adds functionality)
  • Build: a code change that affects the build system or external dependencies
  • Performance: a code change that improves performance
  • Refactor: a code change that neither fixes a bug nor adds a feature
  • Documentation: documentation changes
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)

Checklist

  • I have reviewed my own diff and added inline comments on lines I want reviewers to focus on or that I am uncertain about

@amdove amdove marked this pull request as ready for review March 11, 2026 19:16
@amdove amdove requested a review from a team as a code owner March 11, 2026 19:16
@amdove amdove changed the title Update existing alerts refactor: update existing grafana alerts Mar 11, 2026
@amdove amdove requested a review from timtalbot March 11, 2026 19:22
@amdove amdove enabled auto-merge March 11, 2026 19:23
Copy link
Collaborator

@stevenolen stevenolen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎉

@amdove amdove added this pull request to the merge queue Mar 11, 2026
Merged via the queue into main with commit 52bad5b Mar 11, 2026
4 checks passed
@amdove amdove deleted the update-current-alerts branch March 11, 2026 21:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants