docs: add health check, readiness, and liveness endpoints guide#281
docs: add health check, readiness, and liveness endpoints guide#281sriramveeraghanta wants to merge 1 commit into
Conversation
Document the liveness, readiness, and detailed health probes exposed by self-hosted Plane (Commercial Edition) services, and how to consume them. - New page: self-hosting/manage/health-checks.md covering the primary API probes (/api/live/, /api/ready/, /api/health/), per-service endpoints (pi, live, silo, flux, node-runner) and the Go monitor prober, with curl examples, exact JSON responses, status codes, and the 5s result cache. - Add "use in your infrastructure" examples: Kubernetes liveness/readiness probes, Docker Compose healthchecks, and external uptime/LB guidance, based on Plane's actual Helm and Compose probe configs. - Note the edition split: Community Edition exposes only the root / check; the dedicated probes are Commercial Edition. - Wire the page into the Self-hosting > Manage sidebar.
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
📝 WalkthroughWalkthroughThis PR introduces comprehensive documentation for health check probes in self-hosted Plane installations. It adds a new documentation page that covers probe endpoint semantics, service-specific endpoint details across the Plane architecture, and operational guidance for Kubernetes, Docker Compose, and external monitoring with troubleshooting. ChangesHealth checks documentation for self-hosted Plane
Estimated code review effort🎯 2 (Simple) | ⏱️ ~12 minutes Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@docs/self-hosting/manage/health-checks.md`:
- Line 452: The text uses non-Kubernetes term "start period" which is
misleading; update the sentence to use Kubernetes probe terminology only by
replacing "start period" with either an explicit reference to startupProbe
behavior or by stating the combined effect of initialDelaySeconds plus
startupProbe settings, and clarify that the ~90s comes from initialDelaySeconds:
30 plus any configured startupProbe delays; also change "live`/`silo` readiness
probe" to reference the actual probe type (readinessProbe or startupProbe) and
replace "period" with Kubernetes field name periodSeconds, e.g., note that a
readinessProbe with failureThreshold: 30 and periodSeconds: 10 tolerates ~300s
before marking unhealthy; ensure the paragraph mentions startupProbe if that was
intended.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 1988d15d-ffef-46ea-ae66-3ad8bcd6c2ec
📒 Files selected for processing (2)
docs/.vitepress/config.mtsdocs/self-hosting/manage/health-checks.md
| ``` | ||
|
|
||
| ::: info Timing implications | ||
| With `initialDelaySeconds: 30` plus the start period, the API and pi-api pods take roughly 90 seconds before they are considered ready — this is intentional, giving migrations and warm-up time to complete. The `live`/`silo` readiness probe with `failureThreshold: 30` at a 10s period tolerates up to ~305 seconds of startup before marking the pod unhealthy. |
There was a problem hiding this comment.
Kubernetes timing note uses Docker Compose terminology.
“start period” is not a Kubernetes probe field, so this guidance can mislead probe tuning. Rephrase using Kubernetes terms only (or explicitly reference startupProbe if that’s what you mean).
Suggested doc fix
-With `initialDelaySeconds: 30` plus the start period, the API and pi-api pods take roughly 90 seconds before they are considered ready — this is intentional, giving migrations and warm-up time to complete. The `live`/`silo` readiness probe with `failureThreshold: 30` at a 10s period tolerates up to ~305 seconds of startup before marking the pod unhealthy.
+With `initialDelaySeconds: 30`, readiness checks begin after 30 seconds for API and pi-api. Combined with `periodSeconds` and `failureThreshold`, this gives enough warm-up time for migrations and startup. The `live`/`silo` readiness probe with `failureThreshold: 30` at a 10s period tolerates up to ~305 seconds of startup before marking the pod unhealthy.📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| With `initialDelaySeconds: 30` plus the start period, the API and pi-api pods take roughly 90 seconds before they are considered ready — this is intentional, giving migrations and warm-up time to complete. The `live`/`silo` readiness probe with `failureThreshold: 30` at a 10s period tolerates up to ~305 seconds of startup before marking the pod unhealthy. | |
| With `initialDelaySeconds: 30`, readiness checks begin after 30 seconds for API and pi-api. Combined with `periodSeconds` and `failureThreshold`, this gives enough warm-up time for migrations and startup. The `live`/`silo` readiness probe with `failureThreshold: 30` at a 10s period tolerates up to ~305 seconds of startup before marking the pod unhealthy. |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@docs/self-hosting/manage/health-checks.md` at line 452, The text uses
non-Kubernetes term "start period" which is misleading; update the sentence to
use Kubernetes probe terminology only by replacing "start period" with either an
explicit reference to startupProbe behavior or by stating the combined effect of
initialDelaySeconds plus startupProbe settings, and clarify that the ~90s comes
from initialDelaySeconds: 30 plus any configured startupProbe delays; also
change "live`/`silo` readiness probe" to reference the actual probe type
(readinessProbe or startupProbe) and replace "period" with Kubernetes field name
periodSeconds, e.g., note that a readinessProbe with failureThreshold: 30 and
periodSeconds: 10 tolerates ~300s before marking unhealthy; ensure the paragraph
mentions startupProbe if that was intended.
Summary
Adds a self-hosting guide documenting the liveness, readiness, and health check endpoints exposed by Plane services, sourced from the
plane-eerepo and verified against the actual code, Helm charts, and Docker Compose files.Operators currently have no documentation for wiring up uptime monitors, load-balancer health checks, or Kubernetes/Docker probes against a self-hosted instance. This page closes that gap.
What's included
/api/live/,/api/ready/(DB + cache),/api/health/(detailed) — withcurlexamples, exact JSON responses, status codes (200/503), and the 5s per-process result cache.pi,live(incl. secret-key memory endpoint),silo(incl. its non-standard201codes),flux,node-runner, and the Gomonitorprober.livenessProbe/readinessProbeYAML, Docker Composehealthcheck, and external uptime/LB guidance, based on Plane's actual Helm and Compose probe configs.503/401/500responses.Edition note
Community Edition exposes only the basic root
/health check; the dedicated probes are Commercial Edition features — the page is badged and the distinction is called out.Other changes
docs/.vitepress/config.mts.Validation
pnpm check:format(Prettier) passespnpm build(VitePress) succeeds with no dead-link errorsSummary by CodeRabbit