Observability Docker Stack

A full observability stack for a Gunicorn/FastAPI application, running entirely in Docker Compose. Covers all four pillars — metrics · logs · traces · profiles — with alerting out of the box.

Grafana dashboard: FastAPI Full Observability

Dashboard Preview

Open http://localhost:3000 after starting the stack. Use the Project dropdown to filter by Docker Compose project name.

Architecture

How each signal travels

Metrics

Backend exposes /metrics in Prometheus format (multiprocess-safe via prometheus_multiproc)
Alloy pulls it every 15s — opt-in via metrics.scrape: "true" docker label on any service
Alloy remote-writes to VictoriaMetrics
vmalert queries VictoriaMetrics every 15s → fires to Alertmanager → Telegram / email

Logs

Backend writes structured logfmt to stdout (includes trace_id on every line)
Alloy reads container stdout via Docker API — no log driver config needed
Docker Compose labels (project, service, container) are attached as Loki stream labels
Logs are queryable in Grafana with LogQL

Traces

Backend's OpenTelemetry SDK sends spans via OTLP gRPC → Alloy (:4317) → Tempo
Tempo generates span metrics (RPS, latency, errors per operation) and pushes them to VictoriaMetrics
trace_id in log lines creates a live link from any log entry to its full trace

Profiles

Pyroscope SDK pushes CPU flame graphs via HTTP → Alloy (:4040) → Pyroscope storage
Grafana links profiles to traces via Tempo's tracesToProfiles integration

How the Backend Works

The backend/ directory is a minimal FastAPI + Gunicorn application wired with all four observability signals. It is intentionally kept simple — the goal is to show the instrumentation, not the business logic.

Middleware stack

Every request passes through three middlewares in order:

RequestAccessMiddleware → MetricsMiddleware → FastAPI router

Middleware	What it does
`RequestAccessMiddleware`	Generates `request_id`, writes a structured logfmt access log line with method, path, status, duration
`MetricsMiddleware`	Records `requests_total`, `responses_total`, `request_duration_seconds`, `requests_in_progress`, `exceptions_total`

Multiprocess metrics

Gunicorn forks multiple worker processes. The standard Prometheus client is not process-safe by default. The backend uses prometheus_multiproc mode — each worker writes its metrics to a shared directory (/tmp/prometheus_multiproc). The /metrics endpoint aggregates all files before responding.

Worker lifecycle hooks in gunicorn.conf.py ensure per-worker gauges are cleaned up on exit.

OpenTelemetry

Traces are sent via OTLP gRPC to Alloy (:4317) using BatchSpanProcessor. FastAPIInstrumentor automatically creates a root span for every request. PyroscopeSpanProcessor links each root span to a Pyroscope profile — enabling the Profiles button in Tempo.

Structured logging

Every log line is emitted in logfmt format and includes:

level, timestamp, message
request_id — unique per request, also returned as X-Request-Id response header
trace_id, span_id — injected from the active OpenTelemetry span, enabling Logs → Traces navigation in Grafana

Grafana Dashboard

Overview — Apdex · Error Rate · Total Requests · Workers

Apdex — user satisfaction score based on latency thresholds
Error Rate — percentage of 5xx responses
Total Requests — cumulative request count
Workers — live Gunicorn worker count (step graph, drops to 0 on crash)

Throughput (RPS) · P50 Latency

RPS total, broken down by path and by status code
P50 latency — total and per path

Latency — P95 · P99

P95 and P99 — total and broken down by path.

Top 10 Slowest Endpoints (P95 bar gauge, color-coded green → red)
Average Duration by endpoint

In-flight Requests

Requests currently being processed — total and by path. Useful for detecting request pile-ups.

Errors

4xx Rate by path — client errors
5xx Rate by path — server errors
Exceptions by Type — unhandled Python exceptions with rate

Traces & Span Metrics

Service Map — visual request flow graph with avg latency and RPS per node
Recent Traces — clickable list, opens full trace in Tempo
Span RPS / P99 Latency / Error Rate — broken down per operation

Logs

Log Volume by Level — histogram showing INFO / ERROR / WARNING over time
Log Stream — live log view; each line includes request_id, trace_id, span_id

Profiling

CPU Time Consumed — total CPU usage over time across all workers
Flame Graph — aggregated call stack for the selected time range; shows hottest functions by self/total CPU time

Cross-signal Navigation

Every log line contains a trace_id linking it to a distributed trace.

Logs → Traces:

Click any line in the Log Stream to expand it
In the Links section, click "Open in Tempo"

From a Trace span you can jump to:

Logs for this span — correlated log lines in Loki
Span metrics — RPS / latency / error rate for that operation
Profile — CPU flame graph for that request

Generate Load

To populate the dashboards with real traffic, run the included load script:

./load.sh           # ~20 req/s against http://localhost:8000
./load.sh 50        # custom rate
./load.sh 10 http://localhost:8001  # custom rate + URL

The script cycles through all endpoints — normal requests, 4xx, 5xx, slow — so every dashboard panel gets data within seconds.

Quick Start

1. Configure alerting

Edit observability/alertmanager/config.yaml and fill in the placeholders:

global:
  smtp_smarthost: '<SMTP_HOST>:<SMTP_PORT>'   # e.g. smtp.gmail.com:587
  smtp_from: '<SMTP_FROM>'
  smtp_auth_username: '<SMTP_USERNAME>'
  smtp_auth_password: '<SMTP_PASSWORD>'

receivers:
  - name: 'default-receiver'
    email_configs:
      - to: '<ALERT_EMAIL>'

    telegram_configs:
      - bot_token: '<TELEGRAM_BOT_TOKEN>'     # from @BotFather → /newbot
        chat_id: <TELEGRAM_CHAT_ID>           # from @userinfobot

If you only need Telegram — remove email_configs. If you only need email — remove telegram_configs.

2. Start

docker compose up -d

3. Open Grafana

http://localhost:3000   →  admin / admin

Stack

Component	Role	Version
Grafana Alloy	Collector — scrapes metrics, collects logs, receives traces & profiles	v1.12.0
VictoriaMetrics	Metrics storage (Prometheus-compatible)	v1.131.0
vmalert	Evaluates PromQL alert rules against VictoriaMetrics	v1.131.0
Alertmanager	Alert routing, deduplication & notifications	v0.29.0
Loki	Log storage & querying	v3.5.9
Tempo	Distributed trace storage + span metrics generation	v2.8.0
Pyroscope	Continuous profiling storage	v1.17.0
Grafana	Dashboards, Explore, cross-signal navigation	v12.4.0
Backend	Example FastAPI app (Gunicorn + Uvicorn workers)	—

Alerting

Rules live in observability/vmalert/rules/fastapi.yaml, evaluated every 15s.

Alert	Fires when	Severity	Delay
`BackendDown`	No metrics received from backend	critical	1m
`HighErrorRate`	5xx responses > 5% of total	warning	2m
`HighLatencyP99`	p99 latency > 1s	warning	2m

Critical alerts suppress warnings with the same alertname via inhibit rules
All alerts route to default-receiver → Telegram + email
To adjust thresholds — edit expr in observability/vmalert/rules/fastapi.yaml

Adding Your Own Service

To opt a service into metrics scraping, add these docker labels:

services:
  my-service:
    labels:
      metrics.scrape: "true"           # required
      metrics.path: "/custom/metrics"  # optional, defaults to /metrics

Alloy auto-discovers the container and attaches project, service, container labels. No Alloy config changes needed.

Logs are collected from all running containers automatically — no labels required.

Multiple Environments

Run dev and staging side by side:

docker compose -p dev     up -d
docker compose -p staging up -d

Each project gets its own project label on all metrics and logs. Switch between them in Grafana using the Project dropdown at the top of the dashboard.

Ports

Service	Port	Purpose
Grafana	`3000`	Dashboards — `http://localhost:3000`
Backend API	`8000`	FastAPI docs — `http://localhost:8000/docs`
Alloy	`12345`	Pipeline debug UI — `http://localhost:12345`

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
backend		backend
docs/screenshots		docs/screenshots
observability		observability
.gitignore		.gitignore
README.md		README.md
docker-compose.yaml		docker-compose.yaml
load.sh		load.sh

Folders and files

Latest commit

History

Repository files navigation

Observability Docker Stack

Contents

Dashboard Preview

Architecture

How each signal travels

How the Backend Works

Middleware stack

Multiprocess metrics

OpenTelemetry

Structured logging

Grafana Dashboard

Overview — Apdex · Error Rate · Total Requests · Workers

Throughput (RPS) · P50 Latency

Latency — P95 · P99

In-flight Requests

Errors

Traces & Span Metrics

Logs

Profiling

Cross-signal Navigation

Generate Load

Quick Start

1. Configure alerting

2. Start

3. Open Grafana

Stack

Alerting

Adding Your Own Service

Multiple Environments

Ports

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages