feat: migrate observability to Axiom#906
Conversation
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Replace GCP Cloud Logging JSON output with Axiom's zerolog writer. Logs are shipped directly to Axiom's ingest API in production. Local dev still uses console logger. jsonLogger() kept for rollback. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Replace GCP-specific TraceContext format with standard OTel field names so Axiom can correlate logs with traces. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Replace Uptrace with standard OTLP HTTP exporters pointed at api.axiom.co for traces and metrics. Old providers kept as dead functions for rollback. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Add AXIOM_TOKEN and AXIOM_DATASET configuration. Mark UPTRACE_DSN as deprecated but keep for rollback. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
PR Summary
|
Not up to standards ⛔🔴 Issues
|
| Category | Results |
|---|---|
| CodeStyle | 50 minor |
🟢 Metrics 19 complexity · 4 duplication
Metric Results Complexity 19 Duplication 4
NEW Get contextual insights on your PRs based on Codacy's metrics, along with PR and Jira context, without leaving GitHub. Enable AI reviewer
TIP This summary will be updated as you push new changes.
Greptile SummaryThis PR migrates the production observability stack from GCP Cloud Logging + Uptrace to Axiom, covering all three signals (logs via the axiom-go zerolog adapter, traces and metrics via OTLP HTTP exporters pointed at
Confidence Score: 3/5The observability wiring is correct in structure, but the error handling around missing credentials has two gaps that could leave production running with no traces/metrics and no logs in Axiom, with no startup-time indication of the problem. When AXIOM_TOKEN or AXIOM_DATASET is absent, the logger silently falls back to stderr without printing why, and the OTLP exporters are constructed successfully but will silently reject every export batch with a 401. Both gaps mean a misconfigured deployment looks healthy at startup while losing all observability data. The fixes are small but should land before this reaches production. api/pkg/di/container.go — both axiomLogger and initializeAxiomTraceProvider need attention; the other files are straightforward. Important Files Changed
Sequence DiagramsequenceDiagram
participant App as API Service
participant Logger as axiomLogger
participant OTLP as initializeAxiomTraceProvider
participant AxiomLogs as Axiom Logs Ingest
participant AxiomOTLP as Axiom OTLP (api.axiom.co)
App->>Logger: logDriver() [non-local]
Logger->>AxiomLogs: axiomzerolog.New(SetDataset)
alt Axiom not configured
AxiomLogs-->>Logger: error
Logger-->>App: stderr fallback (silent)
else configured
AxiomLogs-->>Logger: writer
Logger-->>App: zerolog to Axiom writer
end
App->>OTLP: InitializeTraceProvider()
OTLP->>AxiomOTLP: otlptracehttp.New(endpoint, auth headers)
OTLP->>AxiomOTLP: otlpmetrichttp.New(endpoint, auth headers)
AxiomOTLP-->>OTLP: exporters created (lazy connect)
OTLP-->>App: shutdown func
App->>AxiomLogs: log entries via zerolog JSON
App->>AxiomOTLP: spans via BatchSpanProcessor to /v1/traces
App->>AxiomOTLP: metrics via PeriodicReader to /v1/metrics
Reviews (1): Last reviewed commit: "chore: add Axiom env vars to .env.docker" | Re-trigger Greptile |
| axiomWriter, err := axiomzerolog.New( | ||
| axiomzerolog.SetDataset(os.Getenv("AXIOM_DATASET")), | ||
| ) | ||
| if err != nil { | ||
| // Fall back to stderr JSON if Axiom is not configured | ||
| zl := zerolog.New(os.Stderr).With().Timestamp().CallerWithSkipFrameCount(skipFrameCount).Logger() | ||
| return &zerodriver.Logger{Logger: &zl} | ||
| } |
There was a problem hiding this comment.
Silent fallback without any output makes misconfiguration invisible in production. When
axiomzerolog.New() fails (e.g. AXIOM_TOKEN or AXIOM_DATASET is unset), the function returns a stderr logger but writes nothing to explain the failure. An operator checking why logs aren't appearing in Axiom has no signal to act on — the app simply logs to stderr as if that were intentional. Writing the error to stderr before falling back gives operators a clear indication that Axiom ingestion is not active.
| axiomWriter, err := axiomzerolog.New( | |
| axiomzerolog.SetDataset(os.Getenv("AXIOM_DATASET")), | |
| ) | |
| if err != nil { | |
| // Fall back to stderr JSON if Axiom is not configured | |
| zl := zerolog.New(os.Stderr).With().Timestamp().CallerWithSkipFrameCount(skipFrameCount).Logger() | |
| return &zerodriver.Logger{Logger: &zl} | |
| } | |
| axiomWriter, err := axiomzerolog.New( | |
| axiomzerolog.SetDataset(os.Getenv("AXIOM_DATASET")), | |
| ) | |
| if err != nil { | |
| // Fall back to stderr JSON if Axiom is not configured | |
| _, _ = fmt.Fprintf(os.Stderr, "axiom logger init failed, falling back to stderr: %v\n", err) | |
| zl := zerolog.New(os.Stderr).With().Timestamp().CallerWithSkipFrameCount(skipFrameCount).Logger() | |
| return &zerodriver.Logger{Logger: &zl} | |
| } |
| func (container *Container) initializeAxiomTraceProvider(version string, namespace string) func() { | ||
| container.logger.Debug("initializing axiom trace provider") | ||
|
|
||
| headers := map[string]string{ | ||
| "Authorization": "Bearer " + os.Getenv("AXIOM_TOKEN"), | ||
| "X-Axiom-Dataset": os.Getenv("AXIOM_DATASET"), | ||
| } |
There was a problem hiding this comment.
Missing guard for empty
AXIOM_TOKEN before constructing the OTLP exporters. otlptracehttp.New() and otlpmetrichttp.New() initialize lazily — they succeed even when AXIOM_TOKEN is empty, producing an Authorization: Bearer header. All subsequent export calls will receive HTTP 401 responses from Axiom and be silently dropped (the OTel batch processor discards failed batches). Unlike the logger which at least falls back visibly to stderr, there is no equivalent guard here, so traces and metrics will be lost with no startup-time indication.
| func (container *Container) initializeAxiomTraceProvider(version string, namespace string) func() { | |
| container.logger.Debug("initializing axiom trace provider") | |
| headers := map[string]string{ | |
| "Authorization": "Bearer " + os.Getenv("AXIOM_TOKEN"), | |
| "X-Axiom-Dataset": os.Getenv("AXIOM_DATASET"), | |
| } | |
| func (container *Container) initializeAxiomTraceProvider(version string, namespace string) func() { | |
| container.logger.Debug("initializing axiom trace provider") | |
| if os.Getenv("AXIOM_TOKEN") == "" { | |
| container.logger.Fatal(stacktrace.NewError("AXIOM_TOKEN is required for the Axiom trace/metric provider")) | |
| } | |
| if os.Getenv("AXIOM_DATASET") == "" { | |
| container.logger.Fatal(stacktrace.NewError("AXIOM_DATASET is required for the Axiom trace/metric provider")) | |
| } | |
| headers := map[string]string{ | |
| "Authorization": "Bearer " + os.Getenv("AXIOM_TOKEN"), | |
| "X-Axiom-Dataset": os.Getenv("AXIOM_DATASET"), | |
| } |
Use regional edge endpoint for improved data locality on both OTLP exporters (traces/metrics) and the zerolog log adapter. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Logs use the default Axiom API endpoint. Edge endpoint is only for OTLP trace and metric exporters. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Traces and logs go to AXIOM_TRACES_DATASET (events), metrics go to AXIOM_METRICS_DATASET (metrics). Replaces single AXIOM_DATASET. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…ATASET_METRICS Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Summary
Migrate production observability from GCP Cloud Logging + Uptrace to Axiom for all three signals: logs, traces, and metrics.
Changes
Environment Variables
Rollback
All old provider code is kept as dead functions:
Testing