Skip to content

feat: Extend outbox processor with operational metrics (pulse.outbox.*) #116

@samtrion

Description

@samtrion

User Story

As a platform engineer operating the outbox processor, I want detailed metrics for the outbox pipeline exposed via System.Diagnostics.Metrics, so that I can monitor queue depth, failure rates, and processing latency in production using OpenTelemetry-compatible tooling.


Background

The current ActivityAndMetricsRequestInterceptor covers mediator-level metrics (pulse.requests.total, pulse.request.duration, etc.), but the outbox processor background service emits no metrics. This creates a blind spot for operations teams monitoring message delivery health.


Requirements

  • Extend the existing Meter ("NetEvolve.Pulse") in Defaults.cs with the following new instruments:
    Metric Name Instrument Type Unit Description
    pulse.outbox.pending ObservableGauge<long> messages Current number of pending outbox messages
    pulse.outbox.processed.total Counter<long> messages Cumulative number of successfully processed messages
    pulse.outbox.failed.total Counter<long> messages Cumulative number of failed processing attempts
    pulse.outbox.deadletter.total Counter<long> messages Cumulative number of messages moved to dead-letter
    pulse.outbox.processing.duration Histogram<double> ms Duration of each outbox processing batch
  • OutboxProcessorHostedService must record all metrics during its polling loop.
  • The pulse.outbox.pending gauge must query the IOutboxRepository for the current count.
  • Metric recording must not throw; exceptions must be caught and logged at Warning level.

Acceptance Criteria

  • All five metrics are registered and visible when the outbox processor is configured.
  • pulse.outbox.pending reflects the actual value returned by IOutboxRepository.
  • pulse.outbox.processed.total increments by the number of messages successfully sent in a batch.
  • pulse.outbox.failed.total increments when a message transitions to Failed.
  • pulse.outbox.deadletter.total increments when a message transitions to DeadLetter.
  • pulse.outbox.processing.duration records the elapsed time per polling cycle as a histogram.
  • Unit tests verify metric instrument registration and correct increment/observation behavior.
  • Integration tests verify that metrics are non-zero after a processing cycle with messages.
  • XML documentation is updated for any modified public or internal members.

Out of Scope

  • Custom OpenTelemetry exporters.
  • Dashboard or alerting configuration.
  • Per-event-type metric dimensions/tags (may be addressed in a follow-up).

Metadata

Metadata

Labels

type:featureIndicates a new feature or enhancement to be added.

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions