Guardrail span parity: emit OTel spans + attributes alongside the audit events

## Context

PR #160 (closes #159) wires all five library gates and unifies the
\`guardrail_check\` audit event on the \`gate\` vocabulary. The audit
stream now carries everything an SIEM consumer needs to filter by
gate / decision / violation.

**The OTel trace tree, however, sees nothing.** Verified:

- \`forge-core/observability/attrs.go\` defines no \`forge.guardrail.*\` keys.
- \`forge-cli/runtime/guardrails_engine.go\` never imports a tracer.
- \`forge-core/runtime/loop.go\` opens \`agent.execute\` / \`llm.completion\`
  / \`tool.<name>\` spans (FWS-3 Phase 3 / issue #104) but nothing
  nested for guardrail checks.

So a trace from a request that fires a guardrail mask shows:

\`\`\`
a2a.tasks/send
  └─ agent.execute
       ├─ llm.completion
       │   └─ (no guardrail span — InputGate fired BEFORE, invisible)
       └─ tool.<name>
           └─ (no guardrail span — ToolCallGate fired BEFORE, invisible)
\`\`\`

Operators looking at a trace can't see \"PII was masked here\" without
pivoting to the audit stream and joining on \`correlation_id\`. The
guardrail decision is invisible to anyone who only has access to the
trace backend (Honeycomb / Datadog / Tempo / Grafana).

## Proposal

Symmetric to the audit work — every gate call opens a child span and
stamps the same fields the audit event carries.

### Span names

| Span | When |
|------|------|
| \`guardrail.input\` | \`CheckInbound\` |
| \`guardrail.context\` | \`CheckContext\` (one span per system message scanned) |
| \`guardrail.tool_call\` | \`CheckToolCall\` |
| \`guardrail.output\` | \`CheckOutbound\` and \`CheckToolOutput\` — distinguished by presence of \`forge.tool.name\` |
| \`guardrail.stream\` | \`CheckStream\` (when wired) |

### Span attributes

New constants in \`forge-core/observability/attrs.go\` (under the
existing \`forge.*\` namespace):

| Key | Value | Source |
|-----|-------|--------|
| \`forge.guardrail.gate\` | \`input\` / \`context\` / \`tool_call\` / \`output\` / \`stream\` | \`res.Gate\` — single source of truth, matches \`fields.gate\` on the audit event |
| \`forge.guardrail.decision\` | \`allow\` / \`mask\` / \`block\` / \`warn\` | \`res.Decision\` |
| \`forge.guardrail.type\` | \`pii\` / \`moderation\` / \`security\` / … | First violation \`Type\` |
| \`forge.guardrail.category\` | \`ssn\` / \`email\` / … | First violation \`Category\` |
| \`forge.guardrail.violation_count\` | int | \`len(res.Violations)\` |
| \`forge.tool.name\` | string | Already a constant in \`attrs.go\`; reused for tool_call + tool_output gate spans |
| \`forge.guardrail.evidence\` | string | Pre-mask content; gated by \`TracingConfig.CaptureContent\` + \`Redact\` (issue #130 posture) |

### Span parent

Spans nest under whatever's active when the engine method is called:

| Gate | Parent span |
|------|-------------|
| \`input\` | \`a2a.tasks/send\` (CheckInbound runs in the A2A handler, before the loop starts) |
| \`context\` | \`agent.execute\` (BeforeLLMCall hook is inside the loop) |
| \`tool_call\` | \`agent.execute\` (BeforeToolExec hook is inside the loop) |
| \`output\` (final) | \`agent.execute\` (CheckOutbound runs at the A2A handler exit, AFTER agent.execute closes — needs the parent ctx threaded explicitly OR moves CheckOutbound inside agent.execute's defer) |
| \`output\` (tool result) | \`agent.execute\` (AfterToolExec hook) |

### Span status

| Decision | OTel status |
|----------|-------------|
| \`allow\` / \`warn\` / \`mask\` | OK |
| \`block\` | Error (with the violation summary as the status description) |

The Error status surfaces blocked invocations as red bars in the trace
UI without needing custom attribute queries.

### Evidence capture parity with #130

\`forge.guardrail.evidence\` follows the **exact same posture** as
\`gen_ai.input.messages\` / \`gen_ai.output.messages\` / \`forge.tool.args\`
that #130 established:

- Default off: \`TracingConfig.CaptureContent=false\` means the attribute is absent.
- \`CaptureContent=true\` + \`Redact=true\` (default) → \`PrepareSpanContent(s, true, MaxBytes)\` scrubs vendor secret patterns before stamping.
- \`MaxBytes\` (default 4 KiB) trims via the existing \`…[truncated:N]\` marker.

Same env knobs that already control the OTel content-capture pipeline cover guardrail evidence — no new operator-facing surface.

For mask decisions, evidence on the SPAN follows the same rule as
evidence on the AUDIT event: post-mask content (the payload the LLM
actually saw). Block / warn decisions carry the original triggering
content because the library never produces a masked variant in those
paths. See \`docs/security/guardrails.md#what-evidence-actually-contains\`.

## Implementation sketch

- \`forge-core/observability/attrs.go\` — add the five \`forge.guardrail.*\` constants.
- \`forge-cli/runtime/guardrails_tracing.go\` (new) — \`startGuardrailSpan(ctx, gate, tool)\` helper + \`finishGuardrailSpan(span, res, decision, content, captureCfg)\`.
- \`forge-cli/runtime/guardrails_engine.go\` — each Check* method opens a span at the top, stamps attributes + status at the bottom. ~5-10 lines per gate.
- \`forge-cli/runtime/guardrails_engine.go\` — wire \`observability.TracingConfig\` into the engine so the evidence attribute respects \`CaptureContent\` + \`Redact\`. Mirrors how PR #154 wired the tracing config onto \`LLMExecutor\`.
- Tests with \`sdktrace.InMemoryExporter\` (same pattern as \`loop_spans_content_test.go\`) asserting the span name + attributes for each gate.
- Docs: \`docs/core-concepts/observability-tracing.md\` gains a \"Guardrail spans\" section linking to \`docs/security/guardrails.md\`.

## Out of scope

- Cardinality limit on \`guardrail.context\` spans. If a deployment has many system messages it may produce N spans per LLM iteration — that's fine for now (small N in practice), revisit if cardinality complaints arise.
- StreamGate spans. \`CheckStream\` exists but isn't auto-wired (Forge's \`ExecuteStream\` is a buffered wrapper). The span helper is exposed so when real streaming lands in the loop, the wiring is one line.
- Trace context propagation to downstream guardrail-library calls. The library doesn't consume OTel context today; if it grows to (e.g. when calling external moderation endpoints) we can revisit.

## Verification

End-to-end:

1. Run an agent with \`FORGE_OTEL_ENABLED=true\` and \`FORGE_OTEL_EXPORTER=otlp\` plus \`FORGE_GUARDRAIL_CAPTURE_EVIDENCE\` env unset (default).
2. Send a PII-bearing message. Confirm the trace backend shows a \`guardrail.input\` child of \`a2a.tasks/send\` with \`forge.guardrail.gate=input\`, \`forge.guardrail.decision=mask\`, \`forge.guardrail.type=pii\`, \`forge.guardrail.category=ssn\`, and **no** \`forge.guardrail.evidence\` attribute.
3. Set \`FORGE_OTEL_CAPTURE_CONTENT=true\` and re-run. Confirm \`forge.guardrail.evidence\` now carries the redacted + truncated content.
4. Send an A2A request that triggers an outbound block (in enforce mode). Confirm \`guardrail.output\` has OTel status \`Error\` with the violation summary as the description.

## Related

- #155 — guardrail_check emission (merged)
- #156 — PR that closed #155
- #159 — wire all five gates + drop direction in favor of gate
- #160 — PR that closes #159
- #130 — OTel content capture; defines the \`CaptureContent\` + \`Redact\` posture this issue reuses

Span	When
`guardrail.input`	`CheckInbound`
`guardrail.context`	`CheckContext` (one span per system message scanned)
`guardrail.tool_call`	`CheckToolCall`
`guardrail.output`	`CheckOutbound` and `CheckToolOutput` — distinguished by presence of `forge.tool.name`
`guardrail.stream`	`CheckStream` (when wired)

Key	Value	Source
`forge.guardrail.gate`	`input` / `context` / `tool_call` / `output` / `stream`	`res.Gate` — single source of truth, matches `fields.gate` on the audit event
`forge.guardrail.decision`	`allow` / `mask` / `block` / `warn`	`res.Decision`
`forge.guardrail.type`	`pii` / `moderation` / `security` / …	First violation `Type`
`forge.guardrail.category`	`ssn` / `email` / …	First violation `Category`
`forge.guardrail.violation_count`	int	`len(res.Violations)`
`forge.tool.name`	string	Already a constant in `attrs.go`; reused for tool_call + tool_output gate spans
`forge.guardrail.evidence`	string	Pre-mask content; gated by `TracingConfig.CaptureContent` + `Redact` (issue #130 posture)

Gate	Parent span
`input`	`a2a.tasks/send` (CheckInbound runs in the A2A handler, before the loop starts)
`context`	`agent.execute` (BeforeLLMCall hook is inside the loop)
`tool_call`	`agent.execute` (BeforeToolExec hook is inside the loop)
`output` (final)	`agent.execute` (CheckOutbound runs at the A2A handler exit, AFTER agent.execute closes — needs the parent ctx threaded explicitly OR moves CheckOutbound inside agent.execute's defer)
`output` (tool result)	`agent.execute` (AfterToolExec hook)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Guardrail span parity: emit OTel spans + attributes alongside the audit events #161

Context

Proposal

Span names

Span attributes

Span parent

Span status

Evidence capture parity with #130

Implementation sketch

Out of scope

Verification

Related

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Decision	OTel status
`allow` / `warn` / `mask`	OK
`block`	Error (with the violation summary as the status description)

Guardrail span parity: emit OTel spans + attributes alongside the audit events #161

Description

Context

Proposal

Span names

Span attributes

Span parent

Span status

Evidence capture parity with #130

Implementation sketch

Out of scope

Verification

Related

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions