Discord adapter: latency UX improvements (perception + proactive context management)

### Description

# Discord adapter: latency UX improvements (perception + proactive context management)

## Context

Running `openab-claude:0.8.3-beta.7` + PR #791 swap on GB10 ARM64 host
(meta + mentor instances, ~28h uptime, normal daily-driver usage with Discord
adapter).

## Observed behavior

Two distinct latency signals from `openab::dispatch` logs over a full day of
real usage:

| Metric | Range | Comment |
|---|---|---|
| `wait_ms` (OpenAB queue wait) | 300–500 ms | Healthy across all turns |
| `agent_dispatch_ms` (inner agent processing) | 2 sec – **395 sec** | Wide variance |

Breakdown by usage pattern:
- Idle chat: 2–5 sec
- Active sessions with tool chains: 30–100 sec common
- Multi-tool reasoning-heavy turns: 100–400 sec
- Peak observed today: **395 sec (6.5 min)** on one turn

## Root cause investigation

After issuing `/compact` inside the Claude session, `agent_dispatch_ms`
immediately dropped back to the 2–5 sec normal range. So:

- ~80% of "Discord feels slow" is **Claude CLI session jsonl context bloat**
  (full history sent on every Anthropic API call grows linearly with session size)
- ~10% is **ACP / Discord chunked-send overhead** (json-rpc wrap + Discord API
  message chunking for long replies)
- ~10% is **perception** (no streaming visibility — user sees nothing for
  30+ sec, which feels indistinguishable from "stuck")

OpenAB itself is **not** the latency root cause — `wait_ms` is consistently
healthy. The dominant factor is internal Claude/Anthropic processing, which
OpenAB only measures.

That said, OpenAB sits between the user and Claude CLI — it is the *only* layer
that can mitigate the **user-facing experience** of these long turns.

## Suggested improvements

### 1. Typing indicator / partial output during long dispatch

Currently the Discord adapter waits for the agent to fully complete its turn
before pushing the reply. For 30+ sec turns this looks like the bot died.

- Maintain Discord typing indicator while `dispatch` is in flight, OR
- Stream partial output as Claude produces it (if ACP supports incremental
  chunks)

### 2. Auto progress hint when `agent_dispatch_ms` is unusually long

When a turn exceeds a threshold (e.g. > 60 sec):

- Auto-send a short `⏳ still processing...` message to keep channel alive
- Optionally include a hint like `session context is X% full — consider /compact`

### 3. Expose session size to user

- Slash command or `/status` showing current jsonl size + last `agent_dispatch_ms`
- Lets the user see context-bloat accumulating and proactively `/compact`
  before turns slow down

### 4. Auto-compact (most impactful, but biggest change)

OpenAB is an agent layer with full ownership — it can do proactive context
management that the upstream Claude Code CLI itself does not (Anthropic's
auto-compact only triggers near hard context limit, not proactively).

Proposed mechanism:

- **Monitor signals**: any of —
  - jsonl file size exceeds threshold (e.g. 10 MB)
  - rolling average `agent_dispatch_ms` over last N turns exceeds threshold (e.g. 30 sec)
  - time since last `/compact` exceeds T hours
- **Trigger**: before delivering the next user message to the agent, OpenAB
  injects a synthetic `/compact` dispatch
- **Transparent to user**: just a maintenance turn, not visible in Discord
- **Result**: user never has to think about compacting, session stays in
  fresh-context regime indefinitely

This is structurally the kind of thing only an agent wrapper can do —
the upstream commercial Claude API can't modify Claude Code's own behavior,
but a layer that *invokes* Claude CLI can drive it proactively.

## Priority

(1)(2)(3) are perception-layer fixes — keep user informed that the system is
alive during slow turns. Cheap wins.

(4) is structural — eliminates the dominant 80% latency contributor entirely
for daily-driver use cases.

## Logging suggestion (separate)

Consider splitting `agent_dispatch_ms` into:
- `claude_cli_ms` (inner CLI + API time)
- `acp_overhead_ms` (json-rpc serialization + Discord chunked-send)

Currently they're conflated. Splitting helps users (and you) distinguish
"Claude is slow today" from "OpenAB has overhead" when triaging issues.

## Environment

- Host: GB10 ARM64, Ubuntu 24.04
- OpenAB image: `ghcr.io/openabdev/openab-claude:0.8.3-beta.7` (also tested
  with PR #791 reconnect fix swapped on meta instance)
- Two compose instances side by side (meta + mentor)
- Use case: continuous daily-driver, Discord-only interface

### Use Case

Discord adapter: latency UX improvements

### Proposed Solution

_No response_

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Discord adapter: latency UX improvements (perception + proactive context management) #795

Description

Discord adapter: latency UX improvements (perception + proactive context management)

Context

Observed behavior

Root cause investigation

Suggested improvements

1. Typing indicator / partial output during long dispatch

2. Auto progress hint when `agent_dispatch_ms` is unusually long

3. Expose session size to user

4. Auto-compact (most impactful, but biggest change)

Priority

Logging suggestion (separate)

Environment

Use Case

Proposed Solution

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Metric	Range	Comment
`wait_ms` (OpenAB queue wait)	300–500 ms	Healthy across all turns
`agent_dispatch_ms` (inner agent processing)	2 sec – 395 sec	Wide variance

Discord adapter: latency UX improvements (perception + proactive context management) #795

Description

Description

Discord adapter: latency UX improvements (perception + proactive context management)

Context

Observed behavior

Root cause investigation

Suggested improvements

1. Typing indicator / partial output during long dispatch

2. Auto progress hint when agent_dispatch_ms is unusually long

3. Expose session size to user

4. Auto-compact (most impactful, but biggest change)

Priority

Logging suggestion (separate)

Environment

Use Case

Proposed Solution

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

2. Auto progress hint when `agent_dispatch_ms` is unusually long