You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Running openab-claude:0.8.3-beta.7 + PR #791 swap on GB10 ARM64 host
(meta + mentor instances, ~28h uptime, normal daily-driver usage with Discord
adapter).
Observed behavior
Two distinct latency signals from openab::dispatch logs over a full day of
real usage:
Metric
Range
Comment
wait_ms (OpenAB queue wait)
300–500 ms
Healthy across all turns
agent_dispatch_ms (inner agent processing)
2 sec – 395 sec
Wide variance
Breakdown by usage pattern:
Idle chat: 2–5 sec
Active sessions with tool chains: 30–100 sec common
Multi-tool reasoning-heavy turns: 100–400 sec
Peak observed today: 395 sec (6.5 min) on one turn
Root cause investigation
After issuing /compact inside the Claude session, agent_dispatch_ms
immediately dropped back to the 2–5 sec normal range. So:
~80% of "Discord feels slow" is Claude CLI session jsonl context bloat
(full history sent on every Anthropic API call grows linearly with session size)
~10% is ACP / Discord chunked-send overhead (json-rpc wrap + Discord API
message chunking for long replies)
~10% is perception (no streaming visibility — user sees nothing for
30+ sec, which feels indistinguishable from "stuck")
OpenAB itself is not the latency root cause — wait_ms is consistently
healthy. The dominant factor is internal Claude/Anthropic processing, which
OpenAB only measures.
That said, OpenAB sits between the user and Claude CLI — it is the only layer
that can mitigate the user-facing experience of these long turns.
Suggested improvements
1. Typing indicator / partial output during long dispatch
Currently the Discord adapter waits for the agent to fully complete its turn
before pushing the reply. For 30+ sec turns this looks like the bot died.
Maintain Discord typing indicator while dispatch is in flight, OR
Stream partial output as Claude produces it (if ACP supports incremental
chunks)
2. Auto progress hint when agent_dispatch_ms is unusually long
When a turn exceeds a threshold (e.g. > 60 sec):
Auto-send a short ⏳ still processing... message to keep channel alive
Optionally include a hint like session context is X% full — consider /compact
3. Expose session size to user
Slash command or /status showing current jsonl size + last agent_dispatch_ms
Lets the user see context-bloat accumulating and proactively /compact
before turns slow down
4. Auto-compact (most impactful, but biggest change)
OpenAB is an agent layer with full ownership — it can do proactive context
management that the upstream Claude Code CLI itself does not (Anthropic's
auto-compact only triggers near hard context limit, not proactively).
Proposed mechanism:
Monitor signals: any of —
jsonl file size exceeds threshold (e.g. 10 MB)
rolling average agent_dispatch_ms over last N turns exceeds threshold (e.g. 30 sec)
time since last /compact exceeds T hours
Trigger: before delivering the next user message to the agent, OpenAB
injects a synthetic /compact dispatch
Transparent to user: just a maintenance turn, not visible in Discord
Result: user never has to think about compacting, session stays in
fresh-context regime indefinitely
This is structurally the kind of thing only an agent wrapper can do —
the upstream commercial Claude API can't modify Claude Code's own behavior,
but a layer that invokes Claude CLI can drive it proactively.
Priority
(1)(2)(3) are perception-layer fixes — keep user informed that the system is
alive during slow turns. Cheap wins.
(4) is structural — eliminates the dominant 80% latency contributor entirely
for daily-driver use cases.
Description
Discord adapter: latency UX improvements (perception + proactive context management)
Context
Running
openab-claude:0.8.3-beta.7+ PR #791 swap on GB10 ARM64 host(meta + mentor instances, ~28h uptime, normal daily-driver usage with Discord
adapter).
Observed behavior
Two distinct latency signals from
openab::dispatchlogs over a full day ofreal usage:
wait_ms(OpenAB queue wait)agent_dispatch_ms(inner agent processing)Breakdown by usage pattern:
Root cause investigation
After issuing
/compactinside the Claude session,agent_dispatch_msimmediately dropped back to the 2–5 sec normal range. So:
(full history sent on every Anthropic API call grows linearly with session size)
message chunking for long replies)
30+ sec, which feels indistinguishable from "stuck")
OpenAB itself is not the latency root cause —
wait_msis consistentlyhealthy. The dominant factor is internal Claude/Anthropic processing, which
OpenAB only measures.
That said, OpenAB sits between the user and Claude CLI — it is the only layer
that can mitigate the user-facing experience of these long turns.
Suggested improvements
1. Typing indicator / partial output during long dispatch
Currently the Discord adapter waits for the agent to fully complete its turn
before pushing the reply. For 30+ sec turns this looks like the bot died.
dispatchis in flight, ORchunks)
2. Auto progress hint when
agent_dispatch_msis unusually longWhen a turn exceeds a threshold (e.g. > 60 sec):
⏳ still processing...message to keep channel alivesession context is X% full — consider /compact3. Expose session size to user
/statusshowing current jsonl size + lastagent_dispatch_ms/compactbefore turns slow down
4. Auto-compact (most impactful, but biggest change)
OpenAB is an agent layer with full ownership — it can do proactive context
management that the upstream Claude Code CLI itself does not (Anthropic's
auto-compact only triggers near hard context limit, not proactively).
Proposed mechanism:
agent_dispatch_msover last N turns exceeds threshold (e.g. 30 sec)/compactexceeds T hoursinjects a synthetic
/compactdispatchfresh-context regime indefinitely
This is structurally the kind of thing only an agent wrapper can do —
the upstream commercial Claude API can't modify Claude Code's own behavior,
but a layer that invokes Claude CLI can drive it proactively.
Priority
(1)(2)(3) are perception-layer fixes — keep user informed that the system is
alive during slow turns. Cheap wins.
(4) is structural — eliminates the dominant 80% latency contributor entirely
for daily-driver use cases.
Logging suggestion (separate)
Consider splitting
agent_dispatch_msinto:claude_cli_ms(inner CLI + API time)acp_overhead_ms(json-rpc serialization + Discord chunked-send)Currently they're conflated. Splitting helps users (and you) distinguish
"Claude is slow today" from "OpenAB has overhead" when triaging issues.
Environment
ghcr.io/openabdev/openab-claude:0.8.3-beta.7(also testedwith PR fix: reconnect Discord gateway on silent WS disconnect #791 reconnect fix swapped on meta instance)
Use Case
Discord adapter: latency UX improvements
Proposed Solution
No response