Run ten of them. Fan out a batch of Claude calls concurrently, bounded by a cap and measured against the serial baseline.
serial |--c1--||--c2--||--c3--| ... |--c10--| wall-clock = sum of latencies
parallel |--c1--|
|--c2--|
|--c3--| all in flight, up to the cap wall-clock = about the slowest
...
|--c10-|
pip install -r requirements.txt
ANTHROPIC_API_KEY=... python run.pyThis is a real tool. Every run calls the Anthropic API, so ANTHROPIC_API_KEY is
required. Without a key it fails fast with a clear error and a non-zero exit. There is no
offline mode and no fallback. The run takes a batch of ten items, triages each one through
Claude two ways (serial, then parallel with a concurrency cap), and writes the measured
wall-clock of both to data/last_run.md. The only difference between the two runs is the
cap, so the speedup is the parallelize win and nothing else.
Flags: --n to run fewer items, --concurrency to change the cap (default 10), --model
to change the model (default claude-haiku-4-5, the cost-aware choice for a batch of short
triages).
| Feature | Where it shows up |
|---|---|
| Async fan-out | asyncio.gather runs one Claude call per item, all in flight at once (fanout/core.py) |
| Bounded concurrency | an asyncio.Semaphore caps how many run at the same time, so a big batch does not slam the rate limit |
| Same code, one knob | the serial baseline is the same function with the cap set to 1, so the comparison is honest |
| Structured outputs | each call returns typed JSON through a forced tool call, reliable on every SDK version |
| Route by consequence | a batch of short triages runs on Haiku, the fast model, not the expensive one |
| A measured receipt | the run writes the real serial and parallel wall-clock to data/last_run.md |
The batch measures itself. It runs the same ten calls serially, then in parallel, and
reports both wall-clocks and the ratio. A recent run measured 4.3x (15.3s serial against
3.5s parallel), and across runs the win ranged from under 2x to 15x. The spread is the
lesson: the parallel win is bounded by your account's rate limit, not by the code. When the
limit does not bite, ten calls finish in about the time of the slowest one. When it does,
the concurrency cap is what keeps the batch slowing down gracefully instead of failing.
Reproduce before quoting. The numbers in any writeup come from data/last_run.md, produced
by a real run.
A batch like this is the kind of recurring job you put on a schedule, a triage sweep every weekday morning. On the Claude platform that is a routine, a cloud Claude Code agent that runs on a cadence you set, not a job in someone else's CI. Create one from any session:
/schedule run this batch every weekday at 8am
The routine runs in the cloud on that cadence, so the sweep happens whether or not your machine is on. A routine lives in your Claude account, so there is no schedule file to commit here. The schedule is the part that belongs on the platform.
Two more ways to drive a run when you are away from your desk, both in the Claude Code docs:
- Remote Control: start a local session
with
claude remote-control, scan the QR code, and steer or approve the run from your phone. The session keeps running on your machine, the phone is a window into it. - Dispatch: message the task from the Claude mobile app and Desktop spawns a session to handle it.
Neither is a file in this repo. Both are operator surfaces on the platform, the reason this README documents them instead of reimplementing them.
The repo verifies itself. It ships a verify skill (.claude/skills/verify/SKILL.md) that
runs the offline tests, the deslop gate, and a real batch whenever you touch the fan-out,
and a Stop hook (.claude/hooks/verify_stop.py) that asks the agent to run it before
stopping if the receipt is stale. The skill names the tools and the checks, and it ends by
telling the agent to fix any blocker and update the skill, so the verification improves
itself.
fanout/
core.py # the fan-out primitive: bounded asyncio.gather, plus the Claude call
tasks.py # the sample batch task (triage) and its structured-output tool
run.py # one-command entry point: serial then parallel, measured
data/
items.json # the ten generic items the batch runs on
last_run.md # the receipt from the last real run
tests/ # offline tests that prove the concurrency cap actually holds
scripts/ # the self-contained deslop gate for CI
.claude/ # the verify skill and the Stop hook that runs it
MIT.