Skip to content

cfregly/claude-parallel

Repository files navigation

claude-parallel

Run ten of them. Fan out a batch of Claude calls concurrently, bounded by a cap and measured against the serial baseline.

  serial    |--c1--||--c2--||--c3--| ... |--c10--|     wall-clock = sum of latencies
  parallel  |--c1--|
            |--c2--|
            |--c3--|    all in flight, up to the cap    wall-clock = about the slowest
            ...
            |--c10-|

Run it

pip install -r requirements.txt
ANTHROPIC_API_KEY=... python run.py

This is a real tool. Every run calls the Anthropic API, so ANTHROPIC_API_KEY is required. Without a key it fails fast with a clear error and a non-zero exit. There is no offline mode and no fallback. The run takes a batch of ten items, triages each one through Claude two ways (serial, then parallel with a concurrency cap), and writes the measured wall-clock of both to data/last_run.md. The only difference between the two runs is the cap, so the speedup is the parallelize win and nothing else.

Flags: --n to run fewer items, --concurrency to change the cap (default 10), --model to change the model (default claude-haiku-4-5, the cost-aware choice for a batch of short triages).

What it demonstrates

Feature Where it shows up
Async fan-out asyncio.gather runs one Claude call per item, all in flight at once (fanout/core.py)
Bounded concurrency an asyncio.Semaphore caps how many run at the same time, so a big batch does not slam the rate limit
Same code, one knob the serial baseline is the same function with the cap set to 1, so the comparison is honest
Structured outputs each call returns typed JSON through a forced tool call, reliable on every SDK version
Route by consequence a batch of short triages runs on Haiku, the fast model, not the expensive one
A measured receipt the run writes the real serial and parallel wall-clock to data/last_run.md

The receipt

The batch measures itself. It runs the same ten calls serially, then in parallel, and reports both wall-clocks and the ratio. A recent run measured 4.3x (15.3s serial against 3.5s parallel), and across runs the win ranged from under 2x to 15x. The spread is the lesson: the parallel win is bounded by your account's rate limit, not by the code. When the limit does not bite, ten calls finish in about the time of the slowest one. When it does, the concurrency cap is what keeps the batch slowing down gracefully instead of failing. Reproduce before quoting. The numbers in any writeup come from data/last_run.md, produced by a real run.

Stack them: schedule it as a routine

A batch like this is the kind of recurring job you put on a schedule, a triage sweep every weekday morning. On the Claude platform that is a routine, a cloud Claude Code agent that runs on a cadence you set, not a job in someone else's CI. Create one from any session:

/schedule run this batch every weekday at 8am

The routine runs in the cloud on that cadence, so the sweep happens whether or not your machine is on. A routine lives in your Claude account, so there is no schedule file to commit here. The schedule is the part that belongs on the platform.

Operate it from your phone

Two more ways to drive a run when you are away from your desk, both in the Claude Code docs:

  • Remote Control: start a local session with claude remote-control, scan the QR code, and steer or approve the run from your phone. The session keeps running on your machine, the phone is a window into it.
  • Dispatch: message the task from the Claude mobile app and Desktop spawns a session to handle it.

Neither is a file in this repo. Both are operator surfaces on the platform, the reason this README documents them instead of reimplementing them.

Self-verification

The repo verifies itself. It ships a verify skill (.claude/skills/verify/SKILL.md) that runs the offline tests, the deslop gate, and a real batch whenever you touch the fan-out, and a Stop hook (.claude/hooks/verify_stop.py) that asks the agent to run it before stopping if the receipt is stale. The skill names the tools and the checks, and it ends by telling the agent to fix any blocker and update the skill, so the verification improves itself.

Layout

fanout/
  core.py      # the fan-out primitive: bounded asyncio.gather, plus the Claude call
  tasks.py     # the sample batch task (triage) and its structured-output tool
run.py         # one-command entry point: serial then parallel, measured
data/
  items.json   # the ten generic items the batch runs on
  last_run.md  # the receipt from the last real run
tests/         # offline tests that prove the concurrency cap actually holds
scripts/       # the self-contained deslop gate for CI
.claude/       # the verify skill and the Stop hook that runs it

License

MIT.

About

A bounded, measured fan-out of concurrent Claude calls: the parallelize pattern, with a verify skill, Stop hook, and CI.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages