Split out from #254's scale discussion (target: 10,000+ agents per org, ~1,000-engineer customer).
The ceiling
Caching fixes the read path; this is the write path. Onboarding 10,000 agents appends ~20–30k events to ONE org KEL, and KERI ordering serializes that chain: one git commit chain, one writer, strictly sequential seq numbers. At ~50ms/event that is ~25 minutes of pure sequential writes for day-one onboarding — possibly tolerable as a one-time batched import, but nobody has measured it.
What to do
- Measure end-to-end append throughput for a synthetic 10k-agent onboarding against the Git-backed registry, with and without
commit_batch (the batch path exists on RegistryBackend). Get a real events/sec number on launch-class hardware.
- Decide whether batched import needs a dedicated bulk-onboarding flow (e.g. one anchoring
ixn carrying many seals per batch, if the event algebra permits) or whether measured throughput is acceptable as-is.
- Note the risk interaction: 1,000 engineers under a single
kt=1 org KEL raises the operational stakes of the documented duplicity / concurrent-rotation accepted risk (docs/architecture/multi_device_accepted_risks.md) — same root scale, different failure mode. The threshold upgrade path in essays/design/multi_device.md becomes more load-bearing at this org size.
Acceptance
- A written events/sec number from a reproducible benchmark.
- A go/no-go on a dedicated bulk-import flow, recorded here.
- The duplicity-risk interaction acknowledged in the accepted-risks doc if the 10k-agent target is confirmed.
Split out from #254's scale discussion (target: 10,000+ agents per org, ~1,000-engineer customer).
The ceiling
Caching fixes the read path; this is the write path. Onboarding 10,000 agents appends ~20–30k events to ONE org KEL, and KERI ordering serializes that chain: one git commit chain, one writer, strictly sequential seq numbers. At ~50ms/event that is ~25 minutes of pure sequential writes for day-one onboarding — possibly tolerable as a one-time batched import, but nobody has measured it.
What to do
commit_batch(the batch path exists onRegistryBackend). Get a real events/sec number on launch-class hardware.ixncarrying many seals per batch, if the event algebra permits) or whether measured throughput is acceptable as-is.kt=1org KEL raises the operational stakes of the documented duplicity / concurrent-rotation accepted risk (docs/architecture/multi_device_accepted_risks.md) — same root scale, different failure mode. The threshold upgrade path inessays/design/multi_device.mdbecomes more load-bearing at this org size.Acceptance