Tech debt: type-driven hardening of the KEL verify path
Architectural follow-ups surfaced while remediating RT-002 (the systemic "verify path
replays a KEL by structure only, never checks event signatures" finding; see
docs/prompts/red_team_2026_06_10.md and #262). RT-002's functional gap is closed for
every signature-carrying transport. This issue captures the structural weaknesses that
let the class exist and that still make it easy to reintroduce. None of these are
security-blocking today; they are correctness-by-construction improvements.
Pre-launch repo: no backwards-compatibility constraints — wire formats and public
signatures may change freely.
Zero-context primer (read first)
- KERI / KEL. An identity is a Key Event Log (KEL): an ordered chain of events —
icp (inception), rot (rotation), ixn (interaction), plus delegated variants dip
(delegated inception) and drt (delegated rotation). Each event is content-addressed by
a SAID (self-addressing identifier, a hash over its own bytes). An identity's prefix
(its DID, did:keri:<prefix>) is the SAID of its inception event — so the inception is
self-certifying, but later appended events are not constrained by the prefix.
- Two kinds of "replay". The engine lives in
crates/auths-keri/src/validate.rs:
- Structural replay —
validate_kel(&[Event]) (+ _with_lookup, _with_receipts,
and the alias replay_kel): checks SAID + sequence + chain-linkage + pre-rotation
commitment. It does not verify that each event is signed by the controlling key.
- Authenticated replay —
validate_signed_kel(&[SignedEvent], Option<&dyn DelegatorKelLookup>) at validate.rs:588: folds per-event signature verification into
the replay. SignedEvent { event, signatures } pairs an event with its CESR signature
attachment.
- RT-002 in one line: untrusted-input verifiers were calling the structural form, so
a forged/unsigned KEL replayed to attacker-chosen keys and verified.
- Delegation constraint (important for any refactor here). A delegated event (
dip/
drt) cannot be authenticated standalone: validate_delegated_inception
(validate.rs:1024) requires a DelegatorKelLookup to resolve the delegator's
anchoring seal. So authentication of a delegated device KEL must happen where the root
KEL + the device KEL + the lookup all coexist (the commit-verify layer, or the
org-bundle pattern in offline_verify.rs).
- Trust tiers. The local
~/.auths registry is trusted (self-owned). Bundles
(--identity-bundle, org air-gapped bundle), WASM/FFI inputs, and --remote/--oobi
fetches are untrusted and must be authenticated.
- Ports/adapters.
RegistryBackend (trait at crates/auths-id/src/storage/registry/ backend.rs:502) is the storage port; adapters are GitRegistryBackend (auths-storage,
real), PostgresAdapter (stub), FakeRegistryBackend (testing), plus a blanket
impl RegistryBackend for Arc<T> (backend.rs:1019).
- Build/check (per-crate, avoid
--all-features on auths-crypto/-core — a deliberate
FIPS/CNSA compile_error! guard makes that fail):
cargo build -p auths-<crate> 2>&1 | grep "^error\[E". The verifier's WASM path compiles
natively with cargo build -p auths-verifier --features wasm. A standing lint
cargo run -p xtask -- check-verify-path-completeness guards the verify surfaces (below).
Verdict
Trending right, but mixed. The ports/adapters seam is genuine and the leaf domain types
are good. The weaknesses cluster in three places: (1) a trait-default footgun that caused
RT-002, (2) parse-don't-validate is applied at the leaves but not at the load-bearing
boundary (the public replay API still accepts unsigned &[Event]), and (3) genuine
logic duplication in delegator-seal lookups.
1. Ports/adapters — the lossy trait-default footgun (ALREADY FIXED; keep as a guardrail)
What happened: RegistryBackend::append_signed_event / get_attachment were default
methods whose defaults silently did the wrong thing — append_signed_event delegated to
append_event and dropped the signature attachment; get_attachment returned
Ok(None). The Arc<T> blanket impl did not override them, so every
Arc<dyn RegistryBackend> (the common handle) inherited the lossy default. Nothing failed
loudly — get_attachment just returned None. This is the literal root cause of RT-002:
producers couldn't ship signatures because storage silently discarded them.
Fix already landed (this session): the two methods are now required (no default) at
backend.rs:537/:549, so the compiler enumerates every adapter that must implement them;
Git/Fake/Postgres/Arc all implement them (Arc forwards at backend.rs:1029/:1038).
Lesson to encode (the actionable part):
- A trait default method is safe only if its default is fail-closed. A default that
loses data or returns "absent" is a landmine. Audit the rest of RegistryBackend (and
peer port traits) for other defaults that silently no-op.
- A blanket
impl ... for Arc<T> must forward every method, and there is no
compile-time guarantee it stays complete as methods are added (a required method does
force it; a defaulted one does not). Consider whether the Arc blanket impl is worth its
maintenance hazard, or whether a #[forward]-style macro / explicit newtype is clearer.
2. Parse-don't-validate — solid at the leaves, weak exactly where it's load-bearing
Good (do not "fix"): Prefix, Said, CesrKey, IndexedSignature are real newtypes
that validate on deserialize. SignedEvent { event, signatures } is the right domain type
and validate_signed_kel consumes it.
Gaps:
2a. The dangerous boundary is not typed — this is the big one. All four structural
replays are still public and take bare &[Event]:
validate.rs:390 pub fn validate_kel(&[Event])
validate.rs:398 pub fn validate_kel_with_lookup(...)
validate.rs:472 pub fn validate_kel_with_receipts(...)
validate.rs:1131 pub fn replay_kel(&[Event]) (a literal alias of validate_kel)
The type system does nothing to stop an unsigned replay; we rely on the CI lint
xtask check-verify-path-completeness (which bans these on verify surfaces unless a call
site carries an // rt-002-allow: <reason> comment). The lint is a guardrail; the real
fix the audit specified (task "A.0", docs/prompts/red_team_2026_06_10.md) is type-level:
make bare-Event replay pub(crate)/private and have the only public replay take
&[SignedEvent], so "verify an unsigned KEL" becomes a compile error. Then the lint
becomes a backstop instead of the primary defense.
2b. Parallel index-correlated arrays instead of one type. The wire structs carry events
and their attachments as two arrays zipped + length-checked at runtime:
crates/auths-verifier/src/core.rs:935 pub kel: Vec<serde_json::Value> +
:943 pub kel_attachments: Vec<String> (in IdentityBundle, core.rs:918).
crates/auths-sdk/src/domains/org/bundle.rs:49 BundledKel { events: Vec<Event>, attachments: Vec<String> }.
A Vec<SignedEvent> makes the length mismatch unrepresentable. Worse, IdentityBundle.kel
is Vec<serde_json::Value> — unparsed JSON, the weakest possible typing, deferring the
parse past the type boundary entirely. The parallel-array wire shape has a defensible
forward/backward-read reason, but internally we should collapse to Vec<SignedEvent>
immediately at deserialize and never pass the loose arrays around. (Today the verify gate
does reconstruct SignedEvent then call validate_signed_kel, but the struct still leaks
the loose arrays to every other reader.)
2c. Primitive obsession at the resolver boundary. In
crates/auths-cli/src/commands/verify_commit.rs:
:173/:394/:439 thread bundle_kel: Option<&(String, Vec<Event>)> — an anonymous
tuple whose String is really a DID — through three signatures. A named
BundleKel { did: Prefix /* or IdentityDid */, events: Vec<Event> } (or Vec<SignedEvent>)
is clearer and self-documenting.
resolve_signer_kel(did: &str) re-runs parse_did_keri(did) on every branch
(:403, :408, …) instead of parsing once at the boundary into a Prefix and passing
the typed value down (parse-don't-validate: parse at the edge, pass the proof inward).
3. Shared logic vs. duplication — one clear offender (and I added to it)
There are three near-identical DelegatorKelLookup impls (trait at validate.rs:373,
single method find_seal(&self, delegator_aid: &Prefix, seal_said: &Said) -> Option<SourceSeal>), each indexing "a KEL's anchoring seals":
crates/auths-verifier/src/commit_kel.rs:165 RootKelLookup — linear scan
crates/auths-verifier/src/presentation.rs:67 DelegatorSeals — linear scan
crates/auths-sdk/src/domains/org/offline_verify.rs:88 OrgKelLookup — precomputed
HashMap, O(1) (added during the RT-002 work because the other two weren't reusable)
Three structs for one concept, and inconsistent in quality (HashMap vs linear scan). This
should be one reusable index — e.g. KelSealIndex::from_events(&[Event]) living in
auths-keri next to the trait — that everyone constructs. Note the lookups span crates
(auths-verifier and auths-sdk), so the shared type belongs in auths-keri.
Minor: replay_kel (validate.rs:1131) is a literal alias of validate_kel — two public
names for one function; collapse to one (the alias forced the lint to ban both names).
Working as intended (keep): replay logic is not duplicated — bundle, org bundle, and
WASM validateKelJson all call the single validate_signed_kel engine fn. That reuse is
the pattern done right; preserve it.
Proposed work (priority order)
P1 — Type-level A.0: make unsigned replay unrepresentable.
Make validate_kel / validate_kel_with_lookup / validate_kel_with_receipts /
replay_kel pub(crate) (or move behind a private module); expose only signed entrypoints
taking &[SignedEvent]. Signer-side trusted re-derivations that legitimately replay an
already-trusted local KEL stay inside the crate boundary or get a typed "trusted" wrapper.
Acceptance: calling structural replay with bare &[Event] from outside the engine crate
is a compile error; check-verify-path-completeness stays green as a backstop; the
existing rt-002-allow: annotations are revisited (several become unnecessary once the type
enforces it).
P2 — Consolidate the three DelegatorKelLookup impls into one KelSealIndex in
auths-keri; delete RootKelLookup, DelegatorSeals, OrgKelLookup in favor of it.
Acceptance: one impl, O(1) lookup, used by commit-verify + presentation + offline org
verify; behavior unchanged (existing org_delegation + commit_kel + presentation tests
pass).
P3 — Tighten boundary types. IdentityBundle.kel: Vec<serde_json::Value> →
Vec<Event> (ideally fold kel + kel_attachments into Vec<SignedEvent>); same for
BundledKel. Replace (String, Vec<Event>) in verify_commit.rs with a named struct;
have resolve_signer_kel take a parsed Prefix instead of &str + repeated
parse_did_keri.
Acceptance: no serde_json::Value in the bundle wire type; no anonymous DID-bearing
tuples on the resolver path; DID parsed once at the boundary.
Also fold in: audit remaining RegistryBackend (and peer port) trait methods for
lossy/no-op defaults (§1); collapse the replay_kel alias (§3).
Constraints & gotchas for whoever picks this up
Tech debt: type-driven hardening of the KEL verify path
Architectural follow-ups surfaced while remediating RT-002 (the systemic "verify path
replays a KEL by structure only, never checks event signatures" finding; see
docs/prompts/red_team_2026_06_10.mdand #262). RT-002's functional gap is closed forevery signature-carrying transport. This issue captures the structural weaknesses that
let the class exist and that still make it easy to reintroduce. None of these are
security-blocking today; they are correctness-by-construction improvements.
Pre-launch repo: no backwards-compatibility constraints — wire formats and public
signatures may change freely.
Zero-context primer (read first)
icp(inception),rot(rotation),ixn(interaction), plus delegated variantsdip(delegated inception) and
drt(delegated rotation). Each event is content-addressed bya SAID (self-addressing identifier, a hash over its own bytes). An identity's prefix
(its DID,
did:keri:<prefix>) is the SAID of its inception event — so the inception isself-certifying, but later appended events are not constrained by the prefix.
crates/auths-keri/src/validate.rs:validate_kel(&[Event])(+_with_lookup,_with_receipts,and the alias
replay_kel): checks SAID + sequence + chain-linkage + pre-rotationcommitment. It does not verify that each event is signed by the controlling key.
validate_signed_kel(&[SignedEvent], Option<&dyn DelegatorKelLookup>)atvalidate.rs:588: folds per-event signature verification intothe replay.
SignedEvent { event, signatures }pairs an event with its CESR signatureattachment.
a forged/unsigned KEL replayed to attacker-chosen keys and verified.
dip/drt) cannot be authenticated standalone:validate_delegated_inception(
validate.rs:1024) requires aDelegatorKelLookupto resolve the delegator'sanchoring seal. So authentication of a delegated device KEL must happen where the root
KEL + the device KEL + the lookup all coexist (the commit-verify layer, or the
org-bundle pattern in
offline_verify.rs).~/.authsregistry is trusted (self-owned). Bundles(
--identity-bundle, org air-gapped bundle), WASM/FFI inputs, and--remote/--oobifetches are untrusted and must be authenticated.
RegistryBackend(trait atcrates/auths-id/src/storage/registry/ backend.rs:502) is the storage port; adapters areGitRegistryBackend(auths-storage,real),
PostgresAdapter(stub),FakeRegistryBackend(testing), plus a blanketimpl RegistryBackend for Arc<T>(backend.rs:1019).--all-featureson auths-crypto/-core — a deliberateFIPS/CNSA
compile_error!guard makes that fail):cargo build -p auths-<crate> 2>&1 | grep "^error\[E". The verifier's WASM path compilesnatively with
cargo build -p auths-verifier --features wasm. A standing lintcargo run -p xtask -- check-verify-path-completenessguards the verify surfaces (below).Verdict
Trending right, but mixed. The ports/adapters seam is genuine and the leaf domain types
are good. The weaknesses cluster in three places: (1) a trait-default footgun that caused
RT-002, (2) parse-don't-validate is applied at the leaves but not at the load-bearing
boundary (the public replay API still accepts unsigned
&[Event]), and (3) genuinelogic duplication in delegator-seal lookups.
1. Ports/adapters — the lossy trait-default footgun (ALREADY FIXED; keep as a guardrail)
What happened:
RegistryBackend::append_signed_event/get_attachmentwere defaultmethods whose defaults silently did the wrong thing —
append_signed_eventdelegated toappend_eventand dropped the signature attachment;get_attachmentreturnedOk(None). TheArc<T>blanket impl did not override them, so everyArc<dyn RegistryBackend>(the common handle) inherited the lossy default. Nothing failedloudly —
get_attachmentjust returnedNone. This is the literal root cause of RT-002:producers couldn't ship signatures because storage silently discarded them.
Fix already landed (this session): the two methods are now required (no default) at
backend.rs:537/:549, so the compiler enumerates every adapter that must implement them;Git/Fake/Postgres/
Arcall implement them (Arcforwards atbackend.rs:1029/:1038).Lesson to encode (the actionable part):
loses data or returns "absent" is a landmine. Audit the rest of
RegistryBackend(andpeer port traits) for other defaults that silently no-op.
impl ... for Arc<T>must forward every method, and there is nocompile-time guarantee it stays complete as methods are added (a required method does
force it; a defaulted one does not). Consider whether the
Arcblanket impl is worth itsmaintenance hazard, or whether a
#[forward]-style macro / explicit newtype is clearer.2. Parse-don't-validate — solid at the leaves, weak exactly where it's load-bearing
Good (do not "fix"):
Prefix,Said,CesrKey,IndexedSignatureare real newtypesthat validate on deserialize.
SignedEvent { event, signatures }is the right domain typeand
validate_signed_kelconsumes it.Gaps:
2a. The dangerous boundary is not typed — this is the big one. All four structural
replays are still public and take bare
&[Event]:validate.rs:390pub fn validate_kel(&[Event])validate.rs:398pub fn validate_kel_with_lookup(...)validate.rs:472pub fn validate_kel_with_receipts(...)validate.rs:1131pub fn replay_kel(&[Event])(a literal alias ofvalidate_kel)The type system does nothing to stop an unsigned replay; we rely on the CI lint
xtask check-verify-path-completeness(which bans these on verify surfaces unless a callsite carries an
// rt-002-allow: <reason>comment). The lint is a guardrail; the realfix the audit specified (task "A.0",
docs/prompts/red_team_2026_06_10.md) is type-level:make bare-
Eventreplaypub(crate)/private and have the only public replay take&[SignedEvent], so "verify an unsigned KEL" becomes a compile error. Then the lintbecomes a backstop instead of the primary defense.
2b. Parallel index-correlated arrays instead of one type. The wire structs carry events
and their attachments as two arrays zipped + length-checked at runtime:
crates/auths-verifier/src/core.rs:935pub kel: Vec<serde_json::Value>+:943pub kel_attachments: Vec<String>(inIdentityBundle,core.rs:918).crates/auths-sdk/src/domains/org/bundle.rs:49BundledKel { events: Vec<Event>, attachments: Vec<String> }.A
Vec<SignedEvent>makes the length mismatch unrepresentable. Worse,IdentityBundle.kelis
Vec<serde_json::Value>— unparsed JSON, the weakest possible typing, deferring theparse past the type boundary entirely. The parallel-array wire shape has a defensible
forward/backward-read reason, but internally we should collapse to
Vec<SignedEvent>immediately at deserialize and never pass the loose arrays around. (Today the verify gate
does reconstruct
SignedEventthen callvalidate_signed_kel, but the struct still leaksthe loose arrays to every other reader.)
2c. Primitive obsession at the resolver boundary. In
crates/auths-cli/src/commands/verify_commit.rs::173/:394/:439threadbundle_kel: Option<&(String, Vec<Event>)>— an anonymoustuple whose
Stringis really a DID — through three signatures. A namedBundleKel { did: Prefix /* or IdentityDid */, events: Vec<Event> }(orVec<SignedEvent>)is clearer and self-documenting.
resolve_signer_kel(did: &str)re-runsparse_did_keri(did)on every branch(
:403,:408, …) instead of parsing once at the boundary into aPrefixand passingthe typed value down (parse-don't-validate: parse at the edge, pass the proof inward).
3. Shared logic vs. duplication — one clear offender (and I added to it)
There are three near-identical
DelegatorKelLookupimpls (trait atvalidate.rs:373,single method
find_seal(&self, delegator_aid: &Prefix, seal_said: &Said) -> Option<SourceSeal>), each indexing "a KEL's anchoring seals":crates/auths-verifier/src/commit_kel.rs:165RootKelLookup— linear scancrates/auths-verifier/src/presentation.rs:67DelegatorSeals— linear scancrates/auths-sdk/src/domains/org/offline_verify.rs:88OrgKelLookup— precomputedHashMap, O(1) (added during the RT-002 work because the other two weren't reusable)Three structs for one concept, and inconsistent in quality (HashMap vs linear scan). This
should be one reusable index — e.g.
KelSealIndex::from_events(&[Event])living inauths-kerinext to the trait — that everyone constructs. Note the lookups span crates(
auths-verifierandauths-sdk), so the shared type belongs inauths-keri.Minor:
replay_kel(validate.rs:1131) is a literal alias ofvalidate_kel— two publicnames for one function; collapse to one (the alias forced the lint to ban both names).
Working as intended (keep): replay logic is not duplicated — bundle, org bundle, and
WASM
validateKelJsonall call the singlevalidate_signed_kelengine fn. That reuse isthe pattern done right; preserve it.
Proposed work (priority order)
P1 — Type-level A.0: make unsigned replay unrepresentable.
Make
validate_kel/validate_kel_with_lookup/validate_kel_with_receipts/replay_kelpub(crate)(or move behind a private module); expose only signed entrypointstaking
&[SignedEvent]. Signer-side trusted re-derivations that legitimately replay analready-trusted local KEL stay inside the crate boundary or get a typed "trusted" wrapper.
Acceptance: calling structural replay with bare
&[Event]from outside the engine crateis a compile error;
check-verify-path-completenessstays green as a backstop; theexisting
rt-002-allow:annotations are revisited (several become unnecessary once the typeenforces it).
P2 — Consolidate the three
DelegatorKelLookupimpls into oneKelSealIndexinauths-keri; deleteRootKelLookup,DelegatorSeals,OrgKelLookupin favor of it.Acceptance: one impl, O(1) lookup, used by commit-verify + presentation + offline org
verify; behavior unchanged (existing
org_delegation+ commit_kel + presentation testspass).
P3 — Tighten boundary types.
IdentityBundle.kel: Vec<serde_json::Value>→Vec<Event>(ideally foldkel+kel_attachmentsintoVec<SignedEvent>); same forBundledKel. Replace(String, Vec<Event>)inverify_commit.rswith a named struct;have
resolve_signer_keltake a parsedPrefixinstead of&str+ repeatedparse_did_keri.Acceptance: no
serde_json::Valuein the bundle wire type; no anonymous DID-bearingtuples on the resolver path; DID parsed once at the boundary.
Also fold in: audit remaining
RegistryBackend(and peer port) trait methods forlossy/no-op defaults (§1); collapse the
replay_kelalias (§3).Constraints & gotchas for whoever picks this up
pattern (§primer). P1 must not break this: the signed public API still needs the
Option<&dyn DelegatorKelLookup>parameter.per-crate builds + the specific test cases named above.
docs/prompts/red_team_2026_06_10.md(audit, tasks A.0/A.1), Add custom lint rules to enforce domain types over raw strings #98 (existing "custom lint rules to enforce domain types over raw
strings" — P1's lint half partially overlaps).