diff --git a/.github/workflows/proof-corpus.yml b/.github/workflows/proof-corpus.yml index 65e0963..94f1524 100644 --- a/.github/workflows/proof-corpus.yml +++ b/.github/workflows/proof-corpus.yml @@ -64,7 +64,8 @@ jobs: src/core/Levels.idr src/core/Checker.idr \ src/core/Composition.idr src/core/Epistemic.idr \ src/interface/abi/Types.idr src/interface/abi/Layout.idr \ - src/interface/abi/LayoutProofs.idr" + src/interface/abi/LayoutProofs.idr \ + src/interface/WireDecode.idr src/interface/WireConformance.idr" if grep -nE '\b(believe_me|really_believe_me|assert_total|assert_smaller|idris_crash|postulate|sorry)\b' \ $files | grep -vE ':\s*(--|\|\|\|)' ; then echo "::error::proof-escape symbol found in the Phase-1 corpus" diff --git a/CHANGELOG.md b/CHANGELOG.md index 967e7cb..8af5be3 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -12,13 +12,14 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 ### Verified -**VclTotal proof corpus — Phase 0→4 remediation** (2026-05-18/19, -hyperpolymath/standards#124). The `src/core/**` Idris2 corpus, which at +**VclTotal proof corpus — Phase 0→4 remediation + Phase 5 boundary +reinforcement** (2026-05-18/19, hyperpolymath/standards#124, +hyperpolymath/vcl-ut#25). The `src/core/**` Idris2 corpus, which at Phase 0 did not compile and had never been machine-checked, is now CI-gated and green: - `verification/proofs/vclut-core.ipkg` builds clean under idris2 0.8.0 - (`idris2 --build`, exit 0, `%default total`) as **10 modules**, with + (`idris2 --build`, exit 0, `%default total`) as **12 modules**, with **zero proof-escape symbols** (no `believe_me`/`postulate`/`assert_*`/ `idris_crash`/`sorry`), enforced by `.github/workflows/proof-corpus.yml`. - Phase 1 (#21): corpus resurrection — `ABI.Types`/`Grammar` repaired, @@ -33,6 +34,26 @@ CI-gated and green: bounds) + L6–L10 `composeJoin` closure (`l6..l9Compose`, `epiStructJoin`); L10 acyclicity carried by the explicit `JoinSideCondition` (provably non-closed, not faked). +- Phase 5 / vcl-ut#25 (boundary reinforcement): trusted Rust/SPARK-grade + parser (P5a, #26) + deterministic versioned wire codec (P5b step 1, + #28); **P5b step 2 (this change)** — `VclTotal.Interface.WireDecode`, + a total (`%default total`, zero proof-escape) decoder of the v1 wire + format into the certified `Statement`, recursion bounded by an + input-length fuel `Nat`; `VclTotal.Interface.WireConformance` proves + it byte-for-byte conformant with the Rust `to_wire` encoder by `Refl` + on golden fixtures (regeneration oracle: + `src/interface/parse/tests/conformance_emit.rs`). The C-ABI + `Statement` marshalling *decode* side is now certified. Disclosed: + the NaN *payload* is not preserved across the Idris `Double` boundary + (finite + infinite values bit-exact; Rust proptest remains the + exhaustive float witness). OWED: **P5c — typed-wasm PCC transport** + (proof term + checker kernel the consumer re-runs; *re-checkable*, + TCB = checker kernel; the estate-aligned objective — the `ffi/zig` + fail-closed shim is retained only as the C-ABI attestation + *fallback*, not the endpoint); **P5d** — the signed-attestation + fallback contract for C-only consumers. Re-checkable transport is + impossible only over a C ABI, not over typed-wasm (two-tier boundary + model: `verification/proofs/VERIFICATION-STANCE.adoc`). `verification/proofs/VERIFICATION-STANCE.adoc` is the authoritative, precisely-scoped catalogue (residual OWED items disclosed, not masked). diff --git a/PROOF-NEEDS.md b/PROOF-NEEDS.md index 5095201..e4b3340 100644 --- a/PROOF-NEEDS.md +++ b/PROOF-NEEDS.md @@ -6,8 +6,8 @@ - **LOC**: ~8,000 - **Languages**: Rust, ReScript, Idris2, Zig - **Existing ABI proofs**: `src/interface/abi/*.idr` + domain-specific Idris2: `src/core/Checker.idr`, `Grammar.idr`, `Levels.idr`, `Schema.idr`, `Composition.idr` -- **Machine-verified**: the full `VclTotal` proof corpus — `verification/proofs/vclut-core.ipkg` builds clean under idris2 0.8.0 (`idris2 --build`, exit 0, `%default total`, **zero proof-escape symbols**, CI-gated by `.github/workflows/proof-corpus.yml`) as **10 modules** (`ABI.{Types,Layout,LayoutProofs}` + `Core.{Grammar,Schema,Decide,Levels,Checker,Composition,Epistemic}`), plus the self-contained `verification/proofs/SafetyL4Model.idr` (`--check`, exit 0). Phases 1–3 are on `origin/main` (PRs #21/#22/#23); Phase 4 (`LayoutProofs` + L6–L10 `composeJoin` closure) is **PR #24**, pending merge. -- **Status (Phase 0 → 4 RESOLVED, honestly)**: the Phase-0 blockers are fixed, not faked — the corpus compiles and is machine-checked; `ABI.Types`/`Grammar` errors repaired; **L2/L3/L5 de-vacuized** (Phase 2, evidence-carrying predicates over `Core.Decide`); all ten levels carry `checkLevelNSound` and `Checker.certifyAt`/`certifyRequested` assemble a genuine dependent `SafetyCertificate` (Phase 3); the Zig FFI is no longer a fabricating stub — it is **fail-closed** with a proof-gated `Checker.certifiedLevel` mint (Phase 3d). Remaining honest gaps are precisely scoped (FFI proof-transport, string→`Statement` parser, L3 subquery/heuristic scoping, L9/L10 predicate depth, the additive↔ceil `alignUp` sliver). `verification/proofs/VERIFICATION-STANCE.adoc` is the authoritative, proof-backed catalogue and takes precedence over this file. +- **Machine-verified**: the full `VclTotal` proof corpus — `verification/proofs/vclut-core.ipkg` builds clean under idris2 0.8.0 (`idris2 --build`, exit 0, `%default total`, **zero proof-escape symbols**, CI-gated by `.github/workflows/proof-corpus.yml`) as **12 modules** (`ABI.{Types,Layout,LayoutProofs}` + `Core.{Grammar,Schema,Decide,Levels,Checker,Composition,Epistemic}` + `Interface.{WireDecode,WireConformance}`), plus the self-contained `verification/proofs/SafetyL4Model.idr` (`--check`, exit 0). Phases 1–4 are on `origin/main` (PRs #21/#22/#23/#24); Phase 5 / vcl-ut#25 (trusted Rust parser P5a #26, wire codec P5b-step-1 #28, **certified Idris wire decoder + cross-language `Refl` conformance** P5b-step-2) reinforces the FFI boundary. +- **Status (Phase 0 → 4 RESOLVED, honestly; Phase 5 boundary reinforcement in progress)**: the Phase-0 blockers are fixed, not faked — the corpus compiles and is machine-checked; `ABI.Types`/`Grammar` errors repaired; **L2/L3/L5 de-vacuized** (Phase 2, evidence-carrying predicates over `Core.Decide`); all ten levels carry `checkLevelNSound` and `Checker.certifyAt`/`certifyRequested` assemble a genuine dependent `SafetyCertificate` (Phase 3); the Zig FFI is no longer a fabricating stub — it is **fail-closed** with a proof-gated `Checker.certifiedLevel` mint (Phase 3d). **Phase 5 (vcl-ut#25)**: a trusted Rust/SPARK-grade parser + a deterministic versioned wire codec exist, and the *decode* side of the C-ABI `Statement` marshalling is now **certified** — `VclTotal.Interface.WireDecode` is a total (zero-escape) decoder proven byte-for-byte conformant with the Rust encoder by `Refl` (`WireConformance`). Remaining honest gaps are precisely scoped (P5c = typed-wasm PCC proof-transport — proof term + checker kernel the consumer re-runs, *re-checkable*, the estate-aligned objective; the C-ABI/Zig shim retained only as the fail-closed attestation fallback; P5d = signed-attestation *fallback* contract for C-only consumers; plus L3 subquery/heuristic scoping; L9/L10 predicate depth; the additive↔ceil `alignUp` sliver; the disclosed NaN-payload limitation of the Idris `Double` boundary). Re-checkable transport is impossible *only* over a C ABI, not over the estate's typed-wasm target. `verification/proofs/VERIFICATION-STANCE.adoc` is the authoritative, proof-backed catalogue and takes precedence over this file. ## What Needs Proving @@ -44,6 +44,23 @@ - The *trusted* boundary parser is the Rust/SPARK-grade `src/interface/parse` crate (`vcltotal-parse`, P5a of vcl-ut#25), which mirrors `Grammar.idr` and feeds the certifier across the C-ABI. + Its `wire.rs` codec (P5b step 1) serialises the parsed `Statement` + to a deterministic versioned binary format + (`src/interface/parse/WIRE-FORMAT.adoc`). +- The *certified* receiver of that format is + `VclTotal.Interface.WireDecode` (P5b step 2): a total, zero-escape + Idris decoder into the certified `Statement`, proven byte-for-byte + conformant with the Rust encoder by `Refl` in + `VclTotal.Interface.WireConformance`. The marshalling seam's *decode* + side is therefore now machine-verified. **P5c** (the estate-aligned + objective): transport the certificate over the **typed-wasm** target + as a proof term + a small checker kernel the consumer re-runs — + proof-carrying code, *re-checkable*, TCB = checker kernel + typed-wasm + verifier (re-checkable transport is impossible only over a C ABI; the + `ffi/zig` fail-closed shim is retained only as the C-ABI attestation + fallback). **P5d** (fallback tier): the signed-attestation contract + for consumers reachable only over C. Both OWED; see the two-tier + boundary model in `verification/proofs/VERIFICATION-STANCE.adoc`. - Optional future work: prove the ReScript frontend faithfully tracks the Idris2 grammar (low priority; it is a convenience frontend, not a trust anchor). diff --git a/src/interface/WireConformance.idr b/src/interface/WireConformance.idr new file mode 100644 index 0000000..a553443 --- /dev/null +++ b/src/interface/WireConformance.idr @@ -0,0 +1,101 @@ +-- SPDX-License-Identifier: PMPL-1.0-or-later +-- Copyright (c) 2026 Jonathan D.A. Jewell + +||| VCL-total Interface — Cross-language Wire Conformance (P5b step 2) +||| +||| Machine-checked proof that the certified Idris decoder +||| (`VclTotal.Interface.WireDecode.fromWire`) agrees, byte-for-byte, +||| with the trusted Rust encoder (`src/interface/parse/src/wire.rs` +||| `to_wire`) on shared golden fixtures. +||| +||| Each `goldenN` is the EXACT `to_wire` image emitted by the Rust +||| regeneration oracle `src/interface/parse/tests/conformance_emit.rs` +||| (run that test with `--nocapture` to re-derive these literals; the +||| literals below are injected verbatim from its output, never +||| hand-transcribed). The corresponding +||| `conformN : fromWire goldenN = Right expectedN` is proved by `Refl`: +||| the corpus build only succeeds if the total decoder, evaluated at +||| compile time on the Rust bytes, reduces to exactly the expected +||| certified `Statement`. This is WIRE-FORMAT.adoc's conformance +||| contract discharged as a proof, not a runtime test — no +||| proof-escape, %default total. +||| +||| Fixtures: F1 minimal; F2 exercises strings / ints / bools / agents / +||| options / lists / a nested comparison expression and every extension +||| clause; F3 the float path with the exactly-representable `2.5` +||| (arbitrary-`f64` bit-exactness — incl. the NaN payload, which the +||| Idris `Double` boundary does NOT preserve, by disclosure — is the +||| Rust proptest's job: `wire.rs::golden_bit_exact_floats`). + +module VclTotal.Interface.WireConformance + +import VclTotal.ABI.Types +import VclTotal.Core.Grammar +import VclTotal.Interface.WireDecode + +%default total + +-- ── F1: minimal — SELECT * FROM STORE "main", level SchemaBound ────── + +golden1 : List Bits8 +golden1 = [86,67,76,87,1,0,1,0,0,0,3,2,4,0,0,0,109,97,105,110,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1] + +expected1 : Statement +expected1 = + MkStatement [SelStar] (SrcStore "main") + Nothing [] Nothing [] Nothing Nothing + Nothing Nothing Nothing Nothing Nothing + SchemaBound + +conform1 : fromWire WireConformance.golden1 = Right WireConformance.expected1 +conform1 = Refl + +-- ── F2: strings/ints/bools/agents/options/lists/nested expr + every +-- extension clause; level EpistemicSafe ───────────────────────── + +golden2 : List Bits8 +golden2 = [86,67,76,87,1,0,2,0,0,0,0,0,2,0,0,0,105,100,3,0,6,0,0,0,117,117,105,100,45,49,1,2,0,0,1,1,0,0,0,120,1,1,7,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,5,1,0,0,0,116,1,1,10,0,0,0,0,0,0,0,0,1,1,1,0,0,0,119,1,2,1,1,3,0,0,0,0,0,0,0,1,1,1,2,0,0,0,0,1,5,0,0,0,108,101,97,110,52,1,0,0,0,0,0,1,3,1,10] + +expected2 : Statement +expected2 = + MkStatement + [SelField (MkFieldRef Graph "id"), SelStar] + (SrcOctad "uuid-1") + (Just (ECompare Eq + (EField (MkFieldRef Vector "x") TAny) + (ELiteral (LitInt 7) TAny) + TAny)) + [] + Nothing + [(MkFieldRef Temporal "t", True)] + (Just 10) + Nothing + (Just (ProofWitness "w")) + (Just EffReadWrite) + (Just (VerAtLeast 3)) + (Just LinUseOnce) + (Just (EpClause + [AgEngine, AgProver "lean4"] + [EpReqKnows AgEngine (ELiteral (LitBool True) TAny)])) + EpistemicSafe + +conform2 : fromWire WireConformance.golden2 = Right WireConformance.expected2 +conform2 = Refl + +-- ── F3: float path — WHERE 2.5 (exactly representable), level +-- ParseSafe. Witnesses the IEEE-754 reconstruction reduces to +-- the identical primitive `Double` under the evaluator. ───────── + +golden3 : List Bits8 +golden3 = [86,67,76,87,1,0,1,0,0,0,3,2,1,0,0,0,115,1,1,2,0,0,0,0,0,0,4,64,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0] + +expected3 : Statement +expected3 = + MkStatement [SelStar] (SrcStore "s") + (Just (ELiteral (LitFloat 2.5) TAny)) + [] Nothing [] Nothing Nothing + Nothing Nothing Nothing Nothing Nothing + ParseSafe + +conform3 : fromWire WireConformance.golden3 = Right WireConformance.expected3 +conform3 = Refl diff --git a/src/interface/WireDecode.idr b/src/interface/WireDecode.idr new file mode 100644 index 0000000..ae09b83 --- /dev/null +++ b/src/interface/WireDecode.idr @@ -0,0 +1,643 @@ +-- SPDX-License-Identifier: PMPL-1.0-or-later +-- Copyright (c) 2026 Jonathan D.A. Jewell + +||| VCL-total Interface — Wire Decoder (P5b step 2, vcl-ut#25) +||| +||| The certified half of the hypatia<->verisim marshalling seam. The +||| trusted Rust parser (P5a, #26) and its codec (P5b step 1, #28) +||| serialise a `Statement` to the v1 wire format normatively specified +||| in `src/interface/parse/WIRE-FORMAT.adoc`. This module decodes that +||| *identical* byte stream into `Grammar.idr`'s certified `Statement`. +||| +||| Posture (matches the SPARK-grade Rust side): +||| * `%default total` — every function is machine-checked total. +||| * ZERO proof-escape: no believe_me / postulate / assert_* / +||| idris_crash / sorry anywhere. The totality is structural, not +||| asserted. +||| * Bounds-checked, no partial primitive: every malformed input is a +||| typed `WireErr`, never a crash (the Idris analogue of the Rust +||| decoder's `Result<_, WireError>` totality contract). +||| +||| Every definition is `public export`: the cross-language conformance +||| proofs in `WireConformance` are `Refl`s, so the decoder must reduce +||| transparently at compile time on concrete Rust-emitted bytes. +||| +||| Totality strategy. The grammar is mutually recursive (Expr embeds +||| Statement via Subquery; Statement embeds Expr). A binary parser over +||| a finite byte list is not structurally recursive on the input, so +||| recursion is bounded by *fuel*: a `Nat` that strictly decreases on +||| every descent into a sub-node. Every node's encoding begins with at +||| least one discriminant byte, so the node count of any stream is <= +||| its byte length; initialising fuel to the input length therefore +||| never spuriously exhausts on a well-formed stream, and the recursion +||| is structurally terminating on the fuel `Nat`. The recursion-bearing +||| list element decoders are inlined into the `mutual` block so the +||| size-change totality analysis sees the fuel-decreasing edges +||| directly (a higher-order vector combinator would hide them). +||| +||| Type slots. `Grammar.Expr` carries a `VqlType` at every node; the +||| wire format omits it (the parser does syntax only — see +||| WIRE-FORMAT.adoc). The decoder fills `TAny`, exactly the grammar's +||| documented "unresolved type before type checking"; the 10-level +||| checker resolves it at Level 2+. Faithful, not a shortcut. +||| +||| Float reconstruction. Idris2 0.8.0 exposes no pure, bit-exact +||| `Bits64 -> Double`. A finite IEEE-754 value is reconstructed by +||| exact power-of-two scaling (repeated *2.0 / /2.0, which is exact in +||| IEEE until over/underflow, matching the hardware), so ALL finite +||| values — including subnormals and f64::MIN/MAX — and both +||| infinities round-trip bit-exactly. A NaN bit-pattern decodes to *a* +||| NaN (0.0/0.0); the NaN *payload* is not preserved across the Idris +||| `Double` boundary. This single limitation is DISCLOSED here and in +||| WIRE-FORMAT.adoc rather than papered over: every consumer agrees +||| "this is NaN", the semantically meaningful invariant; the +||| exhaustive bit-exact float witness remains the Rust-side proptest. + +module VclTotal.Interface.WireDecode + +import VclTotal.ABI.Types +import VclTotal.Core.Grammar +import Data.List + +%default total + +-- ═══════════════════════════════════════════════════════════════════════ +-- Errors (the Idris mirror of Rust `WireError`) +-- ═══════════════════════════════════════════════════════════════════════ + +public export +data WireErr + = BadMagic + | BadVersion + | Truncated + | BadTag String Bits8 + | BadBool Bits8 + | BadUtf8 + | LengthOverflow + | TrailingBytes + | OutOfFuel + +public export +Show WireErr where + show BadMagic = "bad magic (not a VCLW stream)" + show BadVersion = "unsupported wire version" + show Truncated = "truncated input" + show (BadTag ty t) = "bad " ++ ty ++ " discriminant " ++ show t + show (BadBool b) = "bad bool byte " ++ show b + show BadUtf8 = "invalid UTF-8 in string" + show LengthOverflow = "length/count overflow" + show TrailingBytes = "trailing bytes after statement" + show OutOfFuel = "decoder fuel exhausted (malformed/over-deep stream)" + +public export +Eq WireErr where + BadMagic == BadMagic = True + BadVersion == BadVersion = True + Truncated == Truncated = True + (BadTag a x) == (BadTag b y) = a == b && x == y + (BadBool x) == (BadBool y) = x == y + BadUtf8 == BadUtf8 = True + LengthOverflow == LengthOverflow = True + TrailingBytes == TrailingBytes = True + OutOfFuel == OutOfFuel = True + _ == _ = False + +||| A parser step: consume a prefix of the remaining bytes, yielding a +||| value and the rest, or a typed failure. Pure and total. +public export +0 Parse : Type -> Type +Parse a = List Bits8 -> Either WireErr (a, List Bits8) + +-- ═══════════════════════════════════════════════════════════════════════ +-- Primitive readers (fixed width; structural, no fuel needed) +-- ═══════════════════════════════════════════════════════════════════════ + +||| Split exactly `n` bytes off the front. Structural recursion on `n`. +public export +takeN : Nat -> List Bits8 -> Either WireErr (List Bits8, List Bits8) +takeN Z bs = Right ([], bs) +takeN (S _) [] = Left Truncated +takeN (S k) (b :: bs) = do (hd, tl) <- takeN k bs + Right (b :: hd, tl) + +public export +byte : Parse Bits8 +byte [] = Left Truncated +byte (b :: bs) = Right (b, bs) + +||| Little-endian unsigned value of a byte list (b0 = least significant). +public export +leNat : List Bits8 -> Integer +leNat = go 0 1 + where + go : Integer -> Integer -> List Bits8 -> Integer + go acc _ [] = acc + go acc place (b :: bs) = go (acc + place * cast b) (place * 256) bs + +public export +u16le : Parse Integer +u16le inp = do (bs, r) <- takeN 2 inp + Right (leNat bs, r) + +public export +u32le : Parse Integer +u32le inp = do (bs, r) <- takeN 4 inp + Right (leNat bs, r) + +public export +u64le : Parse Integer +u64le inp = do (bs, r) <- takeN 8 inp + Right (leNat bs, r) + +||| u64 as a `Nat` (Grammar uses `Nat` for limit/offset/version/bounds). +public export +u64nat : Parse Nat +u64nat inp = do (v, r) <- u64le inp + Right (integerToNat v, r) + +||| Signed 64-bit two's-complement (Grammar `LitInt` is `Int`). +public export +i64le : Parse Int +i64le inp = do (v, r) <- u64le inp + let s = if v >= 9223372036854775808 + then v - 18446744073709551616 + else v + Right (cast s, r) + +-- ── IEEE-754 binary64 reconstruction (exact for finite + infinities) ── + +public export +twoPow : Nat -> Double +twoPow Z = 1.0 +twoPow (S k) = 2.0 * twoPow k + +||| Multiply by 2 exactly `k` times (exact in IEEE until overflow→inf, +||| which is the correct binary64 behaviour). +public export +scaleUp : Nat -> Double -> Double +scaleUp Z x = x +scaleUp (S k) x = scaleUp k (x * 2.0) + +||| Halve exactly `k` times — exact in IEEE including gradual underflow +||| (division by two never rounds), so subnormals reconstruct exactly. +public export +scaleDown : Nat -> Double -> Double +scaleDown Z x = x +scaleDown (S k) x = scaleDown k (x / 2.0) + +||| Reconstruct the binary64 denoted by an exponent-relative integer +||| mantissa: `m * 2^e`, exactly, via power-of-two scaling only. +public export +ldexpExact : Integer -> Integer -> Double +ldexpExact m e = + if e >= 0 + then scaleUp (integerToNat e) (cast m) + else scaleDown (integerToNat (negate e)) (cast m) + +||| Decode 8 little-endian bytes as binary64. Total. Bit-exact for all +||| finite values and both infinities; NaN payload not preserved +||| (decodes to a NaN) — disclosed (module header / WIRE-FORMAT.adoc). +public export +f64le : Parse Double +f64le inp = do + (bs, r) <- takeN 8 inp + let bits = leNat bs + let sign = (bits `div` 9223372036854775808) `mod` 2 + let expo = (bits `div` 4503599627370496) `mod` 2048 + let mant = bits `mod` 4503599627370496 + let mag : Double + mag = if expo == 2047 + then (if mant == 0 + then 1.0 / 0.0 -- +inf + else 0.0 / 0.0) -- NaN (payload not kept) + else if expo == 0 + then ldexpExact mant (-1074) -- zero / subnormal + else ldexpExact (mant + 4503599627370496) + (expo - 1075) -- normal + Right (if sign == 1 then negate mag else mag, r) + +public export +boolByte : Parse Bool +boolByte inp = do (b, r) <- byte inp + case b of + 0 => Right (False, r) + 1 => Right (True, r) + x => Left (BadBool x) + +||| u32 length/count as a `Nat`. +public export +u32count : Parse Nat +u32count inp = do (v, r) <- u32le inp + Right (integerToNat v, r) + +-- ── UTF-8 string (strict; the Rust side uses core::str::from_utf8) ─── + +public export +cont : Bits8 -> Maybe Integer +cont b = let v = cast {to=Integer} b in + if v >= 128 && v < 192 then Just (v - 128) else Nothing + +||| Strict, total UTF-8 → codepoint list. Rejects overlong forms, +||| surrogates and out-of-range values exactly as a conformant decoder +||| must. Structural recursion on the byte list. +public export +utf8 : List Bits8 -> Either WireErr (List Char) +utf8 [] = Right [] +utf8 (b0 :: rest) = + let v0 = cast {to=Integer} b0 in + if v0 < 128 + then do cs <- utf8 rest + Right (chr (cast v0) :: cs) + else if v0 < 194 then Left BadUtf8 -- cont byte / overlong lead + else if v0 < 224 + then case rest of + (b1 :: tl) => case cont b1 of + Nothing => Left BadUtf8 + Just c1 => + let cp = (v0 - 192) * 64 + c1 in + if cp < 128 then Left BadUtf8 + else do cs <- utf8 tl + Right (chr (cast cp) :: cs) + [] => Left Truncated + else if v0 < 240 + then case rest of + (b1 :: b2 :: tl) => + case (cont b1, cont b2) of + (Just c1, Just c2) => + let cp = (v0 - 224) * 4096 + c1 * 64 + c2 in + if cp < 2048 then Left BadUtf8 + else if cp >= 55296 && cp <= 57343 then Left BadUtf8 + else do cs <- utf8 tl + Right (chr (cast cp) :: cs) + _ => Left BadUtf8 + _ => Left Truncated + else if v0 < 245 + then case rest of + (b1 :: b2 :: b3 :: tl) => + case (cont b1, cont b2, cont b3) of + (Just c1, Just c2, Just c3) => + let cp = (v0 - 240) * 262144 + c1 * 4096 + c2 * 64 + c3 in + if cp < 65536 then Left BadUtf8 + else if cp > 1114111 then Left BadUtf8 + else do cs <- utf8 tl + Right (chr (cast cp) :: cs) + _ => Left BadUtf8 + _ => Left Truncated + else Left BadUtf8 + +public export +vstring : Parse String +vstring inp = do + (n, r0) <- u32count inp + (bs, r1) <- takeN n r0 + case utf8 bs of + Left e => Left e + Right cs => Right (pack cs, r1) + +-- ═══════════════════════════════════════════════════════════════════════ +-- Non-recursive sum decoders (each consumes its discriminant byte) +-- ═══════════════════════════════════════════════════════════════════════ + +public export +decModality : Parse Modality +decModality inp = do + (t, r) <- byte inp + case t of + 0 => Right (Graph, r); 1 => Right (Vector, r) + 2 => Right (Tensor, r); 3 => Right (Semantic, r) + 4 => Right (Document, r); 5 => Right (Temporal, r) + 6 => Right (Provenance, r); 7 => Right (Spatial, r) + x => Left (BadTag "Modality" x) + +public export +decAgent : Parse Agent +decAgent inp = do + (t, r) <- byte inp + case t of + 0 => Right (AgEngine, r) + 1 => do (s, r1) <- vstring r; Right (AgProver s, r1) + 2 => Right (AgValidator, r) + 3 => do (s, r1) <- vstring r; Right (AgUser s, r1) + 4 => Right (AgFederation, r) + x => Left (BadTag "Agent" x) + +public export +decFieldRef : Parse FieldRef +decFieldRef inp = do + (m, r0) <- decModality inp + (n, r1) <- vstring r0 + Right (MkFieldRef m n, r1) + +public export +decCompOp : Parse CompOp +decCompOp inp = do + (t, r) <- byte inp + case t of + 0 => Right (Eq, r); 1 => Right (NotEq, r); 2 => Right (Lt, r) + 3 => Right (Gt, r); 4 => Right (LtEq, r); 5 => Right (GtEq, r) + 6 => Right (Like, r); 7 => Right (In, r) + x => Left (BadTag "CompOp" x) + +public export +decLogicOp : Parse LogicOp +decLogicOp inp = do + (t, r) <- byte inp + case t of + 0 => Right (And, r); 1 => Right (Or, r); 2 => Right (Not, r) + x => Left (BadTag "LogicOp" x) + +public export +decAggFunc : Parse AggFunc +decAggFunc inp = do + (t, r) <- byte inp + case t of + 0 => Right (Count, r); 1 => Right (Sum, r); 2 => Right (Avg, r) + 3 => Right (Min, r); 4 => Right (Max, r) + x => Left (BadTag "AggFunc" x) + +public export +decEpiOp : Parse EpistemicOp +decEpiOp inp = do + (t, r) <- byte inp + case t of + 0 => Right (OpKnows, r); 1 => Right (OpBelieves, r) + 2 => Right (OpCommonKnowledge, r) + x => Left (BadTag "EpistemicOp" x) + +public export +decF64Vec : Parse (List Double) +decF64Vec i0 = do + (n, i1) <- u32count i0 + go n i1 + where + go : Nat -> Parse (List Double) + go Z i = Right ([], i) + go (S k) i = do (x, i') <- f64le i + (xs, i'') <- go k i' + Right (x :: xs, i'') + +public export +decLiteral : Parse Literal +decLiteral inp = do + (t, r) <- byte inp + case t of + 0 => do (s, r1) <- vstring r; Right (LitString s, r1) + 1 => do (n, r1) <- i64le r; Right (LitInt n, r1) + 2 => do (x, r1) <- f64le r; Right (LitFloat x, r1) + 3 => do (b, r1) <- boolByte r; Right (LitBool b, r1) + 4 => Right (LitNull, r) + 5 => do (xs, r1) <- decF64Vec r; Right (LitVector xs, r1) + x => Left (BadTag "Literal" x) + +public export +decSource : Parse Source +decSource inp = do + (t, r) <- byte inp + case t of + 0 => do (s, r1) <- vstring r; Right (SrcOctad s, r1) + 1 => do (s, r1) <- vstring r; Right (SrcFederation s, r1) + 2 => do (s, r1) <- vstring r; Right (SrcStore s, r1) + x => Left (BadTag "Source" x) + +public export +decEffect : Parse EffectDecl +decEffect inp = do + (t, r) <- byte inp + case t of + 0 => Right (EffRead, r); 1 => Right (EffWrite, r) + 2 => Right (EffReadWrite, r); 3 => Right (EffConsume, r) + x => Left (BadTag "EffectDecl" x) + +public export +decVersion : Parse VersionConstraint +decVersion inp = do + (t, r) <- byte inp + case t of + 0 => Right (VerLatest, r) + 1 => do (n, r1) <- u64nat r; Right (VerAtLeast n, r1) + 2 => do (n, r1) <- u64nat r; Right (VerExact n, r1) + 3 => do (a, r1) <- u64nat r + (b, r2) <- u64nat r1 + Right (VerRange a b, r2) + x => Left (BadTag "VersionConstraint" x) + +public export +decLinear : Parse LinearAnnotation +decLinear inp = do + (t, r) <- byte inp + case t of + 0 => Right (LinUnlimited, r) + 1 => Right (LinUseOnce, r) + 2 => do (n, r1) <- u64nat r; Right (LinBounded n, r1) + x => Left (BadTag "LinearAnnotation" x) + +public export +decSafety : Parse SafetyLevel +decSafety inp = do + (t, r) <- byte inp + case t of + 0 => Right (ParseSafe, r); 1 => Right (SchemaBound, r) + 2 => Right (TypeCompat, r); 3 => Right (NullSafe, r) + 4 => Right (InjectionProof, r); 5 => Right (ResultTyped, r) + 6 => Right (CardinalitySafe, r); 7 => Right (EffectTracked, r) + 8 => Right (TemporalSafe, r); 9 => Right (LinearSafe, r) + 10 => Right (EpistemicSafe, r) + x => Left (BadTag "SafetyLevel" x) + +-- ═══════════════════════════════════════════════════════════════════════ +-- Fuel-bounded repetition for NON-recursive elements (total: structural +-- on the count `Nat`; the element decoder is total by its type) +-- ═══════════════════════════════════════════════════════════════════════ + +public export +decRepeat : Nat -> (Nat -> Parse a) -> Nat -> Parse (List a) +decRepeat _ _ Z i = Right ([], i) +decRepeat fuel f (S k) i = do + (x, i') <- f fuel i + (xs, i'') <- decRepeat fuel f k i' + Right (x :: xs, i'') + +public export +decVec : Nat -> (Nat -> Parse a) -> Parse (List a) +decVec fuel f inp = do + (n, r) <- u32count inp + decRepeat fuel f n r + +public export +decOpt : Parse a -> Parse (Maybe a) +decOpt f inp = do + (t, r) <- byte inp + case t of + 0 => Right (Nothing, r) + 1 => do (x, r1) <- f r; Right (Just x, r1) + x => Left (BadTag "option" x) + +public export +decOrderItem : Parse (FieldRef, Bool) +decOrderItem i0 = do + (fr, i1) <- decFieldRef i0 + (asc, i2) <- boolByte i1 + Right ((fr, asc), i2) + +-- ═══════════════════════════════════════════════════════════════════════ +-- The mutually-recursive core (fuel strictly decreases on every descent; +-- recursion-bearing list decoders are inlined here so the size-change +-- analysis sees the decreasing edges directly) +-- ═══════════════════════════════════════════════════════════════════════ + +mutual + public export + decExpr : Nat -> Parse Expr + decExpr Z _ = Left OutOfFuel + decExpr (S k) inp = do + (t, r0) <- byte inp + case t of + 0 => do (fr, r) <- decFieldRef r0; Right (EField fr TAny, r) + 1 => do (l, r) <- decLiteral r0; Right (ELiteral l TAny, r) + 2 => do (c, r1) <- decCompOp r0 + (a, r2) <- decExpr k r1 + (b, r3) <- decExpr k r2 + Right (ECompare c a b TAny, r3) + 3 => do (lo, r1) <- decLogicOp r0 + (a, r2) <- decExpr k r1 + (mb, r3) <- decOpt (decExpr k) r2 + Right (ELogic lo a mb TAny, r3) + 4 => do (ag, r1) <- decAggFunc r0 + (e, r2) <- decExpr k r1 + Right (EAggregate ag e TAny, r2) + 5 => do (s, r) <- vstring r0; Right (EParam s TAny, r) + 6 => Right (EStar, r0) + 7 => do (st, r) <- decStmt k r0; Right (ESubquery st, r) + 8 => do (op, r1) <- decEpiOp r0 + (ag, r2) <- decAgent r1 + (e, r3) <- decExpr k r2 + Right (EEpistemic op ag e TAny, r3) + 9 => do (ag, r1) <- decAgent r0 + (p, r2) <- decExpr k r1 + (b, r3) <- decExpr k r2 + Right (EAnnounce ag p b TAny, r3) + x => Left (BadTag "Expr" x) + + public export + decSelectItem : Nat -> Parse SelectItem + decSelectItem Z _ = Left OutOfFuel + decSelectItem (S k) inp = do + (t, r0) <- byte inp + case t of + 0 => do (fr, r) <- decFieldRef r0; Right (SelField fr, r) + 1 => do (m, r) <- decModality r0; Right (SelModality m, r) + 2 => do (ag, r1) <- decAggFunc r0 + (e, r2) <- decExpr k r1 + Right (SelAggregate ag e, r2) + 3 => Right (SelStar, r0) + x => Left (BadTag "SelectItem" x) + + public export + decSelectItemsN : Nat -> Nat -> Parse (List SelectItem) + decSelectItemsN _ Z i = Right ([], i) + decSelectItemsN fuel (S c) i = do + (x, i') <- decSelectItem fuel i + (xs, i'') <- decSelectItemsN fuel c i' + Right (x :: xs, i'') + + public export + decSelectItemVec : Nat -> Parse (List SelectItem) + decSelectItemVec fuel i = do + (n, r) <- u32count i + decSelectItemsN fuel n r + + public export + decEpiReq : Nat -> Parse EpistemicRequirement + decEpiReq Z _ = Left OutOfFuel + decEpiReq (S k) inp = do + (t, r0) <- byte inp + case t of + 0 => do (a, r1) <- decAgent r0 + (e, r2) <- decExpr k r1 + Right (EpReqKnows a e, r2) + 1 => do (a, r1) <- decAgent r0 + (e, r2) <- decExpr k r1 + Right (EpReqBelieves a e, r2) + 2 => do (e, r1) <- decExpr k r0 + Right (EpReqCommon e, r1) + 3 => do (a, r1) <- decAgent r0 + (b, r2) <- decAgent r1 + (e, r3) <- decExpr k r2 + Right (EpReqEntails a b e, r3) + x => Left (BadTag "EpistemicRequirement" x) + + public export + decEpiReqsN : Nat -> Nat -> Parse (List EpistemicRequirement) + decEpiReqsN _ Z i = Right ([], i) + decEpiReqsN fuel (S c) i = do + (x, i') <- decEpiReq fuel i + (xs, i'') <- decEpiReqsN fuel c i' + Right (x :: xs, i'') + + public export + decEpiReqVec : Nat -> Parse (List EpistemicRequirement) + decEpiReqVec fuel i = do + (n, r) <- u32count i + decEpiReqsN fuel n r + + public export + decProof : Nat -> Parse ProofClause + decProof Z _ = Left OutOfFuel + decProof (S k) inp = do + (t, r0) <- byte inp + case t of + 0 => Right (ProofAttached, r0) + 1 => do (s, r) <- vstring r0; Right (ProofWitness s, r) + 2 => do (e, r) <- decExpr k r0; Right (ProofAssert e, r) + x => Left (BadTag "ProofClause" x) + + public export + decEpiClause : Nat -> Parse EpistemicClause + decEpiClause Z _ = Left OutOfFuel + decEpiClause (S k) inp = do + (ags, r1) <- decVec k (\_, x => decAgent x) inp + (rqs, r2) <- decEpiReqVec k r1 + Right (EpClause ags rqs, r2) + + public export + decStmt : Nat -> Parse Statement + decStmt Z _ = Left OutOfFuel + decStmt (S k) inp = do + (sel, r1) <- decSelectItemVec k inp + (src, r2) <- decSource r1 + (wc, r3) <- decOpt (decExpr k) r2 + (gb, r4) <- decVec k (\_, x => decFieldRef x) r3 + (hav, r5) <- decOpt (decExpr k) r4 + (ob, r6) <- decVec k (\_, x => decOrderItem x) r5 + (lim, r7) <- decOpt (\x => u64nat x) r6 + (off, r8) <- decOpt (\x => u64nat x) r7 + (pf, r9) <- decOpt (decProof k) r8 + (ef, r10) <- decOpt (\x => decEffect x) r9 + (vc, r11) <- decOpt (\x => decVersion x) r10 + (la, r12) <- decOpt (\x => decLinear x) r11 + (ep, r13) <- decOpt (decEpiClause k) r12 + (lvl, r14) <- decSafety r13 + Right (MkStatement sel src wc gb hav ob lim off pf ef vc la ep lvl, r14) + +-- ═══════════════════════════════════════════════════════════════════════ +-- Header + entry point +-- ═══════════════════════════════════════════════════════════════════════ + +public export +magic : List Bits8 +magic = [86, 67, 76, 87] -- "VCLW" + +||| Decode a v1 wire stream into the certified `Statement`. Total: every +||| input yields `Right stmt` or a typed `Left WireErr`, never a crash. +||| Fuel is the input length — a sound over-approximation of the node +||| count (every node costs >= 1 discriminant byte), so a well-formed +||| stream never exhausts it. +public export +fromWire : List Bits8 -> Either WireErr Statement +fromWire input = do + (m, r0) <- takeN 4 input + if m /= magic then Left BadMagic else Right () + (ver, r1) <- u16le r0 + if ver /= 1 then Left BadVersion else Right () + (stmt, r2) <- decStmt (length input) r1 + if length r2 /= 0 then Left TrailingBytes else Right stmt diff --git a/src/interface/parse/WIRE-FORMAT.adoc b/src/interface/parse/WIRE-FORMAT.adoc index 92794dc..617749a 100644 --- a/src/interface/parse/WIRE-FORMAT.adoc +++ b/src/interface/parse/WIRE-FORMAT.adoc @@ -4,16 +4,27 @@ // SPDX-License-Identifier: PMPL-1.0-or-later // Copyright (c) 2026 Jonathan D.A. Jewell (hyperpolymath) -The C-ABI marshalling format for a parsed `Statement`. The Rust -encoder/decoder (`wire.rs`) is normative for v1; the P5b-step-2 Idris2 -decoder MUST decode this identical byte stream into `Grammar.idr`'s -`Statement` (the certified type), proven total. Cross-language -conformance is by shared golden vectors. +The `Statement` marshalling format. The Rust encoder/decoder +(`wire.rs`) is normative for v1; the P5b-step-2 Idris2 decoder MUST +decode this identical byte stream into `Grammar.idr`'s `Statement` +(the certified type), proven total. Cross-language conformance is by +shared golden vectors. Design: a deterministic, versioned, length-prefixed binary TLV. *Rationale:* no parser dependency / no extra trust surface (vs CBOR), -fully auditable, byte-exact, total decode. This is the boundary the -hypatia<->verisim seam crosses; it is specified, not improvised. +fully auditable, byte-exact, total decode. This is the byte +representation of the `Statement` that crosses the hypatia<->verisim +seam; it is specified, not improvised. + +NOTE: This format is *transport-agnostic*. It is the marshalling +payload over the C-ABI **fallback** tier (trusted-certifier +attestation, P5d) *and* the proof-term payload over the estate's +**typed-wasm** target, where it rides with a checker kernel the +consumer re-runs (proof-carrying code — re-checkable, TCB = checker +kernel; the P5c objective). Re-checkable transport is impossible only +over a C ABI, not over typed-wasm; see the two-tier boundary model in +`verification/proofs/VERIFICATION-STANCE.adoc` ("What is NOT +proven"), the authoritative catalogue. == Conventions @@ -96,3 +107,41 @@ parser). `from_wire(to_wire(s)) == s` for every `Statement` (the round-trip proptest is the machine witness; bit-exact float preservation is pinned by a golden vector). `to_wire` is deterministic (canonical: exactly one byte string per value). + +== Certified decoder + cross-language conformance (P5b step 2) + +This format is no longer Rust-only. `VclTotal.Interface.WireDecode` +(`src/interface/WireDecode.idr`, in the CI-gated proof corpus +`verification/proofs/vclut-core.ipkg`) is a **second, independent +decoder of this exact byte stream** into `Grammar.idr`'s *certified* +`Statement`. It is machine-checked `%default total` with **zero +proof-escape** (no `believe_me`/`postulate`/`assert_*`/`idris_crash`/ +`sorry`): the same trusted-boundary totality contract as the Rust +`from_wire`, but discharged by the type-checker rather than asserted — +recursion is bounded by an input-length *fuel* `Nat` (sound because +every node costs >= 1 discriminant byte). + +Cross-language conformance is a *proof*, not a test: +`src/interface/parse/tests/conformance_emit.rs` is the regeneration +oracle (emits the exact `to_wire` image of fixed `Statement`s); +`VclTotal.Interface.WireConformance` embeds those bytes verbatim and +proves `fromWire goldenN = Right expectedN` by `Refl`. The corpus build +fails unless the Idris decoder, evaluated at compile time on the Rust +encoder's bytes, reduces to exactly the expected certified `Statement`. +Fixtures cover the minimal case, the full clause/agent/nested-expression +surface, and the float path. + +=== Disclosed limitation — NaN payload + +Idris2 0.8.0 has no pure, bit-exact `Bits64 -> Double`. The certified +decoder reconstructs finite values (incl. subnormals and `f64::MIN`/ +`MAX`) and both infinities **bit-exactly** by exact power-of-two +scaling. A NaN bit-pattern decodes to *a* NaN; the NaN *payload* is NOT +preserved across the Idris `Double` boundary. This is disclosed, not +papered over: every consumer still agrees "this is NaN" (the +semantically meaningful invariant), and exhaustive bit-exact float +preservation — including the NaN payload — remains witnessed on the +Rust side by `wire.rs::golden_bit_exact_floats`. The omitted +`VqlType` slots are filled `TAny` (the grammar's documented +"unresolved type before type checking"; the 10-level checker resolves +them at Level 2+) — faithful, not a shortcut. diff --git a/src/interface/parse/tests/conformance_emit.rs b/src/interface/parse/tests/conformance_emit.rs new file mode 100644 index 0000000..eae0e56 --- /dev/null +++ b/src/interface/parse/tests/conformance_emit.rs @@ -0,0 +1,130 @@ +// SPDX-License-Identifier: PMPL-1.0-or-later +// Copyright (c) 2026 Jonathan D.A. Jewell (hyperpolymath) + +//! P5b cross-language conformance: the **regeneration oracle** for the +//! Idris-side `VclTotal.Interface.WireConformance` `Refl` proofs. +//! +//! These are the exact, byte-for-byte `to_wire` outputs the Idris total +//! decoder must reproduce structurally. Run: +//! +//! ```text +//! cargo test --manifest-path src/interface/parse/Cargo.toml --locked \ +//! --test conformance_emit -- --nocapture emit +//! ``` +//! +//! and transcribe each `Fn = [..]` line into the corresponding +//! `goldenN` byte list in `src/interface/WireConformance.idr`. The +//! fixtures are deliberately chosen so the Idris equality is +//! *definitional* (`Refl`): no arbitrary `f64` (F3 uses 2.5, exactly +//! representable — bit-exact reconstruction is the Rust proptest's job, +//! `wire.rs::golden_bit_exact_floats`). + +#![allow(clippy::unwrap_used, clippy::expect_used, clippy::panic)] + +use vcltotal_parse::ast::*; +use vcltotal_parse::to_wire; + +fn line(name: &str, s: &Statement) { + let b = to_wire(s); + let body = b.iter().map(u8::to_string).collect::>().join(","); + println!("{name} = [{body}]"); +} + +fn f1() -> Statement { + Statement { + select_items: vec![SelectItem::Star], + source: Source::Store("main".to_string()), + where_clause: None, + group_by: vec![], + having: None, + order_by: vec![], + limit: None, + offset: None, + proof_clause: None, + effect_decl: None, + version_const: None, + linear_annot: None, + epistemic_clause: None, + requested_level: SafetyLevel::SchemaBound, + } +} + +fn f2() -> Statement { + Statement { + select_items: vec![ + SelectItem::Field(FieldRef { + modality: Modality::Graph, + field_name: "id".to_string(), + }), + SelectItem::Star, + ], + source: Source::Octad("uuid-1".to_string()), + where_clause: Some(Expr::Compare( + CompOp::Eq, + Box::new(Expr::Field(FieldRef { + modality: Modality::Vector, + field_name: "x".to_string(), + })), + Box::new(Expr::Literal(Literal::Int(7))), + )), + group_by: vec![], + having: None, + order_by: vec![( + FieldRef { + modality: Modality::Temporal, + field_name: "t".to_string(), + }, + true, + )], + limit: Some(10), + offset: None, + proof_clause: Some(ProofClause::Witness("w".to_string())), + effect_decl: Some(EffectDecl::ReadWrite), + version_const: Some(VersionConstraint::AtLeast(3)), + linear_annot: Some(LinearAnnotation::UseOnce), + epistemic_clause: Some(EpistemicClause { + agents: vec![Agent::Engine, Agent::Prover("lean4".to_string())], + requirements: vec![EpistemicRequirement::Knows( + Agent::Engine, + Expr::Literal(Literal::Bool(true)), + )], + }), + requested_level: SafetyLevel::EpistemicSafe, + } +} + +fn f3() -> Statement { + Statement { + select_items: vec![SelectItem::Star], + source: Source::Store("s".to_string()), + where_clause: Some(Expr::Literal(Literal::Float(2.5))), + group_by: vec![], + having: None, + order_by: vec![], + limit: None, + offset: None, + proof_clause: None, + effect_decl: None, + version_const: None, + linear_annot: None, + epistemic_clause: None, + requested_level: SafetyLevel::ParseSafe, + } +} + +#[test] +fn emit() { + line("golden1", &f1()); + line("golden2", &f2()); + line("golden3", &f3()); +} + +/// Self-check: every fixture round-trips through the Rust codec, so the +/// emitted bytes are a faithful `to_wire` image (the Idris decoder is +/// then proven to agree on the same bytes). +#[test] +fn fixtures_roundtrip() { + for s in [f1(), f2(), f3()] { + assert_eq!(vcltotal_parse::from_wire(&to_wire(&s)).unwrap(), s); + } +} diff --git a/verification/proofs/README.adoc b/verification/proofs/README.adoc index 7430697..6f77f99 100644 --- a/verification/proofs/README.adoc +++ b/verification/proofs/README.adoc @@ -9,8 +9,9 @@ This directory holds the *machine-verified* proof artefacts for vcl-ut. `vclut-core.ipkg`:: The machine-check package for the full `VclTotal` proof corpus. `idris2 --build verification/proofs/vclut-core.ipkg` exits 0 over - **10 modules** (`ABI.{Types,Layout,LayoutProofs}` + - `Core.{Grammar,Schema,Decide,Levels,Checker,Composition,Epistemic}`) + **12 modules** (`ABI.{Types,Layout,LayoutProofs}` + + `Core.{Grammar,Schema,Decide,Levels,Checker,Composition,Epistemic}` + + `Interface.{WireDecode,WireConformance}`) under `%default total` with **zero proof-escape symbols**. The `corpus/` subtree is a module→path symlink tree into `src/` (edits to `src/**` are checked here verbatim). CI-gated by @@ -26,14 +27,30 @@ This directory holds the *machine-verified* proof artefacts for vcl-ut. soundness lemma `checkLevel4Sound`, a negative injection test, and the genuine compositionality proof `noRawUserInputCompose`. +`VclTotal.Interface.WireDecode` + `.WireConformance`:: + The certified half of the hypatia↔verisim marshalling seam (P5b + step 2, hyperpolymath/vcl-ut#25). `WireDecode` is a total + (`%default total`, zero proof-escape) decoder of the v1 wire format + (`src/interface/parse/WIRE-FORMAT.adoc`) into `Grammar.idr`'s + certified `Statement`; recursion is bounded by an input-length fuel + `Nat`. `WireConformance` proves, by `Refl`, that this decoder agrees + byte-for-byte with the trusted Rust encoder (`wire.rs::to_wire`) on + golden fixtures emitted by `src/interface/parse/tests/ + conformance_emit.rs` — cross-language conformance as a build-time + proof. Disclosed: the NaN *payload* is not preserved across the + Idris `Double` boundary (finite + infinite values are bit-exact); + the Rust proptest remains the exhaustive float witness. + `VERIFICATION-STANCE.adoc`:: The authoritative, proof-backed statement of what is and is not verified. *Read this before trusting any "verified"/"type-safe" - phrasing elsewhere in the repo.* It records the full Phase 0→4 + phrasing elsewhere in the repo.* It records the full Phase 0→5 history (the Phase-0 corpus did not compile and was never machine-checked; Phases 1–4 fixed that *honestly* — corpus compiles, CI-gated, L1–L10 sound + dependent `SafetyCertificate`, FFI - fail-closed) and the precisely-scoped residual OWED items. + fail-closed; Phase 5 / vcl-ut#25 reinforces the boundary — trusted + Rust parser+codec, certified Idris wire decoder) and the + precisely-scoped residual OWED items. == Verifying @@ -41,7 +58,7 @@ This directory holds the *machine-verified* proof artefacts for vcl-ut. ---- export IDRIS2_PREFIX= rm -rf verification/proofs/build # avoid stale-ttc false green -idris2 --build verification/proofs/vclut-core.ipkg # expect exit 0 (10 modules) +idris2 --build verification/proofs/vclut-core.ipkg # expect exit 0 (12 modules) cd verification/proofs && idris2 --check SafetyL4Model.idr # expect exit 0 ---- diff --git a/verification/proofs/VERIFICATION-STANCE.adoc b/verification/proofs/VERIFICATION-STANCE.adoc index b9ea21b..01c4961 100644 --- a/verification/proofs/VERIFICATION-STANCE.adoc +++ b/verification/proofs/VERIFICATION-STANCE.adoc @@ -40,9 +40,13 @@ typecheck. The concrete blockers were: `Composition` (forward refs, deleted `AllParameterised` still referenced, wrong `CertL6+` arities, non-reducing `rewrite`s). -*Phase 1–4 status (RESOLVED):* every blocker above is fixed -*honestly* and the corpus compiles (10 modules; Phase 4 = `LayoutProofs` -+ L6–L10 `composeJoin` closure, PR #24). **Phase 2** de-vacuized L2/L3/L5 +*Phase 1–4 status (RESOLVED); Phase 5 boundary reinforcement (in +progress):* every blocker above is fixed *honestly* and the corpus +compiles (now **12 modules**: Phase 4 = `LayoutProofs` + L6–L10 +`composeJoin` closure, PR #24; Phase 5 / vcl-ut#25 adds +`Interface.{WireDecode,WireConformance}` — the certified wire decoder ++ cross-language `Refl` conformance, P5b step 2). **Phase 2** de-vacuized +L2/L3/L5 (new shared decider module `VclTotal.Core.Decide`, evidence-carrying predicates, `checkLevel2/3/5Sound`, genuine `composeJoin` closure). **Phase 3** added soundness for L1 and L6–L10 @@ -248,21 +252,69 @@ OWED section. Int` returns a level *only* on the `Just`-certificate branch of `certifyRequested` (structurally — no proof-escape). + - *Inherent boundary (not a defect).* A C ABI cannot transport a - dependent `SafetyCertificate`; the honest model is *trusted-certifier - attestation* (as in proof-carrying code: the consumer trusts the - checker that minted the integer). What a hypatia/verisim caller - receives is therefore at most an attestation from the Idris certifier, - never a re-checkable proof object — by construction of any FFI. + *Boundary model — two tiers, not one (canonical; other mentions + refer here).* The achievable guarantee depends on the *transport*, + and the impossibility result is specific to an untyped one: + - *NAMED OWED (absent, not faked):* (1) **no string→`Statement` - parser exists anywhere in the repo** — the corpus certifies an - already-built AST, so the `verify(query_string)` FFI shape - *presupposes a component nobody has written*; (2) C-ABI - `Statement`/`OctadSchema` marshalling; (3) the Idris→C build that - would let the shim actually call `certifiedLevel`; (4) the real - backend symbol (`vclut_rs_verify`) is declared in intent but not - linked. Until (1)–(4) exist the FFI is honestly fail-closed. + ** *C-ABI tier (fallback).* A C ABI erases types to machine words, + so it cannot transport a re-checkable dependent + `SafetyCertificate`. Over a C ABI the honest ceiling is + *trusted-certifier attestation*: the consumer trusts the checker + that minted the token; the token is unforgeable and bound to + `(sha256(query), schema_id, level)` (P5d). This tier exists only + for consumers reachable solely over C. + ** *typed-wasm tier (estate target, the real objective).* The + estate's transport/compilation target is **typed-wasm**, whose + module boundary is *typed, not erased*. The proof term crosses + faithfully (no flattening to an `int`) and a value of the proof + type cannot be forged without the linear capability. Riding a + small **proof-checker kernel** along with it — which the consumer + runs — yields *proof-carrying code*: the consumer **re-validates** + the certificate. The TCB shrinks from "trust the whole certifier" + to "trust the checker kernel + the typed-wasm verifier". (typed- + wasm's own type system is linear/structural, *not* dependent, so + it does not re-check the safety predicate itself — the shipped + kernel does. That is standard PCC and is strictly stronger than + attestation.) ++ + So "a re-checkable dependent proof is impossible by construction" is + true *of a C ABI*, **not of any FFI**: the estate-aligned objective + (P5c) is the typed-wasm PCC tier; C-ABI attestation (P5d) is the + explicit, weaker fallback — not the ceiling. (The current Zig shim is + the C tier; the earlier `ffi/zig` line is fail-closed scaffolding, + not the architectural endpoint.) ++ + *NAMED OWED (absent, not faked) — Phase-5 progress folded in:* + (1) a string→`Statement` *frontend* now exists as the **trusted** + Rust/SPARK-grade `src/interface/parse` crate (P5a, #26): it parses to + a faithful Rust mirror of `Grammar.idr`, panic-free/total by + statically-enforced lints. It is *trusted*, not *certified* — it is + not itself the dependently-typed checker. (2) C-ABI `Statement` + marshalling: the *decode* direction is now **certified** — + `VclTotal.Interface.WireDecode` totally decodes the deterministic + versioned wire format (`src/interface/parse/WIRE-FORMAT.adoc`; + encoder `wire.rs::to_wire`, P5b step 1, #28) into the certified + `Statement` (`%default total`, zero proof-escape, fuel-bounded + recursion), and `VclTotal.Interface.WireConformance` proves it + byte-for-byte conformant with the Rust encoder by `Refl` (P5b + step 2). The *encode* direction (Idris `Statement`→wire) and + `OctadSchema` marshalling remain OWED. (3) the **typed-wasm PCC + transport** — the proof term + a checker kernel the consumer + re-runs, per the two-tier boundary model above — is the P5c + objective (the C-ABI/Zig shim is retained only as the fail-closed + attestation fallback, *not* the endpoint) and remains OWED; (4) the + real backend symbol (`vclut_rs_verify`) linkage remains OWED, as + does the P5d signed-attestation *fallback* contract. Until these + exist the FFI is honestly fail-closed. *Disclosed limitation:* the + certified decoder cannot reconstruct a NaN *payload* (Idris2 0.8.0 + has no pure bit-exact `Bits64 -> Double`); it decodes any NaN + bit-pattern to a NaN — finite and infinite values are bit-exact. + The exhaustive bit-exact float witness (incl. NaN bits) remains the + Rust proptest `wire.rs::golden_bit_exact_floats`. This is a + disclosed limitation, not a hidden gap. Per the two-tier model + above, a re-checkable proof is impossible only over the *C-ABI* + fallback tier; the estate-target *typed-wasm* tier (P5c) **is** + re-checkable (PCC). == Consumer guidance (hypatia / verisim) @@ -286,6 +338,17 @@ OWED section. deleted, not faked); its genuine alignment/no-padding theorems are OWED. `VclTotal.ABI.Foreign` contains *no proofs* (FFI plumbing) — do not treat its returned level tag as a verified claim. +* *Transport target (forward guidance).* The end goal is **not** + "trust our attestation". Over the estate's typed-wasm boundary the + certificate will be transported as a proof term re-validated by a + small checker kernel you run (proof-carrying code): you will be able + to *re-check*, with a TCB of the checker kernel + the typed-wasm + verifier, not the whole certifier (P5c). Trusted-certifier + attestation (P5d) is the explicit *fallback* for consumers reachable + only over a C ABI — weaker, and labelled as such. Until P5c lands, + today's guidance stands (advisory tag only); it is a *plumbing* + gap, and the ceiling is PCC, not attestation. See the two-tier + boundary model in "What is NOT proven". == Remediation roadmap (phased, tracked standards#124) @@ -329,9 +392,11 @@ OWED section. shim to `certifiedLevel`; the genuine `ABI.Layout` alignment/no-padding theorems; strengthen the L6–L10 predicates to the full checker semantics; `ABI.Foreign` plumbing; L3's disclosed subquery/heuristic - scoping. (A C ABI carrying a re-checkable dependent proof is *not* - on this list — it is impossible by construction; the honest target is - trusted-certifier attestation.) + scoping. (Re-checkable transport is *not* a gap to be closed over a + C ABI — impossible there by construction — but it **is** the + objective over the estate's typed-wasm target via PCC; see the + two-tier boundary model above. C-ABI attestation is the fallback + tier, not the ceiling.) . *Phase 4 (DONE in-corpus, this PR — discharges the Phase-3 "owed" layout + join-closure debt):* *(4a)* the genuine `ABI.Layout` alignment / no-padding / bounds @@ -352,11 +417,51 @@ OWED section. `checkLevelN` semantics; FFI transport; the additive↔ceil sliver; the unchanged Phase-3 string→`Statement`/marshalling/Idris→C items. This residual is *tracked, not forgotten*: the boundary-reinforcement - workstream (FFI proof-attestation transport + the missing + workstream (typed-wasm PCC proof-transport + the missing `String -> Statement` parser, priority 1; the secondary items above) - is hyperpolymath/vcl-ut#25. A C ABI carrying a *re-checkable* - dependent proof is impossible by construction — the honest target - there is trusted-certifier attestation, stated not hidden. + is hyperpolymath/vcl-ut#25. Re-checkable transport is impossible only + over a *C ABI*; over the estate's typed-wasm target it is the + objective (PCC, TCB = checker kernel), with C-ABI attestation as the + fallback tier — see the two-tier boundary model above. +. *Phase 5 (boundary reinforcement, hyperpolymath/vcl-ut#25 — in + progress):* the residual tracked above is being discharged in + precisely-scoped steps. *P5a (#26):* the **trusted** Rust/SPARK-grade + `src/interface/parse` parser (string → faithful Rust mirror of + `Grammar.idr`; panic-free/total by static lints) — the previously + *absent* string→AST frontend now exists (trusted, not certified). + *P5b step 1 (#28):* a deterministic, versioned, length-prefixed wire + codec (`wire.rs`, normative `WIRE-FORMAT.adoc`) — the C-ABI + marshalling format, no extra trust surface. *P5b step 2 (this PR):* + the **certified** decode side — `VclTotal.Interface.WireDecode` + totally decodes that exact format into the certified `Statement` + (`%default total`, ZERO proof-escape; recursion bounded by an + input-length fuel `Nat`, sound because every node costs ≥1 + discriminant byte; omitted `VqlType` slots filled `TAny` per the + grammar's documented pre-typecheck convention), and + `VclTotal.Interface.WireConformance` proves byte-for-byte agreement + with the Rust `to_wire` encoder by `Refl` on shared golden fixtures + (regeneration oracle `tests/conformance_emit.rs`) — cross-language + conformance as a build-time proof, not a test. The marshalling + *decode* side is therefore now machine-verified. + *Disclosed limitation (not a hidden gap):* the certified decoder + decodes any NaN bit-pattern to *a* NaN (Idris2 0.8.0 has no pure + bit-exact `Bits64 -> Double`); finite values (incl. subnormals, + `f64::MIN`/`MAX`) and both infinities are bit-exact. The exhaustive + bit-exact float witness, NaN payload included, stays the Rust + proptest `wire.rs::golden_bit_exact_floats`. *Still OWED (precisely):* + the *encode* direction (Idris→wire) + `OctadSchema` marshalling; + **P5c — typed-wasm PCC transport**: carry the proof term across the + *typed* (non-erasing) typed-wasm boundary plus a small proof-checker + kernel the consumer re-runs, so the certificate is **re-validated**, + not merely attested (TCB = checker kernel + typed-wasm verifier; see + the two-tier boundary model above). The pre-existing `ffi/zig` + fail-closed shim is retained *only* as the C-ABI attestation + fallback for non-typed-wasm consumers — it is not the architectural + endpoint. *P5d (fallback tier):* the signed-attestation contract + over `(sha256(query), schema_id, level)` + ADR + the OWED→RESOLVED + stance flip, for consumers reachable only over C. Re-checkable + transport is impossible *only* over a C ABI; over the estate target + (typed-wasm) it is the explicit P5c objective. == Provenance @@ -370,13 +475,19 @@ FFI fabricated-level lie deleted (fail-closed, `zig build test` green) + proof-gated `Checker.certifiedLevel` mint. Phase 4 2026-05-19: `ABI.LayoutProofs` (genuine alignment/no-padding/bounds, 10th ipkg module) + L6–L10 `composeJoin` closure (`l6..l9Compose`, -`epiStructJoin`); corpus rebuilt clean, 10 modules, zero proof-escape. -The corpus builds clean as -*9 modules* +`epiStructJoin`); corpus rebuilt clean, zero proof-escape. +Phase 5 / vcl-ut#25 2026-05-19: P5a trusted Rust parser (#26) + P5b +step 1 wire codec (#28); **P5b step 2 (this PR):** +`VclTotal.Interface.WireDecode` (certified total wire decoder) + +`VclTotal.Interface.WireConformance` (`Refl` cross-language conformance +vs the Rust encoder), corpus rebuilt clean. +The corpus now builds clean as +*12 modules* (`idris2 0.8.0 --build verification/proofs/vclut-core.ipkg`, exit 0, -`%default total`, zero proof-escape symbols — CI-enforced), the added -modules being `VclTotal.Core.Decide` (shared deciders) and -`VclTotal.ABI.Layout` (now sound). +`%default total`, zero proof-escape symbols — CI-enforced): +`VclTotal.ABI.{Types,Layout,LayoutProofs}` + +`VclTotal.Core.{Grammar,Schema,Decide,Levels,Checker,Composition, +Epistemic}` + `VclTotal.Interface.{WireDecode,WireConformance}`. No proof-escape symbol was used to force any green; every obligation that could not be honestly discharged is scoped OWED above rather than masked. @@ -399,16 +510,27 @@ Notes for reproduction / honesty: The residual risk is now purely at the *boundary*, and the boundary no longer lies: the FFI is *fail-closed* (no fabricated level; `zig build test` green) and a genuine proof-gated mint (`Checker.certifiedLevel`) -exists in the CI-gated corpus. What remains OWED is *plumbing and an -unwritten parser*, not a soundness gap: there is **no string→`Statement` -parser anywhere in the repo** (so the `verify(query_string)` FFI shape -presupposes a component nobody has written), no C-ABI `Statement` -marshalling, and no Idris→C build linking the shim to the certifier; -plus the L6–L10 *predicate* shallowness, the genuine `ABI.Layout` -alignment/no-padding theorems, `ABI.Foreign` plumbing, and L3's -disclosed subquery/heuristic scoping. A C ABI carrying a *re-checkable -dependent proof* is impossible by construction — the honest target is -trusted-certifier attestation, and that trust boundary is stated, not +exists in the CI-gated corpus. What remains OWED is *plumbing*, not a +soundness gap. Phase 5 / vcl-ut#25 has closed the previously-headline +"no parser" item: a string→AST frontend now exists as the **trusted** +Rust/SPARK-grade `src/interface/parse` parser (P5a, #26 — trusted, not +certified), and the `Statement` *decode* side is now **certified** +(`VclTotal.Interface.WireDecode`, total/zero-escape, proven byte-for-byte +conformant with the Rust encoder by `Refl` in +`VclTotal.Interface.WireConformance`, P5b #28 + this PR). Still OWED: +the *encode* direction + `OctadSchema` marshalling; **P5c — typed-wasm +PCC transport** (proof term + checker kernel the consumer re-runs, so +the certificate is re-validated, not merely attested; the `ffi/zig` +fail-closed shim retained only as the C-ABI attestation fallback); +the P5d signed-attestation *fallback* contract; plus the L6–L10 +*predicate* shallowness, the genuine `ABI.Layout` alignment/no-padding +theorems, `ABI.Foreign` plumbing, L3's disclosed subquery/heuristic +scoping, and the disclosed NaN-payload limitation of the Idris +`Double` boundary (finite/infinite values bit-exact; Rust proptest is +the exhaustive float witness). Re-checkable transport is impossible +*only* over a C ABI; over the estate's typed-wasm target it is the +explicit objective (PCC, TCB = checker kernel) — see the canonical +two-tier boundary model above. That trust boundary is stated, not hidden. The former headline residuals (*non-compiling corpus*, *vacuous L2/L3/L5 predicates*, *decorative certificate*, *unsound `ABI.Layout`*, *lying FFI*) are all resolved. diff --git a/verification/proofs/corpus/VclTotal/Interface/WireConformance.idr b/verification/proofs/corpus/VclTotal/Interface/WireConformance.idr new file mode 120000 index 0000000..bac0367 --- /dev/null +++ b/verification/proofs/corpus/VclTotal/Interface/WireConformance.idr @@ -0,0 +1 @@ +../../../../../src/interface/WireConformance.idr \ No newline at end of file diff --git a/verification/proofs/corpus/VclTotal/Interface/WireDecode.idr b/verification/proofs/corpus/VclTotal/Interface/WireDecode.idr new file mode 120000 index 0000000..8cf9b8f --- /dev/null +++ b/verification/proofs/corpus/VclTotal/Interface/WireDecode.idr @@ -0,0 +1 @@ +../../../../../src/interface/WireDecode.idr \ No newline at end of file diff --git a/verification/proofs/vclut-core.ipkg b/verification/proofs/vclut-core.ipkg index abd9cc0..61e4a1d 100644 --- a/verification/proofs/vclut-core.ipkg +++ b/verification/proofs/vclut-core.ipkg @@ -37,3 +37,5 @@ modules = VclTotal.ABI.Types , VclTotal.Core.Checker , VclTotal.Core.Composition , VclTotal.Core.Epistemic + , VclTotal.Interface.WireDecode + , VclTotal.Interface.WireConformance