hyperpolymath · hyperpolymath · May 21, 2026 · May 21, 2026 · May 21, 2026
diff --git a/PROOF-NEEDS.md b/PROOF-NEEDS.md
@@ -25,11 +25,11 @@ Classification key: **(J)** genuinely unavoidable, documented as an axiom;
 
 | # | Site | Function | Type | Class | Rationale |
 |---|------|----------|------|-------|-----------|
-| 1 | `SafetyLemmas.idr:53` | `charEqSound` | `(c1,c2:Char) -> c1 == c2 = True -> c1 = c2` | **J ✓** | `Char` is an opaque primitive; `==` is `prim__eqChar` (foreign `Bool`). Idris2 0.8.0 has no in-language soundness principle for primitive equality. Standard, well-understood axiom. **Externally validated** via backend-assurance harness (`docs/backend-assurance/prim__eqChar.md` + BEAM property test). |
-| 2 | `SafetyLemmas.idr:60` | `charEqSym` | `(x,y:Char) -> (x == y) = (y == x)` | **J ✓** | Symmetry of `prim__eqChar`. Same reason as #1 — opaque primitive, no decision procedure to recurse on. **Externally validated** (same harness as #1). |
-| 3 | `SafetyLemmas.idr:211` | `unpackLength` | `length (unpack s) = length s` | **J** | `unpack` = `prim__strToCharList` (foreign). `String` is opaque with no induction principle; the relation between primitive `String` length and `List Char` length is not reducible in-language. _Backend-assurance harness pending._ |
-| 4 | `SafetyLemmas.idr:219` | `appendLengthSum` | `length (s ++ t) = length s + length t` | **J ✓** | `++` on `String` = `prim__strAppend` (foreign). Length additivity is a backend-semantics guarantee, not type-level reducible. **Externally validated** via backend-assurance harness (`docs/backend-assurance/prim__strAppend.md` + BEAM property test). |
-| 5 | `SafetyLemmas.idr:226` | `substrLengthBound` | `LTE (length (substr start len s)) len` | **J ✓** | `substr` = `prim__strSubstr` (foreign). The "result no longer than `len`" bound is a primitive-semantics guarantee with no in-language proof. **Externally validated** via backend-assurance harness (`docs/backend-assurance/prim__strSubstr.md` + BEAM property test). |
+| 1 | `SafetyLemmas.idr:60` | `charEqSound` | `(c1,c2:Char) -> c1 == c2 = True -> c1 = c2` | **J ✓** | `Char` is an opaque primitive; `==` is `prim__eqChar` (foreign `Bool`). Idris2 0.8.0 has no in-language soundness principle for primitive equality. Standard, well-understood axiom. **Externally validated** via backend-assurance harness (`docs/backend-assurance/prim__eqChar.md` + BEAM property test). |
+| 2 | `SafetyLemmas.idr:67` | `charEqSym` | `(x,y:Char) -> (x == y) = (y == x)` | **J ✓** | Symmetry of `prim__eqChar`. Same reason as #1 — opaque primitive, no decision procedure to recurse on. **Externally validated** (same harness as #1). |
+| 3 | `SafetyLemmas.idr:218` | `unpackLength` | `length (unpack s) = length s` | **J ✓** | `unpack` = `prim__strToCharList` (foreign). `String` is opaque with no induction principle; the relation between primitive `String` length and `List Char` length is not reducible in-language. **Externally validated** via backend-assurance harness (`docs/backend-assurance/prim__strToCharList.md` + BEAM property test). |
+| 4 | `SafetyLemmas.idr:226` | `appendLengthSum` | `length (s ++ t) = length s + length t` | **J ✓** | `++` on `String` = `prim__strAppend` (foreign). Length additivity is a backend-semantics guarantee, not type-level reducible. **Externally validated** via backend-assurance harness (`docs/backend-assurance/prim__strAppend.md` + BEAM property test). |
+| 5 | `SafetyLemmas.idr:233` | `substrLengthBound` | `LTE (length (substr start len s)) len` | **J ✓** | `substr` = `prim__strSubstr` (foreign). The "result no longer than `len`" bound is a primitive-semantics guarantee with no in-language proof. **Externally validated** via backend-assurance harness (`docs/backend-assurance/prim__strSubstr.md` + BEAM property test). |
 
 **Verdict: 5/5 are class (J).** All five reduce to the same root cause:
 Idris2 treats `Char` and `String` as opaque primitive types whose
@@ -83,11 +83,11 @@ All five are class **(J)** — genuinely unavoidable in Idris2 0.8.0
 
 | Axiom | Site | Justification | Backend-assurance evidence |
 |-------|------|---------------|----------------------------|
-| `charEqSound` | `SafetyLemmas.idr:53` | Soundness of `prim__eqChar` — backend primitive correctness, externally validated | `docs/backend-assurance/prim__eqChar.md` + `elixir/test/backend_assurance/prim_eq_char_test.exs` |
-| `charEqSym` | `SafetyLemmas.idr:60` | Symmetry of `prim__eqChar` — backend primitive correctness, externally validated | `docs/backend-assurance/prim__eqChar.md` + `elixir/test/backend_assurance/prim_eq_char_test.exs` |
-| `unpackLength` | `SafetyLemmas.idr:211` | `prim__strToCharList` preserves length — backend primitive correctness | _pending_ |
-| `appendLengthSum` | `SafetyLemmas.idr:219` | `prim__strAppend` length semantics — not reducible at type level, externally validated | `docs/backend-assurance/prim__strAppend.md` + `elixir/test/backend_assurance/prim_str_append_test.exs` |
-| `substrLengthBound` | `SafetyLemmas.idr:226` | `prim__strSubstr` length bound — not reducible at type level, externally validated | `docs/backend-assurance/prim__strSubstr.md` + `elixir/test/backend_assurance/prim_str_substr_test.exs` |
+| `charEqSound` | `SafetyLemmas.idr:60` | Soundness of `prim__eqChar` — backend primitive correctness, externally validated | `docs/backend-assurance/prim__eqChar.md` + `elixir/test/backend_assurance/prim_eq_char_test.exs` |
+| `charEqSym` | `SafetyLemmas.idr:67` | Symmetry of `prim__eqChar` — backend primitive correctness, externally validated | `docs/backend-assurance/prim__eqChar.md` + `elixir/test/backend_assurance/prim_eq_char_test.exs` |
+| `unpackLength` | `SafetyLemmas.idr:218` | `prim__strToCharList` preserves length — backend primitive correctness, externally validated | `docs/backend-assurance/prim__strToCharList.md` + `elixir/test/backend_assurance/prim_str_to_char_list_test.exs` |
+| `appendLengthSum` | `SafetyLemmas.idr:226` | `prim__strAppend` length semantics — not reducible at type level, externally validated | `docs/backend-assurance/prim__strAppend.md` + `elixir/test/backend_assurance/prim_str_append_test.exs` |
+| `substrLengthBound` | `SafetyLemmas.idr:233` | `prim__strSubstr` length bound — not reducible at type level, externally validated | `docs/backend-assurance/prim__strSubstr.md` + `elixir/test/backend_assurance/prim_str_substr_test.exs` |
 
 Note: `logSafeBounded` in SafeAPIKey.idr no longer uses `believe_me` directly;
 it calls the documented SafetyLemmas axioms via structural proof.
@@ -115,10 +115,10 @@ from "we trust the backend" to "we read the lowering and randomly
 tested the operation".
 
 Primitives validated so far: `prim__eqChar` (covering `charEqSound`
-and `charEqSym`), `prim__strAppend` (covering `appendLengthSum`),
-`prim__strSubstr` (covering `substrLengthBound`). Remaining:
-`prim__strToCharList` (covering `unpackLength`). Tracked under epic
-#87 Tier C backend-assurance campaign.
+and `charEqSym`), `prim__strToCharList` (covering `unpackLength`),
+`prim__strAppend` (covering `appendLengthSum`), and `prim__strSubstr`
+(covering `substrLengthBound`). Tracked under epic #87 Tier C
+backend-assurance campaign.
 
 ## Priority Going Forward
 

diff --git a/docs/backend-assurance/README.md b/docs/backend-assurance/README.md
@@ -27,12 +27,11 @@ companion property-test harness lives under
 | Primitive          | Axioms covered                       | Trusted-extraction               | Property test                                                  |
 |--------------------|--------------------------------------|----------------------------------|----------------------------------------------------------------|
 | `prim__eqChar`     | `charEqSound`, `charEqSym`           | `prim__eqChar.md`                | `elixir/test/backend_assurance/prim_eq_char_test.exs`          |
-| `prim__strToCharList` | `unpackLength`                    | _pending_                        | _pending_                                                      |
+| `prim__strToCharList` | `unpackLength`                    | `prim__strToCharList.md`         | `elixir/test/backend_assurance/prim_str_to_char_list_test.exs` |
 | `prim__strAppend`  | `appendLengthSum`                    | `prim__strAppend.md`             | `elixir/test/backend_assurance/prim_str_append_test.exs`       |
 | `prim__strSubstr`  | `substrLengthBound`                  | `prim__strSubstr.md`             | `elixir/test/backend_assurance/prim_str_substr_test.exs`       |
 
-Each row is delivered as one PR per primitive. `prim__eqChar` is the
-first; the other three follow the same shape.
+Each row is delivered as one PR per primitive.
 
 ## Constraints
 

diff --git a/docs/backend-assurance/prim__strAppend.md b/docs/backend-assurance/prim__strAppend.md
@@ -55,18 +55,19 @@ compile-time-only artefacts; the runtime strings flowing through the
 system are BEAM UTF-8 binaries. On BEAM:
 
 - **String model.** Elixir strings are UTF-8 encoded binaries.
-  `String.length/1` returns the **codepoint count** (not the byte
-  count) by scanning the UTF-8 boundaries. This matches Idris2's
-  `length`-on-`String` semantics on Chez (codepoint count).
+  Idris2's `length`-on-`String` semantics on Chez are codepoint
+  count. Elixir's `String.length/1` counts grapheme clusters, so the
+  BEAM-side harness measures codepoint count explicitly via
+  `String.codepoints/1`.
 - **Concatenation lowering.** Elixir's `<>` on binaries is the
   built-in `bif erlang:'++'/2` for iolists, ultimately compiling to
   a byte-level binary append. UTF-8 is prefix-free: appending two
   valid UTF-8 byte sequences yields a valid UTF-8 byte sequence whose
   codepoint boundaries are exactly the boundaries of the operands.
 - **Length additivity.** Because UTF-8 is prefix-free, the codepoint
   boundaries of `s <> t` are exactly the boundaries of `s` followed
-  by the boundaries of `t`. Hence
-  `String.length(s <> t) == String.length(s) + String.length(t)`.
+  by the boundaries of `t`. Hence the codepoint count of `s <> t`
+  equals the codepoint count of `s` plus the codepoint count of `t`.
 
 The property test exercises this over random strings sampled from the
 legal codepoint range (excluding surrogates), plus explicit boundary
@@ -75,14 +76,14 @@ the empty-string identity case.
 
 ## Why this isn't circular
 
-The harness does not call `prim__strAppend`. It calls Elixir `<>` and
-`String.length/1` directly on UTF-8 binaries. The argument is: *the
-operation that Idris2 lowers `prim__strAppend` + `prim__strLength` to
-on the BEAM is `<>` + `String.length/1` on UTF-8 binaries*, so
-demonstrating those operations satisfy the property is sufficient.
-The trusted-extraction step is reading the lowering; the
-property-test step is verifying the operation behaves as the lowering
-claims.
+The harness does not call `prim__strAppend`. It calls Elixir `<>`
+directly on UTF-8 binaries and measures codepoint count explicitly.
+The argument is: *BEAM binary concatenation preserves the exact
+UTF-8 codepoint sequence of both operands*, so demonstrating that the
+resulting codepoint count is additive validates the backend operation
+at the semantic level the axiom uses. The trusted-extraction step is
+reading the lowering; the property-test step is verifying the
+operation behaves as the lowering claims.
 
 For Chez, we do not run a Scheme harness — R6RS is sufficient
 documentary evidence. If BoJ ever ships a backend whose string model
@@ -126,7 +127,7 @@ byte length** — see the *Honest framing* clause in
   for `prim__strAppend` and `prim__strLength`.
 - R6RS §6.7, §11.12 — Scheme string model and `string-append`
   specification.
-- Elixir `String` module documentation — UTF-8 codepoint counting via
-  `String.length/1`.
+- Elixir `String` module documentation — UTF-8 codepoint enumeration
+  via `String.codepoints/1`.
 - `PROOF-NEEDS.md` — axiom audit (2026-05-18) and class-(J) framing.
 - `src/abi/Boj/SafetyLemmas.idr` — axiom declaration (line 226).
diff --git a/docs/backend-assurance/prim__strSubstr.md b/docs/backend-assurance/prim__strSubstr.md
@@ -64,26 +64,28 @@ the R6RS guarantee for `substring` combined with the codegen's clamp.
 ## BEAM backend (Erlang / Elixir, where BoJ runs)
 
 BoJ's REST surface is Elixir on the BEAM. The runtime strings are
-UTF-8 encoded binaries; substring extraction goes through Elixir's
-`String.slice/3`.
+UTF-8 encoded binaries; the BEAM-side harness models substring over
+the decoded codepoint sequence (`String.to_charlist/1` + `Enum.slice/3`
++ `to_string/1`).
 
 On BEAM:
 
 - **String model.** As with `prim__strAppend`, strings are UTF-8
-  binaries and `String.length/1` returns codepoint count. The
-  `length` referred to in the axiom is codepoint count, matching
-  Idris2 semantics on Chez.
-- **`String.slice/3` semantics.** Per the Elixir `String` module
-  documentation, `String.slice(s, start, len)` returns at most `len`
-  codepoints from `s`, starting at codepoint offset `start`. The
-  function clamps to the string's actual codepoint length: when
-  `start ≥ length(s)`, the result is `""`; when `start + len >
-  length(s)`, the result is the suffix from `start` to the end (which
-  is strictly shorter than `len`).
-- **Length bound.** Combining the two facts:
-  `String.length(String.slice(s, start, len)) ≤ len` for all
-  `(start, len, s)`. The clamp can only *shorten* the result;
-  it never extends past `len`.
+  binaries. The `length` referred to in the axiom is codepoint count,
+  matching Idris2 semantics on Chez. Elixir's `String.length/1` and
+  `String.slice/3` are grapheme-oriented, so the BEAM-side harness
+  uses explicit codepoint operations.
+- **Codepoint-slice semantics.** The harness decodes `s` with
+  `String.to_charlist/1`, slices that codepoint list with
+  `Enum.slice(start, len)`, and re-encodes it with `to_string/1`.
+  This returns at most `len` codepoints from `s`, starting at
+  codepoint offset `start`. The slice clamps to the string's actual
+  codepoint length: when `start ≥ length(s)`, the result is `""`;
+  when `start + len > length(s)`, the result is the suffix from
+  `start` to the end (which is strictly shorter than `len`).
+- **Length bound.** Combining the two facts: the codepoint count of
+  the codepoint slice is `≤ len` for all `(start, len, s)`. The clamp
+  can only *shorten* the result; it never extends past `len`.
 
 The property test exercises this over random `(start, len, s)` tuples
 plus explicit boundary cases: `len = 0`, `start ≥ length(s)`,
@@ -93,12 +95,13 @@ where codepoint count differs from byte count.
 ## Why this isn't circular
 
 The harness does not call `prim__strSubstr`. It calls Elixir
-`String.slice/3` and `String.length/1` directly. The argument is:
-*the operation that Idris2 lowers `prim__strSubstr` to on the BEAM
-is `String.slice/3` on a UTF-8 binary*, so demonstrating
-`String.slice/3` satisfies the property is sufficient. The
-trusted-extraction step is reading the lowering; the property-test
-step is verifying the operation behaves as the lowering claims.
+charlist/codepoint operations directly. The argument is: *a
+codepoint-level substring operation over a UTF-8 binary clamps to at
+most the requested number of codepoints*, so demonstrating that
+operation satisfies the property validates the backend semantics the
+axiom uses. The trusted-extraction step is reading the lowering; the
+property-test step is verifying the operation behaves as the lowering
+claims.
 
 For Chez, we do not run a Scheme harness — the R6RS `substring`
 semantics combined with the codegen's clamp are sufficient
@@ -108,7 +111,7 @@ documentary evidence.
 
 - **`len = 0`.** The tight corner of the bound: result must be the
   empty string. R6RS `substring` with `end = start` returns `""`;
-  BEAM `String.slice/3` with `len = 0` returns `""`. Both satisfy
+  the BEAM codepoint slice with `len = 0` returns `""`. Both satisfy
   `LTE 0 0`.
 - **`start ≥ length(s)`** (start past end). Both backends clamp to
   the empty string; the bound `LTE 0 len` is trivial.
@@ -120,9 +123,9 @@ documentary evidence.
   result is `s` itself; the bound is tight (`LTE length(s)
   length(s)`).
 - **Multi-byte codepoints.** Tested via boundary strings covering
-  ASCII, Latin-1 supplement, CJK, and astral plane. `String.slice/3`
-  counts codepoints, not bytes, so a slice of length `len` from an
-  emoji-heavy string spans `4 * len` bytes but `len` codepoints.
+  ASCII, Latin-1 supplement, CJK, and astral plane. The harness slices
+  codepoints, not bytes, so a slice of length `len` from an emoji-heavy
+  string spans up to `4 * len` bytes but at most `len` codepoints.
 - **Surrogates** (`0xD800..0xDFFF`): excluded from the codepoint
   generator. Same rationale as `prim__strAppend`.
 - **Negative `start` / `len`.** Idris2's `substr` takes `Nat`, so
@@ -137,7 +140,7 @@ documentary evidence.
 - Idris2 0.8.0 `src/Compiler/Scheme/Chez.idr` — Chez codegen for
   `prim__strSubstr` (clamp + R6RS `substring`).
 - R6RS §11.12 — `substring` specification.
-- Elixir `String` module documentation — `String.slice/3` clamp
-  semantics.
+- Elixir `String` module documentation — `String.to_charlist/1`
+  semantics over UTF-8 binaries.
 - `PROOF-NEEDS.md` — axiom audit (2026-05-18) and class-(J) framing.
 - `src/abi/Boj/SafetyLemmas.idr` — axiom declaration (line 233).