docs(res-to-affine): corpus run + regex precision fixes (Refs #57)#319
Merged
Conversation
First end-to-end run of the Phase-1 scanner against the estate's 491
deduplicated .res files (idaptik + gitbot-fleet) surfaced two
high-impact false-positive sources in the top-level regexes:
- `side-effect-import` fired on in-function `let _ = X.foo(...)`,
ReScript's normal "discard a chained call's return value" idiom
(1,181 hits, vast majority indented; LESSONS.md's anti-pattern is
the top-level form only).
- `mutable-global` fired on any `:=`, including local ref mutation
inside function bodies (653 hits, none at column 0 in the corpus).
Both regexes are now anchored at column 0, matching the LESSONS.md
anti-pattern shape (module-load side effect, module-scoped mutable).
`re_untyped_exn` also gains a `(^|...)` alternation so `raise` / `try`
at column 0 is no longer missed.
Total markers across the corpus: 2,146 → 348 (−84%).
Files with any marker: 216 → 94. The 3-test snapshot suite is
unchanged — the synthetic fixture covers column-0 cases only.
Adds `tools/res-to-affine/CORPUS-RUN.md` (human report with corpus
stats, regex change-log, top-hotspots table, and Phase-2 follow-ups)
plus `CORPUS-RUN.json` (machine-readable sidecar). README links both.
The trade-off: we no longer flag module-load side effects nested
inside `module X = { ... }` blocks, nor `let x = ref(...)` top-level
declarations paired with their later `:=`. Those are scoped to
Phase 2's AST walker and listed in CORPUS-RUN.md as follow-ups.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
First end-to-end run of the Phase-1 scanner (#314) against the estate's
real
.ressurface — 491 deduplicated files acrossidaptik(475) andgitbot-fleet(16). The run surfaced two high-impact false-positivesources, which this PR fixes alongside the corpus report.
side-effect-importwas flagging in-functionlet _ = X.f(...)— ReScript's normal "discard chained-call return value" idiom — not
just the LESSONS.md "module-load side effect" anti-pattern. 1,181
hits in the corpus, vast majority indented.
mutable-globalwas flagging any:=, includingintra-function local-ref mutation. 653 hits in the corpus, none at
column 0.
Both regexes now anchor at column 0.
re_untyped_exnalso picks upcolumn-0
raise/tryvia a(^|...)alternation.Impact
side-effect-importmutable-globalraw-jsuntyped-exceptionSpot-checked the new top hotspots — every remaining
side-effect-importhit is genuinely column-0 (e.g.
idaptik/src/Main.res:5:let _ = PixiSound.sound);every
raw-jsanduntyped-exceptionhit corresponds to a real%raw(…)block or
try/Js.Exn/Promise.catchoccurrence.What's added
tools/res-to-affine/CORPUS-RUN.md— human report (corpus stats,regex change-log, top hot-spots, Phase-2 follow-ups, reproducer).
tools/res-to-affine/CORPUS-RUN.json— machine-readable sidecar.tools/res-to-affine/README.md— link to both.tools/res-to-affine/scanner.ml— three regex tightenings withan inline comment naming the column-0 vs in-function trade-off.
Trade-off
We no longer flag:
module X = { ... }blockslet x = ref(...)declarations (we still flag the:=that follows, if column-0)
Both are scoped to Phase 2's AST walker and listed in
CORPUS-RUN.mdas follow-ups (item 2 + item 3).Refs #57 (the issue is multi-phase, this is a precision pass on
Phase 1, not a phase close).
Test plan
dune build tools/res-to-affine— clean.dune test tools/res-to-affine/— 3/3 OK (snapshot byte-identicalbecause the synthetic fixture covers column-0 cases only).
tally matches the table above and spot-checked remaining hits.
🤖 Generated with Claude Code