This document is the realisation reference for SPEC §8 (Top-Level
Binding Environment): how the WebAssembly back-end in lib/codegen.ml
satisfies C1–C6, what data it carries to satisfy them, and what every
other back-end shipped in this repository does in the same role.
Where the spec is target-agnostic, this doc is concrete: it names OCaml types, files, line-level structure, and the loud-fail discipline. Behavioural claims in this document are claims about current code; the spec is the authority on behavioural requirements.
Companion to:
- SPEC.adoc §8 — what every back-end must do.
- issue #89 — original env-rules ticket.
The codegen environment is a ctx record (lib/codegen.ml, type context)
threaded through gen_decl for each top-level declaration in source order.
Two fields hold the bindings required by SPEC §8:
| Field | Purpose |
|---|---|
|
The single name → integer index map for all runtime-bearing top-level
bindings. Positive indices point into the combined imports+ |
|
The WASM globals vector. One entry per |
Compile-time-only declarations (type, effect, trait, impl,
extern type) populate auxiliary fields (struct_layouts,
variant_tags, …) but do not enter func_indices.
The signed-integer trick in func_indices is internal: every
identifier lookup in expression position consults func_indices and
dispatches on the sign of the result to emit either call or
global.get.
Reproduced from lib/codegen.ml:
type context = {
types : func_type list; (* type section *)
funcs : func list; (* function definitions *)
exports : export list; (* exports *)
imports : import list; (* imports *)
globals : global list; (* global variables *)
locals : (string * int) list; (* local variable name -> index map *)
next_local : int;
loop_depth : int;
func_indices : (string * int) list;
(* Top-level name environment shared by functions and constants:
- k >= 0: Wasm function index (imports + defined functions).
- k < 0: Constant (global); actual global index is -(k+1).
Entries inserted in source declaration order by gen_decl. *)
lambda_funcs : func list; (* lifted lambdas *)
next_lambda_id : int;
heap_ptr : int option;
field_layouts : (string * (string * int) list) list;
struct_layouts : (string * (string * int) list) list;
fn_ret_structs : (string * string) list;
variant_tags : (string * int) list;
string_data : (string * int) list;
next_string_offset : int;
datas : data list;
ownership_annots : (int * ownership_kind list * ownership_kind) list;
}Only the fields touched by SPEC §8 conformance are discussed here;
heap_ptr, string_data, datas, and ownership_annots are part of
unrelated lowering passes.
func_indices is a single association list keyed by the AffineScript
identifier. Two binding flavours share it; the sign of the value tells
them apart:
| Source declaration | Key value | Decode |
|---|---|---|
|
|
|
|
|
|
|
|
|
The encoding satisfies SPEC §8 C1–C2 with a single source-order insertion
and resolves both kinds with a single List.assoc_opt. No separate
const_indices table is needed.
gen_decl : context → top_level → context result is invoked by
generate_module via List.fold_left over prog.prog_decls. Each
case is summarised below; consult lib/codegen.ml for the canonical
implementation.
-
Build the WASM function type from the parameter list (all params are
i32; result isi32). Append toctx.types; record its index. -
Compute
func_idx = import_func_count ctx + List.length ctx.funcs. This is the future WASM function index of the function about to be emitted. -
Register
(fd.fd_name.name, func_idx)infunc_indicesbefore generating the body — satisfies SPEC §8 C2 and admits self-recursion. -
Also record ownership annotations and, if the return type is a known struct, the return-struct mapping (
fn_ret_structs). -
Generate the body against the augmented context.
-
Append the emitted function to
ctx.funcs. -
If
Publicor its name is a reserved game-loop hook (main,init_state,step_state,get_state,mission_active), add anExportFuncexport.
ExprApp call-site lookup (in gen_expr): List.assoc name func_indices
returns a non-negative func_idx; the call site emits Call func_idx.
Emit a WASM import under module "env" and name fd.fd_name.name,
register the alias in func_indices with the positive import index.
The body is not generated. Mirrors gen_imports (§5).
Behaviourally identical to §4.2. The parser produces a TopExternFn
record for the contemporary surface syntax; the legacy
TopFn _ with FnExtern branch remains for compatibility with older
front-end paths.
-
Compile the initialiser against the current context via
gen_expr. The result must be a constant expression (a singleI32Constor analogous form); non-constant initialisers fail at WASM validation, per the loud-fail policy. -
Append a new
globalentry:{ g_type = I32; g_mutable = false; g_init = init_code } -
Register
(tc.tc_name.name, -(global_idx + 1))infunc_indices.
ExprVar lookup (in gen_expr, ExprVar arm): if List.assoc_opt name
ctx.func_indices returns Some k with k < 0, decode
global_idx = -(k + 1) and emit GlobalGet global_idx. This is the
path that closed #73.
-
TyEnum— assign sequential tags to each variant and record them invariant_tags. -
TyStruct— compute the field layout (sequential 4-byte offsets, matching theExprRecordstore path) and record it instruct_layouts. -
TyAlias— no environment change.
None of these enter func_indices; they are compile-time bindings
under SPEC §8 C4.
No WASM artefact. Type is available to the type checker via the resolver; codegen returns the context unchanged. Compile-time-only under SPEC §8 C4.
gen_imports : Module_loader.t → import_decl list → context → context result
walks prog.prog_imports once at the start of generate_module,
before any local gen_decl call. For every imported item:
-
Load the referenced module via
Module_loader. -
Find the matching
TopFn(or fail silently if absent — the resolver would have already errored). -
Intern the function type into
ctx.types. -
Append a WASM import: module name = dotted module path (
String.concat "." mod_path), function name = original declaration name, function type = interned index. -
Register the local alias (or original name) in
func_indiceswith the positiveimport_func_idx.
WASM module-linking for globals isn’t standard yet, so cross-module const support inlines the value into the importer’s module:
-
Load the referenced module via
Module_loaderand locate the matchingTopConst(matching on the original name; alias renaming happens at registration time). -
Compile the const’s initialiser against the importer’s context via
gen_expr(same lowering asgen_decl TopConst, §4.4). -
Append the resulting
globalentry toctx.globals. -
Register
(local_name, -(global_idx + 1))infunc_indices— the same negative-sentinel encoding used for locally-declared consts (§3), so use-site lookup is uniform.
The importer keeps its own copy of the constant value; cross-module const identity is by value, not by reference. This is fine because AffineScript consts are immutable.
The two relevant arms in gen_expr:
match lookup_local ctx id.name with
| Ok idx -> Ok (ctx, [LocalGet idx])
| Error _ ->
match List.assoc_opt id.name ctx.variant_tags with
| Some tag -> Ok (ctx, [I32Const tag])
| None ->
match List.assoc_opt id.name ctx.func_indices with
| Some k when k < 0 -> Ok (ctx, [GlobalGet (-(k + 1))])
| _ -> Error (UnboundVariable id.name)Matches SPEC §8.3: local → enum tag → top-level. A function name
encountered in expression position falls through to UnboundVariable,
which is correct: bare function references in expression position are
not yet representable in WASM without closure boxing, so they are
rejected at the typechecker.
match List.assoc_opt id.name ctx.func_indices with
| Some func_idx -> ... emit (Call func_idx)
| None ->
match lookup_local ctx id.name with
| Ok local_idx -> ... emit indirect closure call
| Error _ -> Error (UnboundVariable ...)A call site finds a positive func_idx and emits Call func_idx. The
typechecker rejects calls whose head is a constant, so a negative k
here would be a type-system bug; the codegen does not defensively check
the sign at call sites.
Cross-walking SPEC §8.5 against lib/codegen.ml:
| Criterion | How the WASM target satisfies it | Site |
|---|---|---|
C1 |
|
|
C2 |
TopFn registers |
§4.1, §4.4 |
C3 |
TopFn → WASM function; TopExternFn → import; TopConst → immutable global with the constant initialiser |
§4.1, §4.3, §4.4 |
C4 |
TopType populates |
§4.5–§4.7 |
C5 |
|
|
C6 |
|
|
Each non-WASM back-end has its own gen_decl (or equivalent). Status
of the SPEC §8 contract per target:
| Target | TopConst |
TopFn |
|---|---|---|
|
|
|
|
|
|
|
|
|
|
Same negative-sentinel discipline as |
|
|
As |
As |
Other (Lua, Julia, C, WGSL, Faust, ONNX, Bash, Nickel, ReScript, LLVM, Verilog, Gleam, CUDA, Metal, OpenCL, MLIR, Why3, Lean, SPIR-V) |
Implementation-specific. If a back-end cannot lower |
Each emits a target-appropriate function definition; cross-module
flow uses |
When auditing or adding a back-end the env-population rule is the same as for WASM: register the name and emit a target-appropriate definition before any body that might reference it.
const inputSuffix: String = ":in";
fn withInput(port: String) -> String {
port ++ inputSuffix
}
pub fn main() -> () {
let p = withInput("front_left");
println(p)
}WASM realisation, step by step (under lib/codegen.ml):
-
gen_importsruns (no imports in this module). -
gen_decl TopConst inputSuffix—globalsgains{ I32, immutable, init = <data offset> },func_indices = [("inputSuffix", -1)]. -
gen_decl TopFn withInput—func_idx = 1(imports + funcs so far), registered infunc_indicesbefore body generation. The body’sinputSuffixreference goes through ExprVar (§6.1), findsk = -1, decodesglobal_idx = 0, emitsGlobalGet 0. -
gen_decl TopFn main— same recipe, body’swithInput(…)call resolves toCall 1.pubtriggersExportFunc 2.
Resulting func_indices (most-recent-first by :: cons except TopFn
which appends): [("main", 2); ("withInput", 1); ("inputSuffix", -1)].
-
#73 —
Codegen.UnboundVariablefor top-levelconstbindings (intra-module). Closed. Resolved by theSome k when k < 0arm inExprVar(lib/codegen.ml, line 442–445). The negative-sentinel encoding is the load-bearing invariant; new back-ends adoptingfunc_indicesmust preserve it. -
#107 — Cross-module
constimports dropped bygen_imports/flatten_imports. Closed. Both paths now threadTopConst:-
Codegen.gen_importsmatchesTopConstalongsideTopFnand inlines the initialiser as a fresh global on the importer (§5.2). -
Module_loader.flatten_importsincludes public consts in its inlined declaration set, with the same alias-renaming machinery used for fns, so non-WASM back-ends pick them up unchanged. Regression tests live intest/test_e2e.ml(E2E Xmod Other Codegensgroup, items 2–3).
-
-
lib/codegen.ml—type context,gen_decl,gen_imports,generate_module; the WASMExprVar/ExprApparms. -
lib/codegen_gc.ml— WasmGC variant; same env discipline. -
lib/module_loader.ml—flatten_importsfor non-WASM cross-module threading. -
lib/ast.ml—type top_levelconstructors enumerated in SPEC §8.1. -
bin/main.ml— pipeline wiring (parse → resolve → typecheck → codegen with loader threading).