Skip to content

[browser/wasi][coreCLR] Dedicated WASM GC PAL — replace mmap with posix_memalign and optimize memory operations#127328

Open
pavelsavara wants to merge 7 commits intodotnet:mainfrom
pavelsavara:browser_no_mmap
Open

[browser/wasi][coreCLR] Dedicated WASM GC PAL — replace mmap with posix_memalign and optimize memory operations#127328
pavelsavara wants to merge 7 commits intodotnet:mainfrom
pavelsavara:browser_no_mmap

Conversation

@pavelsavara
Copy link
Copy Markdown
Member

@pavelsavara pavelsavara commented Apr 23, 2026

Summary

This PR replaces the WASM GC and PAL virtual-memory paths to use posix_memalign/free instead of mmap/munmap. Emscripten's mmap implementation is fundamentally broken for the GC's needs: munmap cannot unmap partial allocations, mmap(PROT_NONE) still consumes linear memory, and MAP_FIXED doesn't work correctly. This change introduces a dedicated WASM GC OS interface (gc/wasm/gcenv.cpp) and updates the PAL layer to use the allocator-based approach.

Based on runtimelab's NativeAOT-LLVM gcenv.wasm.cpp (PR#1510, PR#3151).

Fixes #121036
Fixes #117813
Fixes #118943

Motivation

The CoreCLR GC on WASM targets (browser and WASI) previously shared the Unix gcenv.unix.cpp code path with #ifdef TARGET_WASM carve-outs. This was problematic because:

  1. mmap/munmap semantics don't work on WASM — Emscripten's munmap cannot release partial mappings, mmap(PROT_NONE) reserves real linear memory (no lazy commit), and MAP_FIXED is broken.
  2. Virtual memory doesn't exist on WASM — The linear memory model means reserve == commit. The existing code pretended otherwise, leading to wasted memory.
  3. Page size mismatch — WASM's memory.grow granularity is 64KB, but the GC works best with a 16KB page size for alignment and thresholds.

Changes

New files

  • src/coreclr/gc/wasm/gcenv.cpp — Complete WASM-specific GCToOSInterface implementation using posix_memalign/free. Includes an sbrk optimization that avoids unnecessary memset zeroing on fresh memory.grow pages (safe because single-threaded WASM has no concurrent sbrk calls).
  • src/coreclr/gc/wasm/CMakeLists.txt — Build file for the WASM GC PAL module.
  • src/native/minipal/wasm.h — Cross-platform minipal_getpagesize() helper that returns 16KB on WASM instead of the 64KB memory.grow granularity.

Modified files

  • src/coreclr/gc/CMakeLists.txt — Routes WASM builds to the new wasm/ subdirectory instead of unix/.
  • src/coreclr/gc/init.cpp — Enables the large-pages code path on WASM (to skip VirtualDecommit, since decommit is meaningless on WASM). Caps segment sizes so initial segments fit within the hard limit. Enables auto-detection of the hard limit from WASM linear memory max (treating it like a container memory limit).
  • src/coreclr/gc/interface.cpp — Suppresses the 32-bit assert(!use_large_pages_p) for WASM.
  • src/coreclr/gc/unix/gcenv.unix.cpp — Removes all TARGET_WASM/__EMSCRIPTEN__ conditionals (WASM now has its own file). Also fixes a pre-existing bug: nanosleep returns -1 on EINTR, not EINTR itself.
  • src/coreclr/pal/src/map/virtual.cpp — Adds TARGET_WASM paths in ReserveVirtualMemory (using posix_memalign), VIRTUALCommitMemory (sentinel-based lazy zeroing), VirtualFree (decommit via sentinel or free), and VIRTUALReserveMemory (using free instead of munmap on error). Restructures the MEM_RELEASE path to check munmap return before calling VIRTUALReleaseMemory.
  • src/coreclr/pal/src/misc/sysinfo.cpp — Uses minipal_getpagesize() instead of getpagesize().

Key design decisions

Sentinel-based lazy zeroing (single-threaded path)

Instead of eagerly zeroing memory on decommit, the single-threaded path writes a non-zero sentinel byte at each page boundary. On recommit, the first byte is checked — if non-zero, the range is zeroed. This avoids double-zeroing in the common case where memory is decommitted and never recommitted.

Under multithreading (FEATURE_MULTITHREADING), the code falls back to unconditional zeroing on both decommit and commit to avoid races.

sbrk optimization

When posix_memalign returns memory at or above the previous sbrk(0) break, the allocation came from memory.grow which guarantees zero-initialization per the WASM spec. Only recycled blocks below the old break need explicit zeroing. This is safe because WASM is single-threaded (no concurrent sbrk calls). The same approach is used by Mono's WASM mmap implementation.

use_large_pages_p = true on WASM

The GC's large-pages mode skips VirtualDecommit for heap segments, which is exactly what WASM needs since decommit cannot return memory. The hard-limit auto-detection (75% of emscripten_get_heap_max()) is preserved rather than being tightened to actual segment sizes, leaving room for bookkeeping allocations.

16KB page size

WASM memory.grow uses 64KB pages, but the GC's alignment and threshold calculations work better with a finer granularity. The 16KB page size (minipal_getpagesize()) is used for GC page alignment while the 64KB WasmPageSize constant is used only when converting __builtin_wasm_memory_size counts to bytes.

Code review notes

Correctness

  • The nanosleep fix in gcenv.unix.cpp (checking == -1 && errno == EINTR instead of == EINTR) is a pre-existing bug fix that affects all Unix platforms, not just WASM.
  • The MEM_RELEASE restructuring in virtual.cpp inverts the error-checking logic from if (munmap == 0) { if (!Release) fail } else { fail } to if (munmap != 0) fail; if (!Release) fail. This is a behavioral no-change but is clearer.
  • VirtualReset on WASM returns false, forcing the GC to use the decommit+commit fallback path. The previous code returned true (pretending madvise worked), which silently did nothing.

Thread safety

  • All sbrk-based optimizations are guarded by #ifndef FEATURE_MULTITHREADING.
  • The sentinel trick is only used in single-threaded mode; MT mode zeroes unconditionally.

@pavelsavara pavelsavara added this to the 11.0.0 milestone Apr 23, 2026
@pavelsavara pavelsavara self-assigned this Apr 23, 2026
Copilot AI review requested due to automatic review settings April 23, 2026 16:55
@pavelsavara pavelsavara added arch-wasm WebAssembly architecture area-GC-coreclr labels Apr 23, 2026
@dotnet-policy-service
Copy link
Copy Markdown
Contributor

Tagging subscribers to this area: @JulieLeeMSFT, @dotnet/gc
See info in area-owners.md if you want to be subscribed.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR separates WebAssembly-specific GC OS interface behavior from the shared Unix implementation by introducing a dedicated gcenv.wasm.cpp, and adjusts the PAL virtual memory implementation on WASM to avoid relying on Emscripten’s incomplete mmap/munmap support.

Changes:

  • Added a dedicated WASM GCToOSInterface implementation (gcenv.wasm.cpp) and CMake wiring for building it.
  • Routed WASM GC builds to the new gc/wasm directory and removed WASM-specific #ifdef paths from gcenv.unix.cpp.
  • Updated PAL virtual memory reserve/release on WASM to use posix_memalign/free instead of mmap/munmap.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
src/coreclr/pal/src/map/virtual.cpp Switches WASM reserve/release behavior to posix_memalign/free and adjusts related error/cleanup paths.
src/coreclr/gc/wasm/gcenv.wasm.cpp New WASM-specific GC OS interface implementation (virtual memory, CPU/NUMA stubs, memory stats).
src/coreclr/gc/wasm/CMakeLists.txt Adds build definition for the WASM GC PAL object library.
src/coreclr/gc/unix/gcenv.unix.cpp Removes WASM-specific branches and fixes nanosleep EINTR retry logic.
src/coreclr/gc/CMakeLists.txt Routes WASM builds to gc/wasm instead of gc/unix.

Comment thread src/coreclr/pal/src/map/virtual.cpp
Comment thread src/coreclr/pal/src/map/virtual.cpp Outdated
Comment thread src/coreclr/gc/wasm/gcenv.wasm.cpp Outdated
Comment thread src/coreclr/gc/wasm/gcenv.wasm.cpp Outdated
@pavelsavara pavelsavara changed the title [wasm][coreclr] Extract WASM-specific GC memory management into dedicated file [browser/wasi] Dedicated WASM GC PAL — replace mmap with posix_memalign and optimize memory operations Apr 23, 2026
@pavelsavara pavelsavara changed the title [browser/wasi] Dedicated WASM GC PAL — replace mmap with posix_memalign and optimize memory operations [browser/wasi][coreCLR] Dedicated WASM GC PAL — replace mmap with posix_memalign and optimize memory operations Apr 24, 2026
Copilot AI review requested due to automatic review settings April 24, 2026 08:20
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 7 out of 7 changed files in this pull request and generated 7 comments.

Comment thread src/coreclr/gc/wasm/gcenv.wasm.cpp Outdated
Comment thread src/coreclr/gc/wasm/gcenv.wasm.cpp Outdated
Comment thread src/coreclr/gc/wasm/gcenv.wasm.cpp Outdated
Comment thread src/coreclr/gc/wasm/gcenv.wasm.cpp Outdated
Comment thread src/coreclr/pal/src/map/virtual.cpp Outdated
Comment thread src/coreclr/pal/src/map/virtual.cpp Outdated
Comment thread src/coreclr/pal/src/map/virtual.cpp
Copilot AI review requested due to automatic review settings April 24, 2026 11:37
@pavelsavara pavelsavara marked this pull request as ready for review April 24, 2026 11:43
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 7 out of 7 changed files in this pull request and generated 7 comments.

Comment thread src/coreclr/gc/wasm/gcenv.cpp
Comment thread src/coreclr/gc/wasm/CMakeLists.txt
Comment thread src/coreclr/gc/wasm/gcenv.cpp
Comment thread src/coreclr/pal/src/map/virtual.cpp
Comment thread src/coreclr/gc/wasm/gcenv.wasm.cpp Outdated
Comment thread src/coreclr/gc/wasm/gcenv.cpp
Comment thread src/coreclr/gc/wasm/gcenv.cpp
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 7 out of 7 changed files in this pull request and generated 1 comment.

Comment thread src/coreclr/gc/wasm/gcenv.cpp
Comment thread src/native/minipal/wasm.h Outdated
Comment thread src/coreclr/gc/wasm/gcenv.wasm.cpp Outdated
Comment thread src/coreclr/gc/wasm/gcenv.wasm.cpp Outdated
Comment thread src/native/minipal/wasm.h Outdated
Comment thread src/coreclr/gc/wasm/gcenv.wasm.cpp Outdated
Comment thread src/coreclr/gc/wasm/gcenv.wasm.cpp Outdated
Comment thread src/coreclr/gc/wasm/gcenv.cpp
Comment thread src/coreclr/gc/wasm/gcenv.wasm.cpp Outdated
size_t GetRestrictedPhysicalMemoryLimit()
{
// We must return a valid value here since you can't "overcommit" memory in WASM.
return GetTotalPhysicalMemory_Wasm();
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should just return 0. The restricted memory limit is considered to be a limit that e.g. kubernetes use to limit the amount of memory available to the container. It doesn't apply to WASM as we can use all the memory reported by the GetTotalPhysicalMemory.

Copy link
Copy Markdown
Contributor

@SingleAccretion SingleAccretion Apr 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is one of the pieces that I ran into with #118943 and dotnet/runtimelab#3150. Unfortunately, I don't quite remember the details anymore, but they were along the lines of GC only taking the total physical memory as "hard limit" if we set the 'restricted' bit. But as #3150 says we don't have this code on 32 bit, so it is broken either way.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure if I understand the need for the limit on WASM. The GetTotalPhysicalMemory returns emscripten_get_heap_max. Do you mean that it is not the amount of memory we can use and it should be further limited?

Copy link
Copy Markdown
Contributor

@SingleAccretion SingleAccretion Apr 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean that it is not the amount of memory we can use and it should be further limited?

No, the problem was that GC itself doesn't respect the physical memory amount when it is 'small' without heap_hard_limit set, see

#ifndef USE_REGIONS
if (gc_heap::heap_hard_limit)
{
if (gc_heap::heap_hard_limit_oh[soh])
{
// On 32bit we have next guarantees:
// 0 <= seg_size_from_config <= 1Gb (from max_heap_hard_limit/2)
// 0 <= (heap_hard_limit = heap_hard_limit_oh[soh] + heap_hard_limit_oh[loh] + heap_hard_limit_oh[poh]) < 4Gb (from gc_heap::compute_hard_limit_from_heap_limits)
// 0 <= heap_hard_limit_oh[loh] <= 1Gb or < 2Gb
// 0 <= heap_hard_limit_oh[poh] <= 1Gb or < 2Gb
// 0 <= large_seg_size <= 1Gb or <= 2Gb (alignment and round up)
// 0 <= pin_seg_size <= 1Gb or <= 2Gb (alignment and round up)
// 0 <= soh_segment_size + large_seg_size + pin_seg_size <= 4Gb
// 4Gb overflow is ok, because 0 size allocation will fail
large_seg_size = max (gc_heap::adjust_segment_size_hard_limit (gc_heap::heap_hard_limit_oh[loh], nhp), seg_size_from_config);
pin_seg_size = max (gc_heap::adjust_segment_size_hard_limit (gc_heap::heap_hard_limit_oh[poh], nhp), seg_size_from_config);
}
else
{
// On 32bit we have next guarantees:
// 0 <= heap_hard_limit <= 1Gb (from gc_heap::compute_hard_limit)
// 0 <= soh_segment_size <= 1Gb
// 0 <= large_seg_size <= 1Gb
// 0 <= pin_seg_size <= 1Gb
// 0 <= soh_segment_size + large_seg_size + pin_seg_size <= 3Gb
#ifdef HOST_64BIT
large_seg_size = gc_heap::use_large_pages_p ? gc_heap::soh_segment_size : gc_heap::soh_segment_size * 2;
#else //HOST_64BIT
assert (!gc_heap::use_large_pages_p);
large_seg_size = gc_heap::soh_segment_size;
#endif //HOST_64BIT
pin_seg_size = large_seg_size;
}
if (gc_heap::use_large_pages_p)
gc_heap::min_segment_size = min_segment_size_hard_limit;
}
else
{
large_seg_size = get_valid_segment_size (TRUE);
pin_seg_size = large_seg_size;
}
assert (g_theGCHeap->IsValidSegmentSize (seg_size));
assert (g_theGCHeap->IsValidSegmentSize (large_seg_size));
assert (g_theGCHeap->IsValidSegmentSize (pin_seg_size));
dprintf (1, ("%d heaps, soh seg size: %zd mb, loh: %zd mb\n",
nhp,
(seg_size / (size_t)1024 / 1024),
(large_seg_size / 1024 / 1024)));
gc_heap::min_uoh_segment_size = min (large_seg_size, pin_seg_size);
if (gc_heap::min_segment_size == 0)
{
gc_heap::min_segment_size = min (seg_size, gc_heap::min_uoh_segment_size);
}
#endif //!USE_REGIONS
.

(evidently this code has been changed a bit since #118943 was filed)

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Made attempt on fixing #118943
Added a WASM segment size cap: max_seg = round_down_power2(heap_hard_limit / 6) (min 1MB), preventing 3 initial segments from exceeding the heap. For a 32MB WASM heap: hard_limit=24MB → max_seg=4MB → total=12MB (fits).

Comment thread src/coreclr/gc/CMakeLists.txt Outdated
Comment thread src/coreclr/pal/src/map/virtual.cpp Outdated
Comment thread src/coreclr/pal/src/map/virtual.cpp Outdated
Copilot AI review requested due to automatic review settings April 24, 2026 18:23
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 7 out of 7 changed files in this pull request and generated 2 comments.

Comment thread src/coreclr/gc/wasm/gcenv.cpp
Comment thread src/coreclr/gc/wasm/gcenv.cpp
Copilot AI review requested due to automatic review settings April 27, 2026 17:06
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 9 out of 9 changed files in this pull request and generated 4 comments.

Comment thread src/native/minipal/wasm.h
Comment thread src/coreclr/pal/src/map/virtual.cpp Outdated
Comment thread src/coreclr/gc/wasm/gcenv.cpp
Comment thread src/coreclr/gc/wasm/CMakeLists.txt
@pavelsavara
Copy link
Copy Markdown
Member Author

I processed your feedback and updated the PR description above
Please let me know if I understood it well enough @janvorli @jkotas @SingleAccretion

Copilot AI review requested due to automatic review settings April 27, 2026 18:21
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 9 out of 9 changed files in this pull request and generated 1 comment.

Comment on lines +780 to 793
// On WASM, reserve == commit. If this range was previously decommitted,
// sentinels were placed at each page boundary. Check the first byte and
// zero the entire range if needed.
#ifdef FEATURE_MULTITHREADING
// Under MT, VirtualDecommit already zeroes the full range on decommit, and
// reserve already zeroes on allocation - so commit is a no-op.
(void)MemSize;
#else
if (MemSize && *(BYTE*)StartBoundary != 0)
{
ZeroMemory((LPVOID) StartBoundary, MemSize);
}
#endif
#endif
Copy link

Copilot AI Apr 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On TARGET_WASM, VIRTUALCommitMemory zeroes the entire aligned range whenever the first byte is non-zero. This breaks VirtualAlloc semantics for recommitting already committed memory: PAL test filemapping_memmgt/VirtualAlloc/test20 explicitly writes a value, calls VirtualAlloc(ptr, ..., MEM_COMMIT, ...) again, and expects the contents to be unchanged. With the current check (*(BYTE*)StartBoundary != 0) this will incorrectly zero valid committed data. The commit path needs a way to distinguish ‘recommit after MEM_DECOMMIT’ from ‘commit on already committed pages’ (e.g., always do the zeroing in the MEM_DECOMMIT path and make commit a no-op, or track decommitted state out-of-band instead of inspecting user data bytes).

Suggested change
// On WASM, reserve == commit. If this range was previously decommitted,
// sentinels were placed at each page boundary. Check the first byte and
// zero the entire range if needed.
#ifdef FEATURE_MULTITHREADING
// Under MT, VirtualDecommit already zeroes the full range on decommit, and
// reserve already zeroes on allocation - so commit is a no-op.
(void)MemSize;
#else
if (MemSize && *(BYTE*)StartBoundary != 0)
{
ZeroMemory((LPVOID) StartBoundary, MemSize);
}
#endif
#endif
// On WASM, reserve == commit, so MEM_COMMIT must be a no-op here.
// In particular, recommitting already committed memory must preserve its
// contents; using user data (for example, the first byte) to infer prior
// decommit state is incorrect and can spuriously zero valid committed data.
// Any zeroing required to model MEM_DECOMMIT must be handled by the
// decommit path or tracked out-of-band rather than during commit.
(void)MemSize;
#endif

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

arch-wasm WebAssembly architecture area-GC-coreclr

Projects

None yet

5 participants