Skip to content

Test build perf v2#835

Open
mjp41 wants to merge 35 commits intomicrosoft:mainfrom
mjp41:test-build-perf-v2
Open

Test build perf v2#835
mjp41 wants to merge 35 commits intomicrosoft:mainfrom
mjp41:test-build-perf-v2

Conversation

@mjp41
Copy link
Copy Markdown
Member

@mjp41 mjp41 commented Mar 28, 2026

No description provided.

@SchrodingerZhu
Copy link
Copy Markdown
Collaborator

We can also try modules if speed is a concern.

@mjp41
Copy link
Copy Markdown
Member Author

mjp41 commented Mar 28, 2026

We can also try modules if speed is a concern.

Mostly there is a lot of redundant work in the build, and I wanted to experiment with improving that.

@mjp41 mjp41 force-pushed the test-build-perf-v2 branch 4 times, most recently from e0efc18 to 8eb5e51 Compare March 30, 2026 13:21
mjp41 added 6 commits March 30, 2026 16:09
Pure file move. allocconfig.h contains only compile-time constants with
dependencies only on bits.h and mitigations.h (both in ds_core/).
Moving it enables sizeclassstatic.h (next step) to live in ds_core/
without pulling in ds/.

Changes:
- git mv ds/allocconfig.h -> ds_core/allocconfig.h
- Add #include bits.h and mitigations.h to moved file
- Update include paths in ds/ds.h, ds/mpmcstack.h
- Add allocconfig.h to ds_core/ds_core.h umbrella
Create ds_core/sizeclassstatic.h with compile-time sizeclass functions
extracted from mem/sizeclasstable.h:
- smallsizeclass_t (size_t typedef)
- size_to_sizeclass_const()
- NUM_SMALL_SIZECLASSES
- sizeclass_to_size_const()
- is_small_sizeclass()

Move sizeclasstable.h from mem/ to ds/ (depends on PAL, not allocator).
Remove redundant sizeclasstable.h includes from mem/ files.
Convert smallsizeclass_t from a size_t typedef to a struct with:
- Explicit construction from size_t (no implicit conversion FROM)
- Implicit conversion to size_t (for array indexing, comparisons)
- Pre/post increment operators

Provides type safety: alloc(size_t) (byte size) cannot be confused
with alloc(smallsizeclass_t) (sizeclass index).

Update all iteration sites to use explicit construction.
Add alloc(smallsizeclass_t) overload to corealloc and globalalloc that
skips the dynamic sizeclass lookup. Refactor small_alloc to accept
(smallsizeclass_t, size_t) and keep the single-arg (size_t) version
as a forwarding overload.

Update alloc<size>() to use compile-time sizeclass when the size is
a known small sizeclass.

Add libc::malloc_small(smallsizeclass_t) and
libc::malloc_small_zero(smallsizeclass_t) as non-template entry points.
These are general-purpose data structures, not allocator-specific.
Moving them reduces the dependency surface of mem/.
Add __malloc_start_pointer() and __malloc_last_byte_pointer() to libc.h,
alongside the existing __malloc_end_pointer(). These provide the
external_pointer functionality through the libc API surface.
@mjp41 mjp41 force-pushed the test-build-perf-v2 branch 2 times, most recently from ff3b9f1 to b87a6a0 Compare March 30, 2026 19:53
mjp41 added 16 commits March 31, 2026 09:47
Add snmalloc_testlib.h and snmalloc_testlib.cc. CMake build_test_library() creates a static library per flavour linked to all tests. Add SNMALLOC_BUILD_TESTING=OFF to Bazel, src/test/*.cc to Bazel filegroup, and sanitizer flags to build_test_library.
Convert simple alloc/dealloc tests to use snmalloc_testlib.h instead
of snmalloc.h:
- func/first_operation
- func/teardown
- func/multi_atexit
- func/multi_threadatexit

Replace MAX_SMALL_SIZECLASS_BITS with max_small_sizeclass_bits().
Replace alloc<Zero> with alloc<ZeroMem::YesZero>.
Add pal/pal.h include to testlib header for report_fatal_error.
Convert perf tests that use only the public API:
- perf/startup
- perf/contention
- perf/large_alloc
- perf/low_memory

Replace DefaultPal::tick() with pal_tick(), Aal::pause() with pal_pause().
Convert tests that use only the public API to snmalloc_testlib.h:
- func/bits, func/memory, func/memory_usage, func/pool
- func/protect_fork, func/redblack, func/statistics
- perf/external_pointer, perf/lotsofthreads, perf/post_teardown
- perf/singlethread

Replace DefaultPal/Aal calls with pal_*() wrappers.
Replace external_pointer<> with libc __malloc_*_pointer().
Replace alloc<Zero/Uninit> with alloc<ZeroMem::YesZero/NoZero>.
Fix include order in ds/ds.h for pool.h/pooled.h dependencies.
Convert tests that need allocator internals (sizeclasses, pagemap, etc.)
to use snmalloc_core.h alongside snmalloc_testlib.h:
- func/malloc (replace our_* with testlib_* prefix)
- func/pagemap
- func/release-rounding
- func/sizeclass
Move report_fatal_error() and message() template definitions from
pal/pal.h into ds_core/helpers.h. They now call error() and
message_impl() which are forward-declared non-template functions
in ds_core/defines.h, with implementations in pal/pal.h that
delegate to DefaultPal.

This breaks the dependency of ds_core/ code on pal/: any code
that includes ds_core/helpers.h can call report_fatal_error()
without needing PAL headers. The linker resolves error() and
message_impl() from the translation unit that includes pal.h.

Update testlib.h to include pal/pal_consts.h instead of pal/pal.h.
Fix memory.cc to use ds_aal/ds_aal.h for address_cast.
Fix protect_fork.cc to include ds_aal/prevent_fork.h explicitly.
Revert low-memory.cc to snmalloc.h (needs PalNotificationObject).
The header doesn't use errno itself. Tests that need it can
include it directly.
Use minimal includes for protect_fork (ds_core/helpers.h + ds_aal/prevent_fork.h). Fix missed testlib conversions for multi_atexit and multi_threadatexit.
Create ds_core/sizeclassconfig.h with the mitigation-independent
sizeclass constants (INTERMEDIATE_BITS, MIN_ALLOC_STEP_SIZE, etc.).
This file depends only on bits.h — no mitigations.h.

Update sizeclassstatic.h to include sizeclassconfig.h instead of
allocconfig.h. allocconfig.h now includes sizeclassconfig.h and
keeps only the mitigation-dependent constants (MIN_OBJECT_COUNT,
DEALLOC_BATCH_*, MAX_SLAB_SPAN_*, etc.).

This makes the testlib header chain fully mitigation-independent:
testlib.h -> sizeclassstatic.h -> sizeclassconfig.h -> bits.h
Move mitigations.h, allocconfig.h, and cheri.h from ds_core/ to a new
mitigations/ directory. These files depend on SNMALLOC_CHECK_CLIENT
compile-time flags and don't belong in ds_core/ (which should be
mitigation-independent).

New include hierarchy:
  ds_core/ → aal/ → ds_aal/ → mitigations/ → pal/ → ds/ → mem/

ds_core/ is now fully mitigation-independent. The testlib header
(which includes only ds_core/ headers) produces identical output
regardless of SNMALLOC_CHECK_CLIENT, enabling tests to be compiled
once and linked against both fast and check testlib variants.
Testlib-only tests are mitigation-independent: compile into shared OBJECT libraries. Also fix MSVC C4701 warning in cleanup_unused and add unistd.h to protect_fork.
Introduce SNMALLOC_API macro for public API functions (dealloc,
debug_teardown, libc::malloc, etc.) that need standalone definitions
when compiled into a static library.

Defaults to SNMALLOC_FAST_PATH_INLINE (normal header-only usage).
testlib.cc defines SNMALLOC_API as NOINLINE on MSVC before including
snmalloc.h, producing strong definitions that ARM64EC exit-thunks
can reference. On GCC/Clang, __attribute__((used)) via the existing
SNMALLOC_USED_FUNCTION annotation suffices.
Add a const void* overload of remaining_bytes alongside the existing
address_t version. The testlib API uses const void* so callers don't
need address_cast or ds_aal headers. Remove address_t from testlib.h.

This makes memory.cc a testlib-only test (no ds_aal.h needed),
adding it to the compile-once list.
Expose snmalloc's checked memcpy through the libc API surface:
  libc::memcpy<Checked, ReadsChecked>(dst, src, len)

Checked controls destination bounds checking, ReadsChecked controls
source bounds checking (defaults to same as Checked).

Explicit instantiations provided in testlib for the common variants:
<true,true>, <true,false>, <false,false>.
Replace DefaultPal::message() calls in START_TEST and INFO macros
with snmalloc::message() which is now defined in ds_core/helpers.h.
This removes the PAL dependency from test infrastructure, allowing
tests that only need ds_core/ or ds_aal/ headers to avoid including
pal/pal.h.
The seqset test is a unit test for ds_aal/seqset.h and only needs
ds_aal and ds_core headers. With test/helpers.h no longer depending
on DefaultPal, the pal.h include is unnecessary.
@mjp41 mjp41 force-pushed the test-build-perf-v2 branch from b87a6a0 to ab94ccb Compare March 31, 2026 08:48
Fix header includes that were missing.
@mjp41 mjp41 force-pushed the test-build-perf-v2 branch from ab94ccb to 98d7950 Compare March 31, 2026 09:55
@mjp41 mjp41 requested a review from SchrodingerZhu March 31, 2026 09:55
@mjp41
Copy link
Copy Markdown
Member Author

mjp41 commented Mar 31, 2026

@SchrodingerZhu there is a lot of changes here. The individual commits are all pretty self-contained.

If you don't have time, just let me know, and I'll take a second pass on the commits myself.

Comment thread src/snmalloc/mem/corealloc.h Outdated
@SchrodingerZhu
Copy link
Copy Markdown
Collaborator

LGTM in general, the change duplicates the symbols on slow paths:

schrodingerzy@promaxgb10-edf0:~/Documents/snmalloc/build$ llvm-nm-21 libsnmallocshim.so --demangle  | grep __malloc_end_pointer
000000000001e9d8 t snmalloc::libc::__malloc_end_pointer(void*)
000000000001e9d8 T __malloc_end_pointer

Is this expected?

@mjp41
Copy link
Copy Markdown
Member Author

mjp41 commented Apr 24, 2026

LGTM in general, the change duplicates the symbols on slow paths:

schrodingerzy@promaxgb10-edf0:~/Documents/snmalloc/build$ llvm-nm-21 libsnmallocshim.so --demangle  | grep __malloc_end_pointer
000000000001e9d8 t snmalloc::libc::__malloc_end_pointer(void*)
000000000001e9d8 T __malloc_end_pointer

Is this expected?

This is a good question. I have not been precise on what symbols should be exposed by each library.

What do you think is the right thing? Should the libsnmallocshim.so just provide a libc interface or also the snmalloc::libc interface?

@mjp41 mjp41 marked this pull request as ready for review April 29, 2026 13:55
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR introduces a precompiled snmalloc “test library” to reduce test build times by avoiding recompilation of the allocator in every test translation unit, alongside supporting refactors (sizeclass helpers, mitigation layering, and a few libc/global API adjustments) to make more tests mitigation-independent.

Changes:

  • Add snmalloc_testlib (thin header + single compiled TU) and update many perf/func tests to include it instead of <snmalloc/snmalloc.h>.
  • Refactor sizeclass helpers/types and add small-sizeclass allocation fast paths (alloc(smallsizeclass_t) / libc::malloc_small*).
  • Add a mitigations/ layer in the include hierarchy and update build wiring (CMake + Bazel) to reuse mitigation-independent test objects across flavours.

Reviewed changes

Copilot reviewed 52 out of 54 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
src/test/snmalloc_testlib.h New thin header exposing a minimal test-facing API surface for linking against a precompiled allocator TU.
src/test/snmalloc_testlib.cc New single TU that includes full snmalloc and provides wrapper/instantiated symbols for test consumers.
src/test/perf/startup/startup.cc Switch to testlib header; route tick/pause through testlib wrappers.
src/test/perf/singlethread/singlethread.cc Switch to testlib header; use ZeroMem-based alloc API.
src/test/perf/post_teardown/post-teardown.cc Switch to testlib header; replace direct teardown call with debug_teardown() wrapper.
src/test/perf/lotsofthreads/lotsofthread.cc Switch to testlib header; continue using libc namespace shims.
src/test/perf/large_alloc/large_alloc.cc Switch to testlib header include.
src/test/perf/external_pointer/externalpointer.cc Switch to testlib header; use exported __malloc_start_pointer and runtime pal_address_bits().
src/test/perf/contention/contention.cc Switch to testlib header; route tick/pause through testlib wrappers.
src/test/helpers.h Use snmalloc::message<> directly instead of constructing MessageBuilder + DefaultPal::message.
src/test/func/teardown/teardown.cc Switch to testlib header; use ZeroMem alloc and max_small_sizeclass_bits() wrapper.
src/test/func/statistics/stats.cc Switch to testlib header include.
src/test/func/sizeclass/sizeclass.cc Include core + testlib; simplify small-size loop using is_small_sizeclass.
src/test/func/seqset/seqset.cc Remove <snmalloc/snmalloc.h> dependency by including specific DS headers.
src/test/func/release-rounding/rounding.cc Include core + testlib; adjust loops and casts for smallsizeclass_t.
src/test/func/redblack/redblack.cc Switch to testlib header include and include specific redblack tree header.
src/test/func/protect_fork/protect_fork.cc Remove <snmalloc/snmalloc.h> dependency by including specific AAL/core headers.
src/test/func/pool/pool.cc Include ds/pool.h + testlib; adjust debug API usage and remove systematic opt logic.
src/test/func/pagemap/pagemap.cc Include core + testlib instead of snmalloc.h.
src/test/func/multi_threadatexit/multi_threadatexit.cc Switch to testlib header when RUN_TEST is enabled.
src/test/func/multi_atexit/multi_atexit.cc Switch to testlib header when RUN_TEST is enabled.
src/test/func/memory_usage/memory_usage.cc Replace local inclusion of override sources with testlib-prefixed exported C symbols.
src/test/func/memory/memory.cc Switch to testlib header; use const sizeclass helpers and libc __malloc_*_pointer APIs.
src/test/func/malloc/malloc.cc Switch to testlib-exported testlib_* malloc symbols instead of compiling override/malloc.cc into the test.
src/test/func/first_operation/first_operation.cc Switch to testlib header; use ZeroMem alloc and max_small_sizeclass_bits() wrapper.
src/test/func/bits/bits.cc Switch to testlib header include.
src/snmalloc/pal/pal.h Introduce message_impl and mark error used; pull in mitigations layer.
src/snmalloc/mitigations/mitigations_all.h New umbrella header for mitigation configuration/checks.
src/snmalloc/mitigations/mitigations.h Update include path to ds_core/defines.h.
src/snmalloc/mitigations/cheri.h New CHERI-specific mitigation checks (e.g., deallocation checks).
src/snmalloc/mitigations/allocconfig.h New mitigation-dependent allocator constants previously in ds/allocconfig.h.
src/snmalloc/mem/secondary/gwp_asan.h Update include to use ds/sizeclasstable.h.
src/snmalloc/mem/secondary/default.h Update mitigation include path to new mitigations layer.
src/snmalloc/mem/remotecache.h Remove now-unneeded include of sizeclasstable.
src/snmalloc/mem/metadata.h Remove now-unneeded include of sizeclasstable.
src/snmalloc/mem/mem.h Repoint pool/pooled includes to ds/ and trim unused includes.
src/snmalloc/mem/corealloc.h Add alloc(smallsizeclass_t) and precomputed-sizeclass small allocation path; loop type updates.
src/snmalloc/mem/backend_concept.h Fix include path for sizeclasstable.
src/snmalloc/global/libc.h Add malloc_small*, __malloc_{start,last_byte}_pointer, and export a checked memcpy wrapper; adjust linkage macros.
src/snmalloc/global/globalalloc.h Add linkage macros, refactor cleanup_unused, add remaining_bytes(const void*), add compile-time smallsizeclass fast-path allocation, and export dealloc/debug_teardown via SNMALLOC_API.
src/snmalloc/ds_core/sizeclassstatic.h New small-sizeclass wrapper type and constexpr helpers (is_small_sizeclass, size_to_sizeclass_const, etc.).
src/snmalloc/ds_core/sizeclassconfig.h New consolidated sizeclass configuration constants previously in ds/allocconfig.h.
src/snmalloc/ds_core/helpers.h Move report_fatal_error / message here and route through error / message_impl.
src/snmalloc/ds_core/ds_core.h Remove mitigation-specific headers; include new sizeclass config/static helpers.
src/snmalloc/ds_core/defines.h Add forward decl for message_impl; move UNUSED earlier and clarify forward decl comments.
src/snmalloc/ds/sizeclasstable.h Switch to smallsizeclass_t and new constexpr helpers; adjust lookup table construction accordingly.
src/snmalloc/ds/pooled.h Update include path for backend concept.
src/snmalloc/ds/pool.h New pool implementation moved into ds/ layer (used for allocator bootstrap, etc.).
src/snmalloc/ds/mpmcstack.h Remove include of deleted ds/allocconfig.h.
src/snmalloc/ds/ds.h Update includes to new pool/pooled/sizeclasstable and drop deleted allocconfig include.
src/snmalloc/ds/allocconfig.h Deleted; functionality split into ds_core/sizeclassconfig.h and mitigations/allocconfig.h.
src/snmalloc/README.md Document new mitigations/ layer and its intended include boundaries.
CMakeLists.txt Build snmalloc-testlib-{fast,check} and reuse object libraries for testlib-only tests across flavours; install mitigations headers.
BUILD.bazel Include test .cc files in the filegroup and disable CMake testing for Bazel builds.

Comment thread src/snmalloc/pal/pal.h
Comment on lines +71 to +77
[[noreturn]] SNMALLOC_SLOW_PATH SNMALLOC_USED_FUNCTION inline void
error(const char* const str)
{
DefaultPal::error(str);
}

SNMALLOC_USED_FUNCTION inline void message_impl(const char* const str)
Copy link

Copilot AI Apr 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

error / message_impl are defined as SNMALLOC_USED_FUNCTION inline here, but SNMALLOC_USED_FUNCTION is empty on MSVC. For translation units that instantiate snmalloc::report_fatal_error<> / snmalloc::message<> (from ds_core/helpers.h) without including pal/pal.h (e.g. tests that only include snmalloc_testlib.h), this can produce unresolved externals because no TU is forced to emit these symbols. Consider defining these via the SNMALLOC_API / SNMALLOC_API_SLOW linkage macros (so snmalloc_testlib.cc can force strong definitions on MSVC), or otherwise provide guaranteed strong/out-of-line definitions that are always linked in.

Suggested change
[[noreturn]] SNMALLOC_SLOW_PATH SNMALLOC_USED_FUNCTION inline void
error(const char* const str)
{
DefaultPal::error(str);
}
SNMALLOC_USED_FUNCTION inline void message_impl(const char* const str)
[[noreturn]] SNMALLOC_SLOW_PATH SNMALLOC_API_SLOW void
error(const char* const str)
{
DefaultPal::error(str);
}
SNMALLOC_API void message_impl(const char* const str)

Copilot uses AI. Check for mistakes.
Comment thread src/snmalloc/ds_core/sizeclassstatic.h
Comment thread src/test/snmalloc_testlib.cc Outdated
Comment on lines +106 to +119
// ScopedAllocHandle is declared in snmalloc_testlib.h; define methods here.
struct ScopedAllocHandle
{
TestScopedAllocator* ptr;
ScopedAllocHandle();
~ScopedAllocHandle();
ScopedAllocHandle(const ScopedAllocHandle&) = delete;
ScopedAllocHandle& operator=(const ScopedAllocHandle&) = delete;
void* alloc(size_t size);
void dealloc(void* p);
void dealloc(void* p, size_t size);
ScopedAllocHandle* operator->();
const ScopedAllocHandle* operator->() const;
};
Copy link

Copilot AI Apr 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ScopedAllocHandle is fully defined in snmalloc_testlib.h, but this TU redefines an identical struct ScopedAllocHandle instead of including the header and providing out-of-line method definitions. This is ODR-fragile (any header change must be manually mirrored here) and can easily turn into UB if the definitions ever diverge. Prefer including <test/snmalloc_testlib.h> and removing the duplicate struct definition, keeping only the qualified method definitions.

Copilot uses AI. Check for mistakes.
@SchrodingerZhu
Copy link
Copy Markdown
Collaborator

What do you think is the right thing? Should the libsnmallocshim.so just provide a libc interface or also the snmalloc::libc interface?

I think it is fine to have the current version. I initially thought the refactor may add a layer of indirection but the compiler aliased the symbol directly.

@SchrodingerZhu
Copy link
Copy Markdown
Collaborator

SchrodingerZhu commented Apr 29, 2026

Ccache does not work well with cl. You can try the ccache action with sccache variant. Do not use the sccache action as GHA cache object service is pretty bad in terms of ratelimit.

@SchrodingerZhu
Copy link
Copy Markdown
Collaborator

SchrodingerZhu commented Apr 29, 2026

See also https://github.com/llvm/llvm-project/blob/6591086b6ed41cfac31fd33b3c3e34708763a352/.github/workflows/libc-overlay-tests.yml#L60

Line 87 may also be useful:

    # Use MinSizeRel to reduce the size of the build.
    # Notice that CMP0141=NEW and MSVC_DEBUG_INFORMATION_FORMAT=Embedded are required
    # by the sccache tool

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants