threading: lock-free fast path for SemaphoreSlim.WaitAsync by thomhurst · Pull Request #125452 · dotnet/runtime

thomhurst · 2026-03-11T18:18:51Z

Use a lock-free CAS fast path in SemaphoreSlim.WaitAsync to skip the Monitor lock when a permit is immediately available, improving uncontended throughput

thomhurst · 2026-03-11T18:24:51Z

@EgorBot -intel -amd -arm

  using System.Threading;
  using System.Threading.Tasks;
  using BenchmarkDotNet.Attributes;

  [MemoryDiagnoser]
  public class SemaphoreSlimUncontended
  {
      private SemaphoreSlim _sem = new SemaphoreSlim(1, 1);

      [Benchmark]
      public async Task WaitAsync_Release()
      {
          await _sem.WaitAsync();
          _sem.Release();
      }
  }

Copilot

Pull request overview

This PR introduces a lock-free fast path in SemaphoreSlim.WaitAsync that attempts to acquire an available permit via CAS, avoiding taking m_lockObjAndDisposed when uncontended.

Changes:

Added a CAS-based fast path to decrement m_currentCount when a permit appears immediately available.
Added special-case handling to keep AvailableWaitHandle state consistent if it’s initialized concurrently during the fast-path acquire.

EgorBo · 2026-03-11T18:26:22Z

Doesn't lock itself has fast paths for that?

Copilot

Pull request overview

Copilot reviewed 1 out of 1 changed files in this pull request and generated 1 comment.

thomhurst · 2026-03-11T19:39:14Z

@EgorBot -intel -amd -arm

  using System.Threading;
  using System.Threading.Tasks;
  using BenchmarkDotNet.Attributes;

  [MemoryDiagnoser]
  public class SemaphoreSlimUncontended
  {
      private SemaphoreSlim _sem = new SemaphoreSlim(1, 1);

      [Benchmark]
      public async Task WaitAsync_Release()
      {
          await _sem.WaitAsync();
          _sem.Release();
      }
  }

Copilot

Pull request overview

Copilot reviewed 1 out of 1 changed files in this pull request and generated 2 comments.

thomhurst · 2026-03-11T20:26:53Z

@EgorBo I think just by not entering the lock we can save some time: EgorBot/Benchmarks#31

Use Interlocked.Add to apply a relative delta to m_currentCount rather than writing back an absolute snapshot-derived value, so concurrent lock-free decrements from the WaitAsync fast path are not overwritten.

Replace plain --m_currentCount with a CAS loop to prevent a double grant when the lock-free WaitAsync fast path decrements m_currentCount between the > 0 check and the decrement in the slow path. WaitCore is safe because m_waitCount++ on lock entry blocks the CAS guard for its entire critical section. WaitAsyncCore has no such protection.

Apply the same CAS-loop pattern to WaitCore's m_currentCount decrement that was applied to WaitAsyncCore in the previous commit. A fast-path thread that read m_waitCount = 0 before WaitCore's m_waitCount++ can still race with WaitCore's check-at-404 / decrement-at-407 sequence. The CAS loop serializes both operations on m_currentCount atomically.

… stress test The assert !waitSuccessful || m_currentCount > 0 in WaitCore could fire spuriously in Debug builds: the lock-free WaitAsync fast path runs outside the lock, so it can decrement m_currentCount to 0 between WaitUntilCountOrTimeout returning and the assert executing. Adds a stress test that races AvailableWaitHandle lazy initialization against WaitAsync fast-path acquires and verifies the handle is never signaled when CurrentCount == 0.

Copilot

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

…t WaitAsync fast path

…phoreSlim Also fix CS0420 in Release(): Volatile.Read(ref volatile_field) triggers a compiler error in the coreclr project build; replaced with a plain field read (already volatile) so the testhost can be rebuilt with the fixed implementation.

…ed SemaphoreSlim

…cessor task is always cancelled

Copilot

Pull request overview

Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.

- Extract duplicated CAS-decrement loop into TryDecrementCount() with AggressiveInlining, replacing inline copies in WaitCore and WaitAsyncCore - Strengthen Assert.InRange to Assert.Equal in NeverUnderflows test - Add bulk Release(2) concurrent stress test for Interlocked.Add delta math - Add cancellation-during-fast-path stress test for count integrity - Use m_currentCount (post-Add) instead of netCount for m_waitHandle.Set() - Add UncontendedSync and MixedSyncAsync benchmarks for sync path coverage

Prevents the background task from pegging a CPU core in CI while still exercising concurrent lazy initialization of the wait handle.

thomhurst · 2026-04-04T17:22:03Z

@EgorBot -intel -amd -arm

  using System.Threading;
  using System.Threading.Tasks;
  using BenchmarkDotNet.Attributes;

  [MemoryDiagnoser]
  public class SemaphoreSlimUncontended
  {
      private SemaphoreSlim _sem = new SemaphoreSlim(1, 1);

      [Benchmark]
      public async Task WaitAsync_Release()
      {
          await _sem.WaitAsync();
          _sem.Release();
      }
  }

Copilot

Pull request overview

Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.

JulieLeeMSFT · 2026-04-27T15:24:38Z

@thomhurst, please address the copilot feedback.

… file WaitAsync(CancellationToken) returns Task, not Task<bool>; the prior declaration didn't compile. Removed PerformanceTests/SemaphoreSlimBenchmarks.cs since it had no csproj and wasn't wired into any project; runtime perf benchmarks live in dotnet/performance, and the EgorBot inline benchmarks posted on the PR cover the relevant scenarios.

thomhurst · 2026-04-27T17:06:42Z

@JulieLeeMSFT Pushed 526f1c4 to address the remaining Copilot threads

The comment narrated what the next several lines do; the variable name and surrounding structure already convey it.

Address review hazards in the WaitAsync CAS fast path: - Make m_waitCount and m_asyncHead volatile. The fast path reads them without the lock; sync-waiter writes inside the lock must publish via release semantics rather than depending on the lock release that the fast path bypasses. Without this, ARM64 can let the fast path observe m_waitCount == 0 while a sync waiter is parked, stealing the slot and leaving the waiter blocked. - Restructure AvailableWaitHandle init to publish-then-reflect: publish the handle unsignaled, full barrier, then conditionally Set based on the post-publish count read. Closes the race where ManualResetEvent allocation overlapped a fast-path CAS, leaving the handle Set with count == 0. - WaitCore: loop instead of falling through with a stale waitSuccessful when TryDecrementCount loses to a fast-path acquirer. Fixes the case where Wait(Infinite) could return without owning a permit (silently dropped by the void overload, lying about acquisition for bool overloads). - Release: use Interlocked.Add's return value for the MRE.Set sentinel so a fast-path decrement racing between the Add and the re-read doesn't mask the 0 -> positive transition. - Strengthen the AvailableWaitHandle init test: allocate a fresh SemaphoreSlim per iteration so each iteration is a real attempt at the race. With a single semaphore the race only fires on the first AvailableWaitHandle access.

Defensive: the prior placement was correct because a thrown OCE always short-circuits before re-entry, but moving 'oce' (and 'timedOut', already loop-local) inside makes the freshness invariant explicit and robust to future edits.

thomhurst · 2026-04-27T17:56:33Z

@EgorBot -intel -amd -arm

  using System.Threading;
  using System.Threading.Tasks;
  using BenchmarkDotNet.Attributes;

  [MemoryDiagnoser]
  public class SemaphoreSlimUncontended
  {
      private SemaphoreSlim _sem = new SemaphoreSlim(1, 1);

      [Benchmark]
      public async Task WaitAsync_Release()
      {
          await _sem.WaitAsync();
          _sem.Release();
      }
  }

Copilot AI review requested due to automatic review settings March 11, 2026 18:18

github-actions Bot added the area-System.Threading label Mar 11, 2026

dotnet-policy-service Bot added the community-contribution Indicates that the PR has been added by a community member label Mar 11, 2026

Copilot started reviewing on behalf of thomhurst March 11, 2026 18:20 View session

EgorBot mentioned this pull request Mar 11, 2026

Benchmarks for dotnet/runtime#125452 (for @thomhurst) EgorBot/Benchmarks#30

Open

Copilot AI reviewed Mar 11, 2026

View reviewed changes

Comment thread src/libraries/System.Private.CoreLib/src/System/Threading/SemaphoreSlim.cs

thomhurst requested a review from Copilot March 11, 2026 19:08

Copilot started reviewing on behalf of thomhurst March 11, 2026 19:09 View session

Copilot AI reviewed Mar 11, 2026

View reviewed changes

Comment thread src/libraries/System.Private.CoreLib/src/System/Threading/SemaphoreSlim.cs

Copilot AI review requested due to automatic review settings March 11, 2026 19:36

Copilot started reviewing on behalf of thomhurst March 11, 2026 19:37 View session

EgorBot mentioned this pull request Mar 11, 2026

Benchmarks for dotnet/runtime#125452 (for @thomhurst) EgorBot/Benchmarks#31

Open

Copilot AI reviewed Mar 11, 2026

View reviewed changes

Comment thread src/libraries/System.Private.CoreLib/src/System/Threading/SemaphoreSlim.cs Outdated

Comment thread src/libraries/System.Private.CoreLib/src/System/Threading/SemaphoreSlim.cs

build-analysis Bot mentioned this pull request Mar 12, 2026

MsQuic fails with QUIC_STATUS_OUT_OF_MEMORY on AzureLinux #123216

Closed

adamsitnik mentioned this pull request Mar 18, 2026

Fix TOCTOU race in SharedMemoryHelpers.CreateOrOpenFile on Linux #125524

Open

thomhurst added 4 commits March 29, 2026 22:55

threading: lock-free fast path for SemaphoreSlim.WaitAsync

9cb12a3

Fix Release() race with WaitAsync CAS fast path

ecec289

Use Interlocked.Add to apply a relative delta to m_currentCount rather than writing back an absolute snapshot-derived value, so concurrent lock-free decrements from the WaitAsync fast path are not overwritten.

thomhurst force-pushed the semaphoreslim-cas branch from fb8aeb0 to 419dbb0 Compare March 29, 2026 21:55

build-analysis Bot mentioned this pull request Mar 30, 2026

XHarness package install failure on iOS due to devicectl NSPOSIXErrorDomain error 49 #123796

Open

Copilot AI review requested due to automatic review settings April 4, 2026 14:15

Copilot started reviewing on behalf of thomhurst April 4, 2026 14:16 View session

Copilot AI reviewed Apr 4, 2026

View reviewed changes

Comment thread src/libraries/System.Threading/tests/SemaphoreSlimTests.cs Outdated

Fix spurious SemaphoreFullException race in Release() under concurren…

8d85fa1

…t WaitAsync fast path

thomhurst added 4 commits April 4, 2026 15:42

Fix misleading comment and British spelling in WaitAsyncCore fast path

1851443

Remove SemaphoreSlimCas proxy — benchmarks now measure the real patch…

83845d3

…ed SemaphoreSlim

Wrap AvailableWaitHandle stress test loop in try/finally to ensure ac…

1a6be38

…cessor task is always cancelled

Copilot AI review requested due to automatic review settings April 4, 2026 16:47

Copilot started reviewing on behalf of thomhurst April 4, 2026 16:48 View session

Copilot AI reviewed Apr 4, 2026

View reviewed changes

Comment thread src/libraries/System.Threading/tests/PerformanceTests/SemaphoreSlimBenchmarks.cs Outdated

Comment thread src/libraries/System.Threading/tests/SemaphoreSlimTests.cs Outdated

thomhurst added 2 commits April 4, 2026 18:17

Add SpinWait backoff to AvailableWaitHandle stress test accessor loop

567ddf7

Prevents the background task from pegging a CPU core in CI while still exercising concurrent lazy initialization of the wait handle.

Copilot AI review requested due to automatic review settings April 4, 2026 17:20

Copilot started reviewing on behalf of thomhurst April 4, 2026 17:21 View session

EgorBot mentioned this pull request Apr 4, 2026

Benchmarks for dotnet/runtime#125452 (for @thomhurst) EgorBot/Benchmarks#87

Open

Copilot AI reviewed Apr 4, 2026

View reviewed changes

Comment thread src/libraries/System.Threading/tests/SemaphoreSlimTests.cs Outdated

Comment thread src/libraries/System.Threading/tests/PerformanceTests/SemaphoreSlimBenchmarks.cs Outdated

JulieLeeMSFT added the needs-author-action An issue or pull request that requires more info or actions from the author. label Apr 27, 2026

dotnet-policy-service Bot removed the needs-author-action An issue or pull request that requires more info or actions from the author. label Apr 27, 2026

thomhurst added 3 commits April 27, 2026 18:10

Drop redundant netCount comment

0b4069e

The comment narrated what the next several lines do; the variable name and surrounding structure already convey it.

Move WaitCore retry-loop locals inside the loop

6c63e1b

Defensive: the prior placement was correct because a thrown OCE always short-circuits before re-entry, but moving 'oce' (and 'timedOut', already loop-local) inside makes the freshness invariant explicit and robust to future edits.

EgorBot mentioned this pull request Apr 27, 2026

Benchmarks for dotnet/runtime#125452 (for @thomhurst) EgorBot/Benchmarks#164

Open

Conversation

thomhurst commented Mar 11, 2026

Uh oh!

thomhurst commented Mar 11, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

EgorBo commented Mar 11, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

thomhurst commented Mar 11, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

thomhurst commented Mar 11, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

thomhurst commented Apr 4, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

JulieLeeMSFT commented Apr 27, 2026

Uh oh!

thomhurst commented Apr 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

thomhurst commented Apr 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

thomhurst commented Apr 27, 2026 •

edited

Loading