fix: data race on last_storage_ptr_ cache in gil_safe_call_once_and_store by henryiii · Pull Request #6087 · pybind/pybind11

henryiii · 2026-06-11T21:11:02Z

🤖 AI text below 🤖

Problem

In include/pybind11/gil_safe_call_once.h, the PYBIND11_HAS_SUBINTERPRETER_SUPPORT branch of gil_safe_call_once_and_store caches the per-interpreter storage pointer in T *last_storage_ptr_ as a fast path for the single-interpreter case.

This plain pointer is:

written in call_once_and_store_result() (inside the std::call_once lambda) and in get_stored(), and
read in get_stored().

Under free-threaded CPython (Py_GIL_DISABLED), gil_scoped_acquire provides no mutual exclusion, so these concurrent unsynchronized reads/writes of a plain pointer are a C++ data race (undefined behavior).

Additionally, get_stored() loaded the cached pointer before calling is_last_storage_valid(). The writer publishes the pointer (last_storage_ptr_) before setting the validity flag (is_initialized_by_at_least_one_interpreter_), so a reader must observe the flag first and only then load the pointer to get correct acquire/release ordering.

Fix

Make last_storage_ptr_ a std::atomic<T *> (<atomic> is already included in this branch). Default seq_cst operations pair with the existing atomic validity flag.
Restructure get_stored() to check is_last_storage_valid() first, and only load last_storage_ptr_ on the fast path; on the slow path, look up the per-interpreter storage and update the cache as before.

The change is minimal and behavior-preserving on the non-free-threaded build.

Not addressed here

The separate embedded finalize / re-init staleness concern (a stale last_storage_ptr_ surviving interpreter finalize + re-init within the same process) is not fixed by this PR. It is tracked separately in the issue.

Verification

Compiled a scratch translation unit including <pybind11/gil_safe_call_once.h> and instantiating pybind11::gil_safe_call_once_and_store<int> with c++ -std=c++17 -fsyntax-only against Python 3.14 (the subinterpreter branch is active by default for Python >= 3.12). Compiles cleanly. clang-format applied.

Part of #6084

…tore Under free-threaded CPython (Py_GIL_DISABLED) the GIL provides no mutual exclusion, so the plain pointer last_storage_ptr_ was read and written concurrently without synchronization, a C++ data race. Make it a std::atomic<T *>. Also reorder get_stored() to check is_last_storage_valid() before loading the cached pointer. The writer publishes the pointer before setting the validity flag, so the flag must be observed first for correct acquire/release ordering. Assisted-by: ClaudeCode:claude-fable-5

rwgk

gpt-5.5:

I reviewed the change in gil_safe_call_once.h: making last_storage_ptr_ atomic and loading it only after the validity flag is observed matches the intended seq-cst publish/observe ordering. The change stays scoped to the subinterpreter-support branch and preserves the existing slow-path behavior.

The functional bug being fixed is a C++ data race that matters when Py_GIL_DISABLED is active, because the GIL no longer serializes reads/writes to last_storage_ptr_.

On normal GIL builds, the same code is compiled in the subinterpreter-support branch on Python 3.12+, but the existing GIL discipline already prevents the problematic concurrent unsynchronized access in the intended usage. So the atomic pointer and reordered load are mostly harmless correctness hardening there, not a user-visible behavioral fix.

One nuance: the reorder in get_stored() is conceptually tied to the atomic publish protocol. Once last_storage_ptr_ becomes atomic, checking is_initialized_by_at_least_one_interpreter_ before reading the pointer is the right way to make the memory ordering argument valid. So both changes belong together, but the reason they are needed is free-threaded Python.

henryiii mentioned this pull request Jun 11, 2026

Code review findings (Claude) #6084

Open

5 tasks

henryiii marked this pull request as ready for review June 15, 2026 19:08

henryiii force-pushed the fix/gil-safe-call-once-atomic-cache branch from 3f3c573 to faf4b5d Compare June 17, 2026 12:14

rwgk approved these changes Jun 23, 2026

View reviewed changes

henryiii merged commit 085b660 into pybind:master Jun 24, 2026
86 checks passed

henryiii deleted the fix/gil-safe-call-once-atomic-cache branch June 24, 2026 01:48

github-actions Bot added the needs changelog Possibly needs a changelog entry label Jun 24, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix: data race on last_storage_ptr_ cache in gil_safe_call_once_and_store#6087

fix: data race on last_storage_ptr_ cache in gil_safe_call_once_and_store#6087
henryiii merged 1 commit into
pybind:masterfrom
henryiii:fix/gil-safe-call-once-atomic-cache

henryiii commented Jun 11, 2026

Uh oh!

rwgk left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

henryiii commented Jun 11, 2026

Problem

Fix

Not addressed here

Verification

Uh oh!

rwgk left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants