Skip to content

Commit 7990313

Browse files
Docs: Improve the C API documentation involving threads (GH-145520)
Co-authored-by: Stan Ulbrych <89152624+StanFromIreland@users.noreply.github.com>
1 parent 7fbdc8f commit 7990313

File tree

1 file changed

+125
-85
lines changed

1 file changed

+125
-85
lines changed

Doc/c-api/threads.rst

Lines changed: 125 additions & 85 deletions
Original file line numberDiff line numberDiff line change
@@ -10,43 +10,63 @@ Thread states and the global interpreter lock
1010
single: interpreter lock
1111
single: lock, interpreter
1212

13-
Unless on a :term:`free-threaded <free threading>` build of :term:`CPython`,
14-
the Python interpreter is not fully thread-safe. In order to support
13+
Unless on a :term:`free-threaded build` of :term:`CPython`,
14+
the Python interpreter is generally not thread-safe. In order to support
1515
multi-threaded Python programs, there's a global lock, called the :term:`global
16-
interpreter lock` or :term:`GIL`, that must be held by the current thread before
17-
it can safely access Python objects. Without the lock, even the simplest
18-
operations could cause problems in a multi-threaded program: for example, when
16+
interpreter lock` or :term:`GIL`, that must be held by a thread before
17+
accessing Python objects. Without the lock, even the simplest operations
18+
could cause problems in a multi-threaded program: for example, when
1919
two threads simultaneously increment the reference count of the same object, the
2020
reference count could end up being incremented only once instead of twice.
2121

22+
As such, only a thread that holds the GIL may operate on Python objects or
23+
invoke Python's C API.
24+
2225
.. index:: single: setswitchinterval (in module sys)
2326

24-
Therefore, the rule exists that only the thread that has acquired the
25-
:term:`GIL` may operate on Python objects or call Python/C API functions.
26-
In order to emulate concurrency of execution, the interpreter regularly
27-
tries to switch threads (see :func:`sys.setswitchinterval`). The lock is also
28-
released around potentially blocking I/O operations like reading or writing
29-
a file, so that other Python threads can run in the meantime.
27+
In order to emulate concurrency, the interpreter regularly tries to switch
28+
threads between bytecode instructions (see :func:`sys.setswitchinterval`).
29+
This is why locks are also necessary for thread-safety in pure-Python code.
30+
31+
Additionally, the global interpreter lock is released around blocking I/O
32+
operations, such as reading or writing to a file. From the C API, this is done
33+
by :ref:`detaching the thread state <detaching-thread-state>`.
34+
3035

3136
.. index::
3237
single: PyThreadState (C type)
3338

34-
The Python interpreter keeps some thread-specific bookkeeping information
35-
inside a data structure called :c:type:`PyThreadState`, known as a :term:`thread state`.
36-
Each OS thread has a thread-local pointer to a :c:type:`PyThreadState`; a thread state
39+
The Python interpreter keeps some thread-local information inside
40+
a data structure called :c:type:`PyThreadState`, known as a :term:`thread state`.
41+
Each thread has a thread-local pointer to a :c:type:`PyThreadState`; a thread state
3742
referenced by this pointer is considered to be :term:`attached <attached thread state>`.
3843

3944
A thread can only have one :term:`attached thread state` at a time. An attached
40-
thread state is typically analogous with holding the :term:`GIL`, except on
41-
:term:`free-threaded <free threading>` builds. On builds with the :term:`GIL` enabled,
42-
:term:`attaching <attached thread state>` a thread state will block until the :term:`GIL`
43-
can be acquired. However, even on builds with the :term:`GIL` disabled, it is still required
44-
to have an attached thread state to call most of the C API.
45+
thread state is typically analogous with holding the GIL, except on
46+
free-threaded builds. On builds with the GIL enabled, attaching a thread state
47+
will block until the GIL can be acquired. However, even on builds with the GIL
48+
disabled, it is still required to have an attached thread state, as the interpreter
49+
needs to keep track of which threads may access Python objects.
50+
51+
.. note::
52+
53+
Even on the free-threaded build, attaching a thread state may block, as the
54+
GIL can be re-enabled or threads might be temporarily suspended (such as during
55+
a garbage collection).
56+
57+
Generally, there will always be an attached thread state when using Python's
58+
C API, including during embedding and when implementing methods, so it's uncommon
59+
to need to set up a thread state on your own. Only in some specific cases, such
60+
as in a :c:macro:`Py_BEGIN_ALLOW_THREADS` block or in a fresh thread, will the
61+
thread not have an attached thread state.
62+
If uncertain, check if :c:func:`PyThreadState_GetUnchecked` returns ``NULL``.
4563

46-
In general, there will always be an :term:`attached thread state` when using Python's C API.
47-
Only in some specific cases (such as in a :c:macro:`Py_BEGIN_ALLOW_THREADS` block) will the
48-
thread not have an attached thread state. If uncertain, check if :c:func:`PyThreadState_GetUnchecked` returns
49-
``NULL``.
64+
If it turns out that you do need to create a thread state, call :c:func:`PyThreadState_New`
65+
followed by :c:func:`PyThreadState_Swap`, or use the dangerous
66+
:c:func:`PyGILState_Ensure` function.
67+
68+
69+
.. _detaching-thread-state:
5070

5171
Detaching the thread state from extension code
5272
----------------------------------------------
@@ -86,28 +106,37 @@ The block above expands to the following code::
86106

87107
Here is how these functions work:
88108

89-
The :term:`attached thread state` holds the :term:`GIL` for the entire interpreter. When detaching
90-
the :term:`attached thread state`, the :term:`GIL` is released, allowing other threads to attach
91-
a thread state to their own thread, thus getting the :term:`GIL` and can start executing.
92-
The pointer to the prior :term:`attached thread state` is stored as a local variable.
93-
Upon reaching :c:macro:`Py_END_ALLOW_THREADS`, the thread state that was
94-
previously :term:`attached <attached thread state>` is passed to :c:func:`PyEval_RestoreThread`.
95-
This function will block until another releases its :term:`thread state <attached thread state>`,
96-
thus allowing the old :term:`thread state <attached thread state>` to get re-attached and the
97-
C API can be called again.
98-
99-
For :term:`free-threaded <free threading>` builds, the :term:`GIL` is normally
100-
out of the question, but detaching the :term:`thread state <attached thread state>` is still required
101-
for blocking I/O and long operations. The difference is that threads don't have to wait for the :term:`GIL`
102-
to be released to attach their thread state, allowing true multi-core parallelism.
109+
The attached thread state implies that the GIL is held for the interpreter.
110+
To detach it, :c:func:`PyEval_SaveThread` is called and the result is stored
111+
in a local variable.
112+
113+
By detaching the thread state, the GIL is released, which allows other threads
114+
to attach to the interpreter and execute while the current thread performs
115+
blocking I/O. When the I/O operation is complete, the old thread state is
116+
reattached by calling :c:func:`PyEval_RestoreThread`, which will wait until
117+
the GIL can be acquired.
103118

104119
.. note::
105-
Calling system I/O functions is the most common use case for detaching
106-
the :term:`thread state <attached thread state>`, but it can also be useful before calling
107-
long-running computations which don't need access to Python objects, such
108-
as compression or cryptographic functions operating over memory buffers.
120+
Performing blocking I/O is the most common use case for detaching
121+
the thread state, but it is also useful to call it over long-running
122+
native code that doesn't need access to Python objects or Python's C API.
109123
For example, the standard :mod:`zlib` and :mod:`hashlib` modules detach the
110-
:term:`thread state <attached thread state>` when compressing or hashing data.
124+
:term:`thread state <attached thread state>` when compressing or hashing
125+
data.
126+
127+
On a :term:`free-threaded build`, the :term:`GIL` is usually out of the question,
128+
but **detaching the thread state is still required**, because the interpreter
129+
periodically needs to block all threads to get a consistent view of Python objects
130+
without the risk of race conditions.
131+
For example, CPython currently suspends all threads for a short period of time
132+
while running the garbage collector.
133+
134+
.. warning::
135+
136+
Detaching the thread state can lead to unexpected behavior during interpreter
137+
finalization. See :ref:`cautions-regarding-runtime-finalization` for more
138+
details.
139+
111140

112141
APIs
113142
^^^^
@@ -149,73 +178,84 @@ example usage in the Python source distribution.
149178
declaration.
150179

151180

152-
.. _gilstate:
153-
154181
Non-Python created threads
155182
--------------------------
156183

157184
When threads are created using the dedicated Python APIs (such as the
158-
:mod:`threading` module), a thread state is automatically associated to them
159-
and the code shown above is therefore correct. However, when threads are
160-
created from C (for example by a third-party library with its own thread
161-
management), they don't hold the :term:`GIL`, because they don't have an
162-
:term:`attached thread state`.
185+
:mod:`threading` module), a thread state is automatically associated with them,
186+
However, when a thread is created from native code (for example, by a
187+
third-party library with its own thread management), it doesn't hold an
188+
attached thread state.
163189

164190
If you need to call Python code from these threads (often this will be part
165191
of a callback API provided by the aforementioned third-party library),
166192
you must first register these threads with the interpreter by
167-
creating an :term:`attached thread state` before you can start using the Python/C
168-
API. When you are done, you should detach the :term:`thread state <attached thread state>`, and
169-
finally free it.
193+
creating a new thread state and attaching it.
170194

171-
The :c:func:`PyGILState_Ensure` and :c:func:`PyGILState_Release` functions do
172-
all of the above automatically. The typical idiom for calling into Python
173-
from a C thread is::
195+
The most robust way to do this is through :c:func:`PyThreadState_New` followed
196+
by :c:func:`PyThreadState_Swap`.
174197

175-
PyGILState_STATE gstate;
176-
gstate = PyGILState_Ensure();
198+
.. note::
199+
``PyThreadState_New`` requires an argument pointing to the desired
200+
interpreter; such a pointer can be acquired via a call to
201+
:c:func:`PyInterpreterState_Get` from the code where the thread was
202+
created.
203+
204+
For example::
205+
206+
/* The return value of PyInterpreterState_Get() from the
207+
function that created this thread. */
208+
PyInterpreterState *interp = thread_data->interp;
209+
210+
/* Create a new thread state for the interpreter. It does not start out
211+
attached. */
212+
PyThreadState *tstate = PyThreadState_New(interp);
213+
214+
/* Attach the thread state, which will acquire the GIL. */
215+
PyThreadState_Swap(tstate);
177216

178217
/* Perform Python actions here. */
179218
result = CallSomeFunction();
180219
/* evaluate result or handle exception */
181220

182-
/* Release the thread. No Python API allowed beyond this point. */
183-
PyGILState_Release(gstate);
221+
/* Destroy the thread state. No Python API allowed beyond this point. */
222+
PyThreadState_Clear(tstate);
223+
PyThreadState_DeleteCurrent();
184224

185-
Note that the ``PyGILState_*`` functions assume there is only one global
186-
interpreter (created automatically by :c:func:`Py_Initialize`). Python
187-
supports the creation of additional interpreters (using
188-
:c:func:`Py_NewInterpreter`), but mixing multiple interpreters and the
189-
``PyGILState_*`` API is unsupported. This is because :c:func:`PyGILState_Ensure`
190-
and similar functions default to :term:`attaching <attached thread state>` a
191-
:term:`thread state` for the main interpreter, meaning that the thread can't safely
192-
interact with the calling subinterpreter.
225+
.. warning::
193226

194-
Supporting subinterpreters in non-Python threads
195-
------------------------------------------------
227+
If the interpreter finalized before ``PyThreadState_Swap`` was called, then
228+
``interp`` will be a dangling pointer!
196229

197-
If you would like to support subinterpreters with non-Python created threads, you
198-
must use the ``PyThreadState_*`` API instead of the traditional ``PyGILState_*``
199-
API.
230+
.. _gilstate:
200231

201-
In particular, you must store the interpreter state from the calling
202-
function and pass it to :c:func:`PyThreadState_New`, which will ensure that
203-
the :term:`thread state` is targeting the correct interpreter::
232+
Legacy API
233+
----------
204234

205-
/* The return value of PyInterpreterState_Get() from the
206-
function that created this thread. */
207-
PyInterpreterState *interp = ThreadData->interp;
208-
PyThreadState *tstate = PyThreadState_New(interp);
209-
PyThreadState_Swap(tstate);
235+
Another common pattern to call Python code from a non-Python thread is to use
236+
:c:func:`PyGILState_Ensure` followed by a call to :c:func:`PyGILState_Release`.
210237

211-
/* GIL of the subinterpreter is now held.
212-
Perform Python actions here. */
238+
These functions do not work well when multiple interpreters exist in the Python
239+
process. If no Python interpreter has ever been used in the current thread (which
240+
is common for threads created outside Python), ``PyGILState_Ensure`` will create
241+
and attach a thread state for the "main" interpreter (the first interpreter in
242+
the Python process).
243+
244+
Additionally, these functions have thread-safety issues during interpreter
245+
finalization. Using ``PyGILState_Ensure`` during finalization will likely
246+
crash the process.
247+
248+
Usage of these functions look like such::
249+
250+
PyGILState_STATE gstate;
251+
gstate = PyGILState_Ensure();
252+
253+
/* Perform Python actions here. */
213254
result = CallSomeFunction();
214255
/* evaluate result or handle exception */
215256

216-
/* Destroy the thread state. No Python API allowed beyond this point. */
217-
PyThreadState_Clear(tstate);
218-
PyThreadState_DeleteCurrent();
257+
/* Release the thread. No Python API allowed beyond this point. */
258+
PyGILState_Release(gstate);
219259

220260

221261
.. _fork-and-threads:

0 commit comments

Comments
 (0)