You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
_hal.component returns pyhalitem Python wrappers (from newpin, newparam, getpin) whose internal halitem.u pointer is a value-snapshot into HAL shared memory taken at creation time. When the owning component is exited (comp.exit() / hal_exit()), HAL detaches its shared memory via rtapi_shmem_delete, but the pyhalitem Python objects remain "live" with u pointers that now reference unmapped pages. Any subsequent read/write on those pins through _hal dereferences the freed memory: SIGSEGV on glibc 2.39 (Ubuntu 24.04), silent garbage read on older runtimes.
It is a use-after-free in the caller ("you closed the door and walked through it") but it is invisible from Python: no exception, no warning, just a SIGSEGV.
Where
src/hal/halmodule.ccpyhal_pin_new: pypin->pin = *pin; snapshots halitem by value. No back-ref to the owning halobject.
src/hal/halmodule.ccpyhal_exit_impl: tears down the component without invalidating outstanding pyhalitem wrappers.
src/hal/hal_lib.c:354: on last hal_exit, rtapi_shmem_delete actually detaches the pages, which is when the bug turns into a hard SIGSEGV.
Reproducer
Branch debug/4054-segfault on grandixximo/linuxcnc carries an instrumented halmodule.cc that logs u, ownership, and a mincore() mapping check on every pyhal_read_common. Running the qtdragon ui-smoke test from #4054 in an ubuntu:24.04 docker yields:
[268780.891937 pid=12426] PYHAL_EXIT_CALLED halobject=0x... hal_id=84 name=qtdragon
Python stack:
File "/work/bin/qtvcp", line 527, in shutdown
HAL.exit()
[268780.892578] hal_exit name=qtdragon hal_id=84 map_size=59
[268780.892604] hal_exit marked 59 u's as dead for hal_id=84
[268780.900233] READ u=0x7f4537a97fb0 known=1 alive=0 u_mapped=0
^^^^^^^^^^^^ shmem gone
[268780.900240] READ UNSAFE - aborting deref
u_mapped=0 is mincore() reporting the page no longer mapped. Without the safety guard the deref is a SIGSEGV. Happy to share the instrumented diff and the docker repro script.
Real-world callers that hit this
qtvcp's QPin.REGISTRY + its update_all QTimer ticks once more after HAL.exit() and reads dead pins. Worked around for qtdragon by stopping the timer first in qtvcp: stop QPin update timer before hal_exit to prevent shutdown SIGSEGV #4062. Other qtvcp screens (gmoccapy, axis, touchy) follow similar lifecycle patterns and may be latent.
Any other long-lived Python list of pin handles around process shutdown / component lifecycle change.
Summary
_hal.componentreturnspyhalitemPython wrappers (fromnewpin,newparam,getpin) whose internalhalitem.upointer is a value-snapshot into HAL shared memory taken at creation time. When the owning component is exited (comp.exit()/hal_exit()), HAL detaches its shared memory viartapi_shmem_delete, but thepyhalitemPython objects remain "live" withupointers that now reference unmapped pages. Any subsequent read/write on those pins through_haldereferences the freed memory: SIGSEGV on glibc 2.39 (Ubuntu 24.04), silent garbage read on older runtimes.It is a use-after-free in the caller ("you closed the door and walked through it") but it is invisible from Python: no exception, no warning, just a SIGSEGV.
Where
src/hal/halmodule.ccpyhal_pin_new:pypin->pin = *pin;snapshotshalitemby value. No back-ref to the owninghalobject.src/hal/halmodule.ccpyhal_read_common/pyhal_write_common: dereferenceitem->uunconditionally.src/hal/halmodule.ccpyhal_exit_impl: tears down the component without invalidating outstandingpyhalitemwrappers.src/hal/hal_lib.c:354: on lasthal_exit,rtapi_shmem_deleteactually detaches the pages, which is when the bug turns into a hard SIGSEGV.Reproducer
Branch
debug/4054-segfaulton grandixximo/linuxcnc carries an instrumentedhalmodule.ccthat logsu, ownership, and amincore()mapping check on everypyhal_read_common. Running the qtdragon ui-smoke test from #4054 in anubuntu:24.04docker yields:u_mapped=0ismincore()reporting the page no longer mapped. Without the safety guard the deref is a SIGSEGV. Happy to share the instrumented diff and the docker repro script.Real-world callers that hit this
QPin.REGISTRY+ itsupdate_allQTimer ticks once more afterHAL.exit()and reads dead pins. Worked around for qtdragon by stopping the timer first in qtvcp: stop QPin update timer before hal_exit to prevent shutdown SIGSEGV #4062. Other qtvcp screens (gmoccapy, axis, touchy) follow similar lifecycle patterns and may be latent.Related
halmodule.ccto port the HAL port type; keeps the by-valuepyhalitem.pinsnapshot, so this bug survives that PR.