Fix snapshot automount race causing AVL tree panic #17943
+87
−37
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Motivation and Context
This fixes a race condition that causes kernel panics when multiple processes simultaneously access a fresh snapshot. The bug triggers a
VERIFY()assertion failure in the AVL tree code when concurrent threads attempt to add identical entries during snapshot automount.Description
The race condition occurs in
zfsctl_snapshot_mount()due to a time-of-check-time-of-use (TOCTOU) bug between checking if a snapshot is mounted and adding it to the AVL tree. The sequence is:zfsctl_snapshot_ismounted()- returns FALSEcall_usermodehelper())VERIFY()assertion failsThe fix adds a pending entry mechanism with per-entry mutex synchronization. The first mount thread creates a pending AVL entry and holds se_mtx during helper execution. Concurrent mounts find the pending entry and return success without spawning duplicate helpers, preventing the AVL panic.
Kernel Stack Trace:
How Has This Been Tested?
Reproduction script:
Results:
Types of changes
Checklist:
Signed-off-by.