Skip to content

snapshot: fix delete race producing stuck open_ref / empty clone entries#1018

Open
schmidt-scaled wants to merge 1 commit into
mainfrom
fix/snapshot-delete-race
Open

snapshot: fix delete race producing stuck open_ref / empty clone entries#1018
schmidt-scaled wants to merge 1 commit into
mainfrom
fix/snapshot-delete-race

Conversation

@schmidt-scaled
Copy link
Copy Markdown
Contributor

Summary

Three independent fixes that together close the "Cannot remove snapshot because it is open" / EBUSY (-16) state where the snapshot ends up with non-zero open_ref but no clone entries and can only be cleared by restarting the host node (incident: aws_dual_soak 2026-04-30, 14 stuck snapshots).

Fixes

  1. Random VUID dedupe & range bump (simplyblock_core/utils/__init__.py)

    • get_random_vuid range goes from 10k → 1M.
    • Both get_random_vuid and get_random_snapshot_vuid dedupe against numeric suffixes parsed out of existing CLN_/LVOL_/SNAP_ bdev names cluster-wide.
    • With ~10k lvols+snaps the legacy 10k range hit ~50% birthday-collision probability, which is what triggered the SPDK lvol with name ... already exists rejection, the mgmt async-delete, and the reuse-during-deletion sequence.
  2. Reject snapshot/clone ops on pending-deletion targets (snapshot_controller.add / snapshot_controller.clone)

    • add rejects when source lvol is STATUS_IN_DELETION.
    • clone rejects when source snapshot has deleted=True or STATUS_IN_DELETION.
    • Closes the window between an async delete being issued and a fresh create slipping through against the same blob.
  3. Snapshot delete waits for clone SPDK delete (snapshot_controller.delete)

    • A clone counts as "gone" only once its deletion_status field has been set (i.e. leader's delete_lvol_from_node returned). An IN_DELETION clone with no deletion_status is still SPDK-blocking and the snapshot is soft-deleted instead.
    • The clone's own delete-completion path re-triggers the hard delete once SPDK has actually released the bdev.

Test plan

  • tests/test_snapshot_delete_race.py — 10 unit tests covering all three fixes (random vuid dedupe, add/clone rejection, delete waits for in-flight clone)
  • full tests/ suite green (running)
  • soak verification against AWS cluster after merge

🤖 Generated with Claude Code

Three independent fixes that together close the
"Cannot remove snapshot because it is open" / EBUSY (-16) state where
the snapshot ends up with non-zero open_ref but no clone entries and
can only be cleared by restarting the host node.

1. Bump random VUID space from 10k to 1M and dedupe against existing
   CLN_/LVOL_/SNAP_ bdev-name numeric suffixes. With ~10k lvols+snaps
   the legacy 10k range hit ~50% birthday-collision probability,
   producing repeated SPDK "lvol with name already exists" rejections
   that triggered the async-delete-then-reuse sequence below.

2. snapshot_controller.add and .clone reject ops on a target that is
   in pending deletion (lvol STATUS_IN_DELETION; snapshot
   STATUS_IN_DELETION or deleted=True). Closes the window between an
   async delete being issued and a fresh create slipping through
   against the same blob, which left snapshot parent metadata
   partially overwritten by the new clone's lineage.

3. snapshot_controller.delete blocks the snapshot's hard-delete while
   any clone's SPDK-side delete is still in flight. Previously any
   IN_DELETION clone was treated as "already gone" and the snap
   delete proceeded to call SPDK, which returned EBUSY because the
   clone's bdev was still open. Now a clone counts as gone only when
   its deletion_status field has been set (i.e. the leader's
   delete_lvol_from_node returned). Otherwise the snapshot is
   soft-deleted; the clone's own delete-completion path will
   re-trigger the hard delete once SPDK has actually released it.

Tests: tests/test_snapshot_delete_race.py covers all three fixes
(10 tests, all green).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
_BDEV_NAME_NUMERIC_SUFFIX = re.compile(r'(?:^|[/_])(\d+)\s*$')


def _used_bdev_name_numbers(db_controller):
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function can be written like this:

def _used_bdev_name_numbers(db_controller):
    used = set()
    for lvol in db_controller.get_lvols():
        used.add(lvol.vuid)

    for snap in db_controller.get_snapshots():
        used.add(snap.vuid)
    return used

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants