MDEV-38843: test for BF applier rollback failure on apply error#5309
MDEV-38843: test for BF applier rollback failure on apply error#5309hemantdangi-gc wants to merge 1 commit into
Conversation
Issue: When an applier fails to apply a write set and its rollback also fails, log_dummy_write_set() was skipped, leaving commit order stuck and locking the cluster (fixed in wsrep-lib). Solution: Add a 3-node test that injects an applier rollback failure via simulate_rollback_failure_in_applier and verifies the node loses the inconsistency vote and disconnects instead of hanging.
There was a problem hiding this comment.
Code Review
This pull request addresses MDEV-38843, where a brute-force applier failure on a node could cause a complete cluster lockup. It introduces a new test case (MDEV-38843) to verify that a node experiencing an apply and rollback failure properly disconnects from the cluster rather than hanging, allowing the remaining nodes to continue. In sql/wsrep_high_priority_service.cc, a debug injection point simulate_rollback_failure_in_applier is added to simulate rollback failures and log warnings. Additionally, the wsrep-lib submodule is updated. There are no review comments, so I have no feedback to provide.
Important
The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.
Issue: When an applier fails to apply a write set and its rollback also fails, log_dummy_write_set() was skipped, leaving commit order stuck and locking the cluster (fixed in wsrep-lib).
Solution: Add a 3-node test that injects an applier rollback failure via simulate_rollback_failure_in_applier and verifies the node loses the inconsistency vote and disconnects instead of hanging.