Skip to content

MDEV-38843: test for BF applier rollback failure on apply error#5309

Open
hemantdangi-gc wants to merge 1 commit into
MariaDB:10.11from
mariadb-corporation:10.11_MDEV-38843_orig
Open

MDEV-38843: test for BF applier rollback failure on apply error#5309
hemantdangi-gc wants to merge 1 commit into
MariaDB:10.11from
mariadb-corporation:10.11_MDEV-38843_orig

Conversation

@hemantdangi-gc

Copy link
Copy Markdown
Contributor

Issue: When an applier fails to apply a write set and its rollback also fails, log_dummy_write_set() was skipped, leaving commit order stuck and locking the cluster (fixed in wsrep-lib).

Solution: Add a 3-node test that injects an applier rollback failure via simulate_rollback_failure_in_applier and verifies the node loses the inconsistency vote and disconnects instead of hanging.

Issue: When an applier fails to apply a write set and its rollback also
fails, log_dummy_write_set() was skipped, leaving commit order stuck and
locking the cluster (fixed in wsrep-lib).

Solution: Add a 3-node test that injects an applier rollback failure via
simulate_rollback_failure_in_applier and verifies the node loses the
inconsistency vote and disconnects instead of hanging.

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request addresses MDEV-38843, where a brute-force applier failure on a node could cause a complete cluster lockup. It introduces a new test case (MDEV-38843) to verify that a node experiencing an apply and rollback failure properly disconnects from the cluster rather than hanging, allowing the remaining nodes to continue. In sql/wsrep_high_priority_service.cc, a debug injection point simulate_rollback_failure_in_applier is added to simulate rollback failures and log warnings. Additionally, the wsrep-lib submodule is updated. There are no review comments, so I have no feedback to provide.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

1 participant