MDEV-39459 Fix bad sync pattern for chain replication MTR tests#5074
MDEV-39459 Fix bad sync pattern for chain replication MTR tests#5074FarihaIS wants to merge 1 commit into
Conversation
In chain replication (1->2->3), syncing only server_3 after save_master_gtid on server_1 does not guarantee server_2 has committed, because server_2's binlog dump thread can send events to server_3 before commit_ordered() completes on server_2. Fix affected rpl tests by syncing server_2 before server_3, and update result files accordingly. All new code of the whole pull request, including one or several files that are either new files or modified ones, are contributed under the BSD-new license. I am contributing on behalf of my employer Amazon Web Services, Inc.
There was a problem hiding this comment.
Code Review
This pull request enhances the reliability of replication tests by adding explicit synchronization steps for intermediate servers in chained replication topologies. Specifically, it introduces connections to 'server_2' to sync with 'server_1' and save the GTID state before 'server_3' attempts to synchronize. These changes are applied across multiple test cases and result files to ensure consistent behavior in multi-tier replication environments. I have no feedback to provide as no review comments were included.
gkodinov
left a comment
There was a problem hiding this comment.
Thank you for your contribution! This is a preliminary review.
LGTM. And a very good catch. Please stay tuned for the final review.
|
One thing to consider: this is a test bug fix. Would you be open to backporting this to the lowest affected version (10.11 I suppose) ? |
|
@gkodinov thank you for the review!
Yes, of course - just one small question: when I was trying to rebase my changes onto 10.11, I noticed that my original PR against Would you still like me to rebase this on 10.11? What do you think? |
|
@FarihaIS yes please, this should go into 10.11. Thanks for finding the specific tests which don't exist in 10.11, that is actually good. We should fix those tests in the versions which they were added. Please create separate PRs for each version that has tests that need to be fixed. |
Description
In chain replication (1->2->3), several rpl MTR tests synchronize by calling
save_master_gtidonserver_1followed bysync_with_master_gtidonserver_3, skippingserver_2.This is unsafe because
server_2's binlog dump thread can send events toserver_3beforecommit_ordered()completes onserver_2. Any operations onserver_2that depend on the replicated data can then fail intermittently.Fix by explicitly syncing
server_2beforeserver_3in all affected tests, and update result files accordingly.Release Notes
N/A
How can this PR be tested?
Execute the rpl test suite in mysql-test-run:
Basing the PR against the correct MariaDB version
Copyright
All new code of the whole pull request, including one or several files that are either new files or modified ones, are contributed under the BSD-new license. I am contributing on behalf of my employer Amazon Web Services, Inc.