Skip to content

MDEV-39459 Fix bad sync pattern for chain replication MTR tests#5074

Open
FarihaIS wants to merge 1 commit into
MariaDB:mainfrom
FarihaIS:mdev-39459
Open

MDEV-39459 Fix bad sync pattern for chain replication MTR tests#5074
FarihaIS wants to merge 1 commit into
MariaDB:mainfrom
FarihaIS:mdev-39459

Conversation

@FarihaIS
Copy link
Copy Markdown
Contributor

@FarihaIS FarihaIS commented May 13, 2026

Description

In chain replication (1->2->3), several rpl MTR tests synchronize by calling save_master_gtid on server_1 followed by sync_with_master_gtid on server_3, skipping server_2.

This is unsafe because server_2's binlog dump thread can send events to server_3 before commit_ordered() completes on server_2. Any operations on server_2 that depend on the replicated data can then fail intermittently.

Fix by explicitly syncing server_2 before server_3 in all affected tests, and update result files accordingly.

Release Notes

N/A

How can this PR be tested?

Execute the rpl test suite in mysql-test-run:

--------------------------------------------------------------------------
The servers were restarted 559 times
Spent 3203.500 of 545 seconds executing testcases

Completed: All 1230 tests were successful.
83 tests were skipped, 69 by the test itself.

Basing the PR against the correct MariaDB version

  • This is a bug fix, and the PR is based against the main branch.

Copyright

All new code of the whole pull request, including one or several files that are either new files or modified ones, are contributed under the BSD-new license. I am contributing on behalf of my employer Amazon Web Services, Inc.

In chain replication (1->2->3), syncing only server_3 after
save_master_gtid on server_1 does not guarantee server_2 has committed,
because server_2's binlog dump thread can send events to server_3 before
commit_ordered() completes on server_2.

Fix affected rpl tests by syncing server_2 before server_3, and update
result files accordingly.

All new code of the whole pull request, including one or several files
that are either new files or modified ones, are contributed under the
BSD-new license. I am contributing on behalf of my employer Amazon Web
Services, Inc.
Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request enhances the reliability of replication tests by adding explicit synchronization steps for intermediate servers in chained replication topologies. Specifically, it introduces connections to 'server_2' to sync with 'server_1' and save the GTID state before 'server_3' attempts to synchronize. These changes are applied across multiple test cases and result files to ensure consistent behavior in multi-tier replication environments. I have no feedback to provide as no review comments were included.

@FarihaIS FarihaIS marked this pull request as ready for review May 14, 2026 00:19
@gkodinov gkodinov added the External Contribution All PRs from entities outside of MariaDB Foundation, Corporation, Codership agreements. label May 14, 2026
@gkodinov gkodinov self-assigned this May 14, 2026
Copy link
Copy Markdown
Member

@gkodinov gkodinov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your contribution! This is a preliminary review.

LGTM. And a very good catch. Please stay tuned for the final review.

@gkodinov
Copy link
Copy Markdown
Member

One thing to consider: this is a test bug fix. Would you be open to backporting this to the lowest affected version (10.11 I suppose) ?

@gkodinov gkodinov requested a review from bnestere May 14, 2026 10:37
@gkodinov gkodinov assigned bnestere and unassigned gkodinov May 14, 2026
@FarihaIS
Copy link
Copy Markdown
Contributor Author

@gkodinov thank you for the review!

One thing to consider: this is a test bug fix. Would you be open to backporting this to the lowest affected version (10.11 I suppose) ?

Yes, of course - just one small question: when I was trying to rebase my changes onto 10.11, I noticed that my original PR against main contained fixes to these files that are not tracked on 10.11 I believe:

mysql-test/suite/rpl/r/rpl_extra_col_slave_rebinlog.result
mysql-test/suite/rpl/r/rpl_master_slave_mismatched_columns_full.result
mysql-test/suite/rpl/r/rpl_row_img_sequence_full_nodup.result
mysql-test/suite/rpl/t/rpl_extra_col_slave_rebinlog.test
mysql-test/suite/rpl/t/rpl_master_slave_mismatched_columns_full.inc`

Would you still like me to rebase this on 10.11? What do you think?

@bnestere
Copy link
Copy Markdown
Contributor

@FarihaIS yes please, this should go into 10.11. Thanks for finding the specific tests which don't exist in 10.11, that is actually good. We should fix those tests in the versions which they were added. Please create separate PRs for each version that has tests that need to be fixed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

External Contribution All PRs from entities outside of MariaDB Foundation, Corporation, Codership agreements.

Development

Successfully merging this pull request may close these issues.

3 participants