HDDS-8703. Integration test for SnapshotDeletingService during OM failover#10024
Open
arunsarin85 wants to merge 2 commits intoapache:masterfrom
Open
HDDS-8703. Integration test for SnapshotDeletingService during OM failover#10024arunsarin85 wants to merge 2 commits intoapache:masterfrom
arunsarin85 wants to merge 2 commits intoapache:masterfrom
Conversation
adoroszlai
reviewed
Apr 2, 2026
Contributor
adoroszlai
left a comment
There was a problem hiding this comment.
Thanks @arunsarin85 for the patch.
| * consistent. (HDDS-8703) | ||
| */ | ||
| @Test | ||
| public void testSnapshotDeletingServiceWithMultipleSnapshotsDuringFailover() |
Contributor
There was a problem hiding this comment.
Please make this test parameterized with numSnapshots 1 and 3, then testSnapshotDeletingServiceDuringOMFailover can be removed, and this one renamed to testSnapshotDeletingServiceDuringOMFailover.
@ParametizedTest
@ValueSource(ints = { 1, 3 })
void testSnapshotDeletingServiceDuringOMFailover(int numSnapshots)
Contributor
Author
There was a problem hiding this comment.
Thanks for the suggestion! Updated as recommended.
The two separate @test methods have been merged into a single @ParameterizedTest:
@ParameterizedTest
@ValueSource(ints = {1, 3})
public void testSnapshotDeletingServiceDuringOMFailover(int numSnapshots)
numSnapshots=1 covers the single-snapshot failover scenario
numSnapshots=3 covers the multi-snapshot backlog scenario
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What changes were proposed in this pull request?
Added two integration tests to TestOzoneManagerHASnapshot that verify SnapshotDeletingService (SDS) behaves correctly when an OM leader failover happens while snapshot cleanup is pending.
testSnapshotDeletingServiceDuringOMFailover Simulates SDS being blocked on the old leader (via suspend()) while a snapshot is queued for deletion. Triggers a leader failover by shutting down the old leader. Verifies that the new leader's SDS independently picks up the pending SNAPSHOT_DELETED entry, purges it from the DB, and leaves the snapshot chain in a consistent state.
testSnapshotDeletingServiceWithMultipleSnapshotsDuringFailover Extends the above scenario to 3 snapshots queued for deletion simultaneously before the failover. Verifies the new leader's SDS correctly processes the full backlog and that chain integrity holds after all cleanups complete.
Please describe your PR in detail:
SnapshotDeletingService (SDS) is a background service on the OM leader responsible for cleaning up deleted snapshots.Two @test methods are added:
What is the link to the Apache JIRA
https://issues.apache.org/jira/browse/HDDS-8703
How was this patch tested?
The two new tests were run locally against a 3-node MiniOzoneHAClusterImpl:
mvn test -pl hadoop-ozone/integration-test -am
-Dtest="TestOzoneManagerHASnapshot#testSnapshotDeletingServiceDuringOMFailover+testSnapshotDeletingServiceWithMultipleSnapshotsDuringFailover"
-DfailIfNoTests=false
Result:
Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 33.88 s

BUILD SUCCESS