Open
Conversation
|
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #395 +/- ##
==========================================
- Coverage 63.15% 58.78% -4.37%
==========================================
Files 32 38 +6
Lines 1900 6122 +4222
Branches 204 800 +596
==========================================
+ Hits 1200 3599 +2399
- Misses 600 2122 +1522
- Partials 100 401 +301 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Add comprehensive scrub infrastructure to detect data corruption and inconsistencies across replicas in HomeObject. This is phase 1 of the scrubber implementation. - Implements deep and shallow scrubbing for PG metadata, shards, and blobs - Supports periodic and manual scrub triggering modes - Uses priority queue (MPMCPriorityQueue) for scrub task scheduling - Persists scrub metadata using superblocks to track last scrub times - Coordinates scrub operations across all replicas in a PG 1. **Deep Scrub**: Full data integrity verification - PG metadata validation - Shard existence and consistency checks - Blob hash verification (reads data and computes checksums) - Detects corrupted, missing, and inconsistent data across replicas 2. **Shallow Scrub**: Lightweight metadata-only verification - Shard existence checks - Blob index validation (no data reads) - Faster execution for routine checks - FlatBuffer-based serialization for scrub requests and responses - Leader sends scrub requests to all replicas - Followers return scrub maps with their local state - Retry logic with configurable timeouts for reliability - **ShallowScrubReport**: Tracks missing shards and blobs per peer - **DeepScrubReport**: Extends shallow report with: - Corrupted blobs/shards with error details - Inconsistent blobs (different hashes across replicas) - Corrupted PG metadata - Scrubs data in configurable ranges to avoid timeouts - Shard range: 2M shards per request - Blob range: Based on HDD IOPS for deep scrub, 2M for shallow - Early cancellation support for graceful shutdown 1. **DeepScrubTest**: Verifies detection of: - Missing blobs on followers - Missing shards on followers - Corrupted blob data (IO errors) - Inconsistent blob hashes across replicas 2. **MPMCPriorityQueue Tests**: Lock-free queue validation - Concurrent push/pop operations - Priority ordering verification - Thread safety under contention
a28ce6e to
8ae8569
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
this pr implements the framwork and basic logic of scrubber, including:
1 thread model
2 scrubber rpc
3 local scrub: deep and shallow scrub for pg, shard and blob