feat(chunk): add chunk get peer diagnostics#97
Conversation
grumbach
left a comment
There was a problem hiding this comment.
A few real issues worth addressing before merge:
-
high
ant-core/Cargo.toml:40— swapsant-protocolfrom a registry pin to a git branch (rc-2026.5.4), which transitively pulls a secondsaorsa-core 0.24.5-rc.1intoCargo.lock. The PR's own description admitscargo clippy --all-targetsfails on the RC branch due to multiplesaorsa-coreversions andMultiAddr/P2PNodetype mismatches. Best to land the protocol bump in its own PR, or block this one until the RC merges to crates.io. We don't want the diagnostic PR carrying the dep-bump fallout. -
high
ant-cli/src/commands/data/chunk.rs:34—--peer-countworks without--all-peers: it routes the defaultchunk getthroughchunk_get_from_closest_peers(addr, N), expanding the queried set well beyondclose_group_size. This silently breaks the close-group invariant assumed by AIMD/budget accounting and the cache-write path — which could be contributing to the exact symptom you're chasing. Gate--peer-countbehind--all-peers, or capN <= close_group_sizein the early-return path. -
high
ant-core/src/data/client/chunk.rs:362—chunk_get_from_closest_peer_groupdispatches all N peer GETs concurrently viaFuturesUnorderedwith no per-command deadline and no concurrency cap. With--peer-count 200this fans out 200 simultaneous chunk RPCs and waits for the slowest 10s timeout. Bound withbuffer_unordered(K)and an overall timeout, or document the diagnostic-only intent in--help. -
medium
ant-cli/src/commands/data/chunk.rs:54-72— CLI claims success ("Saved closest successful chunk to ...") whenever any peer returned the chunk, even if N-1 timed out. The summary line is honest but the headline isn't. SuggestSaved chunk (1/N peers responded successfully)so the operator can't misread a near-total failure as a clean hit. -
medium (tests) — only
xor_distance_decimalhas new tests.chunk_get_from_closest_peer_group(ordering, summary counts, cache-write on partial success) and the CLI flag plumbing have none. At least a unit test that exercises the result-sorting and the summary tallies with stubbedchunk_get_from_peeroutcomes. -
low
ant-cli/src/commands/data/chunk.rs:166,168,186,203,211,215,219— format args inconsistently inlined. Project style isformat!("foo {var}"); bind locals (let display = path.display();) where needed. -
low
ant-core/src/data/client/chunk.rs:368— pre-dispatchpeers.sort_by_key(...)is dead work; results are re-sorted byxor_distanceat line 401 afterFuturesUnorderedcompletion order randomises them. Drop the pre-sort.
No unwrap/expect/panic introduced in non-test paths. peer_xor_distance consolidation into mod.rs and the quote.rs call-site updates are a clean dedup.
6cb527a to
1e5ef30
Compare
dirvine
left a comment
There was a problem hiding this comment.
Review: PR #97 — feat(chunk): add chunk get peer diagnostics
I reviewed the current diff (4 commits merged, latest 3191af9). Mick has addressed all of Anselme's review items from the previous round. Checking each:
Previously flagged items — status
| # | Issue | Status |
|---|---|---|
| 1 | Cargo.toml ant-protocol git dep pin |
✅ Removed — no Cargo.toml diff present anymore |
| 2 | --peer-count works without --all-peers |
✅ Fixed — requires = "all_peers" in clap + runtime bail |
| 3 | Unbounded concurrent fan-out in diagnostic sweep | ✅ Fixed — buffer_unordered(concurrency_limit) + diagnostic_peer_get_overall_timeout |
| 4 | Misleading success headline on partial failure | ✅ Fixed — shows "x / y peers responded successfully" |
| 5 | Insufficient test coverage | ✅ Added — sorting, timeout scaling, summary tallies, CLI flag validation |
| 6 | Inconsistent format-arg style | ✅ Fixed — consistent {var} inlining with let display bindings |
| 7 | Dead pre-sort before FutUnordered | ✅ Removed — sort_chunk_peer_get_results runs after collection |
Additional observations
-
chunk_getnow routes throughchunk_get_from_closest_peers— This changes the hot-path to queryclose_group_sizepeers and return the first success, matching the existing behavior of the old inlined loop. No regression risk. -
chunk_get_from_closest_peer_groupcaches on partial success — Caching found chunks during diagnostics is harmless and improves cache-hit rates on retries. Fine. -
Schema-safe — No new dependencies, no protobuf/serialization changes, no public API additions beyond the CLI.
-
closest_peers(count)extracted fromclose_group_peers()— Clean dedup. Thequote.rsxor_distance→peer_xor_distancemigration is equally clean.
Verdict
Approve. All review-blocking items resolved. Clean, well-tested, and safe for the RC branch.
3191af9 to
4fbdf53
Compare
Summary
ant chunk get --all-peersto query every selected closest peer and print ranked per-peer results--peer-count/--peersto choose how many closest peers to try, e.g. 20 instead of the default close-group sizeVerification
cargo fmt --all -- --checkcargo test -p ant-cli xor_distance_decimalcargo clippy -p ant-core --lib -- -D warningscargo clippy -p ant-cli -- -D warningsNote
cargo clippy --all-targets --all-features -- -D warningscurrently fails on the RC branch outside this diff due multiplesaorsa_coreversions causingMultiAddr/P2PNodetype mismatches in devnet/test-support code.