Skip to content

feat(chunk): add chunk get peer diagnostics#97

Merged
jacderida merged 4 commits into
rc-2026.5.4from
feat/chunk-get-peer-count-rc
May 27, 2026
Merged

feat(chunk): add chunk get peer diagnostics#97
jacderida merged 4 commits into
rc-2026.5.4from
feat/chunk-get-peer-count-rc

Conversation

@mickvandijke
Copy link
Copy Markdown
Contributor

Summary

  • add ant chunk get --all-peers to query every selected closest peer and print ranked per-peer results
  • print XOR distance as a decimal number
  • add --peer-count / --peers to choose how many closest peers to try, e.g. 20 instead of the default close-group size

Verification

  • cargo fmt --all -- --check
  • cargo test -p ant-cli xor_distance_decimal
  • cargo clippy -p ant-core --lib -- -D warnings
  • cargo clippy -p ant-cli -- -D warnings

Note

  • cargo clippy --all-targets --all-features -- -D warnings currently fails on the RC branch outside this diff due multiple saorsa_core versions causing MultiAddr / P2PNode type mismatches in devnet/test-support code.

Copy link
Copy Markdown
Contributor

@grumbach grumbach left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few real issues worth addressing before merge:

  • high ant-core/Cargo.toml:40 — swaps ant-protocol from a registry pin to a git branch (rc-2026.5.4), which transitively pulls a second saorsa-core 0.24.5-rc.1 into Cargo.lock. The PR's own description admits cargo clippy --all-targets fails on the RC branch due to multiple saorsa-core versions and MultiAddr/P2PNode type mismatches. Best to land the protocol bump in its own PR, or block this one until the RC merges to crates.io. We don't want the diagnostic PR carrying the dep-bump fallout.

  • high ant-cli/src/commands/data/chunk.rs:34--peer-count works without --all-peers: it routes the default chunk get through chunk_get_from_closest_peers(addr, N), expanding the queried set well beyond close_group_size. This silently breaks the close-group invariant assumed by AIMD/budget accounting and the cache-write path — which could be contributing to the exact symptom you're chasing. Gate --peer-count behind --all-peers, or cap N <= close_group_size in the early-return path.

  • high ant-core/src/data/client/chunk.rs:362chunk_get_from_closest_peer_group dispatches all N peer GETs concurrently via FuturesUnordered with no per-command deadline and no concurrency cap. With --peer-count 200 this fans out 200 simultaneous chunk RPCs and waits for the slowest 10s timeout. Bound with buffer_unordered(K) and an overall timeout, or document the diagnostic-only intent in --help.

  • medium ant-cli/src/commands/data/chunk.rs:54-72 — CLI claims success ("Saved closest successful chunk to ...") whenever any peer returned the chunk, even if N-1 timed out. The summary line is honest but the headline isn't. Suggest Saved chunk (1/N peers responded successfully) so the operator can't misread a near-total failure as a clean hit.

  • medium (tests) — only xor_distance_decimal has new tests. chunk_get_from_closest_peer_group (ordering, summary counts, cache-write on partial success) and the CLI flag plumbing have none. At least a unit test that exercises the result-sorting and the summary tallies with stubbed chunk_get_from_peer outcomes.

  • low ant-cli/src/commands/data/chunk.rs:166,168,186,203,211,215,219 — format args inconsistently inlined. Project style is format!("foo {var}"); bind locals (let display = path.display();) where needed.

  • low ant-core/src/data/client/chunk.rs:368 — pre-dispatch peers.sort_by_key(...) is dead work; results are re-sorted by xor_distance at line 401 after FuturesUnordered completion order randomises them. Drop the pre-sort.

No unwrap/expect/panic introduced in non-test paths. peer_xor_distance consolidation into mod.rs and the quote.rs call-site updates are a clean dedup.

@mickvandijke mickvandijke force-pushed the feat/chunk-get-peer-count-rc branch from 6cb527a to 1e5ef30 Compare May 26, 2026 14:37
Copy link
Copy Markdown

@dirvine dirvine left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review: PR #97 — feat(chunk): add chunk get peer diagnostics

I reviewed the current diff (4 commits merged, latest 3191af9). Mick has addressed all of Anselme's review items from the previous round. Checking each:

Previously flagged items — status

# Issue Status
1 Cargo.toml ant-protocol git dep pin Removed — no Cargo.toml diff present anymore
2 --peer-count works without --all-peers Fixedrequires = "all_peers" in clap + runtime bail
3 Unbounded concurrent fan-out in diagnostic sweep Fixedbuffer_unordered(concurrency_limit) + diagnostic_peer_get_overall_timeout
4 Misleading success headline on partial failure Fixed — shows "x / y peers responded successfully"
5 Insufficient test coverage Added — sorting, timeout scaling, summary tallies, CLI flag validation
6 Inconsistent format-arg style Fixed — consistent {var} inlining with let display bindings
7 Dead pre-sort before FutUnordered Removedsort_chunk_peer_get_results runs after collection

Additional observations

  1. chunk_get now routes through chunk_get_from_closest_peers — This changes the hot-path to query close_group_size peers and return the first success, matching the existing behavior of the old inlined loop. No regression risk.

  2. chunk_get_from_closest_peer_group caches on partial success — Caching found chunks during diagnostics is harmless and improves cache-hit rates on retries. Fine.

  3. Schema-safe — No new dependencies, no protobuf/serialization changes, no public API additions beyond the CLI.

  4. closest_peers(count) extracted from close_group_peers() — Clean dedup. The quote.rs xor_distancepeer_xor_distance migration is equally clean.

Verdict

Approve. All review-blocking items resolved. Clean, well-tested, and safe for the RC branch.

@mickvandijke mickvandijke force-pushed the feat/chunk-get-peer-count-rc branch from 3191af9 to 4fbdf53 Compare May 27, 2026 10:12
@jacderida jacderida merged commit 0201c23 into rc-2026.5.4 May 27, 2026
12 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants