test: extract site explorer power shelf tests#2250
Conversation
|
Important Review skippedAuto reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Enterprise Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
WalkthroughThis PR consolidates test infrastructure by establishing a centralized ChangesAPI Core Test Support Consolidation
New Test-Harness Crate Infrastructure
Power-Shelf Test Suite Migration
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes This PR exhibits substantial scope: a new crate with non-trivial test infrastructure (harness builder with background task management, state controller iteration, RPC integration), significant test suite migration (removal of 13 tests from api-core, addition of 12 comprehensive integration tests in site-explorer), and API surface expansion. The test-harness builder requires careful review of lifecycle management (cancellation tokens, JoinSet cleanup, drop guards), transaction handling, and metric wiring. The power-shelf test suite is dense with multiple independent test scenarios, database assertions, metric validation, and state machine behavior. Cross-module refactoring (endpoint-explorer re-exports, logging consolidation) adds moderate coupling complexity. Heterogeneous changes across configuration, types, infrastructure, and integration tests require separate reasoning per layer. Possibly related PRs
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 6
🧹 Nitpick comments (10)
crates/site-explorer/tests/power_shelf.rs (8)
1305-1314: 💤 Low valueRemove commented-out assertion code.
This large block of commented-out state version verification logic should be deleted if it's no longer needed, or re-enabled if it's required for the test.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@crates/site-explorer/tests/power_shelf.rs` around lines 1305 - 1314, Remove the large commented-out assertion block that builds state_versions from final_state (the commented lines referencing state_versions, final_state, and the assert! about state_versions.len() > 1); either delete these comment lines entirely if the verification is no longer needed, or re-enable them by uncommenting and ensuring the test compiles and uses state_versions/state_version properly inside the test function where final_state is available.
1087-1094: 💤 Low valueRemove commented-out code and second early return.
Lines 1089-1091 contain commented-out debug
println!statements that should be deleted. Lines 1092-1094 contain anotherif 1 == 1 { return Ok(()); }block that's redundant given the earlier return at line 1067-1069. Clean up this section entirely.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@crates/site-explorer/tests/power_shelf.rs` around lines 1087 - 1094, Remove the commented-out debug loop and the redundant early-return by deleting the lines containing the commented println!/env.run_power_shelf_controller_iteration() block and the `if 1 == 1 { return Ok(()); }` statement so the test no longer contains dead commented code or a second early return; search for the `println!`/`env.run_power_shelf_controller_iteration().await` snippet and the `if 1 == 1` block in the test and remove them, leaving the function flow as originally intended with only the earlier return.
254-366: 💤 Low valueRemove commented-out code.
Line 350 contains a commented-out call to
explorer.run_single_iteration().await.unwrap();. Either remove this line entirely or document why it's disabled with a clear TODO/FIXME comment.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@crates/site-explorer/tests/power_shelf.rs` around lines 254 - 366, In test_site_explorer_power_shelf_discovery_with_static_ip remove the commented-out call to explorer.run_single_iteration().await.unwrap(); (or replace it with a one-line TODO/FIXME explaining why it's disabled) so there is no stray commented code left in the test; locate the commented line referencing explorer.run_single_iteration() and either delete it or add a concise TODO/FIXME comment directly above the line documenting the reason for disabling the extra iteration.
1481-1491: ⚡ Quick winAvoid unused variable pattern.
Line 1481 uses
let _history_by_ids = ...followed by commented-out assertions at lines 1489-1491. If this query result is genuinely unused, uselet _ = ...(without a variable name) to make the intent explicit. Otherwise, re-enable the assertions or remove the query entirely.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@crates/site-explorer/tests/power_shelf.rs` around lines 1481 - 1491, The local binding `_history_by_ids` is unused after calling `db::state_history::find_by_object_ids` in the test; either replace `let _history_by_ids = ...` with `let _ = db::state_history::find_by_object_ids(...).await?;` to explicitly discard the result, or re-enable the commented assertions that reference `history_by_ids` (and rename `_history_by_ids` to `history_by_ids`) so the query result is actually asserted against; update the call site around `db::state_history::find_by_object_ids`, the `_history_by_ids` binding, and the commented lines that assert `contains_key(&power_shelf1_id)`/`contains_key(&power_shelf2_id)` accordingly.
1700-1720: ⚖️ Poor tradeoffFragile JSON comparison using string manipulation.
Lines 1701-1702 and 1717-1719 compare JSON by removing spaces with
.replace(" ", ""). This approach is brittle and fails if the serialization format changes in other ways (key ordering, escaping, etc.). Consider either:
- Deserializing both strings back to the state enum and comparing structurally
- Using a proper JSON equality library
- Comparing the typed
PowerShelfControllerStatevalues directly without round-tripping through JSON🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@crates/site-explorer/tests/power_shelf.rs` around lines 1700 - 1720, The test currently compares JSON strings by stripping spaces (history_entry.state.replace(" ", "") vs serde_json::to_string(&state)?), which is brittle; instead parse the stored JSON and compare structurally — either deserialize history_entry.state into the concrete type (e.g., PowerShelfControllerState) and assert equality with state, or deserialize both into serde_json::Value via serde_json::from_str and compare the Value to serde_json::to_value(&state). Update the assertions around history_entry.state and found_entry.unwrap().state to perform these structural deserializations/comparisons rather than string replacement.
1279-1286: 💤 Low valueRemove commented-out code and redundant early return.
Lines 1281-1283 contain commented-out iteration code. Lines 1284-1286 contain another
if 1 == 1 { return Ok(()); }block that's redundant. Remove this entire section.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@crates/site-explorer/tests/power_shelf.rs` around lines 1279 - 1286, Remove the commented-out loop and the redundant early-return block: delete the three commented lines calling env.run_power_shelf_controller_iteration() and the subsequent if 1 == 1 { return Ok(()); } so the test continues normally; look for the env.run_power_shelf_controller_iteration() commented loop and the literal if 1 == 1 return Ok(()) and remove those lines together.
127-1755: 🏗️ Heavy liftConsider extracting common test setup patterns.
The test suite contains significant code duplication:
EndpointExplorationReportconstruction (appears ~11 times with similar fields)- DHCP discovery + power shelf creation flow (repeated in multiple tests)
- Expected power shelf database seeding
Extract these into helper methods on
Envor standalone functions to improve maintainability. For example:impl Env { fn create_basic_exploration_report(&self, serial: &str, model: &str) -> EndpointExplorationReport { EndpointExplorationReport { endpoint_type: EndpointType::Bmc, vendor: Some(bmc_vendor::BMCVendor::Nvidia), systems: vec![ComputerSystem { serial_number: Some(serial.to_string()), ..Default::default() }], chassis: vec![Chassis { model: Some(model.to_string()), ..Default::default() }], model: Some(model.to_string()), ..Default::default() } } async fn setup_power_shelf_with_dhcp(&self, power_shelf: &mut FakePowerShelf) -> Result<(), Box<dyn std::error::Error>> { let response = self.api() .discover_dhcp( DhcpDiscovery::builder( power_shelf.bmc_mac_address.to_string(), power_shelf.relay_address.to_string(), ) .tonic_request(), ) .await? .into_inner(); power_shelf.ip = response.address; Ok(()) } }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@crates/site-explorer/tests/power_shelf.rs` around lines 127 - 1755, Several tests duplicate constructing EndpointExplorationReport, running DHCP discovery to set FakePowerShelf.ip, and seeding expected_power_shelf rows; extract helpers to reduce duplication: add Env::create_basic_exploration_report(&self, serial: &str, model: &str) returning EndpointExplorationReport and Env::setup_power_shelf_with_dhcp(&self, power_shelf: &mut FakePowerShelf) -> Result<(), Box<dyn std::error::Error>> to perform the DhcpDiscovery call and set power_shelf.ip, plus a helper Env::seed_expected_power_shelf(&self, power_shelf: &FakePowerShelf) that wraps db::expected_power_shelf::create; update tests like test_site_explorer_power_shelf_discovery, test_site_explorer_power_shelf_with_static_ip, test_site_explorer_power_shelf_creation_limit, etc. to call these helpers and replace the repeated EndpointExplorationReport literals with create_basic_exploration_report calls.
1513-1520: 💤 Low valueRemove commented-out iteration code.
Lines 1515-1517 contain commented-out controller iteration calls that should be deleted.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@crates/site-explorer/tests/power_shelf.rs` around lines 1513 - 1520, Delete the commented-out controller iteration block (the three lines calling env.run_power_shelf_controller_iteration() that are prefixed with //) so the test no longer contains dead commented code; keep the surrounding TODO comment if desired and leave the remaining early return logic unchanged—look for the env.run_power_shelf_controller_iteration() calls to locate the commented lines to remove.crates/test-harness/src/network/controller.rs (1)
68-110: ⚡ Quick winDocument the state iteration count requirement.
Both
create_underlay_segment()andcreate_admin_segment()run exactly 2 state controller iterations (lines 101-102, 145-146) without explaining why this specific count is needed. This "magic number" should be documented to clarify the provisioning lifecycle requirements.📋 Suggested documentation
let segment = self .api .create_network_segment(tonic::Request::new(request)) .await .expect("Unable to create network segment") .into_inner(); + // Run two iterations to fully provision the segment: + // 1st iteration: allocate resources (VLAN, VNI) + // 2nd iteration: transition to active state self.run_single_iteration().await; self.run_single_iteration().await; TestNetworkSegment {🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@crates/test-harness/src/network/controller.rs` around lines 68 - 110, Add a brief explanatory comment above the two run_single_iteration().await calls in both create_underlay_segment() and create_admin_segment() that documents why exactly two state controller iterations are required for provisioning (e.g., first iteration to enqueue/apply the network creation and a second to reconcile and persist resulting state/allocate IPs), mention any assumptions (such as async job scheduling or reconciliation semantics), and reference run_single_iteration() so future readers understand this “magic number” is intentional and tied to the controller’s two-step provisioning lifecycle.crates/test-harness/src/lib.rs (1)
57-72: 💤 Low valueConsider consolidating error handling in test setup.
The
test_domain()method chains multiple.unwrap()calls, including a particularly awkward.map().unwrap().unwrap()pattern at lines 68-70. While panicking on test setup failure is acceptable, the readability could be improved.♻️ Suggested refactor for clarity
pub async fn test_domain(&self) -> TestDomain { let name = "testharness.example.com"; - let id = self + let response = self .api .create_domain(Request::new(rpc::protos::dns::CreateDomainRequest { name: name.to_string(), })) .await - .unwrap() - .into_inner() - .id - .map(::carbide_uuid::domain::DomainId::try_from) - .unwrap() - .unwrap(); + .expect("create_domain RPC failed"); + + let proto_id = response + .into_inner() + .id + .expect("create_domain returned no id"); + + let id = ::carbide_uuid::domain::DomainId::try_from(proto_id) + .expect("invalid DomainId from create_domain"); + TestDomain { id, name } }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@crates/test-harness/src/lib.rs` around lines 57 - 72, The test_domain function uses multiple chained .unwrap() calls (notably the .map(::carbide_uuid::domain::DomainId::try_from).unwrap().unwrap()) which hurts readability; replace these with explicit, contextual failures by using expect() with clear messages on the RPC/result unwraps and convert the map/unwrap/unwrap into .map(|s| ::carbide_uuid::domain::DomainId::try_from(s).expect("failed to parse DomainId from response")) (or alternatively pattern-match the Option/Result to produce a clear expect message) so callers still panic on setup failure but the failure points (create_domain RPC, missing id, parse error) are explicit; update test_domain, create_domain call handling and the DomainId conversion accordingly while preserving the returned TestDomain { id, name }.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@crates/api-core/src/tests/site_explorer.rs`:
- Line 3239: Remove the redundant local import of the Forge trait by deleting
the line that says "use rpc::forge::forge_server::Forge;" in this test file; the
Forge trait is already imported at the module level, so remove the duplicate to
avoid unused/duplicate imports while leaving the module-level import intact.
- Around line 3292-3303: The test function
test_get_machine_position_info_no_endpoint is misnamed and miscommented because
create_managed_host actually creates an explored endpoint; rename the test
(e.g., test_get_machine_position_info_endpoint_without_position) and update the
leading comment to state that an explored endpoint exists but contains no
position data, and remove the redundant use rpc::forge::forge_server::Forge
import (duplicate of the previous test) so imports are not repeated; keep
existing variables like dpu_machine_id and the rest of the test logic unchanged.
In `@crates/site-explorer/tests/power_shelf.rs`:
- Around line 1319-1520: The test_power_shelf_state_history_multiple test is
prematurely short-circuited by the unconditional early return (if 1 == 1 {
return Ok(()); }), so the later state-history assertions never run; remove the
early-return block (the "if 1 == 1 { return Ok(()); }" stub) and either
implement the intended controller iterations and assertions (call
env.run_power_shelf_controller_iteration() the required number of times and then
run the assertions that follow) or, if not ready, mark the test with #[ignore]
and keep the rest of the test intact so the current assertions are either
executed or intentionally skipped.
- Around line 929-1069: The test_site_explorer_creates_power_shelf test is
short-circuited by the early return (if 1 == 1 { return Ok(()); }) which
prevents the final controller state assertions from running; either remove that
conditional return and call/run the controller iteration
(env.run_power_shelf_controller_iteration().await) and then re-enable the
subsequent assertions that verify state transitions, or mark the test with
#[ignore] and add a TODO explaining which state-machine wiring is required to
enable the controller assertions; update references in the test to
test_site_explorer_creates_power_shelf,
env.run_power_shelf_controller_iteration, and the post-return assertions
accordingly.
- Around line 1117-1247: The test_power_shelf_state_history test is being
short-circuited by the unconditional early return (if 1 == 1 { return Ok(());
}), so remove that guard and either (A) invoke the real controller iteration
(call env.run_power_shelf_controller_iteration().await; or the proper async
method that advances the power shelf state machine) and then run the remaining
state history assertions against db::state_history::for_object, or (B) if the
state machine isn't ready, mark the test with #[ignore] and add a brief comment
explaining why; ensure references to created_power_shelf.id / power_shelf_id and
the post-iteration assertions remain reachable.
In `@crates/test-harness/Cargo.toml`:
- Around line 18-23: The Cargo.toml for the crate named "carbide-test-harness"
is missing the required package description; open the [package] block in that
Cargo.toml and add a descriptive description field (e.g., description = "Short
one-line summary of the crate's purpose") so the package metadata includes a
clear description for the carbide-test-harness crate.
---
Nitpick comments:
In `@crates/site-explorer/tests/power_shelf.rs`:
- Around line 1305-1314: Remove the large commented-out assertion block that
builds state_versions from final_state (the commented lines referencing
state_versions, final_state, and the assert! about state_versions.len() > 1);
either delete these comment lines entirely if the verification is no longer
needed, or re-enable them by uncommenting and ensuring the test compiles and
uses state_versions/state_version properly inside the test function where
final_state is available.
- Around line 1087-1094: Remove the commented-out debug loop and the redundant
early-return by deleting the lines containing the commented
println!/env.run_power_shelf_controller_iteration() block and the `if 1 == 1 {
return Ok(()); }` statement so the test no longer contains dead commented code
or a second early return; search for the
`println!`/`env.run_power_shelf_controller_iteration().await` snippet and the
`if 1 == 1` block in the test and remove them, leaving the function flow as
originally intended with only the earlier return.
- Around line 254-366: In
test_site_explorer_power_shelf_discovery_with_static_ip remove the commented-out
call to explorer.run_single_iteration().await.unwrap(); (or replace it with a
one-line TODO/FIXME explaining why it's disabled) so there is no stray commented
code left in the test; locate the commented line referencing
explorer.run_single_iteration() and either delete it or add a concise TODO/FIXME
comment directly above the line documenting the reason for disabling the extra
iteration.
- Around line 1481-1491: The local binding `_history_by_ids` is unused after
calling `db::state_history::find_by_object_ids` in the test; either replace `let
_history_by_ids = ...` with `let _ =
db::state_history::find_by_object_ids(...).await?;` to explicitly discard the
result, or re-enable the commented assertions that reference `history_by_ids`
(and rename `_history_by_ids` to `history_by_ids`) so the query result is
actually asserted against; update the call site around
`db::state_history::find_by_object_ids`, the `_history_by_ids` binding, and the
commented lines that assert
`contains_key(&power_shelf1_id)`/`contains_key(&power_shelf2_id)` accordingly.
- Around line 1700-1720: The test currently compares JSON strings by stripping
spaces (history_entry.state.replace(" ", "") vs serde_json::to_string(&state)?),
which is brittle; instead parse the stored JSON and compare structurally —
either deserialize history_entry.state into the concrete type (e.g.,
PowerShelfControllerState) and assert equality with state, or deserialize both
into serde_json::Value via serde_json::from_str and compare the Value to
serde_json::to_value(&state). Update the assertions around history_entry.state
and found_entry.unwrap().state to perform these structural
deserializations/comparisons rather than string replacement.
- Around line 1279-1286: Remove the commented-out loop and the redundant
early-return block: delete the three commented lines calling
env.run_power_shelf_controller_iteration() and the subsequent if 1 == 1 { return
Ok(()); } so the test continues normally; look for the
env.run_power_shelf_controller_iteration() commented loop and the literal if 1
== 1 return Ok(()) and remove those lines together.
- Around line 127-1755: Several tests duplicate constructing
EndpointExplorationReport, running DHCP discovery to set FakePowerShelf.ip, and
seeding expected_power_shelf rows; extract helpers to reduce duplication: add
Env::create_basic_exploration_report(&self, serial: &str, model: &str) returning
EndpointExplorationReport and Env::setup_power_shelf_with_dhcp(&self,
power_shelf: &mut FakePowerShelf) -> Result<(), Box<dyn std::error::Error>> to
perform the DhcpDiscovery call and set power_shelf.ip, plus a helper
Env::seed_expected_power_shelf(&self, power_shelf: &FakePowerShelf) that wraps
db::expected_power_shelf::create; update tests like
test_site_explorer_power_shelf_discovery,
test_site_explorer_power_shelf_with_static_ip,
test_site_explorer_power_shelf_creation_limit, etc. to call these helpers and
replace the repeated EndpointExplorationReport literals with
create_basic_exploration_report calls.
- Around line 1513-1520: Delete the commented-out controller iteration block
(the three lines calling env.run_power_shelf_controller_iteration() that are
prefixed with //) so the test no longer contains dead commented code; keep the
surrounding TODO comment if desired and leave the remaining early return logic
unchanged—look for the env.run_power_shelf_controller_iteration() calls to
locate the commented lines to remove.
In `@crates/test-harness/src/lib.rs`:
- Around line 57-72: The test_domain function uses multiple chained .unwrap()
calls (notably the
.map(::carbide_uuid::domain::DomainId::try_from).unwrap().unwrap()) which hurts
readability; replace these with explicit, contextual failures by using expect()
with clear messages on the RPC/result unwraps and convert the map/unwrap/unwrap
into .map(|s| ::carbide_uuid::domain::DomainId::try_from(s).expect("failed to
parse DomainId from response")) (or alternatively pattern-match the
Option/Result to produce a clear expect message) so callers still panic on setup
failure but the failure points (create_domain RPC, missing id, parse error) are
explicit; update test_domain, create_domain call handling and the DomainId
conversion accordingly while preserving the returned TestDomain { id, name }.
In `@crates/test-harness/src/network/controller.rs`:
- Around line 68-110: Add a brief explanatory comment above the two
run_single_iteration().await calls in both create_underlay_segment() and
create_admin_segment() that documents why exactly two state controller
iterations are required for provisioning (e.g., first iteration to enqueue/apply
the network creation and a second to reconcile and persist resulting
state/allocate IPs), mention any assumptions (such as async job scheduling or
reconciliation semantics), and reference run_single_iteration() so future
readers understand this “magic number” is intentional and tied to the
controller’s two-step provisioning lifecycle.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: 4790ace5-95d1-4ea4-a758-15de3d1d2bfe
⛔ Files ignored due to path filters (1)
Cargo.lockis excluded by!**/*.lock
📒 Files selected for processing (15)
crates/api-core/src/test_support/endpoint_explorer.rscrates/api-core/src/test_support/mod.rscrates/api-core/src/tests/common/api_fixtures/mod.rscrates/api-core/src/tests/mod.rscrates/api-core/src/tests/site_explorer.rscrates/site-explorer/Cargo.tomlcrates/site-explorer/tests/power_shelf.rscrates/test-harness/Cargo.tomlcrates/test-harness/src/builder.rscrates/test-harness/src/dns.rscrates/test-harness/src/lib.rscrates/test-harness/src/network/controller.rscrates/test-harness/src/network/mod.rscrates/test-harness/src/network/segment.rscrates/test-harness/src/prelude.rs
b5a448b to
fa2e40f
Compare
fa2e40f to
c0cbea2
Compare
|
🌿 Preview your docs: https://nvidia-preview-pull-request-2250.docs.buildwithfern.com/infra-controller |
Move power shelf integration coverage out of api-core site explorer tests into the site-explorer crate, and introduce a shared test harness foundation for cross-crate test setup. Signed-off-by: Dmitry Porokh <dporokh@nvidia.com>
c0cbea2 to
2dedd22
Compare
Description
Move power shelf integration coverage out of api-core site explorer tests into the site-explorer crate, and introduce a shared test harness foundation for cross-crate test setup.
Type of Change
Related Issues (Optional)
#2001
Breaking Changes
Testing
Additional Notes