feat: Initial changes for Astra support#2216
Conversation
Signed-off-by: Srinivasa Murthy <srmurthy@nvidia.com>
Signed-off-by: Srinivasa Murthy <srmurthy@nvidia.com>
Signed-off-by: Srinivasa Murthy <srmurthy@nvidia.com>
Signed-off-by: Srinivasa Murthy <srmurthy@nvidia.com>
Signed-off-by: Srinivasa Murthy <srmurthy@nvidia.com>
Signed-off-by: Srinivasa Murthy <srmurthy@nvidia.com>
Signed-off-by: Srinivasa Murthy <srmurthy@nvidia.com>
Signed-off-by: Srinivasa Murthy <srmurthy@nvidia.com>
Signed-off-by: Srinivasa Murthy <srmurthy@nvidia.com>
Signed-off-by: Srinivasa Murthy <srmurthy@nvidia.com>
Signed-off-by: Srinivasa Murthy <srmurthy@nvidia.com>
|
Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually. Contributors can view more details about this message here. |
|
Important Review skippedAuto reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Enterprise Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
|
/ok to test f65bcb9 |
|
🌿 Preview your docs: https://nvidia-preview-pull-request-2216.docs.buildwithfern.com/infra-controller |
🔐 TruffleHog Secret Scan✅ No secrets or credentials found! Your code has been scanned for 700+ types of secrets and credentials. All clear! 🎉 🕐 Last updated: 2026-06-04 22:06:16 UTC | Commit: f65bcb9 |
|
/ok to test 37ce636 |
🔍 Container Scan Summary
Per-CVE detail lives in the per-service |
Signed-off-by: Srinivasa Murthy <srmurthy@nvidia.com>
|
/ok to test eec95f7 |
| #[clap(help = "PCI name (e.g. 5e:00.0)")] | ||
| pub pci_name: String, | ||
| #[clap(help = "Interface type (e.g. SVPC or ASTRA)")] | ||
| pub interface_type: i32, |
There was a problem hiding this comment.
this should be able to be a DpaInterfaceType -- there will be the correct implementations on it to allow it to naturally be set as svpc or astra (or whatever the Display variants are) -- and then you get validation for free
| CREATE TYPE dpa_interface_type AS ENUM ('Svpc', 'Astra'); | ||
|
|
||
| ALTER TABLE dpa_interfaces | ||
| ADD COLUMN interface_type dpa_interface_type NOT NULL; |
There was a problem hiding this comment.
...so i guess we're just standardizing as "DPA" on the name for E/W then? lol. it works. since S-VPC is the new FW name, it's easy to roll off the tongue to say "DPA" as E/W.
..but this also means, what if we have an E/W card that we won't run SVPC on, and its not ASTRA, but we want to manage it? I think it makes sense to have some other variant that still gives us the ability to flash stock firmware, put mlxconfig parameters on it, and lock it down, even if its not SVPC.
| client: Arc<MqtteaClient>, | ||
| dpa_info: &Arc<DpaInfo>, | ||
| hb_interval: TimeDelta, | ||
| metrics: &mut DpaMonitorMetrics, |
There was a problem hiding this comment.
maybe we could break this down into a couple of "contexts" (like a context with services, and a context with data), so we could get rid of #[allow(clippy::too_many_arguments)] -- or if any of those could be a part of the SvpcInterfaceHandler then you could just refer to them as self.
|
|
||
| let mut vni = 0_u32; | ||
|
|
||
| let mut this_nic_configured_attachments = spx_config |
There was a problem hiding this comment.
yeah again, we do try to avoid mut -- i think it's even in the STYLE_GUIDE.md -- sometimes its an annoying exercise, sometimes its a fun exercise, sometimes its a hopeless exercise, but if you can rewrite the flow such that you avoid mut, it usually comes out very idiomatic
| .to_string(), | ||
| )); | ||
| } | ||
|
|
There was a problem hiding this comment.
So one thing you could do is express this as "configured" and "observed" -- you already got the "observed" part, so rephrase the top half as the "configured" half and it will naturally all come together later. Like you could introduce a new at_most_one:
fn at_most_one<I: Iterator>(mut iter: I, ctx: &str) -> DpaManagerResult<Option<I::Item>> {
let first = iter.next();
if first.is_some() && iter.next().is_some() {
tracing::error!("{ctx}: more than one match");
return Err(DpaManagerError::InvalidArgument(format!("{ctx}: more than one match")));
}
Ok(first)
}
...and let configured become:
let configured = at_most_one(
instance.config.spxconfig.spx_attachments
.iter()
.filter(|a| a.mac_address.as_deref() == Some(this_mac_str.as_str())),
"reconcile_assigned_state configured attachments",
)?;
...and then observed becomes:
let observed = at_most_one(
machine.spx_status_observation
.iter()
.flat_map(|o| &o.spx_attachments)
.filter(|a| a.mac_address == this_mac),
"reconcile_assigned_state observed attachments",
)?;
and then your need_creation can turn into an action of sorts match configured?
- Nothing configured, then the action is SetVni 0
- Something configured, then look up the VNI and see if it needs to be SetVni or a Heartbeat.
And then based on the action, you either monitor.send_set_vni_command or monitor.do_heartbeat.
The action could probably be some type of new MonitorAction enum with SetVni(vni, version) or Heartbeat(vni) variants, so then the match on action can just match on the enum type.
Sorry, lol. Super long feedback on this.
| DpaInterfaceType interface_type = 21; | ||
| } | ||
|
|
||
| enum DpaInterfaceType { |
There was a problem hiding this comment.
I think 0 should be our basic E/W NIC type, not SVPC or ASTRA (e.g. don't default to SVPC).
|
|
||
| enum DpaInterfaceType { | ||
| Svpc = 0; | ||
| Astra = 1; |
There was a problem hiding this comment.
you can SVPC and ASTRA these -- Rust will naturally stub them out as Svpc and Astra. If you want to be compatible with other proto parsing libraries (whatever the C one is), these should be:
DPA_INTERFACE_TYPE_SVPC
DPA_INTERFACE_TYPE_ASTRA
Rust knows the right thing to do and still calls them Svpc and Astra in its generated code.
Signed-off-by: Srinivasa Murthy <srmurthy@nvidia.com>
Description
Changes to core to support both SVPC and ASTRA. SVPC is code complete. Some initial changes for Astra.
Type of Change
Related Issues (Optional)
Breaking Changes
Testing
Additional Notes