Skip to content

[shimV2] added network controller implementation#2633

Open
rawahars wants to merge 1 commit intomicrosoft:mainfrom
rawahars:network-controller
Open

[shimV2] added network controller implementation#2633
rawahars wants to merge 1 commit intomicrosoft:mainfrom
rawahars:network-controller

Conversation

@rawahars
Copy link
Contributor

Summary

This change adds the network controller implementation for V2 shims which manages the network lifecycle for a single pod running inside a UVM. The implementation provides a clear lifecycle state machine, separates platform-specific logic for LCOW and WCOW. This controller will be initialized from VMController which can inject the low-level managers to perform VM host + guest network operations.

The main changes are grouped below.

Network controller implementation:

  • Implemented the Controller interface and its concrete Manager type, providing Setup and Teardown methods to manage HCN namespaces and endpoints for a pod (internal/controller/network/interface.go, internal/controller/network/network.go).

Platform-specific guest operations:

  • Added platform-specific files for LCOW and WCOW, implementing guest-side network namespace and endpoint management with proper separation via build tags (internal/controller/network/network_lcow.go, internal/controller/network/network_wcow.go).

Lifecycle state management:

  • Defined a State type to track the network lifecycle, including transitions for setup, teardown, and error handling (internal/controller/network/state.go).

This change adds the network controller implementation for V2 shims which manages the network lifecycle for a single pod running inside a UVM.

Signed-off-by: Harsh Rawat <harshrawat@microsoft.com>
@rawahars rawahars requested a review from a team as a code owner March 17, 2026 21:57
// SetupOptions holds the configuration required to set up the network for a pod.
type SetupOptions struct {
// PodID is the identifier of the pod whose network is being configured.
PodID string
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the need of the PodID for the network? Should it be the PodController -> Network mapping?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PodID is logged by the controller to provide immediate correlation between log messages, namespaces, and their respective pods in multi-pod deployments.

The network controller would be part of the PodController instance itself.


// Hot-add all endpoints in the namespace to the guest.
for _, endpoint := range endpoints {
nicGUID, err := guid.NewV4()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The old code did this too. Do we have to generate new GUIDs or can we use the same ones from the endpoint so that the logs are easier to correlate? At the very least we need to log a pivot trace here, this endpoint id is this new guid.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As we discussed for LM, we would have separate IDs for UVM resources than what was there on the host.
I can add both the IDs in context so that the same gets logged within GCS.


if len(teardownErrs) > 0 {
// If any errors were encountered during teardown, mark the state as invalid.
m.netState = StateInvalid
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are retries safe in this state?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. The Invalid state is meant to signify that container is not in valid Configured state.

// Ensure the endpoint named "eth0" is added first when multiple endpoints are present,
// so it maps to eth0 inside the guest. CNI results aren't available here, so we rely
// on the endpoint name suffix as a heuristic.
cmp := func(a, b *hcn.HostComputeEndpoint) int {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Uh... this wont work for multi-pod. Do we need to maintain this nomenclature?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would work for multi-pod since each pod would get it's own controller. This sorting is on the host side so that we can determine which NIC is added first. Once the NIC is added to the UVM, we will send guest request to move it into pod namespace. Therefore, even if the NIC is 3rd one on the guest side, it would be the first to be moved into pod namespace and hence becomes eth0 in that namespace.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants