net: add a new flag NET_FLAG_INCLUDE_VNET_HEADER#493
net: add a new flag NET_FLAG_INCLUDE_VNET_HEADER#493akerouanton wants to merge 1 commit intocontainers:mainfrom
Conversation
mtjhrc
left a comment
There was a problem hiding this comment.
Thanks for PR, it certainly makes sense to expose the header to interested backends.
I am curious though, could you share more detail, what backend you would use with this?
Please fix the mentioned confusion between stream/datagram sockets and features/flags. Also I guess it would be nice to also implement this for stream sockets too, but not strictly necessary.
src/libkrun/src/lib.rs
Outdated
| if (features & !(NET_ALL_FEATURES | NET_FLAG_INCLUDE_VNET_HEADER)) != 0 { | ||
| return -libc::EINVAL; | ||
| } |
There was a problem hiding this comment.
This check is incorrect on 2 levels:
- as it stands this PR only adds support for this feature for unix datagrams, not streams (which this function is configuring) .
- the NET_FLAG_INCLUDE_VNET_HEADER is a flag not a feature (as correctly in the krun_add_net_unixgram).
| return -libc::EINVAL; | ||
| } | ||
|
|
||
| let include_vnet_header: bool = flags & NET_FLAG_INCLUDE_VNET_HEADER != 0; |
There was a problem hiding this comment.
As implied above, include_vnet_header, should always be false here, because it's not implemented for the stream socket.
| let backend = if let Some(path) = path { | ||
| VirtioNetBackend::UnixgramPath(path, send_vfkit_magic) | ||
| } else { | ||
| VirtioNetBackend::UnixgramFd(fd) | ||
| }; |
There was a problem hiding this comment.
If you only implement the flag for the unixgram sockets, I think it would be better to put the bool inside the enum (like send_vfkit_magic is).
Note: I am not sure why send_vfkit_magic would not passed into VirtioNetBackend::UnixgramFd, that seems like a bug.
I tried for a couple hours to get a kernel stacktrace to pinpoint exactly where's the issue and why this PR fixes it, but to no avail. Let me describe that with plain words instead, but let me know if you want something more concrete. I don't plan to use this flag with any existing backend right now. I'm working on a container runtime that leverages https://github.com/containerd/nerdbox and has its own virtual switch that connects multiple VMs. Every VM gets one or more virtio-net devices, and each interface is connected to a bridge interface (i.e. With TSO enabled, VM-to-VM connections are working correctly — packets are GSO'd on the sending side, and GRO'd on the receiving side. However, container-to-container connections are broken. After some debugging, I realized that disabling GSO on the sending veth fixed the issue and that was probably caused by invalid GSO fields in virtio-net headers. I believe that with the above topology GRO is supposed to happen at the veth level (although I might be wrong), and missing GSO fields would cause With this change, my virtual switch is forwarding virtio-net headers sent by the sending side without recomputing L4 checksums (which was required for VM-to-VM scenarios to work), and it fixes container-to-container communications too. I also think that this should help with host-to-[VM|container] communications as it'd allow to send large frames with partial checksums.
Sorry for the nasty mistakes, let me fix them! |
420da29 to
ab2fe8b
Compare
|
@mtjhrc I added
The Anyway, this should be in a better shape. PTAL. |
|
@mtjhrc PTAAL |
|
Hi, sorry for the late reply. |
This flag should be used to indicate to libkrun that downstream network backend wants to receive and transmit the virtio-net header along with Ethernet frames. Network backends using this flag can then forward unmodified headers to another VM or build a sensible virtio_net_hdr (e.g. with GSO fields correctly set) such that receiving VM handles GSO'd frames properly. Signed-off-by: Albin Kerouanton <albin.kerouanton@docker.com>
ab2fe8b to
fa669df
Compare
|
@mtjhrc @slp Do you plan to get #530 ready and merge it any time soon (i.e. before the next minor)? Otherwise, maybe we can consider merging this PR first and then get #530 in a subsequent release?
I fixed that in a previous iteration, and fixed conflicts yesterday. I think it's good to go now. |
I think neither this one nor #530 qualify for a minor release. This will need to wait for 1.18, sorry. |
|
To clarify, when I said next minor, I meant 1.18 not 1.17.x 🙂 |
We should be able to get #530 before 1.18 anyway, but just in case: Regarding this PR: Since you care about the The unixgram implementation looks OK. |
I'd prefer not touching net code in 1.17, but it'll definitely go in 1.18. |
I meant merged into main before 1.18 release, not released as 1.17.x 😅 |
This flag should be used to indicate to libkrun that downstream network backend wants to receive and transmit the virtio-net header along with Ethernet frames.
Network backends using this flag can then forward unmodified headers to another VM or build a sensible virtio_net_hdr (e.g. with GSO fields correctly set) such that receiving VM handles GSO'd frames properly.