netvsp: Add VLAN support#3417
Conversation
Add a new VMM test that validates guest-side VLAN (802.1Q) sub-interface creation and configuration on the synthetic NIC (netvsp). The test: - Boots a Linux VM with a NIC (via consomme backend) - Finds the NIC by MAC address in sysfs - Creates a VLAN sub-interface (ID 100) using ip-link - Verifies 802.1Q protocol and VLAN ID in interface details - Assigns an IP address and brings the interface up - Sends TX smoke traffic (ping) to exercise the netvsp VLAN PPI path - Verifies at least one TX packet was transmitted - Confirms the parent interface remains operational - Cleans up the VLAN interface This exercises the guest netvsc driver's VLAN support and the netvsp VLAN PPI (Per-Packet Information) metadata extraction path. Full end-to-end VLAN datapath validation would require a VLAN-aware backend; the current consomme backend ignores VLAN metadata, so this test focuses on guest-side configuration correctness and TX path stability. Also fix a pre-existing build error in gdma/src/resolver.rs where the cfg(test)-gated reject_tx_with_vlan_error field was referenced unconditionally in struct initialization. The test took a little over 2m running under WSL2, which seems long. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Remove the `reject_tx_with_vlan_error` field from VportConfig, Vport, and TxRxTask, which was gated behind #[cfg(test)] and caused net_mana tests to fail to compile (gdma is built as a regular dependency, not in test mode, when net_mana tests run). Instead, have the BNIC emulator inspect the `inject_vlan_pri_tag` field in the TX long OOB — matching how real MANA hardware detects and rejects 802.1Q VLAN tag insertion requests with CQE_TX_VLAN_TAGGING_VIOLATION. On the net_mana side, translate TxMetadata's vlan_enabled flag into the MANA OOB's inject_vlan_pri_tag/vlan_id/pcp/dei fields, and force the long OOB format when VLAN tagging is requested. The VLAN fallback test now sends an actual VLAN-tagged packet rather than relying on a synthetic per-vport flag. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
Adds VLAN (802.1Q) tagging support end-to-end for Underhill’s NetVSP when using the MANA backend, by plumbing VLAN metadata through net_backend and the NetVSP RNDIS PPI paths, and adding targeted unit/integration tests to prevent regressions.
Changes:
- Extend
net_backendTX/RX metadata to carry VLAN information (and add a transport header offset field for offload parsing). - Teach NetVSP to parse/emit VLAN RNDIS PPI and propagate VLAN metadata into TX/RX paths (with stats + tests).
- Add MANA/GDMA VLAN preservation handling and new tests, plus a VMM test that exercises guest VLAN sub-interface configuration.
Reviewed changes
Copilot reviewed 14 out of 15 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| vmm_tests/vmm_tests/tests/tests/multiarch/vlan.rs | New multi-arch VMM test that configures a guest VLAN interface and performs a TX smoke check. |
| vmm_tests/vmm_tests/tests/tests/multiarch.rs | Registers the new VLAN test module. |
| vm/devices/net/netvsp/src/test.rs | Enhances test endpoint plumbing to carry RX metadata and capture TX metadata; adds VLAN PPI parsing/tests. |
| vm/devices/net/netvsp/src/rndisprot.rs | Introduces VLAN PPI type + EthVlanInfo encoding helpers. |
| vm/devices/net/netvsp/src/lib.rs | Parses VLAN PPI into TxMetadata, adjusts header-length logic, and adds VLAN counters. |
| vm/devices/net/netvsp/src/buffers.rs | Emits VLAN PPI on RX when RxMetadata.vlan is present. |
| vm/devices/net/net_tap/src/lib.rs | Updates RX metadata construction to include vlan: None. |
| vm/devices/net/net_mana/src/test.rs | Adds a VLAN-preserving loopback endpoint and multiple VLAN-focused tests. |
| vm/devices/net/net_mana/src/lib.rs | Plumbs VLAN into MANA TX OOB (inject) and RX metadata (detect). |
| vm/devices/net/net_mana/Cargo.toml | Adds parking_lot as a dev-dependency for the new tests. |
| vm/devices/net/net_consomme/src/lib.rs | Updates RX metadata construction to include vlan: None. |
| vm/devices/net/net_backend/src/tests.rs | Adds a helper to retrieve written RX metadata in tests. |
| vm/devices/net/net_backend/src/lib.rs | Adds VlanMetadata + VLAN fields on RX/TX metadata; adds tcp_header_offset. |
| vm/devices/net/gdma/src/bnic.rs | Propagates VLAN through GDMA RX flags and TX metadata. |
| Cargo.lock | Records new dependency usage (parking_lot) via dev-deps. |
|
This PR modifies files containing For more on why we check whole files, instead of just diffs, check out the Rustonomicon |
| /// VLAN IDs are 12 bits. This will silently reject any bits outside of | ||
| /// the range. |
There was a problem hiding this comment.
I was debating whether we should validate / check... but I think in both cases TX and RX were we get the values from sources we should just trust. Do we log the VLAN somewhere for triage?
There was a problem hiding this comment.
We don't at all so far--I'll see about logging state that traffic was going over a vlan on error, though.
… removed counters that seemed counterproductive.
| )); | ||
| } | ||
|
|
||
| metadata.l3_len = metadata.transport_header_offset - metadata.l2_len as u16; |
There was a problem hiding this comment.
There used to be logic to attempt to get a bit of IPv4 or 40 for IPv6 and now we return invalid offset. TBD: Were there old VMs with netvsc that needed those fallbacks that will now be seeing TX failures? Could kusto tell us? (how many l3_len == 40???)
Copilot suggests we could add a warn_ratelimit to the fallback, let a UH release go out, and then check how often it fires.
|
|
||
| let sge0 = sqe.sgl().first().context("no sgl")?; | ||
| let total_len: usize = sqe.sgl().iter().map(|sge| sge.size as usize).sum(); | ||
| let l2_len = if oob.l_oob.inject_vlan_pri_tag() { |
There was a problem hiding this comment.
check oob.s_oob.pkt_fmt() == MANA_LONG_PKT_FMT before reading any l_oob fields.
| } | ||
| } | ||
|
|
||
| // TODO: USO support is not present. |
There was a problem hiding this comment.
Optional: If you intend to do this in a followup PR, you could open a github issue for the work and link that here.
| offset += size_of::<PerPacketInfo>() as u32; | ||
| if let Some(vlan_ppi) = vlan { | ||
| self.buffers.write_at(offset, vlan_ppi.as_bytes()); | ||
| } |
There was a problem hiding this comment.
Optional: Now that there can be multiple PPI, a future PR could introduce a new PPI where write_at goes past RX_HEADER_LEN (unlikely, but possible). You could a debug assert here to check that size_of header + 2x PPI < RX_HEADER_LEN so that blows up at compile time.
Presently, VTL0 with VF passthrough supports VLANs with the MANA NIC, but when Underhill switches that connection over to synthetic path, the Underhill NetVSP implementation drops VLAN tags and therefore breaks connections for the duration of the synthetic handoff; this is leading to issues for users depending on VLAN support, and is a functionality gap with standard NetVSP.
This change adds full VLAN support to Underhill for NetVSP with the MANA backend, in all directions. It does not address other network interfaces at this time such as the consomme test network backend, the tap interface, or the virtio integration; that is left as future work to build off of the structures established here in the interest of expedience.