Skip to content

netvsp: Add VLAN support#3417

Open
ben-zen wants to merge 25 commits intomicrosoft:mainfrom
ben-zen:netvsp-vlan
Open

netvsp: Add VLAN support#3417
ben-zen wants to merge 25 commits intomicrosoft:mainfrom
ben-zen:netvsp-vlan

Conversation

@ben-zen
Copy link
Copy Markdown
Contributor

@ben-zen ben-zen commented May 2, 2026

Presently, VTL0 with VF passthrough supports VLANs with the MANA NIC, but when Underhill switches that connection over to synthetic path, the Underhill NetVSP implementation drops VLAN tags and therefore breaks connections for the duration of the synthetic handoff; this is leading to issues for users depending on VLAN support, and is a functionality gap with standard NetVSP.

This change adds full VLAN support to Underhill for NetVSP with the MANA backend, in all directions. It does not address other network interfaces at this time such as the consomme test network backend, the tap interface, or the virtio integration; that is left as future work to build off of the structures established here in the interest of expedience.

Ben Lewis and others added 15 commits April 29, 2026 00:12
Add a new VMM test that validates guest-side VLAN (802.1Q) sub-interface
creation and configuration on the synthetic NIC (netvsp).

The test:
- Boots a Linux VM with a NIC (via consomme backend)
- Finds the NIC by MAC address in sysfs
- Creates a VLAN sub-interface (ID 100) using ip-link
- Verifies 802.1Q protocol and VLAN ID in interface details
- Assigns an IP address and brings the interface up
- Sends TX smoke traffic (ping) to exercise the netvsp VLAN PPI path
- Verifies at least one TX packet was transmitted
- Confirms the parent interface remains operational
- Cleans up the VLAN interface

This exercises the guest netvsc driver's VLAN support and the netvsp
VLAN PPI (Per-Packet Information) metadata extraction path. Full
end-to-end VLAN datapath validation would require a VLAN-aware backend;
the current consomme backend ignores VLAN metadata, so this test focuses
on guest-side configuration correctness and TX path stability.

Also fix a pre-existing build error in gdma/src/resolver.rs where the
cfg(test)-gated reject_tx_with_vlan_error field was referenced
unconditionally in struct initialization.

The test took a little over 2m running under WSL2, which seems long.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Remove the `reject_tx_with_vlan_error` field from VportConfig, Vport,
and TxRxTask, which was gated behind #[cfg(test)] and caused net_mana
tests to fail to compile (gdma is built as a regular dependency, not in
test mode, when net_mana tests run).

Instead, have the BNIC emulator inspect the `inject_vlan_pri_tag` field
in the TX long OOB — matching how real MANA hardware detects and rejects
802.1Q VLAN tag insertion requests with CQE_TX_VLAN_TAGGING_VIOLATION.

On the net_mana side, translate TxMetadata's vlan_enabled flag into the
MANA OOB's inject_vlan_pri_tag/vlan_id/pcp/dei fields, and force the
long OOB format when VLAN tagging is requested.

The VLAN fallback test now sends an actual VLAN-tagged packet rather
than relying on a synthetic per-vport flag.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings May 2, 2026 05:04
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds VLAN (802.1Q) tagging support end-to-end for Underhill’s NetVSP when using the MANA backend, by plumbing VLAN metadata through net_backend and the NetVSP RNDIS PPI paths, and adding targeted unit/integration tests to prevent regressions.

Changes:

  • Extend net_backend TX/RX metadata to carry VLAN information (and add a transport header offset field for offload parsing).
  • Teach NetVSP to parse/emit VLAN RNDIS PPI and propagate VLAN metadata into TX/RX paths (with stats + tests).
  • Add MANA/GDMA VLAN preservation handling and new tests, plus a VMM test that exercises guest VLAN sub-interface configuration.

Reviewed changes

Copilot reviewed 14 out of 15 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
vmm_tests/vmm_tests/tests/tests/multiarch/vlan.rs New multi-arch VMM test that configures a guest VLAN interface and performs a TX smoke check.
vmm_tests/vmm_tests/tests/tests/multiarch.rs Registers the new VLAN test module.
vm/devices/net/netvsp/src/test.rs Enhances test endpoint plumbing to carry RX metadata and capture TX metadata; adds VLAN PPI parsing/tests.
vm/devices/net/netvsp/src/rndisprot.rs Introduces VLAN PPI type + EthVlanInfo encoding helpers.
vm/devices/net/netvsp/src/lib.rs Parses VLAN PPI into TxMetadata, adjusts header-length logic, and adds VLAN counters.
vm/devices/net/netvsp/src/buffers.rs Emits VLAN PPI on RX when RxMetadata.vlan is present.
vm/devices/net/net_tap/src/lib.rs Updates RX metadata construction to include vlan: None.
vm/devices/net/net_mana/src/test.rs Adds a VLAN-preserving loopback endpoint and multiple VLAN-focused tests.
vm/devices/net/net_mana/src/lib.rs Plumbs VLAN into MANA TX OOB (inject) and RX metadata (detect).
vm/devices/net/net_mana/Cargo.toml Adds parking_lot as a dev-dependency for the new tests.
vm/devices/net/net_consomme/src/lib.rs Updates RX metadata construction to include vlan: None.
vm/devices/net/net_backend/src/tests.rs Adds a helper to retrieve written RX metadata in tests.
vm/devices/net/net_backend/src/lib.rs Adds VlanMetadata + VLAN fields on RX/TX metadata; adds tcp_header_offset.
vm/devices/net/gdma/src/bnic.rs Propagates VLAN through GDMA RX flags and TX metadata.
Cargo.lock Records new dependency usage (parking_lot) via dev-deps.

Comment thread vm/devices/net/net_backend/src/lib.rs Outdated
Comment thread vm/devices/net/netvsp/src/rndisprot.rs
Comment thread vm/devices/net/netvsp/src/lib.rs Outdated
Comment thread vm/devices/net/net_mana/src/lib.rs Outdated
@ben-zen ben-zen marked this pull request as ready for review May 4, 2026 17:54
@ben-zen ben-zen requested a review from a team as a code owner May 4, 2026 17:54
Copilot AI review requested due to automatic review settings May 4, 2026 17:54
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 14 out of 15 changed files in this pull request and generated 6 comments.

Comment thread vm/devices/net/netvsp/src/lib.rs Outdated
Comment thread vm/devices/net/gdma/src/bnic.rs Outdated
Comment thread vmm_tests/vmm_tests/tests/tests/multiarch/vlan.rs
Comment thread vm/devices/net/netvsp/src/test.rs
Comment thread vm/devices/net/net_backend/src/lib.rs
Comment thread vm/devices/net/net_mana/src/lib.rs Outdated
Copilot AI review requested due to automatic review settings May 4, 2026 20:14
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 15 out of 16 changed files in this pull request and generated 6 comments.

Comment thread vm/devices/net/netvsp/src/lib.rs
Comment thread vm/devices/net/gdma/src/bnic.rs Outdated
Comment thread vm/devices/net/net_mana/src/test.rs Outdated
Comment thread vm/devices/net/net_mana/src/test.rs Outdated
Comment thread vmm_tests/vmm_tests/tests/tests/multiarch/vlan.rs Outdated
Comment thread vm/devices/net/netvsp/src/lib.rs Outdated
@github-actions github-actions Bot added the unsafe Related to unsafe code label May 4, 2026
@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 4, 2026

⚠️ Unsafe Code Detected

This PR modifies files containing unsafe Rust code. Extra scrutiny is required during review.

For more on why we check whole files, instead of just diffs, check out the Rustonomicon

Comment thread vm/devices/net/gdma/src/bnic.rs Outdated
Comment thread vm/devices/net/gdma/src/bnic.rs Outdated
Comment thread vm/devices/net/net_backend/src/lib.rs Outdated
Comment thread vm/devices/net/net_mana/src/test.rs Outdated
Comment thread vm/devices/net/netvsp/src/lib.rs
Comment on lines +738 to +739
/// VLAN IDs are 12 bits. This will silently reject any bits outside of
/// the range.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was debating whether we should validate / check... but I think in both cases TX and RX were we get the values from sources we should just trust. Do we log the VLAN somewhere for triage?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't at all so far--I'll see about logging state that traffic was going over a vlan on error, though.

Comment thread vm/devices/net/net_mana/src/lib.rs
Comment thread vm/devices/net/gdma/src/bnic.rs Outdated
Comment thread vm/devices/net/netvsp/src/rndisprot.rs
… removed counters that seemed counterproductive.
Copilot AI review requested due to automatic review settings May 6, 2026 18:13
Copy link
Copy Markdown
Contributor

@erfrimod erfrimod left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

));
}

metadata.l3_len = metadata.transport_header_offset - metadata.l2_len as u16;
Copy link
Copy Markdown
Contributor

@erfrimod erfrimod May 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There used to be logic to attempt to get a bit of IPv4 or 40 for IPv6 and now we return invalid offset. TBD: Were there old VMs with netvsc that needed those fallbacks that will now be seeing TX failures? Could kusto tell us? (how many l3_len == 40???)

Copilot suggests we could add a warn_ratelimit to the fallback, let a UH release go out, and then check how often it fires.


let sge0 = sqe.sgl().first().context("no sgl")?;
let total_len: usize = sqe.sgl().iter().map(|sge| sge.size as usize).sum();
let l2_len = if oob.l_oob.inject_vlan_pri_tag() {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

check oob.s_oob.pkt_fmt() == MANA_LONG_PKT_FMT before reading any l_oob fields.

}
}

// TODO: USO support is not present.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Optional: If you intend to do this in a followup PR, you could open a github issue for the work and link that here.

offset += size_of::<PerPacketInfo>() as u32;
if let Some(vlan_ppi) = vlan {
self.buffers.write_at(offset, vlan_ppi.as_bytes());
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Optional: Now that there can be multiple PPI, a future PR could introduce a new PPI where write_at goes past RX_HEADER_LEN (unlikely, but possible). You could a debug assert here to check that size_of header + 2x PPI < RX_HEADER_LEN so that blows up at compile time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

unsafe Related to unsafe code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants