Skip to content

KVM: Hot-plugged NIC gets non-sequential PCI slot causing unpredictable interface naming in guest #12825

@jmsperu

Description

@jmsperu

Description

When hot-plugging a NIC to a running KVM VM via CloudStack, the new NIC receives a non-sequential PCI slot address, resulting in unpredictable interface naming in the guest OS (e.g., ens9 instead of the expected ens5). This causes guest network configuration to fail because CloudStack and cloud-init expect sequential interface naming.

Root Cause

In LibvirtPlugNicCommandWrapper.java, the execute() method calls vifDriver.plug() which creates an InterfaceDef without setting a PCI slot (line 64). When vm.attachDevice() is called, libvirt auto-assigns the next free PCI slot. Since non-NIC PCI devices (virtio-serial controller, virtio-disk, memballoon, watchdog) occupy slots immediately after the existing NICs, the hot-plugged NIC gets a much higher slot number.

Typical PCI slot layout for a CloudStack KVM VM:

Slot Device
0x01 USB + IDE controllers
0x02 Video (cirrus)
0x03 NIC net0 → guest sees ens3
0x04 NIC net1 → guest sees ens4
0x05 virtio-serial controller
0x06 virtio-disk
0x07 memballoon
0x08 watchdog
0x09 Hot-plugged NIC → guest sees ens9 (should be ens5)

Note: LibvirtReplugNicCommandWrapper already handles this correctly by preserving the PCI slot when re-plugging (interfaceDef.setSlot(oldPluggedNic.getSlot())), but LibvirtPlugNicCommandWrapper has no slot assignment logic.

Evidence (from production CloudStack 4.20)

Before fix — hot-plugged NIC at PCI slot 0x09:

$ virsh dumpxml i-2-1681-VM | grep -A6 'net2'
      <alias name='net2'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x09' function='0x0'/>

$ virsh qemu-agent-command i-2-1681-VM '{"execute":"guest-network-get-interfaces"}'
"name": "ens3", "hardware-address": "02:01:01:76:01:35"
"name": "ens4", "hardware-address": "02:01:03:15:00:01"  
"name": "ens9", "hardware-address": "02:01:03:4c:00:08"  ← non-sequential

$ ip a (inside guest):
4: ens9: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN

After relocating non-NIC devices to high PCI slots (0x10+) — hot-plugged NIC gets slot 0x04:

$ virsh qemu-agent-command i-2-1681-VM '{"execute":"guest-network-get-interfaces"}'
"name": "ens3", "hardware-address": "02:01:01:76:01:35"
"name": "ens4", "hardware-address": "02:aa:bb:cc:dd:01"  ← sequential!

Impact

  • Hot-plugged NICs get ens9, ens10, etc. instead of sequential names
  • Guest network configuration (cloud-init, CloudStack userdata) fails for the new NIC
  • NIC shows as DOWN with no IP address in the guest
  • Only a full VM reboot reassigns sequential PCI slots

Fix

This PR adds explicit PCI slot assignment in LibvirtPlugNicCommandWrapper:

  1. Query the domain XML to find all PCI slots currently in use
  2. Find the highest PCI slot used by existing NICs
  3. Assign the next free slot after the last NIC slot to the new NIC

Note: For full sequential naming, a follow-up change is needed to assign non-NIC PCI devices to higher slot numbers (e.g., 0x10+) during VM creation in LibvirtComputingResource.createDevicesDef(). This would leave slots 3-15 available for NICs, allowing hot-plugged NICs to always get sequential names.

How to Reproduce

  1. Create a KVM VM with 2 NICs
  2. While running, add a 3rd NIC via CloudStack API/UI
  3. Check inside guest: ip link show — the new NIC will be ens9 (not ens5)
  4. The NIC will be DOWN with no IP configuration

Versions

  • CloudStack 4.20
  • QEMU 8.2, Libvirt 9.x
  • Guest: Debian 12 (systemd-networkd with predictable interface naming)

Metadata

Metadata

Assignees

No one assigned

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions