Skip to content

[libvirt_manager] generate unique MAC per interface in generate_macs#3802

Open
fultonj wants to merge 1 commit intoopenstack-k8s-operators:mainfrom
fultonj:fix/OSPRH-28297-unique-mac-per-interface
Open

[libvirt_manager] generate unique MAC per interface in generate_macs#3802
fultonj wants to merge 1 commit intoopenstack-k8s-operators:mainfrom
fultonj:fix/OSPRH-28297-unique-mac-per-interface

Conversation

@fultonj
Copy link
Copy Markdown
Contributor

@fultonj fultonj commented Mar 25, 2026

Previously, a single MAC address (_fixed_mac) was generated once per VM and assigned to all its network interfaces. For OCP master VMs with multiple networks (e.g. ocppr, ocpbm, osp_trunk, osp_trunk), all interfaces shared the same MAC.

This caused the networking_mapper to always resolve to the first interface (enp5s0, on the ocppr bridge) when looking up any network by MAC, since all MACs matched equally. The incorrect interface name then flowed into ci_gen_kustomize_values, setting enp5s0 as the macvlan master in the ctlplane NetworkAttachmentDefinition. Bootstrap pods sent ARP on the wrong bridge and never reached compute nodes, causing wait_for_connection to time out.

Fix: generate a unique MAC per (VM, network-index) pair by incorporating loop.index0 into the seed, ensuring each interface gets a distinct MAC. The networking_mapper can then correctly identify enp7s0 (on cifmw-osp_trunk) as the ctlplane interface.

Fixes: OSPRH-28297

Assisted-By: Claude Sonnet 4.6 noreply@anthropic.com

@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci bot commented Mar 25, 2026

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign danpawlik for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@softwarefactory-project-zuul
Copy link
Copy Markdown

Build failed (check pipeline). Post recheck (without leading slash)
to rerun all jobs. Make sure the failure cause has been resolved before
you rerun jobs.

https://softwarefactory-project.io/zuul/t/rdoproject.org/buildset/ed13079b284a4700b6016b8e524acf9d

✔️ openstack-k8s-operators-content-provider SUCCESS in 2h 28m 30s
podified-multinode-edpm-deployment-crc FAILURE in 34m 48s
✔️ cifmw-crc-podified-edpm-baremetal SUCCESS in 1h 29m 49s
✔️ cifmw-crc-podified-edpm-baremetal-minor-update SUCCESS in 2h 15m 14s
✔️ cifmw-pod-zuul-files SUCCESS in 5m 20s
✔️ noop SUCCESS in 0s
✔️ cifmw-pod-ansible-test SUCCESS in 9m 15s
✔️ cifmw-pod-pre-commit SUCCESS in 8m 48s
cifmw-molecule-libvirt_manager FAILURE in 16m 33s
✔️ cifmw-molecule-reproducer SUCCESS in 16m 32s

Previously, a single MAC address (_fixed_mac) was generated once per
VM and assigned to all its network interfaces. For OCP master VMs with
multiple networks (e.g. ocppr, ocpbm, osp_trunk, osp_trunk), all
interfaces shared the same MAC.

This caused the networking_mapper to always resolve to the first
interface (enp5s0, on the ocppr bridge) when looking up any network
by MAC, since all MACs matched equally. The incorrect interface name
then flowed into ci_gen_kustomize_values, setting enp5s0 as the
macvlan master in the ctlplane NetworkAttachmentDefinition. Bootstrap
pods sent ARP on the wrong bridge and never reached compute nodes,
causing wait_for_connection to time out.

Fix: generate a unique MAC per (VM, network-index) pair by
incorporating loop.index0 into the seed, ensuring each interface gets
a distinct MAC. The networking_mapper can then correctly identify
enp7s0 (on cifmw-osp_trunk) as the ctlplane interface.

Fixes: OSPRH-28297

Assisted-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Signed-off-by: John Fulton <fulton@redhat.com>
@fultonj fultonj force-pushed the fix/OSPRH-28297-unique-mac-per-interface branch from d1182fc to 553a9a1 Compare March 30, 2026 18:46
@softwarefactory-project-zuul
Copy link
Copy Markdown

Build failed (check pipeline). Post recheck (without leading slash)
to rerun all jobs. Make sure the failure cause has been resolved before
you rerun jobs.

https://softwarefactory-project.io/zuul/t/rdoproject.org/buildset/1e94add55022422b8cc361458d7226dd

✔️ openstack-k8s-operators-content-provider SUCCESS in 1h 30m 14s
podified-multinode-edpm-deployment-crc FAILURE in 35m 00s
cifmw-crc-podified-edpm-baremetal RETRY_LIMIT Host unreachable in 1h 16m 50s
cifmw-crc-podified-edpm-baremetal-minor-update RETRY_LIMIT Host unreachable in 1h 01m 06s
✔️ cifmw-pod-zuul-files SUCCESS in 5m 02s
✔️ noop SUCCESS in 0s
✔️ cifmw-pod-ansible-test SUCCESS in 8m 43s
✔️ cifmw-pod-pre-commit SUCCESS in 8m 39s
✔️ cifmw-molecule-libvirt_manager SUCCESS in 41m 34s
✔️ cifmw-molecule-reproducer SUCCESS in 15m 17s

@fultonj
Copy link
Copy Markdown
Contributor Author

fultonj commented Mar 31, 2026

recheck

@softwarefactory-project-zuul
Copy link
Copy Markdown

Build failed (check pipeline). Post recheck (without leading slash)
to rerun all jobs. Make sure the failure cause has been resolved before
you rerun jobs.

https://softwarefactory-project.io/zuul/t/rdoproject.org/buildset/1927f3ee0b1344b7b8726a97be8de104

✔️ openstack-k8s-operators-content-provider SUCCESS in 1h 45m 23s
✔️ podified-multinode-edpm-deployment-crc SUCCESS in 1h 27m 33s
✔️ cifmw-crc-podified-edpm-baremetal SUCCESS in 1h 16m 14s
cifmw-crc-podified-edpm-baremetal-minor-update RETRY_LIMIT in 33m 46s
✔️ cifmw-pod-zuul-files SUCCESS in 5m 24s
✔️ noop SUCCESS in 0s
✔️ cifmw-pod-ansible-test SUCCESS in 9m 06s
✔️ cifmw-pod-pre-commit SUCCESS in 9m 08s
✔️ cifmw-molecule-libvirt_manager SUCCESS in 42m 58s
✔️ cifmw-molecule-reproducer SUCCESS in 15m 45s

@fultonj
Copy link
Copy Markdown
Contributor Author

fultonj commented Apr 3, 2026

recheck

@softwarefactory-project-zuul
Copy link
Copy Markdown

Build failed (check pipeline). Post recheck (without leading slash)
to rerun all jobs. Make sure the failure cause has been resolved before
you rerun jobs.

https://softwarefactory-project.io/zuul/t/rdoproject.org/buildset/f9d30e3a15fc40ef8fedb1215a198ca2

✔️ openstack-k8s-operators-content-provider SUCCESS in 2h 21m 53s
podified-multinode-edpm-deployment-crc FAILURE in 21m 02s
✔️ cifmw-crc-podified-edpm-baremetal SUCCESS in 1h 30m 56s
✔️ cifmw-crc-podified-edpm-baremetal-minor-update SUCCESS in 2h 05m 02s
✔️ cifmw-pod-zuul-files SUCCESS in 5m 09s
✔️ noop SUCCESS in 0s
✔️ cifmw-pod-ansible-test SUCCESS in 9m 06s
✔️ cifmw-pod-pre-commit SUCCESS in 8m 33s
✔️ cifmw-molecule-libvirt_manager SUCCESS in 41m 14s
✔️ cifmw-molecule-reproducer SUCCESS in 15m 11s

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant