From 42bd256a1918e5ca88f70bac47a653790cc4badb Mon Sep 17 00:00:00 2001 From: Tomas Hruby Date: Fri, 6 Mar 2026 12:44:42 -0800 Subject: [PATCH 1/2] Document MKE VXLAN port conflict for eBPF dataplane MKE's Docker Swarm overlay networking creates VXLAN devices on UDP port 4789, which conflicts with Calico's VXLAN in eBPF flow mode (BTF/kernel v5.8+). The flow-mode device acts as a catch-all and the kernel rejects it with EADDRINUSE. The fix is to change the VXLAN port (e.g. to 4790) before enabling eBPF. Changes across Calico OSS (next + 3.31) and Enterprise (next + 3.23-1): - enabling-ebpf.mdx: prerequisite section to change VXLAN port on MKE - install.mdx: caution admonition in the MKE tab - troubleshoot-ebpf.mdx: diagnosis and fix for VXLAN device DOWN Co-Authored-By: Claude Opus 4.6 --- .../operations/ebpf/enabling-ebpf.mdx | 26 ++++++++++ calico-enterprise/operations/ebpf/install.mdx | 15 ++++++ .../operations/ebpf/troubleshoot-ebpf.mdx | 49 +++++++++++++++++++ .../operations/ebpf/enabling-ebpf.mdx | 26 ++++++++++ .../operations/ebpf/install.mdx | 15 ++++++ .../operations/ebpf/troubleshoot-ebpf.mdx | 49 +++++++++++++++++++ calico/operations/ebpf/enabling-ebpf.mdx | 26 ++++++++++ calico/operations/ebpf/install.mdx | 15 ++++++ calico/operations/ebpf/troubleshoot-ebpf.mdx | 49 +++++++++++++++++++ .../operations/ebpf/enabling-ebpf.mdx | 26 ++++++++++ .../version-3.31/operations/ebpf/install.mdx | 15 ++++++ .../operations/ebpf/troubleshoot-ebpf.mdx | 49 +++++++++++++++++++ 12 files changed, 360 insertions(+) diff --git a/calico-enterprise/operations/ebpf/enabling-ebpf.mdx b/calico-enterprise/operations/ebpf/enabling-ebpf.mdx index 418509bc85..ae28cfdbaf 100644 --- a/calico-enterprise/operations/ebpf/enabling-ebpf.mdx +++ b/calico-enterprise/operations/ebpf/enabling-ebpf.mdx @@ -185,6 +185,32 @@ kubectl patch felixconfiguration default --patch='{"spec": {"bpfKubeProxyIptable If both `kube-proxy` and `BPFKubeProxyIptablesCleanupEnabled` is enabled then `kube-proxy` will write its iptables rules and Felix will try to clean them up resulting in iptables flapping between the two. +### MKE: Change the VXLAN port before enabling eBPF + +:::caution + +MKE uses Docker Swarm overlay networking, which creates VXLAN devices on UDP port 4789 inside Docker network namespaces. +When $[prodname] switches to eBPF mode on kernels with BTF support (v5.8+), Felix creates the `vxlan.calico` device in +flow mode, which acts as a catch-all on its UDP port. The kernel rejects this when another VXLAN device (Docker Swarm's) +already holds the same port, causing `vxlan.calico` to stay DOWN with `address already in use` errors. + +**You must change the VXLAN port before enabling eBPF on MKE clusters:** + +```bash +kubectl patch felixconfiguration default --type merge -p '{"spec":{"vxlanPort":4790}}' +``` + +Wait for all calico-node pods to recreate the VXLAN device on the new port, then verify on each node: + +```bash +kubectl exec -n calico-system -- ip -d link show vxlan.calico +``` + +Confirm the device shows `dstport 4790` (or your chosen port) and is UP before proceeding. +Ensure that the chosen UDP port is allowed by your underlying network between all nodes. + +::: + ### Enable eBPF mode To enable eBPF mode, change the `spec.calicoNetwork.linuxDataplane` parameter in the operator's `Installation` diff --git a/calico-enterprise/operations/ebpf/install.mdx b/calico-enterprise/operations/ebpf/install.mdx index 96984ed807..10ac5dfa50 100644 --- a/calico-enterprise/operations/ebpf/install.mdx +++ b/calico-enterprise/operations/ebpf/install.mdx @@ -175,6 +175,21 @@ can do that by setting `--kube-proxy-mode=disabled` and `--kube-default-drop-mas More details can be found in [the MKE documentation](https://docs.mirantis.com/mke/current/install/predeployment/configure-networking/cluster-service-networking-options.html) +:::caution VXLAN port conflict + +MKE's Docker Swarm overlay networking uses UDP port 4789 for its own VXLAN devices. In eBPF mode with BTF support (kernel v5.8+), +$[prodname] creates the `vxlan.calico` device in flow mode, which conflicts with Docker Swarm's use of the same port. +You must change the VXLAN port before enabling eBPF: + +```bash +kubectl patch felixconfiguration default --type merge -p '{"spec":{"vxlanPort":4790}}' +``` + +Ensure the chosen UDP port is allowed by your underlying network between all nodes. +See [Troubleshoot eBPF mode](troubleshoot-ebpf.mdx#mke-vxlan-device-down) for more details. + +::: + diff --git a/calico-enterprise/operations/ebpf/troubleshoot-ebpf.mdx b/calico-enterprise/operations/ebpf/troubleshoot-ebpf.mdx index b8850f1028..49d6479095 100644 --- a/calico-enterprise/operations/ebpf/troubleshoot-ebpf.mdx +++ b/calico-enterprise/operations/ebpf/troubleshoot-ebpf.mdx @@ -396,6 +396,55 @@ e2e` command. The command resets the profiling data after dumping it. ``` +## MKE: VXLAN device DOWN {#mke-vxlan-device-down} + +On MKE clusters, after enabling eBPF mode, the `vxlan.calico` device may stay DOWN with Felix logs showing: + +``` +Failed to set tunnel device up error=address already in use +``` + +This happens because MKE's Docker Swarm overlay networking creates VXLAN devices on UDP port 4789 inside Docker network namespaces (`/run/docker/netns/`). In eBPF mode with BTF support (kernel v5.8+), Felix creates `vxlan.calico` in flow mode (`external`), which acts as a catch-all on its UDP port. The kernel rejects this because Docker Swarm's VXLAN device already holds port 4789. + +In iptables mode this conflict doesn't occur because Calico's VXLAN device binds with a specific VNI (4096), which can coexist with Docker Swarm's VXLAN on the same port. + +**Diagnosis:** + +```bash +# Confirm the VXLAN device is DOWN +kubectl exec -n calico-system -- ip -d link show vxlan.calico + +# Check what holds port 4789 (shows a kernel-owned socket with no process name) +kubectl exec -n calico-system -- ss -ulnp sport = :4789 + +# Find Docker Swarm VXLAN devices in Docker network namespaces +for ns in /run/docker/netns/*; do + echo "=== $(basename $ns) ===" + nsenter --net="$ns" ip -d link show type vxlan 2>/dev/null +done +``` + +**Fix:** Change Calico's VXLAN port to avoid the conflict: + +```bash +kubectl patch felixconfiguration default --type merge -p '{"spec":{"vxlanPort":4790}}' +``` + +Felix will recreate `vxlan.calico` on the new port within seconds. Verify on each node: + +```bash +kubectl exec -n calico-system -- ip -d link show vxlan.calico +# Should show: vxlan external ... dstport 4790 ... state UP +``` + +Ensure the chosen UDP port is allowed by your underlying network between all nodes. + +:::tip + +For MKE clusters, set the VXLAN port **before** enabling eBPF mode to avoid any disruption. + +::: + ## Debug high CPU usage If you notice `$[noderunning]` using high CPU: diff --git a/calico-enterprise_versioned_docs/version-3.23-1/operations/ebpf/enabling-ebpf.mdx b/calico-enterprise_versioned_docs/version-3.23-1/operations/ebpf/enabling-ebpf.mdx index b409a20f0f..c17544fcf2 100644 --- a/calico-enterprise_versioned_docs/version-3.23-1/operations/ebpf/enabling-ebpf.mdx +++ b/calico-enterprise_versioned_docs/version-3.23-1/operations/ebpf/enabling-ebpf.mdx @@ -185,6 +185,32 @@ kubectl patch felixconfiguration default --patch='{"spec": {"bpfKubeProxyIptable If both `kube-proxy` and `BPFKubeProxyIptablesCleanupEnabled` is enabled then `kube-proxy` will write its iptables rules and Felix will try to clean them up resulting in iptables flapping between the two. +### MKE: Change the VXLAN port before enabling eBPF + +:::caution + +MKE uses Docker Swarm overlay networking, which creates VXLAN devices on UDP port 4789 inside Docker network namespaces. +When $[prodname] switches to eBPF mode on kernels with BTF support (v5.8+), Felix creates the `vxlan.calico` device in +flow mode, which acts as a catch-all on its UDP port. The kernel rejects this when another VXLAN device (Docker Swarm's) +already holds the same port, causing `vxlan.calico` to stay DOWN with `address already in use` errors. + +**You must change the VXLAN port before enabling eBPF on MKE clusters:** + +```bash +kubectl patch felixconfiguration default --type merge -p '{"spec":{"vxlanPort":4790}}' +``` + +Wait for all calico-node pods to recreate the VXLAN device on the new port, then verify on each node: + +```bash +kubectl exec -n calico-system -- ip -d link show vxlan.calico +``` + +Confirm the device shows `dstport 4790` (or your chosen port) and is UP before proceeding. +Ensure that the chosen UDP port is allowed by your underlying network between all nodes. + +::: + ### Enable eBPF mode To enable eBPF mode, change the `spec.calicoNetwork.linuxDataplane` parameter in the operator's `Installation` diff --git a/calico-enterprise_versioned_docs/version-3.23-1/operations/ebpf/install.mdx b/calico-enterprise_versioned_docs/version-3.23-1/operations/ebpf/install.mdx index 631dff33d5..39a6fd35b9 100644 --- a/calico-enterprise_versioned_docs/version-3.23-1/operations/ebpf/install.mdx +++ b/calico-enterprise_versioned_docs/version-3.23-1/operations/ebpf/install.mdx @@ -175,6 +175,21 @@ can do that by setting `--kube-proxy-mode=disabled` and `--kube-default-drop-mas More details can be found in [the MKE documentation](https://docs.mirantis.com/mke/current/install/predeployment/configure-networking/cluster-service-networking-options.html) +:::caution VXLAN port conflict + +MKE's Docker Swarm overlay networking uses UDP port 4789 for its own VXLAN devices. In eBPF mode with BTF support (kernel v5.8+), +$[prodname] creates the `vxlan.calico` device in flow mode, which conflicts with Docker Swarm's use of the same port. +You must change the VXLAN port before enabling eBPF: + +```bash +kubectl patch felixconfiguration default --type merge -p '{"spec":{"vxlanPort":4790}}' +``` + +Ensure the chosen UDP port is allowed by your underlying network between all nodes. +See [Troubleshoot eBPF mode](troubleshoot-ebpf.mdx#mke-vxlan-device-down) for more details. + +::: + diff --git a/calico-enterprise_versioned_docs/version-3.23-1/operations/ebpf/troubleshoot-ebpf.mdx b/calico-enterprise_versioned_docs/version-3.23-1/operations/ebpf/troubleshoot-ebpf.mdx index b8850f1028..49d6479095 100644 --- a/calico-enterprise_versioned_docs/version-3.23-1/operations/ebpf/troubleshoot-ebpf.mdx +++ b/calico-enterprise_versioned_docs/version-3.23-1/operations/ebpf/troubleshoot-ebpf.mdx @@ -396,6 +396,55 @@ e2e` command. The command resets the profiling data after dumping it. ``` +## MKE: VXLAN device DOWN {#mke-vxlan-device-down} + +On MKE clusters, after enabling eBPF mode, the `vxlan.calico` device may stay DOWN with Felix logs showing: + +``` +Failed to set tunnel device up error=address already in use +``` + +This happens because MKE's Docker Swarm overlay networking creates VXLAN devices on UDP port 4789 inside Docker network namespaces (`/run/docker/netns/`). In eBPF mode with BTF support (kernel v5.8+), Felix creates `vxlan.calico` in flow mode (`external`), which acts as a catch-all on its UDP port. The kernel rejects this because Docker Swarm's VXLAN device already holds port 4789. + +In iptables mode this conflict doesn't occur because Calico's VXLAN device binds with a specific VNI (4096), which can coexist with Docker Swarm's VXLAN on the same port. + +**Diagnosis:** + +```bash +# Confirm the VXLAN device is DOWN +kubectl exec -n calico-system -- ip -d link show vxlan.calico + +# Check what holds port 4789 (shows a kernel-owned socket with no process name) +kubectl exec -n calico-system -- ss -ulnp sport = :4789 + +# Find Docker Swarm VXLAN devices in Docker network namespaces +for ns in /run/docker/netns/*; do + echo "=== $(basename $ns) ===" + nsenter --net="$ns" ip -d link show type vxlan 2>/dev/null +done +``` + +**Fix:** Change Calico's VXLAN port to avoid the conflict: + +```bash +kubectl patch felixconfiguration default --type merge -p '{"spec":{"vxlanPort":4790}}' +``` + +Felix will recreate `vxlan.calico` on the new port within seconds. Verify on each node: + +```bash +kubectl exec -n calico-system -- ip -d link show vxlan.calico +# Should show: vxlan external ... dstport 4790 ... state UP +``` + +Ensure the chosen UDP port is allowed by your underlying network between all nodes. + +:::tip + +For MKE clusters, set the VXLAN port **before** enabling eBPF mode to avoid any disruption. + +::: + ## Debug high CPU usage If you notice `$[noderunning]` using high CPU: diff --git a/calico/operations/ebpf/enabling-ebpf.mdx b/calico/operations/ebpf/enabling-ebpf.mdx index 406e5d4461..cfd426e49d 100644 --- a/calico/operations/ebpf/enabling-ebpf.mdx +++ b/calico/operations/ebpf/enabling-ebpf.mdx @@ -326,6 +326,32 @@ kubectl patch felixconfiguration default --patch='{"spec": {"bpfKubeProxyIptable If both `kube-proxy` and `BPFKubeProxyIptablesCleanupEnabled` is enabled then `kube-proxy` will write its iptables rules and Felix will try to clean them up resulting in iptables flapping between the two. +### MKE: Change the VXLAN port before enabling eBPF + +:::caution + +MKE uses Docker Swarm overlay networking, which creates VXLAN devices on UDP port 4789 inside Docker network namespaces. +When $[prodname] switches to eBPF mode on kernels with BTF support (v5.8+), Felix creates the `vxlan.calico` device in +flow mode, which acts as a catch-all on its UDP port. The kernel rejects this when another VXLAN device (Docker Swarm's) +already holds the same port, causing `vxlan.calico` to stay DOWN with `address already in use` errors. + +**You must change the VXLAN port before enabling eBPF on MKE clusters:** + +```bash +kubectl patch felixconfiguration default --type merge -p '{"spec":{"vxlanPort":4790}}' +``` + +Wait for all calico-node pods to recreate the VXLAN device on the new port, then verify on each node: + +```bash +kubectl exec -n calico-system -- ip -d link show vxlan.calico +``` + +Confirm the device shows `dstport 4790` (or your chosen port) and is UP before proceeding. +Ensure that the chosen UDP port is allowed by your underlying network between all nodes. + +::: + ### Enable eBPF mode **The next step depends on whether you installed $[prodname] using the operator, or a manifest:** diff --git a/calico/operations/ebpf/install.mdx b/calico/operations/ebpf/install.mdx index 6ac8643320..7ebb15d953 100644 --- a/calico/operations/ebpf/install.mdx +++ b/calico/operations/ebpf/install.mdx @@ -203,6 +203,21 @@ can do that by setting `--kube-proxy-mode=disabled` and `--kube-default-drop-mas More details can be found in [the MKE documentation](https://docs.mirantis.com/mke/current/install/predeployment/configure-networking/cluster-service-networking-options.html) +:::caution VXLAN port conflict + +MKE's Docker Swarm overlay networking uses UDP port 4789 for its own VXLAN devices. In eBPF mode with BTF support (kernel v5.8+), +$[prodname] creates the `vxlan.calico` device in flow mode, which conflicts with Docker Swarm's use of the same port. +You must change the VXLAN port before enabling eBPF: + +```bash +kubectl patch felixconfiguration default --type merge -p '{"spec":{"vxlanPort":4790}}' +``` + +Ensure the chosen UDP port is allowed by your underlying network between all nodes. +See [Troubleshoot eBPF mode](troubleshoot-ebpf.mdx#mke-vxlan-device-down) for more details. + +::: + diff --git a/calico/operations/ebpf/troubleshoot-ebpf.mdx b/calico/operations/ebpf/troubleshoot-ebpf.mdx index b1be84f35b..c307172b01 100644 --- a/calico/operations/ebpf/troubleshoot-ebpf.mdx +++ b/calico/operations/ebpf/troubleshoot-ebpf.mdx @@ -396,6 +396,55 @@ e2e` command. The command resets the profiling data after dumping it. ``` +## MKE: VXLAN device DOWN {#mke-vxlan-device-down} + +On MKE clusters, after enabling eBPF mode, the `vxlan.calico` device may stay DOWN with Felix logs showing: + +``` +Failed to set tunnel device up error=address already in use +``` + +This happens because MKE's Docker Swarm overlay networking creates VXLAN devices on UDP port 4789 inside Docker network namespaces (`/run/docker/netns/`). In eBPF mode with BTF support (kernel v5.8+), Felix creates `vxlan.calico` in flow mode (`external`), which acts as a catch-all on its UDP port. The kernel rejects this because Docker Swarm's VXLAN device already holds port 4789. + +In iptables mode this conflict doesn't occur because Calico's VXLAN device binds with a specific VNI (4096), which can coexist with Docker Swarm's VXLAN on the same port. + +**Diagnosis:** + +```bash +# Confirm the VXLAN device is DOWN +kubectl exec -n calico-system -- ip -d link show vxlan.calico + +# Check what holds port 4789 (shows a kernel-owned socket with no process name) +kubectl exec -n calico-system -- ss -ulnp sport = :4789 + +# Find Docker Swarm VXLAN devices in Docker network namespaces +for ns in /run/docker/netns/*; do + echo "=== $(basename $ns) ===" + nsenter --net="$ns" ip -d link show type vxlan 2>/dev/null +done +``` + +**Fix:** Change Calico's VXLAN port to avoid the conflict: + +```bash +kubectl patch felixconfiguration default --type merge -p '{"spec":{"vxlanPort":4790}}' +``` + +Felix will recreate `vxlan.calico` on the new port within seconds. Verify on each node: + +```bash +kubectl exec -n calico-system -- ip -d link show vxlan.calico +# Should show: vxlan external ... dstport 4790 ... state UP +``` + +Ensure the chosen UDP port is allowed by your underlying network between all nodes. + +:::tip + +For MKE clusters, set the VXLAN port **before** enabling eBPF mode to avoid any disruption. + +::: + ## Debug high CPU usage If you notice `$[noderunning]` using high CPU: diff --git a/calico_versioned_docs/version-3.31/operations/ebpf/enabling-ebpf.mdx b/calico_versioned_docs/version-3.31/operations/ebpf/enabling-ebpf.mdx index c92eed2ef0..b700f6b8f8 100644 --- a/calico_versioned_docs/version-3.31/operations/ebpf/enabling-ebpf.mdx +++ b/calico_versioned_docs/version-3.31/operations/ebpf/enabling-ebpf.mdx @@ -326,6 +326,32 @@ kubectl patch felixconfiguration default --patch='{"spec": {"bpfKubeProxyIptable If both `kube-proxy` and `BPFKubeProxyIptablesCleanupEnabled` is enabled then `kube-proxy` will write its iptables rules and Felix will try to clean them up resulting in iptables flapping between the two. +### MKE: Change the VXLAN port before enabling eBPF + +:::caution + +MKE uses Docker Swarm overlay networking, which creates VXLAN devices on UDP port 4789 inside Docker network namespaces. +When $[prodname] switches to eBPF mode on kernels with BTF support (v5.8+), Felix creates the `vxlan.calico` device in +flow mode, which acts as a catch-all on its UDP port. The kernel rejects this when another VXLAN device (Docker Swarm's) +already holds the same port, causing `vxlan.calico` to stay DOWN with `address already in use` errors. + +**You must change the VXLAN port before enabling eBPF on MKE clusters:** + +```bash +kubectl patch felixconfiguration default --type merge -p '{"spec":{"vxlanPort":4790}}' +``` + +Wait for all calico-node pods to recreate the VXLAN device on the new port, then verify on each node: + +```bash +kubectl exec -n calico-system -- ip -d link show vxlan.calico +``` + +Confirm the device shows `dstport 4790` (or your chosen port) and is UP before proceeding. +Ensure that the chosen UDP port is allowed by your underlying network between all nodes. + +::: + ### Enable eBPF mode **The next step depends on whether you installed $[prodname] using the operator, or a manifest:** diff --git a/calico_versioned_docs/version-3.31/operations/ebpf/install.mdx b/calico_versioned_docs/version-3.31/operations/ebpf/install.mdx index 6210bf4d4c..7d2f02970f 100644 --- a/calico_versioned_docs/version-3.31/operations/ebpf/install.mdx +++ b/calico_versioned_docs/version-3.31/operations/ebpf/install.mdx @@ -203,6 +203,21 @@ can do that by setting `--kube-proxy-mode=disabled` and `--kube-default-drop-mas More details can be found in [the MKE documentation](https://docs.mirantis.com/mke/current/install/predeployment/configure-networking/cluster-service-networking-options.html) +:::caution VXLAN port conflict + +MKE's Docker Swarm overlay networking uses UDP port 4789 for its own VXLAN devices. In eBPF mode with BTF support (kernel v5.8+), +$[prodname] creates the `vxlan.calico` device in flow mode, which conflicts with Docker Swarm's use of the same port. +You must change the VXLAN port before enabling eBPF: + +```bash +kubectl patch felixconfiguration default --type merge -p '{"spec":{"vxlanPort":4790}}' +``` + +Ensure the chosen UDP port is allowed by your underlying network between all nodes. +See [Troubleshoot eBPF mode](troubleshoot-ebpf.mdx#mke-vxlan-device-down) for more details. + +::: + diff --git a/calico_versioned_docs/version-3.31/operations/ebpf/troubleshoot-ebpf.mdx b/calico_versioned_docs/version-3.31/operations/ebpf/troubleshoot-ebpf.mdx index b1be84f35b..c307172b01 100644 --- a/calico_versioned_docs/version-3.31/operations/ebpf/troubleshoot-ebpf.mdx +++ b/calico_versioned_docs/version-3.31/operations/ebpf/troubleshoot-ebpf.mdx @@ -396,6 +396,55 @@ e2e` command. The command resets the profiling data after dumping it. ``` +## MKE: VXLAN device DOWN {#mke-vxlan-device-down} + +On MKE clusters, after enabling eBPF mode, the `vxlan.calico` device may stay DOWN with Felix logs showing: + +``` +Failed to set tunnel device up error=address already in use +``` + +This happens because MKE's Docker Swarm overlay networking creates VXLAN devices on UDP port 4789 inside Docker network namespaces (`/run/docker/netns/`). In eBPF mode with BTF support (kernel v5.8+), Felix creates `vxlan.calico` in flow mode (`external`), which acts as a catch-all on its UDP port. The kernel rejects this because Docker Swarm's VXLAN device already holds port 4789. + +In iptables mode this conflict doesn't occur because Calico's VXLAN device binds with a specific VNI (4096), which can coexist with Docker Swarm's VXLAN on the same port. + +**Diagnosis:** + +```bash +# Confirm the VXLAN device is DOWN +kubectl exec -n calico-system -- ip -d link show vxlan.calico + +# Check what holds port 4789 (shows a kernel-owned socket with no process name) +kubectl exec -n calico-system -- ss -ulnp sport = :4789 + +# Find Docker Swarm VXLAN devices in Docker network namespaces +for ns in /run/docker/netns/*; do + echo "=== $(basename $ns) ===" + nsenter --net="$ns" ip -d link show type vxlan 2>/dev/null +done +``` + +**Fix:** Change Calico's VXLAN port to avoid the conflict: + +```bash +kubectl patch felixconfiguration default --type merge -p '{"spec":{"vxlanPort":4790}}' +``` + +Felix will recreate `vxlan.calico` on the new port within seconds. Verify on each node: + +```bash +kubectl exec -n calico-system -- ip -d link show vxlan.calico +# Should show: vxlan external ... dstport 4790 ... state UP +``` + +Ensure the chosen UDP port is allowed by your underlying network between all nodes. + +:::tip + +For MKE clusters, set the VXLAN port **before** enabling eBPF mode to avoid any disruption. + +::: + ## Debug high CPU usage If you notice `$[noderunning]` using high CPU: From 730ec85dc3dd8df6e722540eaf7d3ed7f16856b8 Mon Sep 17 00:00:00 2001 From: Tomas Hruby Date: Fri, 6 Mar 2026 14:28:47 -0800 Subject: [PATCH 2/2] Address PR review feedback for MKE VXLAN port docs - Remove unsupported {#heading-id} syntax from troubleshoot headings (Docusaurus auto-generates the slug from the heading text) - Clarify BTF wording: "when BTF is available (typically v5.8+)" instead of "with BTF support (v5.8+)" to avoid conflating BTF with kernel version - Add "(run on the node via SSH)" note and quote $ns in the Docker netns diagnosis loop to clarify execution context Co-Authored-By: Claude Opus 4.6 --- calico-enterprise/operations/ebpf/enabling-ebpf.mdx | 2 +- calico-enterprise/operations/ebpf/install.mdx | 2 +- calico-enterprise/operations/ebpf/troubleshoot-ebpf.mdx | 8 ++++---- .../version-3.23-1/operations/ebpf/enabling-ebpf.mdx | 2 +- .../version-3.23-1/operations/ebpf/install.mdx | 2 +- .../version-3.23-1/operations/ebpf/troubleshoot-ebpf.mdx | 8 ++++---- calico/operations/ebpf/enabling-ebpf.mdx | 2 +- calico/operations/ebpf/install.mdx | 2 +- calico/operations/ebpf/troubleshoot-ebpf.mdx | 8 ++++---- .../version-3.31/operations/ebpf/enabling-ebpf.mdx | 2 +- .../version-3.31/operations/ebpf/install.mdx | 2 +- .../version-3.31/operations/ebpf/troubleshoot-ebpf.mdx | 8 ++++---- 12 files changed, 24 insertions(+), 24 deletions(-) diff --git a/calico-enterprise/operations/ebpf/enabling-ebpf.mdx b/calico-enterprise/operations/ebpf/enabling-ebpf.mdx index ae28cfdbaf..9f074fa2a0 100644 --- a/calico-enterprise/operations/ebpf/enabling-ebpf.mdx +++ b/calico-enterprise/operations/ebpf/enabling-ebpf.mdx @@ -190,7 +190,7 @@ If both `kube-proxy` and `BPFKubeProxyIptablesCleanupEnabled` is enabled then `k :::caution MKE uses Docker Swarm overlay networking, which creates VXLAN devices on UDP port 4789 inside Docker network namespaces. -When $[prodname] switches to eBPF mode on kernels with BTF support (v5.8+), Felix creates the `vxlan.calico` device in +When $[prodname] switches to eBPF mode on kernels where BTF is available (typically v5.8+), Felix creates the `vxlan.calico` device in flow mode, which acts as a catch-all on its UDP port. The kernel rejects this when another VXLAN device (Docker Swarm's) already holds the same port, causing `vxlan.calico` to stay DOWN with `address already in use` errors. diff --git a/calico-enterprise/operations/ebpf/install.mdx b/calico-enterprise/operations/ebpf/install.mdx index 10ac5dfa50..26b763d9fa 100644 --- a/calico-enterprise/operations/ebpf/install.mdx +++ b/calico-enterprise/operations/ebpf/install.mdx @@ -177,7 +177,7 @@ More details can be found in [the MKE documentation](https://docs.mirantis.com/m :::caution VXLAN port conflict -MKE's Docker Swarm overlay networking uses UDP port 4789 for its own VXLAN devices. In eBPF mode with BTF support (kernel v5.8+), +MKE's Docker Swarm overlay networking uses UDP port 4789 for its own VXLAN devices. In eBPF mode, when BTF is available on the node (typically kernel v5.8+), $[prodname] creates the `vxlan.calico` device in flow mode, which conflicts with Docker Swarm's use of the same port. You must change the VXLAN port before enabling eBPF: diff --git a/calico-enterprise/operations/ebpf/troubleshoot-ebpf.mdx b/calico-enterprise/operations/ebpf/troubleshoot-ebpf.mdx index 49d6479095..c68010958c 100644 --- a/calico-enterprise/operations/ebpf/troubleshoot-ebpf.mdx +++ b/calico-enterprise/operations/ebpf/troubleshoot-ebpf.mdx @@ -396,7 +396,7 @@ e2e` command. The command resets the profiling data after dumping it. ``` -## MKE: VXLAN device DOWN {#mke-vxlan-device-down} +## MKE: VXLAN device DOWN On MKE clusters, after enabling eBPF mode, the `vxlan.calico` device may stay DOWN with Felix logs showing: @@ -404,7 +404,7 @@ On MKE clusters, after enabling eBPF mode, the `vxlan.calico` device may stay DO Failed to set tunnel device up error=address already in use ``` -This happens because MKE's Docker Swarm overlay networking creates VXLAN devices on UDP port 4789 inside Docker network namespaces (`/run/docker/netns/`). In eBPF mode with BTF support (kernel v5.8+), Felix creates `vxlan.calico` in flow mode (`external`), which acts as a catch-all on its UDP port. The kernel rejects this because Docker Swarm's VXLAN device already holds port 4789. +This happens because MKE's Docker Swarm overlay networking creates VXLAN devices on UDP port 4789 inside Docker network namespaces (`/run/docker/netns/`). In eBPF mode, when BTF is available on the node (typically kernel v5.8+), Felix creates `vxlan.calico` in flow mode (`external`), which acts as a catch-all on its UDP port. The kernel rejects this because Docker Swarm's VXLAN device already holds port 4789. In iptables mode this conflict doesn't occur because Calico's VXLAN device binds with a specific VNI (4096), which can coexist with Docker Swarm's VXLAN on the same port. @@ -417,9 +417,9 @@ kubectl exec -n calico-system -- ip -d link show vxlan.calico # Check what holds port 4789 (shows a kernel-owned socket with no process name) kubectl exec -n calico-system -- ss -ulnp sport = :4789 -# Find Docker Swarm VXLAN devices in Docker network namespaces +# Find Docker Swarm VXLAN devices in Docker network namespaces (run on the node via SSH) for ns in /run/docker/netns/*; do - echo "=== $(basename $ns) ===" + echo "=== $(basename "$ns") ===" nsenter --net="$ns" ip -d link show type vxlan 2>/dev/null done ``` diff --git a/calico-enterprise_versioned_docs/version-3.23-1/operations/ebpf/enabling-ebpf.mdx b/calico-enterprise_versioned_docs/version-3.23-1/operations/ebpf/enabling-ebpf.mdx index c17544fcf2..1e6798ab57 100644 --- a/calico-enterprise_versioned_docs/version-3.23-1/operations/ebpf/enabling-ebpf.mdx +++ b/calico-enterprise_versioned_docs/version-3.23-1/operations/ebpf/enabling-ebpf.mdx @@ -190,7 +190,7 @@ If both `kube-proxy` and `BPFKubeProxyIptablesCleanupEnabled` is enabled then `k :::caution MKE uses Docker Swarm overlay networking, which creates VXLAN devices on UDP port 4789 inside Docker network namespaces. -When $[prodname] switches to eBPF mode on kernels with BTF support (v5.8+), Felix creates the `vxlan.calico` device in +When $[prodname] switches to eBPF mode on kernels where BTF is available (typically v5.8+), Felix creates the `vxlan.calico` device in flow mode, which acts as a catch-all on its UDP port. The kernel rejects this when another VXLAN device (Docker Swarm's) already holds the same port, causing `vxlan.calico` to stay DOWN with `address already in use` errors. diff --git a/calico-enterprise_versioned_docs/version-3.23-1/operations/ebpf/install.mdx b/calico-enterprise_versioned_docs/version-3.23-1/operations/ebpf/install.mdx index 39a6fd35b9..3fb6d9ae20 100644 --- a/calico-enterprise_versioned_docs/version-3.23-1/operations/ebpf/install.mdx +++ b/calico-enterprise_versioned_docs/version-3.23-1/operations/ebpf/install.mdx @@ -177,7 +177,7 @@ More details can be found in [the MKE documentation](https://docs.mirantis.com/m :::caution VXLAN port conflict -MKE's Docker Swarm overlay networking uses UDP port 4789 for its own VXLAN devices. In eBPF mode with BTF support (kernel v5.8+), +MKE's Docker Swarm overlay networking uses UDP port 4789 for its own VXLAN devices. In eBPF mode, when BTF is available on the node (typically kernel v5.8+), $[prodname] creates the `vxlan.calico` device in flow mode, which conflicts with Docker Swarm's use of the same port. You must change the VXLAN port before enabling eBPF: diff --git a/calico-enterprise_versioned_docs/version-3.23-1/operations/ebpf/troubleshoot-ebpf.mdx b/calico-enterprise_versioned_docs/version-3.23-1/operations/ebpf/troubleshoot-ebpf.mdx index 49d6479095..c68010958c 100644 --- a/calico-enterprise_versioned_docs/version-3.23-1/operations/ebpf/troubleshoot-ebpf.mdx +++ b/calico-enterprise_versioned_docs/version-3.23-1/operations/ebpf/troubleshoot-ebpf.mdx @@ -396,7 +396,7 @@ e2e` command. The command resets the profiling data after dumping it. ``` -## MKE: VXLAN device DOWN {#mke-vxlan-device-down} +## MKE: VXLAN device DOWN On MKE clusters, after enabling eBPF mode, the `vxlan.calico` device may stay DOWN with Felix logs showing: @@ -404,7 +404,7 @@ On MKE clusters, after enabling eBPF mode, the `vxlan.calico` device may stay DO Failed to set tunnel device up error=address already in use ``` -This happens because MKE's Docker Swarm overlay networking creates VXLAN devices on UDP port 4789 inside Docker network namespaces (`/run/docker/netns/`). In eBPF mode with BTF support (kernel v5.8+), Felix creates `vxlan.calico` in flow mode (`external`), which acts as a catch-all on its UDP port. The kernel rejects this because Docker Swarm's VXLAN device already holds port 4789. +This happens because MKE's Docker Swarm overlay networking creates VXLAN devices on UDP port 4789 inside Docker network namespaces (`/run/docker/netns/`). In eBPF mode, when BTF is available on the node (typically kernel v5.8+), Felix creates `vxlan.calico` in flow mode (`external`), which acts as a catch-all on its UDP port. The kernel rejects this because Docker Swarm's VXLAN device already holds port 4789. In iptables mode this conflict doesn't occur because Calico's VXLAN device binds with a specific VNI (4096), which can coexist with Docker Swarm's VXLAN on the same port. @@ -417,9 +417,9 @@ kubectl exec -n calico-system -- ip -d link show vxlan.calico # Check what holds port 4789 (shows a kernel-owned socket with no process name) kubectl exec -n calico-system -- ss -ulnp sport = :4789 -# Find Docker Swarm VXLAN devices in Docker network namespaces +# Find Docker Swarm VXLAN devices in Docker network namespaces (run on the node via SSH) for ns in /run/docker/netns/*; do - echo "=== $(basename $ns) ===" + echo "=== $(basename "$ns") ===" nsenter --net="$ns" ip -d link show type vxlan 2>/dev/null done ``` diff --git a/calico/operations/ebpf/enabling-ebpf.mdx b/calico/operations/ebpf/enabling-ebpf.mdx index cfd426e49d..0ddc3b6f78 100644 --- a/calico/operations/ebpf/enabling-ebpf.mdx +++ b/calico/operations/ebpf/enabling-ebpf.mdx @@ -331,7 +331,7 @@ If both `kube-proxy` and `BPFKubeProxyIptablesCleanupEnabled` is enabled then `k :::caution MKE uses Docker Swarm overlay networking, which creates VXLAN devices on UDP port 4789 inside Docker network namespaces. -When $[prodname] switches to eBPF mode on kernels with BTF support (v5.8+), Felix creates the `vxlan.calico` device in +When $[prodname] switches to eBPF mode on kernels where BTF is available (typically v5.8+), Felix creates the `vxlan.calico` device in flow mode, which acts as a catch-all on its UDP port. The kernel rejects this when another VXLAN device (Docker Swarm's) already holds the same port, causing `vxlan.calico` to stay DOWN with `address already in use` errors. diff --git a/calico/operations/ebpf/install.mdx b/calico/operations/ebpf/install.mdx index 7ebb15d953..815cd431fa 100644 --- a/calico/operations/ebpf/install.mdx +++ b/calico/operations/ebpf/install.mdx @@ -205,7 +205,7 @@ More details can be found in [the MKE documentation](https://docs.mirantis.com/m :::caution VXLAN port conflict -MKE's Docker Swarm overlay networking uses UDP port 4789 for its own VXLAN devices. In eBPF mode with BTF support (kernel v5.8+), +MKE's Docker Swarm overlay networking uses UDP port 4789 for its own VXLAN devices. In eBPF mode, when BTF is available on the node (typically kernel v5.8+), $[prodname] creates the `vxlan.calico` device in flow mode, which conflicts with Docker Swarm's use of the same port. You must change the VXLAN port before enabling eBPF: diff --git a/calico/operations/ebpf/troubleshoot-ebpf.mdx b/calico/operations/ebpf/troubleshoot-ebpf.mdx index c307172b01..aab1018ac5 100644 --- a/calico/operations/ebpf/troubleshoot-ebpf.mdx +++ b/calico/operations/ebpf/troubleshoot-ebpf.mdx @@ -396,7 +396,7 @@ e2e` command. The command resets the profiling data after dumping it. ``` -## MKE: VXLAN device DOWN {#mke-vxlan-device-down} +## MKE: VXLAN device DOWN On MKE clusters, after enabling eBPF mode, the `vxlan.calico` device may stay DOWN with Felix logs showing: @@ -404,7 +404,7 @@ On MKE clusters, after enabling eBPF mode, the `vxlan.calico` device may stay DO Failed to set tunnel device up error=address already in use ``` -This happens because MKE's Docker Swarm overlay networking creates VXLAN devices on UDP port 4789 inside Docker network namespaces (`/run/docker/netns/`). In eBPF mode with BTF support (kernel v5.8+), Felix creates `vxlan.calico` in flow mode (`external`), which acts as a catch-all on its UDP port. The kernel rejects this because Docker Swarm's VXLAN device already holds port 4789. +This happens because MKE's Docker Swarm overlay networking creates VXLAN devices on UDP port 4789 inside Docker network namespaces (`/run/docker/netns/`). In eBPF mode, when BTF is available on the node (typically kernel v5.8+), Felix creates `vxlan.calico` in flow mode (`external`), which acts as a catch-all on its UDP port. The kernel rejects this because Docker Swarm's VXLAN device already holds port 4789. In iptables mode this conflict doesn't occur because Calico's VXLAN device binds with a specific VNI (4096), which can coexist with Docker Swarm's VXLAN on the same port. @@ -417,9 +417,9 @@ kubectl exec -n calico-system -- ip -d link show vxlan.calico # Check what holds port 4789 (shows a kernel-owned socket with no process name) kubectl exec -n calico-system -- ss -ulnp sport = :4789 -# Find Docker Swarm VXLAN devices in Docker network namespaces +# Find Docker Swarm VXLAN devices in Docker network namespaces (run on the node via SSH) for ns in /run/docker/netns/*; do - echo "=== $(basename $ns) ===" + echo "=== $(basename "$ns") ===" nsenter --net="$ns" ip -d link show type vxlan 2>/dev/null done ``` diff --git a/calico_versioned_docs/version-3.31/operations/ebpf/enabling-ebpf.mdx b/calico_versioned_docs/version-3.31/operations/ebpf/enabling-ebpf.mdx index b700f6b8f8..3f7325d25e 100644 --- a/calico_versioned_docs/version-3.31/operations/ebpf/enabling-ebpf.mdx +++ b/calico_versioned_docs/version-3.31/operations/ebpf/enabling-ebpf.mdx @@ -331,7 +331,7 @@ If both `kube-proxy` and `BPFKubeProxyIptablesCleanupEnabled` is enabled then `k :::caution MKE uses Docker Swarm overlay networking, which creates VXLAN devices on UDP port 4789 inside Docker network namespaces. -When $[prodname] switches to eBPF mode on kernels with BTF support (v5.8+), Felix creates the `vxlan.calico` device in +When $[prodname] switches to eBPF mode on kernels where BTF is available (typically v5.8+), Felix creates the `vxlan.calico` device in flow mode, which acts as a catch-all on its UDP port. The kernel rejects this when another VXLAN device (Docker Swarm's) already holds the same port, causing `vxlan.calico` to stay DOWN with `address already in use` errors. diff --git a/calico_versioned_docs/version-3.31/operations/ebpf/install.mdx b/calico_versioned_docs/version-3.31/operations/ebpf/install.mdx index 7d2f02970f..2199f0f7ab 100644 --- a/calico_versioned_docs/version-3.31/operations/ebpf/install.mdx +++ b/calico_versioned_docs/version-3.31/operations/ebpf/install.mdx @@ -205,7 +205,7 @@ More details can be found in [the MKE documentation](https://docs.mirantis.com/m :::caution VXLAN port conflict -MKE's Docker Swarm overlay networking uses UDP port 4789 for its own VXLAN devices. In eBPF mode with BTF support (kernel v5.8+), +MKE's Docker Swarm overlay networking uses UDP port 4789 for its own VXLAN devices. In eBPF mode, when BTF is available on the node (typically kernel v5.8+), $[prodname] creates the `vxlan.calico` device in flow mode, which conflicts with Docker Swarm's use of the same port. You must change the VXLAN port before enabling eBPF: diff --git a/calico_versioned_docs/version-3.31/operations/ebpf/troubleshoot-ebpf.mdx b/calico_versioned_docs/version-3.31/operations/ebpf/troubleshoot-ebpf.mdx index c307172b01..aab1018ac5 100644 --- a/calico_versioned_docs/version-3.31/operations/ebpf/troubleshoot-ebpf.mdx +++ b/calico_versioned_docs/version-3.31/operations/ebpf/troubleshoot-ebpf.mdx @@ -396,7 +396,7 @@ e2e` command. The command resets the profiling data after dumping it. ``` -## MKE: VXLAN device DOWN {#mke-vxlan-device-down} +## MKE: VXLAN device DOWN On MKE clusters, after enabling eBPF mode, the `vxlan.calico` device may stay DOWN with Felix logs showing: @@ -404,7 +404,7 @@ On MKE clusters, after enabling eBPF mode, the `vxlan.calico` device may stay DO Failed to set tunnel device up error=address already in use ``` -This happens because MKE's Docker Swarm overlay networking creates VXLAN devices on UDP port 4789 inside Docker network namespaces (`/run/docker/netns/`). In eBPF mode with BTF support (kernel v5.8+), Felix creates `vxlan.calico` in flow mode (`external`), which acts as a catch-all on its UDP port. The kernel rejects this because Docker Swarm's VXLAN device already holds port 4789. +This happens because MKE's Docker Swarm overlay networking creates VXLAN devices on UDP port 4789 inside Docker network namespaces (`/run/docker/netns/`). In eBPF mode, when BTF is available on the node (typically kernel v5.8+), Felix creates `vxlan.calico` in flow mode (`external`), which acts as a catch-all on its UDP port. The kernel rejects this because Docker Swarm's VXLAN device already holds port 4789. In iptables mode this conflict doesn't occur because Calico's VXLAN device binds with a specific VNI (4096), which can coexist with Docker Swarm's VXLAN on the same port. @@ -417,9 +417,9 @@ kubectl exec -n calico-system -- ip -d link show vxlan.calico # Check what holds port 4789 (shows a kernel-owned socket with no process name) kubectl exec -n calico-system -- ss -ulnp sport = :4789 -# Find Docker Swarm VXLAN devices in Docker network namespaces +# Find Docker Swarm VXLAN devices in Docker network namespaces (run on the node via SSH) for ns in /run/docker/netns/*; do - echo "=== $(basename $ns) ===" + echo "=== $(basename "$ns") ===" nsenter --net="$ns" ip -d link show type vxlan 2>/dev/null done ```