tigera · tomastigera · Mar 6, 2026 · Mar 6, 2026 · Mar 8, 2026 · Copilot
@@ -185,6 +185,38 @@ kubectl patch felixconfiguration default --patch='{"spec": {"bpfKubeProxyIptable
 
 If both `kube-proxy` and `BPFKubeProxyIptablesCleanupEnabled` is enabled then `kube-proxy` will write its iptables rules and Felix will try to clean them up resulting in iptables flapping between the two.
 
+You should also set `bpfKubeProxyHealthzPort` to `0` to disable the health check server in $[prodname]'s BPF kube-proxy replacement, which by default binds to port 10256 and would conflict with the Kubernetes `kube-proxy` already running on the node. The Kubernetes `kube-proxy` can serve the health check equally well, so there is no degradation.
+
+```bash
+kubectl patch felixconfiguration default --patch='{"spec": {"bpfKubeProxyHealthzPort": 0}}'
-You should also set `bpfKubeProxyHealthzPort` to `0` to disable the health check server in $[prodname]'s BPF kube-proxy replacement, which by default binds to port 10256 and would conflict with the Kubernetes `kube-proxy` already running on the node. The Kubernetes `kube-proxy` can serve the health check equally well, so there is no degradation.
-
-```bash
-kubectl patch felixconfiguration default --patch='{"spec": {"bpfKubeProxyHealthzPort": 0}}'
+You should also set `bpfKubeProxyHealthzPort` to an unused port to avoid conflicts with the health check server in $[prodname]'s BPF kube-proxy replacement, which by default binds to port 10256 and would conflict with the Kubernetes `kube-proxy` already running on the node. The Kubernetes `kube-proxy` can serve the health check equally well, so there is no degradation; the new port value is only to avoid the conflict. For example:
+
+```bash
+kubectl patch felixconfiguration default --patch='{"spec": {"bpfKubeProxyHealthzPort": 11256}}'
-You should also set `bpfKubeProxyHealthzPort` to `0` to disable the health check server in $[prodname]'s BPF kube-proxy replacement, which by default binds to port 10256 and would conflict with the Kubernetes `kube-proxy` already running on the node. The Kubernetes `kube-proxy` can serve the health check equally well, so there is no degradation.
-
-```bash
-kubectl patch felixconfiguration default --patch='{"spec": {"bpfKubeProxyHealthzPort": 0}}'
+You should also set `bpfKubeProxyHealthzPort` to an unused port to avoid conflicts with the health check server in $[prodname]'s BPF kube-proxy replacement, which by default binds to port 10256 and would conflict with the Kubernetes `kube-proxy` already running on the node. The Kubernetes `kube-proxy` can serve the health check equally well, so there is no degradation; the new port value is only to avoid the conflict. For example:
+
+```bash
+kubectl patch felixconfiguration default --patch='{"spec": {"bpfKubeProxyHealthzPort": 11256}}'
+```
+
+### MKE: Change the VXLAN port before enabling eBPF
+
+:::caution
+
+MKE uses Docker Swarm overlay networking, which creates VXLAN devices on UDP port 4789 inside Docker network namespaces.
+When $[prodname] switches to eBPF mode on kernels where BTF is available (typically v5.8+), Felix creates the `vxlan.calico` device in
+flow mode, which acts as a catch-all on its UDP port. The kernel rejects this when another VXLAN device (Docker Swarm's)
+already holds the same port, causing `vxlan.calico` to stay DOWN with `address already in use` errors.
+
+**You must change the VXLAN port before enabling eBPF on MKE clusters:**
+
+```bash
+kubectl patch felixconfiguration default --type merge -p '{"spec":{"vxlanPort":4790}}'
+```
+
+Wait for all calico-node pods to recreate the VXLAN device on the new port, then verify on each node:
+
+```bash
+kubectl exec -n calico-system <calico-node-pod> -- ip -d link show vxlan.calico
+```
+
+Confirm the device shows `dstport 4790` (or your chosen port) and is UP before proceeding.
+Ensure that the chosen UDP port is allowed by your underlying network between all nodes.
+
+:::
+
 ### Enable eBPF mode
 
 To enable eBPF mode, change the `spec.calicoNetwork.linuxDataplane` parameter in the operator's `Installation`

@@ -175,6 +175,21 @@ can do that by setting `--kube-proxy-mode=disabled` and `--kube-default-drop-mas
 
 More details can be found in [the MKE documentation](https://docs.mirantis.com/mke/current/install/predeployment/configure-networking/cluster-service-networking-options.html)
 
+:::caution VXLAN port conflict
+
+MKE's Docker Swarm overlay networking uses UDP port 4789 for its own VXLAN devices. In eBPF mode, when BTF is available on the node (typically kernel v5.8+),
+$[prodname] creates the `vxlan.calico` device in flow mode, which conflicts with Docker Swarm's use of the same port.
+You must change the VXLAN port before enabling eBPF:
+
+```bash
+kubectl patch felixconfiguration default --type merge -p '{"spec":{"vxlanPort":4790}}'
+```
+
+Ensure the chosen UDP port is allowed by your underlying network between all nodes.
+See [Troubleshoot eBPF mode](troubleshoot-ebpf.mdx#mke-vxlan-device-down) for more details.
+
+:::
+
 </TabItem>
 
 </Tabs>

@@ -396,6 +396,55 @@ e2e` command. The command resets the profiling data after dumping it.
 
 ```
 
+## MKE: VXLAN device DOWN
+
+On MKE clusters, after enabling eBPF mode, the `vxlan.calico` device may stay DOWN with Felix logs showing:
+
+```
+Failed to set tunnel device up error=address already in use
+```
+
+This happens because MKE's Docker Swarm overlay networking creates VXLAN devices on UDP port 4789 inside Docker network namespaces (`/run/docker/netns/`). In eBPF mode, when BTF is available on the node (typically kernel v5.8+), Felix creates `vxlan.calico` in flow mode (`external`), which acts as a catch-all on its UDP port. The kernel rejects this because Docker Swarm's VXLAN device already holds port 4789.
+
+In iptables mode this conflict doesn't occur because Calico's VXLAN device binds with a specific VNI (4096), which can coexist with Docker Swarm's VXLAN on the same port.
+
+**Diagnosis:**
+
+```bash
+# Confirm the VXLAN device is DOWN
+kubectl exec -n calico-system <calico-node-pod> -- ip -d link show vxlan.calico
+
+# Check what holds port 4789 (shows a kernel-owned socket with no process name)
+kubectl exec -n calico-system <calico-node-pod> -- ss -ulnp sport = :4789
+
+# Find Docker Swarm VXLAN devices in Docker network namespaces (run on the node via SSH)
+for ns in /run/docker/netns/*; do
+  echo "=== $(basename "$ns") ==="
+  nsenter --net="$ns" ip -d link show type vxlan 2>/dev/null
+done
+```
+
+**Fix:** Change Calico's VXLAN port to avoid the conflict:
+
+```bash
+kubectl patch felixconfiguration default --type merge -p '{"spec":{"vxlanPort":4790}}'
+```
+
+Felix will recreate `vxlan.calico` on the new port within seconds. Verify on each node:
+
+```bash
+kubectl exec -n calico-system <calico-node-pod> -- ip -d link show vxlan.calico
+# Should show: vxlan external ... dstport 4790 ... state UP
+```
+
+Ensure the chosen UDP port is allowed by your underlying network between all nodes.
+
+:::tip
+
+For MKE clusters, set the VXLAN port **before** enabling eBPF mode to avoid any disruption.
+
+:::
+
 ## Debug high CPU usage
 
 If you notice `$[noderunning]` using high CPU:

@@ -185,6 +185,38 @@ kubectl patch felixconfiguration default --patch='{"spec": {"bpfKubeProxyIptable
 
 If both `kube-proxy` and `BPFKubeProxyIptablesCleanupEnabled` is enabled then `kube-proxy` will write its iptables rules and Felix will try to clean them up resulting in iptables flapping between the two.
 
+You should also change `bpfKubeProxyHealthzPort` to an unused port to avoid conflicting with the Kubernetes `kube-proxy`'s default health check port (10256). The Kubernetes `kube-proxy` can serve the health check equally well, so there is no degradation. Changing the health check port of the Kubernetes `kube-proxy` is typically not possible on managed platforms such as AKS. Choose a port that is not already in use on your nodes (for example, 10258; note that 10257 may be used by containerd).
-You should also change `bpfKubeProxyHealthzPort` to an unused port to avoid conflicting with the Kubernetes `kube-proxy`'s default health check port (10256). The Kubernetes `kube-proxy` can serve the health check equally well, so there is no degradation. Changing the health check port of the Kubernetes `kube-proxy` is typically not possible on managed platforms such as AKS. Choose a port that is not already in use on your nodes (for example, 10258; note that 10257 may be used by containerd).
+You should also change `bpfKubeProxyHealthzPort` to an unused port to avoid conflicting with the Kubernetes `kube-proxy`'s default health check port (10256). The Kubernetes `kube-proxy` can serve the health check equally well, so there is no degradation. Changing the health check port of the Kubernetes `kube-proxy` is typically not possible on managed platforms such as AKS. Choose a port that is not already in use on your nodes (for example, 10258), avoid typical Kubernetes reserved ports such as 10257 and 10259, and verify port availability by checking which ports are currently in use on your nodes.
-You should also change `bpfKubeProxyHealthzPort` to an unused port to avoid conflicting with the Kubernetes `kube-proxy`'s default health check port (10256). The Kubernetes `kube-proxy` can serve the health check equally well, so there is no degradation. Changing the health check port of the Kubernetes `kube-proxy` is typically not possible on managed platforms such as AKS. Choose a port that is not already in use on your nodes (for example, 10258; note that 10257 may be used by containerd).
+You should also change `bpfKubeProxyHealthzPort` to an unused port to avoid conflicting with the Kubernetes `kube-proxy`'s default health check port (10256). The Kubernetes `kube-proxy` can serve the health check equally well, so there is no degradation. Changing the health check port of the Kubernetes `kube-proxy` is typically not possible on managed platforms such as AKS. Choose a port that is not already in use on your nodes (for example, 10258), avoid typical Kubernetes reserved ports such as 10257 and 10259, and verify port availability by checking which ports are currently in use on your nodes.
+
+```bash
+kubectl patch felixconfiguration default --patch='{"spec": {"bpfKubeProxyHealthzPort": 10258}}'
+```
+
+### MKE: Change the VXLAN port before enabling eBPF
+
+:::caution
+
+MKE uses Docker Swarm overlay networking, which creates VXLAN devices on UDP port 4789 inside Docker network namespaces.
+When $[prodname] switches to eBPF mode on kernels where BTF is available (typically v5.8+), Felix creates the `vxlan.calico` device in
+flow mode, which acts as a catch-all on its UDP port. The kernel rejects this when another VXLAN device (Docker Swarm's)
+already holds the same port, causing `vxlan.calico` to stay DOWN with `address already in use` errors.
+
+**You must change the VXLAN port before enabling eBPF on MKE clusters:**
+
+```bash
+kubectl patch felixconfiguration default --type merge -p '{"spec":{"vxlanPort":4790}}'
+```
+
+Wait for all calico-node pods to recreate the VXLAN device on the new port, then verify on each node:
+
+```bash
+kubectl exec -n calico-system <calico-node-pod> -- ip -d link show vxlan.calico
+```
+
+Confirm the device shows `dstport 4790` (or your chosen port) and is UP before proceeding.
+Ensure that the chosen UDP port is allowed by your underlying network between all nodes.
+
+:::
+
 ### Enable eBPF mode
 
 To enable eBPF mode, change the `spec.calicoNetwork.linuxDataplane` parameter in the operator's `Installation`

@@ -175,6 +175,21 @@ can do that by setting `--kube-proxy-mode=disabled` and `--kube-default-drop-mas
 
 More details can be found in [the MKE documentation](https://docs.mirantis.com/mke/current/install/predeployment/configure-networking/cluster-service-networking-options.html)
 
+:::caution VXLAN port conflict
+
+MKE's Docker Swarm overlay networking uses UDP port 4789 for its own VXLAN devices. In eBPF mode, when BTF is available on the node (typically kernel v5.8+),
+$[prodname] creates the `vxlan.calico` device in flow mode, which conflicts with Docker Swarm's use of the same port.
+You must change the VXLAN port before enabling eBPF:
+
+```bash
+kubectl patch felixconfiguration default --type merge -p '{"spec":{"vxlanPort":4790}}'
-You must change the VXLAN port before enabling eBPF:
-
-```bash
-kubectl patch felixconfiguration default --type merge -p '{"spec":{"vxlanPort":4790}}'
+You must configure $[prodname] to use a different VXLAN port after installing the Tigera Operator and Calico CRDs, but before enabling eBPF. For example:
+
+```bash
+cat <<EOF | kubectl apply -f -
+apiVersion: crd.projectcalico.org/v1
+kind: FelixConfiguration
+metadata:
+  name: default
+spec:
+  vxlanPort: 4790
+EOF
-You must change the VXLAN port before enabling eBPF:
-
-```bash
-kubectl patch felixconfiguration default --type merge -p '{"spec":{"vxlanPort":4790}}'
+You must configure $[prodname] to use a different VXLAN port after installing the Tigera Operator and Calico CRDs, but before enabling eBPF. For example:
+
+```bash
+cat <<EOF | kubectl apply -f -
+apiVersion: crd.projectcalico.org/v1
+kind: FelixConfiguration
+metadata:
+  name: default
+spec:
+  vxlanPort: 4790
+EOF
+```
+
+Ensure the chosen UDP port is allowed by your underlying network between all nodes.
+See [Troubleshoot eBPF mode](troubleshoot-ebpf.mdx#mke-vxlan-device-down) for more details.
+
+:::
+
 </TabItem>
 
 </Tabs>

@@ -396,6 +396,55 @@ e2e` command. The command resets the profiling data after dumping it.
 
 ```
 
+## MKE: VXLAN device DOWN
+
+On MKE clusters, after enabling eBPF mode, the `vxlan.calico` device may stay DOWN with Felix logs showing:
+
+```
+Failed to set tunnel device up error=address already in use
+```
+
+This happens because MKE's Docker Swarm overlay networking creates VXLAN devices on UDP port 4789 inside Docker network namespaces (`/run/docker/netns/`). In eBPF mode, when BTF is available on the node (typically kernel v5.8+), Felix creates `vxlan.calico` in flow mode (`external`), which acts as a catch-all on its UDP port. The kernel rejects this because Docker Swarm's VXLAN device already holds port 4789.
+
+In iptables mode this conflict doesn't occur because Calico's VXLAN device binds with a specific VNI (4096), which can coexist with Docker Swarm's VXLAN on the same port.
+
+**Diagnosis:**
+
+```bash
+# Confirm the VXLAN device is DOWN
+kubectl exec -n calico-system <calico-node-pod> -- ip -d link show vxlan.calico
+
+# Check what holds port 4789 (shows a kernel-owned socket with no process name)
+kubectl exec -n calico-system <calico-node-pod> -- ss -ulnp sport = :4789
+
+# Find Docker Swarm VXLAN devices in Docker network namespaces (run on the node via SSH)
+for ns in /run/docker/netns/*; do
+  echo "=== $(basename "$ns") ==="
+  nsenter --net="$ns" ip -d link show type vxlan 2>/dev/null
+done
+```
+
+**Fix:** Change Calico's VXLAN port to avoid the conflict:
+
+```bash
+kubectl patch felixconfiguration default --type merge -p '{"spec":{"vxlanPort":4790}}'
+```
+
+Felix will recreate `vxlan.calico` on the new port within seconds. Verify on each node:
+
+```bash
+kubectl exec -n calico-system <calico-node-pod> -- ip -d link show vxlan.calico
+# Should show: vxlan external ... dstport 4790 ... state UP
+```
+
+Ensure the chosen UDP port is allowed by your underlying network between all nodes.
+
+:::tip
+
+For MKE clusters, set the VXLAN port **before** enabling eBPF mode to avoid any disruption.
+
+:::
+
 ## Debug high CPU usage
 
 If you notice `$[noderunning]` using high CPU:

@@ -326,6 +326,38 @@ kubectl patch felixconfiguration default --patch='{"spec": {"bpfKubeProxyIptable
 
 If both `kube-proxy` and `BPFKubeProxyIptablesCleanupEnabled` is enabled then `kube-proxy` will write its iptables rules and Felix will try to clean them up resulting in iptables flapping between the two.
 
+You should also set `bpfKubeProxyHealthzPort` to `0` to disable the health check server in $[prodname]'s BPF kube-proxy replacement, which by default binds to port 10256 and would conflict with the Kubernetes `kube-proxy` already running on the node. The Kubernetes `kube-proxy` can serve the health check equally well, so there is no degradation.
+
+```
+kubectl patch felixconfiguration default --patch='{"spec": {"bpfKubeProxyHealthzPort": 0}}'
+```
+
+### MKE: Change the VXLAN port before enabling eBPF
+
+:::caution
+
+MKE uses Docker Swarm overlay networking, which creates VXLAN devices on UDP port 4789 inside Docker network namespaces.
+When $[prodname] switches to eBPF mode on kernels where BTF is available (typically v5.8+), Felix creates the `vxlan.calico` device in
+flow mode, which acts as a catch-all on its UDP port. The kernel rejects this when another VXLAN device (Docker Swarm's)
+already holds the same port, causing `vxlan.calico` to stay DOWN with `address already in use` errors.
+
+**You must change the VXLAN port before enabling eBPF on MKE clusters:**
+
+```bash
+kubectl patch felixconfiguration default --type merge -p '{"spec":{"vxlanPort":4790}}'
+```
+
+Wait for all calico-node pods to recreate the VXLAN device on the new port, then verify on each node:
+
+```bash
+kubectl exec -n calico-system <calico-node-pod> -- ip -d link show vxlan.calico
+```
+
+Confirm the device shows `dstport 4790` (or your chosen port) and is UP before proceeding.
+Ensure that the chosen UDP port is allowed by your underlying network between all nodes.
+
+:::
+
 ### Enable eBPF mode
 
 **The next step depends on whether you installed $[prodname] using the operator, or a manifest:**

@@ -203,6 +203,21 @@ can do that by setting `--kube-proxy-mode=disabled` and `--kube-default-drop-mas
 
 More details can be found in [the MKE documentation](https://docs.mirantis.com/mke/current/install/predeployment/configure-networking/cluster-service-networking-options.html)
 
+:::caution VXLAN port conflict
+
+MKE's Docker Swarm overlay networking uses UDP port 4789 for its own VXLAN devices. In eBPF mode, when BTF is available on the node (typically kernel v5.8+),
+$[prodname] creates the `vxlan.calico` device in flow mode, which conflicts with Docker Swarm's use of the same port.
+You must change the VXLAN port before enabling eBPF:
+
+```bash
+kubectl patch felixconfiguration default --type merge -p '{"spec":{"vxlanPort":4790}}'
+```
+
+Ensure the chosen UDP port is allowed by your underlying network between all nodes.
+See [Troubleshoot eBPF mode](troubleshoot-ebpf.mdx#mke-vxlan-device-down) for more details.
+
+:::
+
 </TabItem>
 
 </Tabs>

@@ -396,6 +396,55 @@ e2e` command. The command resets the profiling data after dumping it.
 
 ```
 
+## MKE: VXLAN device DOWN
+
+On MKE clusters, after enabling eBPF mode, the `vxlan.calico` device may stay DOWN with Felix logs showing:
+
+```
+Failed to set tunnel device up error=address already in use
+```
+
+This happens because MKE's Docker Swarm overlay networking creates VXLAN devices on UDP port 4789 inside Docker network namespaces (`/run/docker/netns/`). In eBPF mode, when BTF is available on the node (typically kernel v5.8+), Felix creates `vxlan.calico` in flow mode (`external`), which acts as a catch-all on its UDP port. The kernel rejects this because Docker Swarm's VXLAN device already holds port 4789.
+
+In iptables mode this conflict doesn't occur because Calico's VXLAN device binds with a specific VNI (4096), which can coexist with Docker Swarm's VXLAN on the same port.
+
+**Diagnosis:**
+
+```bash
+# Confirm the VXLAN device is DOWN
+kubectl exec -n calico-system <calico-node-pod> -- ip -d link show vxlan.calico
+
+# Check what holds port 4789 (shows a kernel-owned socket with no process name)
+kubectl exec -n calico-system <calico-node-pod> -- ss -ulnp sport = :4789
+
+# Find Docker Swarm VXLAN devices in Docker network namespaces (run on the node via SSH)
+for ns in /run/docker/netns/*; do
+  echo "=== $(basename "$ns") ==="
+  nsenter --net="$ns" ip -d link show type vxlan 2>/dev/null
+done
+```
+
+**Fix:** Change Calico's VXLAN port to avoid the conflict:
+
+```bash
+kubectl patch felixconfiguration default --type merge -p '{"spec":{"vxlanPort":4790}}'
+```
+
+Felix will recreate `vxlan.calico` on the new port within seconds. Verify on each node:
+
+```bash
+kubectl exec -n calico-system <calico-node-pod> -- ip -d link show vxlan.calico
+# Should show: vxlan external ... dstport 4790 ... state UP
+```
+
+Ensure the chosen UDP port is allowed by your underlying network between all nodes.
+
+:::tip
+
+For MKE clusters, set the VXLAN port **before** enabling eBPF mode to avoid any disruption.
+
+:::
+
 ## Debug high CPU usage
 
 If you notice `$[noderunning]` using high CPU:

@@ -318,6 +318,12 @@ kubectl patch felixconfiguration default --patch='{"spec": {"bpfKubeProxyIptable
 
 If both `kube-proxy` and `BPFKubeProxyIptablesCleanupEnabled` is enabled then `kube-proxy` will write its iptables rules and Felix will try to clean them up resulting in iptables flapping between the two.
 
+You should also change `bpfKubeProxyHealthzPort` to an unused port to avoid conflicting with the Kubernetes `kube-proxy`'s default health check port (10256). The Kubernetes `kube-proxy` can serve the health check equally well, so there is no degradation. Changing the health check port of the Kubernetes `kube-proxy` is typically not possible on managed platforms such as AKS. Choose a port that is not already in use on your nodes (for example, 10258; note that 10257 may be used by containerd).
-You should also change `bpfKubeProxyHealthzPort` to an unused port to avoid conflicting with the Kubernetes `kube-proxy`'s default health check port (10256). The Kubernetes `kube-proxy` can serve the health check equally well, so there is no degradation. Changing the health check port of the Kubernetes `kube-proxy` is typically not possible on managed platforms such as AKS. Choose a port that is not already in use on your nodes (for example, 10258; note that 10257 may be used by containerd).
+You should also change `bpfKubeProxyHealthzPort` to an unused port to avoid conflicting with the Kubernetes `kube-proxy`'s default health check port (10256). The Kubernetes `kube-proxy` can serve the health check equally well, so there is no degradation. Changing the health check port of the Kubernetes `kube-proxy` is typically not possible on managed platforms such as AKS. Choose a port that is not already in use on your nodes (for example, 10258); note that other Kubernetes control plane components commonly use ports such as 10257 and 10259, so verify an unused port on your nodes with tools like `ss` or `netstat` before selecting one.
-You should also change `bpfKubeProxyHealthzPort` to an unused port to avoid conflicting with the Kubernetes `kube-proxy`'s default health check port (10256). The Kubernetes `kube-proxy` can serve the health check equally well, so there is no degradation. Changing the health check port of the Kubernetes `kube-proxy` is typically not possible on managed platforms such as AKS. Choose a port that is not already in use on your nodes (for example, 10258; note that 10257 may be used by containerd).
+You should also change `bpfKubeProxyHealthzPort` to an unused port to avoid conflicting with the Kubernetes `kube-proxy`'s default health check port (10256). The Kubernetes `kube-proxy` can serve the health check equally well, so there is no degradation. Changing the health check port of the Kubernetes `kube-proxy` is typically not possible on managed platforms such as AKS. Choose a port that is not already in use on your nodes (for example, 10258); note that other Kubernetes control plane components commonly use ports such as 10257 and 10259, so verify an unused port on your nodes with tools like `ss` or `netstat` before selecting one.
+
+```
+kubectl patch felixconfiguration default --patch='{"spec": {"bpfKubeProxyHealthzPort": 10258}}'
+```
+
 ### Enable eBPF mode
 
 **The next step depends on whether you installed $[prodname] using the operator, or a manifest:**