Document MKE VXLAN port conflict for eBPF dataplane#2574
Document MKE VXLAN port conflict for eBPF dataplane#2574tomastigera wants to merge 2 commits intotigera:mainfrom
Conversation
MKE's Docker Swarm overlay networking creates VXLAN devices on UDP port 4789, which conflicts with Calico's VXLAN in eBPF flow mode (BTF/kernel v5.8+). The flow-mode device acts as a catch-all and the kernel rejects it with EADDRINUSE. The fix is to change the VXLAN port (e.g. to 4790) before enabling eBPF. Changes across Calico OSS (next + 3.31) and Enterprise (next + 3.23-1): - enabling-ebpf.mdx: prerequisite section to change VXLAN port on MKE - install.mdx: caution admonition in the MKE tab - troubleshoot-ebpf.mdx: diagnosis and fix for VXLAN device DOWN Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
✅ Deploy Preview for calico-docs-preview-next ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
✅ Deploy Preview succeeded!Built without sensitive environment variables
To edit notification comments on pull requests, go to your Netlify project configuration. |
There was a problem hiding this comment.
Pull request overview
Documents a Mirantis Kubernetes Engine (MKE) Docker Swarm VXLAN UDP port (4789) conflict that can prevent vxlan.calico from coming up when enabling Calico eBPF mode, and provides prerequisites + troubleshooting guidance across OSS and Enterprise docs (next and selected versioned docs).
Changes:
- Added an MKE-specific prerequisite in
enabling-ebpf.mdxto change Calico’s VXLAN port before enabling eBPF. - Added an MKE caution in
install.mdxwarning about the VXLAN port conflict and linking to troubleshooting. - Added an MKE troubleshooting section describing symptoms, diagnosis commands, and the fix (
vxlanPortchange).
Reviewed changes
Copilot reviewed 12 out of 12 changed files in this pull request and generated 20 comments.
Show a summary per file
| File | Description |
|---|---|
| calico/operations/ebpf/enabling-ebpf.mdx | Adds MKE prerequisite section to change VXLAN port before enabling eBPF. |
| calico/operations/ebpf/install.mdx | Adds MKE caution admonition about VXLAN port 4789 conflict and links to troubleshooting. |
| calico/operations/ebpf/troubleshoot-ebpf.mdx | Adds MKE troubleshooting section for vxlan.calico DOWN and port-conflict diagnosis/fix. |
| calico_versioned_docs/version-3.31/operations/ebpf/enabling-ebpf.mdx | Backports MKE VXLAN port prerequisite to v3.31 docs. |
| calico_versioned_docs/version-3.31/operations/ebpf/install.mdx | Backports MKE caution admonition and troubleshooting link to v3.31 docs. |
| calico_versioned_docs/version-3.31/operations/ebpf/troubleshoot-ebpf.mdx | Backports MKE troubleshooting section to v3.31 docs. |
| calico-enterprise/operations/ebpf/enabling-ebpf.mdx | Adds MKE prerequisite section to change VXLAN port before enabling eBPF (Enterprise next). |
| calico-enterprise/operations/ebpf/install.mdx | Adds MKE caution admonition about VXLAN port conflict (Enterprise next). |
| calico-enterprise/operations/ebpf/troubleshoot-ebpf.mdx | Adds MKE troubleshooting section for vxlan.calico DOWN (Enterprise next). |
| calico-enterprise_versioned_docs/version-3.23-1/operations/ebpf/enabling-ebpf.mdx | Backports MKE VXLAN port prerequisite to Enterprise v3.23-1 docs. |
| calico-enterprise_versioned_docs/version-3.23-1/operations/ebpf/install.mdx | Backports MKE caution admonition and troubleshooting link to Enterprise v3.23-1 docs. |
| calico-enterprise_versioned_docs/version-3.23-1/operations/ebpf/troubleshoot-ebpf.mdx | Backports MKE troubleshooting section to Enterprise v3.23-1 docs. |
|
|
||
| ``` | ||
|
|
||
| ## MKE: VXLAN device DOWN {#mke-vxlan-device-down} |
There was a problem hiding this comment.
The {#mke-vxlan-device-down} custom heading ID syntax does not appear to be supported in this repo’s Docusaurus/remark configuration (no heading-id remark plugin). It will likely render the {#...} literally and the generated anchor may not match the cross-links. Prefer relying on the auto-generated heading slug (which should already be mke-vxlan-device-down for this heading), or use an explicit HTML anchor element if a fixed ID is required.
| ## MKE: VXLAN device DOWN {#mke-vxlan-device-down} | |
| ## MKE: VXLAN device DOWN |
|
|
||
| ``` | ||
|
|
||
| ## MKE: VXLAN device DOWN {#mke-vxlan-device-down} |
There was a problem hiding this comment.
The {#mke-vxlan-device-down} custom heading ID syntax does not appear to be supported in this repo’s Docusaurus/remark configuration. It will likely render literally and the resulting anchor may not match #mke-vxlan-device-down links. Prefer relying on the auto-generated slug for the heading, or add an explicit HTML anchor if you need a fixed ID.
| ## MKE: VXLAN device DOWN {#mke-vxlan-device-down} | |
| <a id="mke-vxlan-device-down"></a> | |
| ## MKE: VXLAN device DOWN |
| :::caution | ||
|
|
||
| MKE uses Docker Swarm overlay networking, which creates VXLAN devices on UDP port 4789 inside Docker network namespaces. | ||
| When $[prodname] switches to eBPF mode on kernels with BTF support (v5.8+), Felix creates the `vxlan.calico` device in |
There was a problem hiding this comment.
“kernels with BTF support (v5.8+)” is likely conflating CO-RE/BTF with the kernel VXLAN flow/external-mode feature that triggers this behavior. Consider rewording to the actual prerequisite (e.g., kernel v5.8+ supporting VXLAN flow/external mode) and mention BTF/CO-RE separately if needed.
| When $[prodname] switches to eBPF mode on kernels with BTF support (v5.8+), Felix creates the `vxlan.calico` device in | |
| When $[prodname] switches to eBPF mode on Linux kernels v5.8+ that support VXLAN flow/external mode, Felix creates the `vxlan.calico` device in |
| # Find Docker Swarm VXLAN devices in Docker network namespaces | ||
| for ns in /run/docker/netns/*; do | ||
| echo "=== $(basename $ns) ===" | ||
| nsenter --net="$ns" ip -d link show type vxlan 2>/dev/null | ||
| done |
There was a problem hiding this comment.
The diagnosis snippet mixes kubectl exec ... commands with a local shell for ns in /run/docker/netns/*; do ... loop. It’s unclear where the loop should be executed, and copy/paste will fail from an admin workstation. Wrap the loop in kubectl exec ... -- sh -c '...' or explicitly direct readers to run it on the node (SSH).
| # Find Docker Swarm VXLAN devices in Docker network namespaces | |
| for ns in /run/docker/netns/*; do | |
| echo "=== $(basename $ns) ===" | |
| nsenter --net="$ns" ip -d link show type vxlan 2>/dev/null | |
| done | |
| # Find Docker Swarm VXLAN devices in Docker network namespaces (from the calico-node pod) | |
| kubectl exec -n calico-system <calico-node-pod> -- sh -c 'for ns in /run/docker/netns/*; do | |
| echo "=== $(basename "$ns") ===" | |
| nsenter --net="$ns" ip -d link show type vxlan 2>/dev/null | |
| done' |
| When $[prodname] switches to eBPF mode on kernels with BTF support (v5.8+), Felix creates the `vxlan.calico` device in | ||
| flow mode, which acts as a catch-all on its UDP port. The kernel rejects this when another VXLAN device (Docker Swarm's) | ||
| already holds the same port, causing `vxlan.calico` to stay DOWN with `address already in use` errors. |
There was a problem hiding this comment.
“kernels with BTF support (v5.8+)” is likely conflating CO-RE/BTF with the kernel VXLAN flow/external-mode feature that triggers this behavior. Consider rewording to the actual prerequisite (e.g., kernel v5.8+ supporting VXLAN flow/external mode) and mention BTF/CO-RE separately if needed.
| When $[prodname] switches to eBPF mode on kernels with BTF support (v5.8+), Felix creates the `vxlan.calico` device in | |
| flow mode, which acts as a catch-all on its UDP port. The kernel rejects this when another VXLAN device (Docker Swarm's) | |
| already holds the same port, causing `vxlan.calico` to stay DOWN with `address already in use` errors. | |
| When $[prodname] switches to eBPF mode on Linux kernels v5.8+ that support VXLAN flow/external mode, Felix creates the | |
| `vxlan.calico` device in flow mode, which acts as a catch-all on its UDP port. The kernel rejects this when another VXLAN | |
| device (Docker Swarm's) already holds the same port, causing `vxlan.calico` to stay DOWN with `address already in use` | |
| errors. |
| MKE's Docker Swarm overlay networking uses UDP port 4789 for its own VXLAN devices. In eBPF mode with BTF support (kernel v5.8+), | ||
| $[prodname] creates the `vxlan.calico` device in flow mode, which conflicts with Docker Swarm's use of the same port. | ||
| You must change the VXLAN port before enabling eBPF: |
There was a problem hiding this comment.
“eBPF mode with BTF support (kernel v5.8+)” is likely conflating CO-RE/BTF with the kernel VXLAN flow/external-mode feature that causes this port conflict. Consider rephrasing to the actual kernel prerequisite (typically v5.8+ supporting VXLAN external/flow mode) and keep BTF/CO-RE recommendations separate.
| MKE's Docker Swarm overlay networking uses UDP port 4789 for its own VXLAN devices. In eBPF mode with BTF support (kernel v5.8+), | |
| $[prodname] creates the `vxlan.calico` device in flow mode, which conflicts with Docker Swarm's use of the same port. | |
| You must change the VXLAN port before enabling eBPF: | |
| MKE's Docker Swarm overlay networking uses UDP port 4789 for its own VXLAN devices. On kernels that support VXLAN flow/external mode (typically v5.8+), | |
| $[prodname] creates the `vxlan.calico` device in flow mode, which conflicts with Docker Swarm's use of the same port. | |
| This conflict is due to VXLAN behavior and is independent of BTF/CO-RE support. You must change the VXLAN port before enabling eBPF: |
| # Find Docker Swarm VXLAN devices in Docker network namespaces | ||
| for ns in /run/docker/netns/*; do | ||
| echo "=== $(basename $ns) ===" | ||
| nsenter --net="$ns" ip -d link show type vxlan 2>/dev/null | ||
| done |
There was a problem hiding this comment.
The diagnosis snippet mixes kubectl exec ... commands with a local shell for ns in /run/docker/netns/*; do ... loop. As written, it’s unclear where the loop should be executed (admin workstation vs inside the calico-node container vs SSH on the node), and copy/pasting the whole block will fail on most machines. Consider wrapping the loop in a kubectl exec ... -- sh -c '...' (or explicitly instruct readers to SSH to the node before running that loop).
calico/operations/ebpf/install.mdx
Outdated
|
|
||
| :::caution VXLAN port conflict | ||
|
|
||
| MKE's Docker Swarm overlay networking uses UDP port 4789 for its own VXLAN devices. In eBPF mode with BTF support (kernel v5.8+), |
There was a problem hiding this comment.
“eBPF mode with BTF support (kernel v5.8+)” is likely conflating separate concepts: kernel version does not guarantee BTF, and the VXLAN flow/external-mode behavior isn’t specifically tied to BTF. Reword to describe the actual prerequisite (e.g., kernels that support VXLAN flow/external mode, typically v5.8+) and mention BTF/CO-RE separately if needed.
| MKE's Docker Swarm overlay networking uses UDP port 4789 for its own VXLAN devices. In eBPF mode with BTF support (kernel v5.8+), | |
| MKE's Docker Swarm overlay networking uses UDP port 4789 for its own VXLAN devices. On kernels that support VXLAN flow/external mode (typically v5.8+), |
| When $[prodname] switches to eBPF mode on kernels with BTF support (v5.8+), Felix creates the `vxlan.calico` device in | ||
| flow mode, which acts as a catch-all on its UDP port. The kernel rejects this when another VXLAN device (Docker Swarm's) |
There was a problem hiding this comment.
“kernels with BTF support (v5.8+)” is likely conflating CO-RE/BTF with the kernel VXLAN flow/external-mode feature that triggers this port-binding behavior. Consider rephrasing to the actual kernel prerequisite (e.g., v5.8+ with VXLAN external/flow mode) and keep BTF/CO-RE recommendations separate.
| When $[prodname] switches to eBPF mode on kernels with BTF support (v5.8+), Felix creates the `vxlan.calico` device in | |
| flow mode, which acts as a catch-all on its UDP port. The kernel rejects this when another VXLAN device (Docker Swarm's) | |
| When $[prodname] switches to eBPF mode on Linux kernels v5.8+ that support VXLAN external/flow mode, Felix creates the `vxlan.calico` device in | |
| flow mode (external mode), which acts as a catch-all on its UDP port. The kernel rejects this when another VXLAN device (Docker Swarm's) |
| # Find Docker Swarm VXLAN devices in Docker network namespaces | ||
| for ns in /run/docker/netns/*; do | ||
| echo "=== $(basename $ns) ===" | ||
| nsenter --net="$ns" ip -d link show type vxlan 2>/dev/null | ||
| done |
There was a problem hiding this comment.
The diagnosis snippet mixes kubectl exec ... commands with a local shell for ns in /run/docker/netns/*; do ... loop. It’s unclear where the loop should be executed, and copy/paste will fail from an admin workstation. Wrap the loop in kubectl exec ... -- sh -c '...' or explicitly direct readers to run it on the node (SSH).
- Remove unsupported {#heading-id} syntax from troubleshoot headings
(Docusaurus auto-generates the slug from the heading text)
- Clarify BTF wording: "when BTF is available (typically v5.8+)" instead
of "with BTF support (v5.8+)" to avoid conflating BTF with kernel version
- Add "(run on the node via SSH)" note and quote $ns in the Docker netns
diagnosis loop to clarify execution context
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Summary
EADDRINUSE.vxlan.calicostays DOWN.Files changed (12 total)
Across Calico OSS (next + 3.31) and Enterprise (next + 3.23-1):
Calico Cloud skipped — MKE is listed as unsupported for eBPF on that product.
Test plan
#mke-vxlan-device-down🤖 Generated with Claude Code