Skip to content

Document MKE VXLAN port conflict for eBPF dataplane#2574

Open
tomastigera wants to merge 2 commits intotigera:mainfrom
tomastigera:tomas-mke-ebpf-vxlan-port
Open

Document MKE VXLAN port conflict for eBPF dataplane#2574
tomastigera wants to merge 2 commits intotigera:mainfrom
tomastigera:tomas-mke-ebpf-vxlan-port

Conversation

@tomastigera
Copy link
Contributor

Summary

  • MKE's Docker Swarm overlay networking creates VXLAN devices on UDP port 4789, which conflicts with Calico's VXLAN in eBPF flow mode (BTF/kernel v5.8+). The flow-mode device acts as a catch-all and the kernel rejects it with EADDRINUSE.
  • Documents the prerequisite to change the VXLAN port (e.g. to 4790) before enabling eBPF on MKE clusters.
  • Adds a troubleshooting section with diagnosis commands and fix for when vxlan.calico stays DOWN.

Files changed (12 total)

Across Calico OSS (next + 3.31) and Enterprise (next + 3.23-1):

  • enabling-ebpf.mdx: New "MKE: Change the VXLAN port before enabling eBPF" prerequisite section with caution admonition
  • install.mdx: Caution admonition in the MKE tab about VXLAN port conflict
  • troubleshoot-ebpf.mdx: New "MKE: VXLAN device DOWN" section with symptom, root cause, diagnosis, and fix

Calico Cloud skipped — MKE is listed as unsupported for eBPF on that product.

Test plan

  • Verify docs build successfully
  • Check enabling-ebpf pages render the caution admonition correctly for MKE
  • Check install pages render the caution in the MKE tab
  • Check troubleshoot pages render the new section with anchor link #mke-vxlan-device-down
  • Verify cross-links from install.mdx to troubleshoot-ebpf.mdx#mke-vxlan-device-down resolve

🤖 Generated with Claude Code

MKE's Docker Swarm overlay networking creates VXLAN devices on UDP port
4789, which conflicts with Calico's VXLAN in eBPF flow mode (BTF/kernel
v5.8+). The flow-mode device acts as a catch-all and the kernel rejects
it with EADDRINUSE. The fix is to change the VXLAN port (e.g. to 4790)
before enabling eBPF.

Changes across Calico OSS (next + 3.31) and Enterprise (next + 3.23-1):
- enabling-ebpf.mdx: prerequisite section to change VXLAN port on MKE
- install.mdx: caution admonition in the MKE tab
- troubleshoot-ebpf.mdx: diagnosis and fix for VXLAN device DOWN

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@tomastigera tomastigera requested a review from a team as a code owner March 6, 2026 20:44
Copilot AI review requested due to automatic review settings March 6, 2026 20:44
@netlify
Copy link

netlify bot commented Mar 6, 2026

Deploy Preview for calico-docs-preview-next ready!

Name Link
🔨 Latest commit 730ec85
🔍 Latest deploy log https://app.netlify.com/projects/calico-docs-preview-next/deploys/69ab55275fd6ab00086b8860
😎 Deploy Preview https://deploy-preview-2574--calico-docs-preview-next.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

@netlify
Copy link

netlify bot commented Mar 6, 2026

Deploy Preview succeeded!

Built without sensitive environment variables

Name Link
🔨 Latest commit 730ec85
🔍 Latest deploy log https://app.netlify.com/projects/tigera/deploys/69ab552714484000087205fd
😎 Deploy Preview https://deploy-preview-2574--tigera.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.
Lighthouse
Lighthouse
1 paths audited
Performance: 66 (🔴 down 2 from production)
Accessibility: 98 (no change from production)
Best Practices: 92 (no change from production)
SEO: 100 (no change from production)
PWA: -
View the detailed breakdown and full score reports

To edit notification comments on pull requests, go to your Netlify project configuration.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Documents a Mirantis Kubernetes Engine (MKE) Docker Swarm VXLAN UDP port (4789) conflict that can prevent vxlan.calico from coming up when enabling Calico eBPF mode, and provides prerequisites + troubleshooting guidance across OSS and Enterprise docs (next and selected versioned docs).

Changes:

  • Added an MKE-specific prerequisite in enabling-ebpf.mdx to change Calico’s VXLAN port before enabling eBPF.
  • Added an MKE caution in install.mdx warning about the VXLAN port conflict and linking to troubleshooting.
  • Added an MKE troubleshooting section describing symptoms, diagnosis commands, and the fix (vxlanPort change).

Reviewed changes

Copilot reviewed 12 out of 12 changed files in this pull request and generated 20 comments.

Show a summary per file
File Description
calico/operations/ebpf/enabling-ebpf.mdx Adds MKE prerequisite section to change VXLAN port before enabling eBPF.
calico/operations/ebpf/install.mdx Adds MKE caution admonition about VXLAN port 4789 conflict and links to troubleshooting.
calico/operations/ebpf/troubleshoot-ebpf.mdx Adds MKE troubleshooting section for vxlan.calico DOWN and port-conflict diagnosis/fix.
calico_versioned_docs/version-3.31/operations/ebpf/enabling-ebpf.mdx Backports MKE VXLAN port prerequisite to v3.31 docs.
calico_versioned_docs/version-3.31/operations/ebpf/install.mdx Backports MKE caution admonition and troubleshooting link to v3.31 docs.
calico_versioned_docs/version-3.31/operations/ebpf/troubleshoot-ebpf.mdx Backports MKE troubleshooting section to v3.31 docs.
calico-enterprise/operations/ebpf/enabling-ebpf.mdx Adds MKE prerequisite section to change VXLAN port before enabling eBPF (Enterprise next).
calico-enterprise/operations/ebpf/install.mdx Adds MKE caution admonition about VXLAN port conflict (Enterprise next).
calico-enterprise/operations/ebpf/troubleshoot-ebpf.mdx Adds MKE troubleshooting section for vxlan.calico DOWN (Enterprise next).
calico-enterprise_versioned_docs/version-3.23-1/operations/ebpf/enabling-ebpf.mdx Backports MKE VXLAN port prerequisite to Enterprise v3.23-1 docs.
calico-enterprise_versioned_docs/version-3.23-1/operations/ebpf/install.mdx Backports MKE caution admonition and troubleshooting link to Enterprise v3.23-1 docs.
calico-enterprise_versioned_docs/version-3.23-1/operations/ebpf/troubleshoot-ebpf.mdx Backports MKE troubleshooting section to Enterprise v3.23-1 docs.


```

## MKE: VXLAN device DOWN {#mke-vxlan-device-down}
Copy link

Copilot AI Mar 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The {#mke-vxlan-device-down} custom heading ID syntax does not appear to be supported in this repo’s Docusaurus/remark configuration (no heading-id remark plugin). It will likely render the {#...} literally and the generated anchor may not match the cross-links. Prefer relying on the auto-generated heading slug (which should already be mke-vxlan-device-down for this heading), or use an explicit HTML anchor element if a fixed ID is required.

Suggested change
## MKE: VXLAN device DOWN {#mke-vxlan-device-down}
## MKE: VXLAN device DOWN

Copilot uses AI. Check for mistakes.

```

## MKE: VXLAN device DOWN {#mke-vxlan-device-down}
Copy link

Copilot AI Mar 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The {#mke-vxlan-device-down} custom heading ID syntax does not appear to be supported in this repo’s Docusaurus/remark configuration. It will likely render literally and the resulting anchor may not match #mke-vxlan-device-down links. Prefer relying on the auto-generated slug for the heading, or add an explicit HTML anchor if you need a fixed ID.

Suggested change
## MKE: VXLAN device DOWN {#mke-vxlan-device-down}
<a id="mke-vxlan-device-down"></a>
## MKE: VXLAN device DOWN

Copilot uses AI. Check for mistakes.
:::caution

MKE uses Docker Swarm overlay networking, which creates VXLAN devices on UDP port 4789 inside Docker network namespaces.
When $[prodname] switches to eBPF mode on kernels with BTF support (v5.8+), Felix creates the `vxlan.calico` device in
Copy link

Copilot AI Mar 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

“kernels with BTF support (v5.8+)” is likely conflating CO-RE/BTF with the kernel VXLAN flow/external-mode feature that triggers this behavior. Consider rewording to the actual prerequisite (e.g., kernel v5.8+ supporting VXLAN flow/external mode) and mention BTF/CO-RE separately if needed.

Suggested change
When $[prodname] switches to eBPF mode on kernels with BTF support (v5.8+), Felix creates the `vxlan.calico` device in
When $[prodname] switches to eBPF mode on Linux kernels v5.8+ that support VXLAN flow/external mode, Felix creates the `vxlan.calico` device in

Copilot uses AI. Check for mistakes.
Comment on lines +420 to +424
# Find Docker Swarm VXLAN devices in Docker network namespaces
for ns in /run/docker/netns/*; do
echo "=== $(basename $ns) ==="
nsenter --net="$ns" ip -d link show type vxlan 2>/dev/null
done
Copy link

Copilot AI Mar 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The diagnosis snippet mixes kubectl exec ... commands with a local shell for ns in /run/docker/netns/*; do ... loop. It’s unclear where the loop should be executed, and copy/paste will fail from an admin workstation. Wrap the loop in kubectl exec ... -- sh -c '...' or explicitly direct readers to run it on the node (SSH).

Suggested change
# Find Docker Swarm VXLAN devices in Docker network namespaces
for ns in /run/docker/netns/*; do
echo "=== $(basename $ns) ==="
nsenter --net="$ns" ip -d link show type vxlan 2>/dev/null
done
# Find Docker Swarm VXLAN devices in Docker network namespaces (from the calico-node pod)
kubectl exec -n calico-system <calico-node-pod> -- sh -c 'for ns in /run/docker/netns/*; do
echo "=== $(basename "$ns") ==="
nsenter --net="$ns" ip -d link show type vxlan 2>/dev/null
done'

Copilot uses AI. Check for mistakes.
Comment on lines +193 to +195
When $[prodname] switches to eBPF mode on kernels with BTF support (v5.8+), Felix creates the `vxlan.calico` device in
flow mode, which acts as a catch-all on its UDP port. The kernel rejects this when another VXLAN device (Docker Swarm's)
already holds the same port, causing `vxlan.calico` to stay DOWN with `address already in use` errors.
Copy link

Copilot AI Mar 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

“kernels with BTF support (v5.8+)” is likely conflating CO-RE/BTF with the kernel VXLAN flow/external-mode feature that triggers this behavior. Consider rewording to the actual prerequisite (e.g., kernel v5.8+ supporting VXLAN flow/external mode) and mention BTF/CO-RE separately if needed.

Suggested change
When $[prodname] switches to eBPF mode on kernels with BTF support (v5.8+), Felix creates the `vxlan.calico` device in
flow mode, which acts as a catch-all on its UDP port. The kernel rejects this when another VXLAN device (Docker Swarm's)
already holds the same port, causing `vxlan.calico` to stay DOWN with `address already in use` errors.
When $[prodname] switches to eBPF mode on Linux kernels v5.8+ that support VXLAN flow/external mode, Felix creates the
`vxlan.calico` device in flow mode, which acts as a catch-all on its UDP port. The kernel rejects this when another VXLAN
device (Docker Swarm's) already holds the same port, causing `vxlan.calico` to stay DOWN with `address already in use`
errors.

Copilot uses AI. Check for mistakes.
Comment on lines +180 to +182
MKE's Docker Swarm overlay networking uses UDP port 4789 for its own VXLAN devices. In eBPF mode with BTF support (kernel v5.8+),
$[prodname] creates the `vxlan.calico` device in flow mode, which conflicts with Docker Swarm's use of the same port.
You must change the VXLAN port before enabling eBPF:
Copy link

Copilot AI Mar 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

“eBPF mode with BTF support (kernel v5.8+)” is likely conflating CO-RE/BTF with the kernel VXLAN flow/external-mode feature that causes this port conflict. Consider rephrasing to the actual kernel prerequisite (typically v5.8+ supporting VXLAN external/flow mode) and keep BTF/CO-RE recommendations separate.

Suggested change
MKE's Docker Swarm overlay networking uses UDP port 4789 for its own VXLAN devices. In eBPF mode with BTF support (kernel v5.8+),
$[prodname] creates the `vxlan.calico` device in flow mode, which conflicts with Docker Swarm's use of the same port.
You must change the VXLAN port before enabling eBPF:
MKE's Docker Swarm overlay networking uses UDP port 4789 for its own VXLAN devices. On kernels that support VXLAN flow/external mode (typically v5.8+),
$[prodname] creates the `vxlan.calico` device in flow mode, which conflicts with Docker Swarm's use of the same port.
This conflict is due to VXLAN behavior and is independent of BTF/CO-RE support. You must change the VXLAN port before enabling eBPF:

Copilot uses AI. Check for mistakes.
Comment on lines +420 to +424
# Find Docker Swarm VXLAN devices in Docker network namespaces
for ns in /run/docker/netns/*; do
echo "=== $(basename $ns) ==="
nsenter --net="$ns" ip -d link show type vxlan 2>/dev/null
done
Copy link

Copilot AI Mar 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The diagnosis snippet mixes kubectl exec ... commands with a local shell for ns in /run/docker/netns/*; do ... loop. As written, it’s unclear where the loop should be executed (admin workstation vs inside the calico-node container vs SSH on the node), and copy/pasting the whole block will fail on most machines. Consider wrapping the loop in a kubectl exec ... -- sh -c '...' (or explicitly instruct readers to SSH to the node before running that loop).

Copilot uses AI. Check for mistakes.

:::caution VXLAN port conflict

MKE's Docker Swarm overlay networking uses UDP port 4789 for its own VXLAN devices. In eBPF mode with BTF support (kernel v5.8+),
Copy link

Copilot AI Mar 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

“eBPF mode with BTF support (kernel v5.8+)” is likely conflating separate concepts: kernel version does not guarantee BTF, and the VXLAN flow/external-mode behavior isn’t specifically tied to BTF. Reword to describe the actual prerequisite (e.g., kernels that support VXLAN flow/external mode, typically v5.8+) and mention BTF/CO-RE separately if needed.

Suggested change
MKE's Docker Swarm overlay networking uses UDP port 4789 for its own VXLAN devices. In eBPF mode with BTF support (kernel v5.8+),
MKE's Docker Swarm overlay networking uses UDP port 4789 for its own VXLAN devices. On kernels that support VXLAN flow/external mode (typically v5.8+),

Copilot uses AI. Check for mistakes.
Comment on lines +334 to +335
When $[prodname] switches to eBPF mode on kernels with BTF support (v5.8+), Felix creates the `vxlan.calico` device in
flow mode, which acts as a catch-all on its UDP port. The kernel rejects this when another VXLAN device (Docker Swarm's)
Copy link

Copilot AI Mar 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

“kernels with BTF support (v5.8+)” is likely conflating CO-RE/BTF with the kernel VXLAN flow/external-mode feature that triggers this port-binding behavior. Consider rephrasing to the actual kernel prerequisite (e.g., v5.8+ with VXLAN external/flow mode) and keep BTF/CO-RE recommendations separate.

Suggested change
When $[prodname] switches to eBPF mode on kernels with BTF support (v5.8+), Felix creates the `vxlan.calico` device in
flow mode, which acts as a catch-all on its UDP port. The kernel rejects this when another VXLAN device (Docker Swarm's)
When $[prodname] switches to eBPF mode on Linux kernels v5.8+ that support VXLAN external/flow mode, Felix creates the `vxlan.calico` device in
flow mode (external mode), which acts as a catch-all on its UDP port. The kernel rejects this when another VXLAN device (Docker Swarm's)

Copilot uses AI. Check for mistakes.
Comment on lines +420 to +424
# Find Docker Swarm VXLAN devices in Docker network namespaces
for ns in /run/docker/netns/*; do
echo "=== $(basename $ns) ==="
nsenter --net="$ns" ip -d link show type vxlan 2>/dev/null
done
Copy link

Copilot AI Mar 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The diagnosis snippet mixes kubectl exec ... commands with a local shell for ns in /run/docker/netns/*; do ... loop. It’s unclear where the loop should be executed, and copy/paste will fail from an admin workstation. Wrap the loop in kubectl exec ... -- sh -c '...' or explicitly direct readers to run it on the node (SSH).

Copilot uses AI. Check for mistakes.
- Remove unsupported {#heading-id} syntax from troubleshoot headings
  (Docusaurus auto-generates the slug from the heading text)
- Clarify BTF wording: "when BTF is available (typically v5.8+)" instead
  of "with BTF support (v5.8+)" to avoid conflating BTF with kernel version
- Add "(run on the node via SSH)" note and quote $ns in the Docker netns
  diagnosis loop to clarify execution context

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants