Skip to content

feat(bootstrap): add Podman socket fallback for macOS#502

Open
craigamcw wants to merge 1 commit intoNVIDIA:mainfrom
craigamcw:feat/podman-macos-support
Open

feat(bootstrap): add Podman socket fallback for macOS#502
craigamcw wants to merge 1 commit intoNVIDIA:mainfrom
craigamcw:feat/podman-macos-support

Conversation

@craigamcw
Copy link

Implemented feature with help from Claude Code.

Add additive Podman support on macOS without changing any Linux paths, K3s logic, policy engine, or inference routing.

Socket discovery fallback chain:

  1. $DOCKER_HOST
  2. $CONTAINER_HOST
  3. /var/run/docker.sock (bollard default)
  4. Podman socket via podman machine inspect (macOS only)

Container runtime adaptations when Podman is detected:

  • security_opt: unmask /sys/fs/cgroup and /dev/kmsg
  • kubelet feature gate: KubeletInUserNamespace=true
  • kubelet arg: cgroups-per-qos=false, enforce-node-allocatable=

Image push reliability:

  • Extended timeout (120s → 600s) for Unix socket connections
  • Fallback from bollard put_archive API to docker cp CLI for large image transfers that fail over the Podman API socket

Also adds documentation for Podman setup in quickstart, support matrix, and a new troubleshooting page.

Summary

Adds Podman as a supported container runtime on macOS. OpenShell now auto-discovers the Podman machine socket, configures k3s kubelet flags for rootful Podman compatibility, and falls back to docker cp for reliable large image uploads. No Linux paths, K3s core logic, policy engine, or inference routing are changed.

Related Issue

N/A — feature contribution (Podman on macOS was previously unsupported)

Changes

  • Socket discovery fallback chain in check_docker_available(): $DOCKER_HOST > $CONTAINER_HOST > /var/run/docker.sock > Podman socket via podman machine inspect (macOS only)
  • Podman runtime detection (is_podman_runtime()) via Docker version API response
  • k3s container adaptations when Podman detected: security_opt unmask for cgroup/kmsg, kubelet KubeletInUserNamespace=true, cgroups-per-qos=false, enforce-node-allocatable=
  • Image push reliability: extended timeout (120s → 600s) for Unix socket connections; docker cp CLI fallback when bollard put_archive API fails on large payloads
  • Documentation: Podman setup in quickstart, support matrix entry, new troubleshooting page

Testing

  • [ Y ] mise run pre-commit passes
  • [ Y ] Unit tests added/updated (8 new tests: socket discovery, extended timeout, tar wrapping, docker cp fallback lifecycle and error messages — 97 total, 0 failures)
  • [ Y ] E2E tests added/updated (if applicable) - manually verified end-to-end: nemoclaw onboard completes successfully on macOS M4 Mini 16GB with Podman 5.8.1 (rootful mode, Apple Hypervisor backend)

Checklist

  • [ Y ] Follows Conventional Commits
  • [ Y ] Commits are signed off (DCO)
  • [ Y ] Architecture docs updated (if applicable) - docs/get-started/quickstart.md, docs/reference/support-matrix.md,
    docs/reference/troubleshooting.md (new)

Implemented feature with help from Claude Code

Add additive Podman support on macOS without changing any Linux paths,
K3s logic, policy engine, or inference routing.

Socket discovery fallback chain:
  1. $DOCKER_HOST
  2. $CONTAINER_HOST
  3. /var/run/docker.sock (bollard default)
  4. Podman socket via `podman machine inspect` (macOS only)

Container runtime adaptations when Podman is detected:
  - security_opt: unmask /sys/fs/cgroup and /dev/kmsg
  - kubelet feature gate: KubeletInUserNamespace=true
  - kubelet arg: cgroups-per-qos=false, enforce-node-allocatable=

Image push reliability:
  - Extended timeout (120s → 600s) for Unix socket connections
  - Fallback from bollard put_archive API to `docker cp` CLI for
    large image transfers that fail over the Podman API socket

Also adds documentation for Podman setup in quickstart, support matrix,
and a new troubleshooting page.

Signed-off-by: Craig <craig@epic28.com>
@craigamcw craigamcw requested a review from a team as a code owner March 20, 2026 15:35
@github-actions
Copy link

github-actions bot commented Mar 20, 2026

All contributors have signed the DCO ✍️ ✅
Posted by the DCO Assistant Lite bot.

@craigamcw
Copy link
Author

I have read the DCO document and I hereby sign the DCO.

@craigamcw
Copy link
Author

recheck

@johntmyers johntmyers added test:e2e Requires end-to-end coverage labels Mar 20, 2026
@drew drew self-assigned this Mar 20, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

test:e2e Requires end-to-end coverage

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants