-
Notifications
You must be signed in to change notification settings - Fork 34
procd jail incorrect mount behavior when openwrt is run in a container (namespaces)? #22
Copy link
Copy link
Open
Description
see the following strace for dnsmasq
strace: Process 74923 attached
[pid 74923] execve("/sbin/ujail", ["/sbin/ujail", "-t", "5", "-n", "dnsmasq", "-u", "-l", "-r", "/bin/ubus", "-r", "/etc/TZ", "-r", "/etc/dnsmasq.conf", "-r", "/etc/ethers", "-r", "/etc/group", "-r", "/etc/hosts", "-r", "/etc/passwd", "-w", "/tmp/dhcp.leases", "-r", "/tmp/dnsmasq.cfg01411c.d", "-r", "/tmp/hosts", "-r", "/tmp/resolv.conf.d", "-r", "/usr/bin/jshn", "-r", ...], 0x7fff4ddbc2f8 /* 10 vars */) = 0
[pid 74923] clone(child_stack=0x5636ee84a718, flags=CLONE_NEWNS|CLONE_NEWIPC|CLONE_NEWPID|SIGCHLDstrace: Process 74924 attached) = 4367
[pid 74924] mount("none", "/", 0x5636ee7446dd, MS_REC|MS_PRIVATE, NULL) = 0
[pid 74924] mount("tmpfs", "/tmp/ujail-EJimnB", "tmpfs", MS_NOATIME, "mode=0755") = 0
[pid 74924] mount("/bin/sh", "/tmp/ujail-EJimnB/bin/sh", 0x5636ee745e09, MS_BIND, NULL) = 0
[pid 74924] mount("/bin/sh", "/tmp/ujail-EJimnB/bin/sh", 0x5636ee7446dd, MS_RDONLY|MS_REMOUNT|MS_BIND, NULL) = 0
[pid 74924] mount("/bin/ubus", "/tmp/ujail-EJimnB/bin/ubus", 0x5636ee745e09, MS_BIND, NULL) = 0
[pid 74924] mount("/bin/ubus", "/tmp/ujail-EJimnB/bin/ubus", 0x5636ee7446dd, MS_RDONLY|MS_REMOUNT|MS_BIND, NULL) = 0
[pid 74924] mount(NULL, "/tmp/ujail-EJimnB/dev", "tmpfs", MS_NOSUID|MS_NOEXEC|MS_NOATIME, "size=1M") = 0
[pid 74924] mount("/dev/log", "/tmp/ujail-EJimnB/dev/log", 0x5636ee745e09, MS_BIND, NULL) = 0
[pid 74924] mount("/dev/log", "/tmp/ujail-EJimnB/dev/log", 0x5636ee7446dd, MS_REMOUNT|MS_BIND, NULL) = -1 EPERM (Operation not permitted)
[pid 74924] +++ exited with 1 +++
[pid 74923] --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=4367, si_uid=0, si_status=1, si_utime=0, si_stime=0} ---
[pid 74923] +++ exited with 1 +++
in mount_namespaces - overview of Linux mount namespaces [5]. it is stated that
The mount flags MS_RDONLY, MS_NOSUID, MS_NOEXEC, and the
"atime" flags (MS_NOATIME, MS_NODIRATIME, MS_RELATIME)
settings become locked when propagated from a more privileged
to a less privileged mount namespace, and may not be changed
in the less privileged mount namespace.
This point is illustrated in the following example where, in
a more privileged mount namespace, we create a bind mount
that is marked as read-only. For security reasons, it should
not be possible to make the mount writable in a less
privileged mount namespace, and indeed the kernel prevents
this:
$ sudo mkdir /mnt/dir
$ sudo mount --bind -o ro /some/path /mnt/dir
$ sudo unshare --user --map-root-user --mount \
mount -o remount,rw /mnt/dir
mount: /mnt/dir: permission denied.
There's a workaround in lxc lxc/lxc-ci#586 by unjailing dnsmasq. But I guess the proper way is to OR the original flags which may include nosuid?
bash-5.2# cat /proc/mounts | grep "/dev "
tmpfs /dev tmpfs rw,nosuid,size=4096k,nr_inodes=65536,mode=755,uid=921370624,gid=921370624,inode64 0 0
sysfs /sys/dev sysfs ro,nosuid,nodev,noexec,relatime 0 0
Note: ftrace seems to suggest EPERM from can_change_locked_flags()
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels