toolbox/toolbox-cmd-initContainer-B...

77 lines
3.3 KiB
Diff

From 1cc9e07b7c36fe9f9784b40b58f0a2a3694dd328 Mon Sep 17 00:00:00 2001
From: Debarshi Ray <rishi@fedoraproject.org>
Date: Thu, 13 Jul 2023 13:08:40 +0200
Subject: [PATCH] cmd/initContainer: Be aware of security hardened mount points
Sometimes locations such as /var/lib/flatpak, /var/lib/systemd/coredump
and /var/log/journal sit on security hardened mount points that are
marked as 'nosuid,nodev,noexec' [1]. In such cases, when Toolbx is used
rootless, an attempt to bind mount these locations read-only at runtime
with mount(8) fails because of permission problems:
# mount --rbind -o ro <source> <containerPath>
mount: <containerPath>: filesystem was mounted, but any subsequent
operation failed: Unknown error 5005.
(Note that the above error message from mount(8) was subsequently
improved to show something more meaningful than 'Unknown error' [2].)
The problem is that 'init-container' is running inside the container's
mount and user namespace, and the source paths were mounted inside the
host's namespace with 'nosuid,nodev,noexec'. The above mount(8) call
tries to remove the 'nosuid,nodev,noexec' flags from the mount point and
replace them with only 'ro', which is something that can't be done from
a child namespace.
Note that this doesn't fail when Toolbx is running as root. This is
because the container uses the host's user namespace and is able to
remove the 'nosuid,nodev,noexec' flags from the mount point and replace
them with only 'ro'. Even though it doesn't fail, the flags shouldn't
get replaced like that inside the container, because it removes the
security hardening of those mount points.
There's actually no benefit in bind mounting these paths as read-only.
It was historically done this way 'just to be safe' because a user isn't
expected to write to these locations from inside a container. However,
Toolbx doesn't intend to provide any heightened security beyond what's
already available on the host.
Hence, it's better to get out of the way and leave it to the permissions
on the source location from the host operating system to guard the
castle. This is accomplished by not passing any file system options to
mount(8) [1].
Based on an idea from Si.
[1] https://man7.org/linux/man-pages/man8/mount.8.html
[2] util-linux commit 9420ca34dc8b6f0f
https://github.com/util-linux/util-linux/commit/9420ca34dc8b6f0f
https://github.com/util-linux/util-linux/pull/2376
https://github.com/containers/toolbox/issues/911
---
src/cmd/initContainer.go | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/src/cmd/initContainer.go b/src/cmd/initContainer.go
index 222aa42e1036..41b825b33f58 100644
--- a/src/cmd/initContainer.go
+++ b/src/cmd/initContainer.go
@@ -62,10 +62,10 @@ var (
{"/run/udev/data", "/run/host/run/udev/data", ""},
{"/run/udev/tags", "/run/host/run/udev/tags", ""},
{"/tmp", "/run/host/tmp", "rslave"},
- {"/var/lib/flatpak", "/run/host/var/lib/flatpak", "ro"},
+ {"/var/lib/flatpak", "/run/host/var/lib/flatpak", ""},
{"/var/lib/libvirt", "/run/host/var/lib/libvirt", ""},
- {"/var/lib/systemd/coredump", "/run/host/var/lib/systemd/coredump", "ro"},
- {"/var/log/journal", "/run/host/var/log/journal", "ro"},
+ {"/var/lib/systemd/coredump", "/run/host/var/lib/systemd/coredump", ""},
+ {"/var/log/journal", "/run/host/var/log/journal", ""},
{"/var/mnt", "/run/host/var/mnt", "rslave"},
}
)
--
2.41.0